Version/2.0.0#26
Merged
Merged
Conversation
Releasing v1.0
Merging Develop
Release: 1.1.0
Creates the additive Phase 1 package structure off of version/2.0.0: manifest/, providers/, session/, planning/, cli/. Each new package ships with a docstring describing its Phase 1 role and its hooks for later phases (telemetry, AI assistants, Kubernetes/cloud providers, ML advisor). scalable/manifest/schema.py defines the frozen v1 schema dataclasses (ManifestModel, ProjectConfig, TargetConfig, ComponentConfig, TaskConfig) and SCHEMA_VERSION = 1. The schema is intentionally implemented with stdlib dataclasses so manifest validation works without the optional [ai] extra (resolves Phase 1 plan section 9 open question #1). scalable/manifest/errors.py declares the ManifestError hierarchy used by the parser, validator, and Phase 4 AI migration assistant. scalable/cli/main.py is a Phase 1 stub for the [project.scripts] entry point; the real validate / plan --dry-run wiring lands in WU-10. pyproject.toml: version bumped to 2.0.0a1, pyyaml pinned explicitly, empty placeholder extras for ai/cloud/kubernetes registered so pip install scalable[ai] resolves cleanly from day one, scalable console script registered, packages.find used so the new sub-packages are picked up by setuptools. Verified: existing 73 unit tests pass unchanged; ruff clean on all new modules. No public API removed or renamed. Refs plans/v2.0.0_phase1_plan.md WU-1.
Phase 1: provider abstraction + scalable.yaml manifest foundation
…sing phase 2 progress towards telemetry and deterministic advising
Implements Phase 3 of the v2.0.0 roadmap: - KubernetesProvider over Dask Kubernetes Operator - AWSBatchProvider over dask-cloudprovider (Fargate/EC2) - GCPProvider scaffold (validation only; build_cluster deferred) - ArtifactStore protocol with local and fsspec backends - RemoteCacheBackend for opt-in remote cache (SCALABLE_CACHE_REMOTE) - Manifest overlays (overlays: block + targets[*].overlay) - CostEstimate primitives and static cost tables - scalable run CLI verb - Settings: cache_remote_uri, default_storage, runs_dir_remote - Telemetry: CostEvent, cost.jsonl stream, cost in report - Provider protocol: optional estimate_cost() method - Public API: Phase 3 exports with optional-dep guards - Docs: cloud.rst, kubernetes.rst, artifacts.rst, overlays.rst, cost.rst - Example manifests: gke, aws, overlays - 238 unit tests passing, ruff clean Version bumped to 2.0.0a3.
Phase 3: cloud + Kubernetes execution, artifact stores, overlays, cost
Implements the Phase 4 deliverables from the v2.0.0 development plan: - AI assistant subsystem (scalable.ai) with pluggable LLM backend protocol and heuristic-only fallback mode - Component onboarding assistant (scalable init-component) - Failure diagnosis assistant (scalable diagnose) - Plan explanation assistant (scalable explain) - Workflow composition assistant (scalable compose) - Manifest migration assistant (scalable migrate) - ScalableSession.plan(objective=, policy=) now functional with heuristic-based resource/worker adjustments - Prompt template system for all assistants - Settings: SCALABLE_AI_BACKEND, SCALABLE_AI_MODEL, SCALABLE_AI_ENDPOINT - Populated [project.optional-dependencies] ai extra - Version bumped to 2.0.0a4 - 356 unit tests passing, ruff clean All AI features work without an LLM backend via deterministic heuristic fallbacks. LLM enhancement is opt-in. All outputs are reviewable artifacts - never auto-executed. Ref: plans/v2.0.0_phase4_plan.md
Co-authored-by: crvernon <3947069+crvernon@users.noreply.github.com>
Co-authored-by: crvernon <3947069+crvernon@users.noreply.github.com>
Agent-Logs-Url: https://github.com/JGCRI/scalable/sessions/b7e62493-29e0-4a5f-9bdb-28a778012e68 Co-authored-by: crvernon <3947069+crvernon@users.noreply.github.com>
[WIP] Fix failing GitHub Actions job 'ruff + mypy'
Agent-Logs-Url: https://github.com/JGCRI/scalable/sessions/fe9e5b5a-f73f-4999-8e77-194af9b7b931 Co-authored-by: crvernon <3947069+crvernon@users.noreply.github.com>
Phase 4: AI assistant features
- Add scalable.ml package: LearnedAdvisor, AdaptiveScaler, FeatureExtractor, ResourceModel, HyperparameterSearch, cross_validate_advisor - Add scalable.emulation package: @emulatable decorator, EmulatorRegistry, EmulatorDispatch, ActiveLearner, GradientBoostingEmulator, RandomForestEmulator, uncertainty calibration - Add scalable advise CLI command with ML-backed recommendations - Add EmulationEvent to telemetry events - Add Phase 5 settings (ML cache, emulator registry, enable flags) - Add [ml] optional dependency extra (scikit-learn, dask-ml, joblib) - Bump version to 2.0.0a5 - 75 new unit tests, 431 total passing
Version/2.0.0 phase5 ml emulation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request introduces significant new features and improvements across the codebase, focusing on advanced ML optimization, model emulation, expanded AI assistant capabilities, and enhanced CI coverage. It also updates documentation, environment configuration, and the test matrix to support new functionality and platforms.
Major feature additions and improvements:
Machine Learning Optimization and Emulation:
scalable.ml) with learned resource prediction, adaptive scaling, feature extraction, model wrappers, distributed hyperparameter search, and model quality assessment. Adds a model emulation subsystem (scalable.emulation) for surrogate modeling, uncertainty-aware dispatch, and active learning strategies. Public APIs and CLI commands are provided for ML-backed recommendations and emulation. ([CHANGELOG.mdR8-R285])AI Assistant and Cloud/Kubernetes Support:
Testing and CI Enhancements:
Documentation and Configuration:
CHANGELOG.mdwith detailed release notes for all new features, breaking changes, and tests. Updates the README to reflect new capabilities, optional dependency groups, and system requirements. Adds an.env.examplefor OpenAI configuration. [1]], [2]], [3]], [4]], [5]])Notable grouped changes:
1. ML Optimization & Emulation
scalable.mlandscalable.emulationsubsystems: learned resource prediction, adaptive scaling, surrogate modeling, uncertainty calibration, and CLI integration. ([CHANGELOG.mdR8-R285])[ml]for ML features. ([CHANGELOG.mdR8-R285])2. AI Assistants & Cloud/Kubernetes
init-component,diagnose,explain,compose,migrate) with LLM backend support and heuristic fallback. ([CHANGELOG.mdR8-R285])3. CI and Testing
4. Documentation & Config
CHANGELOG.mdandREADME.mdfor new features, usage, and requirements. [1]], [2]], [3]], [4]]).env.examplefor OpenAI credentials and model configuration. ([.env.exampleR1-R13])5. Maintenance & Versioning
These changes collectively advance the project to a new phase with robust ML, emulation, AI, and cloud-native capabilities, while ensuring strong test coverage and clear documentation.