Version/2.0.0 phase5 ml emulation#25
Merged
Merged
Conversation
- Add scalable.ml package: LearnedAdvisor, AdaptiveScaler, FeatureExtractor, ResourceModel, HyperparameterSearch, cross_validate_advisor - Add scalable.emulation package: @emulatable decorator, EmulatorRegistry, EmulatorDispatch, ActiveLearner, GradientBoostingEmulator, RandomForestEmulator, uncertainty calibration - Add scalable advise CLI command with ML-backed recommendations - Add EmulationEvent to telemetry events - Add Phase 5 settings (ML cache, emulator registry, enable flags) - Add [ml] optional dependency extra (scikit-learn, dask-ml, joblib) - Bump version to 2.0.0a5 - 75 new unit tests, 431 total passing
0777b09 to
2efbe9d
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Here is the pull request description for Phase 5:
Phase 5: ML Optimization and Emulation
Branch:
version/2.0.0-phase5-ml-emulation→version/2.0.0Plan:
plans/v2.0.0_phase5_plan.mdPrior phases: Phase 1 (#20), Phase 2 (#21), Phase 3 (#22), Phase 4 (#23)
Version:
2.0.0a5Summary
Phase 5 is the capstone phase of the v2.0.0 roadmap. It adds ML-backed resource prediction and scientific model emulation as first-class capabilities, completing Scalable's evolution from a Slurm-oriented launcher into a portable scientific-model execution control plane with optional ML services.
All ML features degrade gracefully to Phase 2 heuristics when training data is insufficient or
scalable[ml]is not installed. Emulation is opt-in (off by default) and uncertainty-aware — predictions are never silently substituted for full model runs.What's new
ML Optimization (
scalable.ml)LearnedAdvisor— ML-backed resource recommendations using gradient boosting, random forest, or quantile regression trained on telemetry history. Returns the sameResourceRecommendationpayload as Phase 2's heuristic advisor.AdaptiveScaler— real-time adaptive worker scaling with configurable queue-depth thresholds, min/max worker bounds, and cooldown periods to prevent thrashing.FeatureExtractor— engineers features from telemetry records (rolling aggregates, task identity hashing) and user-provided input features for ML consumption.ResourceModel— unified sklearn wrapper withfit/predict/confidence intervals, model persistence via joblib, and percentile fallback when sklearn is unavailable.HyperparameterSearch— Dask-ML distributed tuning (hyperband, successive halving, random) with sklearnRandomizedSearchCVfallback.cross_validate_advisor()— model quality assessment with MAE, RMSE, R², and coverage metrics.Model Emulation (
scalable.emulation)@emulatabledecorator — marks functions as emulation-capable with declared inputs, outputs, domain bounds, uncertainty requirements, and confidence thresholds.EmulatorRegistry— versioned emulator management with filesystem persistence, domain validation, and joblib serialization.EmulatorDispatch— confidence-gated routing between emulator and full model with full provenance recording of every dispatch decision.ActiveLearner— intelligent scenario selection using expected improvement, maximum uncertainty, or random acquisition strategies.GradientBoostingEmulatorandRandomForestEmulator— surrogate model implementations with tree-based uncertainty estimation.calibrate_emulator()— uncertainty calibration assessment (coverage, sharpness).CLI
scalable advise— ML-backed resource recommendations from the command line. Supports--task,--target,--model-type,--confidence,--format(text/json). Degrades to heuristic advisor when ML unavailable.Telemetry
EmulationEvent— new event type tracking emulator dispatch decisions (source, confidence, fallback reason, domain validity).Configuration
SCALABLE_ML_CACHE_DIR— trained ML model cache locationSCALABLE_EMULATOR_DIR— emulator registry storageSCALABLE_ML— enable/disable ML features (default: enabled)SCALABLE_EMULATION— enable/disable emulation (default: disabled)SCALABLE_EMULATION_CONFIDENCE— default confidence threshold (0.9)Dependencies
[project.optional-dependencies] mlextra:scikit-learn >= 1.3,dask-ml >= 2023.3.24,joblib >= 1.3Stats
scalable/ml/andscalable/emulation/scalable advise)Design principles honored
After this merge
The
version/2.0.0branch will contain the complete v2.0.0 feature set across all 5 phases and can proceed toward release candidate stabilization.