Skip to content

Version/2.0.0 phase5 ml emulation#25

Merged
crvernon merged 4 commits into
version/2.0.0from
version/2.0.0-phase5-ml-emulation
May 20, 2026
Merged

Version/2.0.0 phase5 ml emulation#25
crvernon merged 4 commits into
version/2.0.0from
version/2.0.0-phase5-ml-emulation

Conversation

@crvernon
Copy link
Copy Markdown
Member

Here is the pull request description for Phase 5:


Phase 5: ML Optimization and Emulation

Branch: version/2.0.0-phase5-ml-emulationversion/2.0.0
Plan: plans/v2.0.0_phase5_plan.md
Prior phases: Phase 1 (#20), Phase 2 (#21), Phase 3 (#22), Phase 4 (#23)
Version: 2.0.0a5


Summary

Phase 5 is the capstone phase of the v2.0.0 roadmap. It adds ML-backed resource prediction and scientific model emulation as first-class capabilities, completing Scalable's evolution from a Slurm-oriented launcher into a portable scientific-model execution control plane with optional ML services.

All ML features degrade gracefully to Phase 2 heuristics when training data is insufficient or scalable[ml] is not installed. Emulation is opt-in (off by default) and uncertainty-aware — predictions are never silently substituted for full model runs.


What's new

ML Optimization (scalable.ml)

  • LearnedAdvisor — ML-backed resource recommendations using gradient boosting, random forest, or quantile regression trained on telemetry history. Returns the same ResourceRecommendation payload as Phase 2's heuristic advisor.
  • AdaptiveScaler — real-time adaptive worker scaling with configurable queue-depth thresholds, min/max worker bounds, and cooldown periods to prevent thrashing.
  • FeatureExtractor — engineers features from telemetry records (rolling aggregates, task identity hashing) and user-provided input features for ML consumption.
  • ResourceModel — unified sklearn wrapper with fit/predict/confidence intervals, model persistence via joblib, and percentile fallback when sklearn is unavailable.
  • HyperparameterSearch — Dask-ML distributed tuning (hyperband, successive halving, random) with sklearn RandomizedSearchCV fallback.
  • cross_validate_advisor() — model quality assessment with MAE, RMSE, R², and coverage metrics.

Model Emulation (scalable.emulation)

  • @emulatable decorator — marks functions as emulation-capable with declared inputs, outputs, domain bounds, uncertainty requirements, and confidence thresholds.
  • EmulatorRegistry — versioned emulator management with filesystem persistence, domain validation, and joblib serialization.
  • EmulatorDispatch — confidence-gated routing between emulator and full model with full provenance recording of every dispatch decision.
  • ActiveLearner — intelligent scenario selection using expected improvement, maximum uncertainty, or random acquisition strategies.
  • GradientBoostingEmulator and RandomForestEmulator — surrogate model implementations with tree-based uncertainty estimation.
  • calibrate_emulator() — uncertainty calibration assessment (coverage, sharpness).

CLI

  • scalable advise — ML-backed resource recommendations from the command line. Supports --task, --target, --model-type, --confidence, --format (text/json). Degrades to heuristic advisor when ML unavailable.

Telemetry

  • EmulationEvent — new event type tracking emulator dispatch decisions (source, confidence, fallback reason, domain validity).

Configuration

  • SCALABLE_ML_CACHE_DIR — trained ML model cache location
  • SCALABLE_EMULATOR_DIR — emulator registry storage
  • SCALABLE_ML — enable/disable ML features (default: enabled)
  • SCALABLE_EMULATION — enable/disable emulation (default: disabled)
  • SCALABLE_EMULATION_CONFIDENCE — default confidence threshold (0.9)

Dependencies

  • New [project.optional-dependencies] ml extra: scikit-learn >= 1.3, dask-ml >= 2023.3.24, joblib >= 1.3

Stats

  • 26 files changed (+4,214 lines)
  • 14 new source modules across scalable/ml/ and scalable/emulation/
  • 1 new CLI command (scalable advise)
  • 75 new unit tests — all passing
  • 431 total unit tests — zero regressions from Phases 1–4
  • ruff lint clean on all new modules

Design principles honored

Principle How it's satisfied
AI proposes; Scalable disposes ML predictions validated by deterministic policy before execution
Every plan is inspectable Recommendations include feature importances, confidence intervals, training provenance
Manual overrides always win Users can pin resources, disable ML, or force full-model execution
Emulators are opt-in and labeled Default disabled; every result records whether it was emulated or full-model
Offline-compatible All models trained/cached locally; no remote inference required

After this merge

The version/2.0.0 branch will contain the complete v2.0.0 feature set across all 5 phases and can proceed toward release candidate stabilization.

crvernon added 4 commits May 19, 2026 20:17
- Add scalable.ml package: LearnedAdvisor, AdaptiveScaler, FeatureExtractor,
  ResourceModel, HyperparameterSearch, cross_validate_advisor
- Add scalable.emulation package: @emulatable decorator, EmulatorRegistry,
  EmulatorDispatch, ActiveLearner, GradientBoostingEmulator,
  RandomForestEmulator, uncertainty calibration
- Add scalable advise CLI command with ML-backed recommendations
- Add EmulationEvent to telemetry events
- Add Phase 5 settings (ML cache, emulator registry, enable flags)
- Add [ml] optional dependency extra (scikit-learn, dask-ml, joblib)
- Bump version to 2.0.0a5
- 75 new unit tests, 431 total passing
@crvernon crvernon force-pushed the version/2.0.0-phase5-ml-emulation branch from 0777b09 to 2efbe9d Compare May 20, 2026 00:50
@crvernon crvernon merged commit a3b68a6 into version/2.0.0 May 20, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant