Local hyperparameter search and explainability — no cloud required.
Features • Installation • Quick Start • Documentation • Contributing
Corter runs hyperparameter search on tabular CSV data, fits a scikit-learn model with the best settings, and reports feature importance and short text insights. Everything runs on your machine via the CLI, Python API, or optional Flask web dashboard.
- CLI workflow —
corter init,corter run,corter web - YAML configuration — task, model, HPO, and XAI settings in one file
- Terminal UI — live progress with Rich during optimization
- Web dashboard — monitor runs at
http://localhost:5000when using--weborcorter web - Explainability — permutation importance; optional SHAP when
corter-ml[xai]is installed
- Random search over a YAML-defined search space
- Bayesian optimization (
bayesian) via Optuna TPE sampler (pip install optunaorcorter-ml[bayesian]) - SciPy global (
scipy_de) and local (scipy_local) strategies - Parallel trials via joblib
- Early stopping when scores stop improving
- Resumable runs — progress saved to
corter_checkpoint.jsonafter each trial (setcheckpoint_path: nullto disable) - Per-trial explanation snapshots — feature importance and insights appended to
explanation_snapshots.jsonafter each trial
Supported model.name values:
| Name | Aliases |
|---|---|
random_forest |
rf |
gradient_boosting |
gbm |
xgboost |
xgb |
lightgbm |
lgbm, lgb |
catboost |
cb |
logistic_regression |
logreg, logistic |
XGBoost, LightGBM, and CatBoost require the boosting extra: pip install corter-ml[boosting]
| ridge | |
| svc | svm |
Classification vs regression is chosen from task in config (or inferred when task: auto).
Numeric feature columns are used automatically. If target_column is omitted, Corter uses a column named target, label, or y (case-insensitive), otherwise the last column in the CSV.
- Permutation importance (always)
- SHAP values when
shapis installed (pip install corter-ml[xai]) - Drift checks and generated insight strings
- CLI —
corter init,run,web,version - Python —
Corter.from_yaml(...)andcore.run("data.csv") - Web UI — Flask app in
web_ui.py;corter_web.pypushes live updates during a run
pip install corter-mlgit clone https://github.com/pizenkov13-boop/Corter.git
cd Corter
pip install -e .pip install corter-ml[boosting] # XGBoost, LightGBM, CatBoost
pip install corter-ml[xai] # SHAP support
pip install corter-ml[bayesian] # Optuna for strategy: bayesian
pip install corter-ml[dev] # pytest, black, mypycorter initExample config.yaml:
task: classification
target_column: target
model:
name: random_forest
params:
n_estimators: 100
hpo:
strategy: random # random | bayesian | scipy_de | scipy_local
n_trials: 24
parallel_trials: 4
enable_early_stop: true
cv_folds: 5
scoring: accuracy
search_space:
n_estimators:
low: 50
high: 200
type: int
xai:
use_shap: true
top_k_features: 10
tui:
show_live: true
refresh_hz: 4corter run data.csv
corter run data.csv --web # optimization + dashboard
corter run data.csv -c other.yamlResults are written to results.json by default. After each run, Corter also writes results_report.html (best score, parameters, holdout metrics, feature importance, insights). Optional PDF: core.run(data, export_pdf=True) or pip install corter-ml[reports].
from corter import Corter
core = Corter.from_yaml("config.yaml")
result = core.run("data.csv")
print(result["best_cv_score"])
print(result["best_params"])
print(result["insights"])corter web
# open http://127.0.0.1:5000During corter run data.csv --web, the dashboard receives live updates from the optimizer.
task: auto # auto | classification | regression
target_column: target # optional: auto-detect target/label/y or last columnhpo:
strategy: random # random | bayesian | scipy_de | scipy_local
n_trials: 50
parallel_trials: 4
enable_early_stop: true
patience: 5
min_delta: 0.001
cv_folds: 5
scoring: accuracy # or f1_weighted, neg_mean_squared_error, etc.
search_space: { ... }
checkpoint_path: corter_checkpoint.json # resume if interrupted; null to disablexai:
use_shap: true # requires corter-ml[xai]
shap_sample_size: 100
top_k_features: 10
permutation_repeats: 8
drift_threshold: 0.15
explanation_snapshots_path: explanation_snapshots.json # null to disable
snapshot_top_k: 5
snapshot_permutation_repeats: 3corter init [-o config.yaml]
corter run <data.csv> [-c config.yaml] [--web] [--output results.json]
corter web [--host 127.0.0.1] [--port 5000]
corter versionpython corter.py data.csv -c config.yaml
python corter_web.py config.yaml data.csv # optimization with web updates
gunicorn web_ui:app # production-style web only (see Procfile)Internal runs on synthetic tabular data (~1k samples, Random Forest, 50 trials, 8-core CPU):
| HPO configuration | Time | vs sequential |
|---|---|---|
| Sequential | 180s | 1.0× |
| + early stopping | 108s | 1.7× |
| + parallel (4 workers) | 45s | 4.0× |
| Combined | 36s | 5.0× |
XAI re-analysis on synthetic data (~1k × 50 features): caching reduced repeat runs from ~90s to ~15–27s on subsequent passes.
Benchmarks based on internal testing on synthetic data. Results may vary.
Full methodology: TECHNICAL_REPORT.md.
┌──────────────────────────────────────────────┐
│ Corter │
├──────────────────────────────────────────────┤
│ CLI (corter_pkg) │ corter.py │ web_ui │
├────────────────────┴─────────────┴──────────┤
│ HyperparameterAutopilot → fit best model │
│ SemanticDiagnostics → insights │
│ CorterDashboard (Rich TUI) │
└──────────────────────────────────────────────┘
- Fork and clone the repository
pip install -e ".[dev]"- Make changes and run formatters/tests as appropriate
- Open a pull request
MIT — see LICENSE.
- scikit-learn — models and metrics
- SHAP — optional explainability
- Rich — terminal UI
- Flask — web dashboard
Made with ❤️ by the Corter Team