-
Notifications
You must be signed in to change notification settings - Fork 0
Architecture
Giacomo Saccaggi edited this page Jun 19, 2026
·
1 revision
scomp_link/
├── cli.py # 13 CLI commands
├── core.py # ScompLinkPipeline orchestrator
├── preprocessing/
│ ├── data_processor.py # Preprocessor (polars backend)
│ ├── feature_engineer.py # FeatureEngineer (sklearn-compatible)
│ └── data_quality.py # DataQualityReport
├── models/
│ ├── model_factory.py # Decision-tree model selection
│ ├── regressor_optimizer.py # GridSearchCV + Boruta
│ ├── classifier_optimizer.py # GridSearchCV classification
│ ├── ensemble_optimizer.py # Voting/stacking
│ ├── advanced_tuning.py # Optuna, Halving, EarlyStopping
│ ├── forecaster.py # TimeSeriesForecaster
│ ├── anomaly_detector.py # Multi-method tabular
│ ├── ts_anomaly_detector.py # Conv1D + ARIMA residuals
│ ├── contrastive_text.py # BERT contrastive learning
│ ├── supervised_text.py # spaCy/sentence-transformers
│ ├── supervised_img.py # CNN (TensorFlow)
│ └── unsupervised_img.py # Image clustering
├── validation/
│ ├── model_validator.py # Metrics + HTML report
│ ├── advanced_cv.py # LOOCV, Bootstrap
│ └── fairness.py # FairnessMetrics
├── explainability/
│ └── explainer.py # ShapExplainer, LimeExplainer
├── monitoring/
│ └── drift_detector.py # DriftDetector (PSI + KS)
├── persistence/
│ └── artifact.py # ScompArtifact (.scomp format)
└── utils/
├── logger.py # get_logger(), set_verbosity()
├── decorators.py # @timer, @retry, @cache, etc.
├── report_html.py # HTML report builder
├── plotly_utils.py # Chart generation
└── highcharts.py # Highcharts visualizations
| Pattern | Where | Purpose |
|---|---|---|
| Facade | ScompLinkPipeline |
Simplified interface over the ML stack |
| Factory | ModelFactory.get_model() |
String → model instance |
| Strategy | EnsembleOptimizer |
Swappable voting/stacking |
| Builder | ScompArtifact |
Fluent .set_x().set_y().save()
|
| Decorator | utils/decorators.py |
Cross-cutting concerns (timing, retry) |
| Lazy Import | All optional deps | torch/tensorflow loaded only when needed |
User Data (pandas/polars)
↓
Preprocessor (polars internally)
↓
FeatureEngineer (sklearn fit/transform)
↓
ModelFactory → selects estimator
↓
Optimizer (GridSearch / Optuna / Halving)
↓
Validator (metrics + HTML report)
↓
ScompArtifact.save() → .scomp file
- 212 tests across 7 test files
- pytest with
--covfor coverage (61%) - CI runs on Python 3.10, 3.11, 3.12, 3.13
- Tests use synthetic data (no external datasets needed)