Skip to content

Architecture

Giacomo Saccaggi edited this page Jun 19, 2026 · 1 revision

Architecture

Package Structure

scomp_link/
├── cli.py                           # 13 CLI commands
├── core.py                          # ScompLinkPipeline orchestrator
├── preprocessing/
│   ├── data_processor.py            # Preprocessor (polars backend)
│   ├── feature_engineer.py          # FeatureEngineer (sklearn-compatible)
│   └── data_quality.py             # DataQualityReport
├── models/
│   ├── model_factory.py             # Decision-tree model selection
│   ├── regressor_optimizer.py       # GridSearchCV + Boruta
│   ├── classifier_optimizer.py      # GridSearchCV classification
│   ├── ensemble_optimizer.py        # Voting/stacking
│   ├── advanced_tuning.py           # Optuna, Halving, EarlyStopping
│   ├── forecaster.py               # TimeSeriesForecaster
│   ├── anomaly_detector.py          # Multi-method tabular
│   ├── ts_anomaly_detector.py       # Conv1D + ARIMA residuals
│   ├── contrastive_text.py          # BERT contrastive learning
│   ├── supervised_text.py           # spaCy/sentence-transformers
│   ├── supervised_img.py            # CNN (TensorFlow)
│   └── unsupervised_img.py          # Image clustering
├── validation/
│   ├── model_validator.py           # Metrics + HTML report
│   ├── advanced_cv.py              # LOOCV, Bootstrap
│   └── fairness.py                 # FairnessMetrics
├── explainability/
│   └── explainer.py                # ShapExplainer, LimeExplainer
├── monitoring/
│   └── drift_detector.py           # DriftDetector (PSI + KS)
├── persistence/
│   └── artifact.py                 # ScompArtifact (.scomp format)
└── utils/
    ├── logger.py                   # get_logger(), set_verbosity()
    ├── decorators.py               # @timer, @retry, @cache, etc.
    ├── report_html.py              # HTML report builder
    ├── plotly_utils.py             # Chart generation
    └── highcharts.py              # Highcharts visualizations

Design Patterns

Pattern Where Purpose
Facade ScompLinkPipeline Simplified interface over the ML stack
Factory ModelFactory.get_model() String → model instance
Strategy EnsembleOptimizer Swappable voting/stacking
Builder ScompArtifact Fluent .set_x().set_y().save()
Decorator utils/decorators.py Cross-cutting concerns (timing, retry)
Lazy Import All optional deps torch/tensorflow loaded only when needed

Data Flow

User Data (pandas/polars)
    ↓
Preprocessor (polars internally)
    ↓
FeatureEngineer (sklearn fit/transform)
    ↓
ModelFactory → selects estimator
    ↓
Optimizer (GridSearch / Optuna / Halving)
    ↓
Validator (metrics + HTML report)
    ↓
ScompArtifact.save() → .scomp file

Testing

  • 212 tests across 7 test files
  • pytest with --cov for coverage (61%)
  • CI runs on Python 3.10, 3.11, 3.12, 3.13
  • Tests use synthetic data (no external datasets needed)

Clone this wiki locally