Skip to content

Pipeline Persistence

Giacomo Saccaggi edited this page Jun 19, 2026 · 1 revision

Pipeline Persistence

The .scomp Format

A .scomp file is a ZIP archive that bundles an entire ML pipeline:

model.scomp (ZIP)
├── __magic__           # File identification (6 bytes)
├── manifest.json       # Python version, package versions, timestamp
├── model.pkl           # Fitted model (pickle)
├── preprocessor.pkl    # Fitted preprocessor (optional)
├── config.json         # task_type, target_col, feature_cols, etc.
├── metrics.json        # r2, rmse, mae, accuracy, etc.
├── feature_schema.json # Column names, dtypes, min/max/mean
└── sample_data.parquet # Small training sample (for drift detection)

Saving

from scomp_link import ScompArtifact

artifact = ScompArtifact()
artifact.set_model(model)
artifact.set_preprocessor(scaler)          # optional
artifact.set_config(task_type='regression', target_col='price')
artifact.set_metrics({'r2': 0.92, 'rmse': 1.23})
artifact.set_feature_schema(X_train)       # stores dtypes, ranges
artifact.set_sample_data(X_train, max_rows=200)  # for drift detection
artifact.set_metadata(author='team', version='v2')
artifact.save('production_model.scomp')

Loading & Predicting

loaded = ScompArtifact.load('production_model.scomp')

# Predict chains preprocessor → model automatically
predictions = loaded.predict(new_data)

# Access components
model = loaded.model
config = loaded.config
metrics = loaded.metrics
schema = loaded.feature_schema

CLI

# Save during training
scomp-link run --data train.csv --target y --task regression --save-artifact model.scomp

# Load and predict
scomp-link predict --artifact model.scomp --data new.csv --output preds.csv

# Inspect
scomp-link info --artifact model.scomp

# Compare multiple versions
scomp-link compare --artifacts v1.scomp v2.scomp v3.scomp

Clone this wiki locally