Cascade - Small-scale MLOps Library

Lightweight and modular MLOps library with the aim to make ML development more efficient targeted at small teams or individuals.

Cascade offers the solution that enables MLOps features for small projects while demanding little. There is usually no need for the full MLOps setups in most of the small-scale ML-projects.

Included in Model Lifecycle section of Awesome MLOps list

Installation

pip install cascade-ml

More info on installation can be found in documentation

Docs

Go to Cascade documentation

Usage Examples

This section is divided into blocks based on what problem you can solve using Cascade. These are the simplest examples of what the library is capable of. See more in documentation.

ETL pipeline tracking

Data processing pipelines need to be versioned and tracked as a part of model experiments.
To track changes and version everything about data Cascade has Datasets - special wrappers that encapsulate operations on data.

from pprint import pprint
from cascade import data as cdd
from sklearn.datasets import load_digits
import numpy as np


X, y = load_digits(return_X_y=True)
pairs = [(x, y) for (x, y) in zip(X, y)]

ds = cdd.Wrapper(pairs)
ds = cdd.RandomSampler(ds)

train_ds, test_ds = cdd.split(ds)
train_ds = cdd.ApplyModifier(
    train_ds,
    lambda pair: pair[0] + np.random.random() * 0.1 - 0.05, pair[1]
)

pprint(train_ds.get_meta())

We see all the stages that we did in meta.

Click to see full pipeline metadata

[{"comments": [],
  "description": null,
  "len": 898,
  "links": [],
  "name": "cascade.data.apply_modifier.ApplyModifier",
  "tags": [],
  "type": "dataset"},
 {"comments": [],
  "description": null,
  "len": 898,
  "links": [],
  "name": "cascade.data.range_sampler.RangeSampler",
  "tags": [],
  "type": "dataset"},
 {"comments": [],
  "description": null,
  "len": 1797,
  "links": [],
  "name": "cascade.data.random_sampler.RandomSampler",
  "tags": [],
  "type": "dataset"},
 {"comments": [],
  "description": null,
  "len": 1797,
  "links": [],
  "name": "cascade.data.dataset.Wrapper",
  "obj_type": "<class 'list'>",
  "tags": [],
  "type": "dataset"}]

See all datasets in zoo
See tutorial in documentation

Experiment tracking

Cascade provides a rich set of ML-experiment tracking tools. You can easily track history of model changes, save and restore models in a structured manner along with metadata.

import random
from cascade.models import Model
from cascade.repos import Repo

model = Model()
model.add_metric('acc', random.random())

repo = Repo('./repo')

line = repo.add_line('baseline')
line.save(model, only_meta=True)

Repo is the collection of lines and Line can be a bunch of experiments on one model type. Lines can also store data pipelines.

Click to see full model metadata

[
    {
        "name": "cascade.models.model.Model",
        "description": null,
        "tags": [],
        "comments": [],
        "links": [],
        "type": "model",
        "created_at": "2024-08-25T19:15:24.658259+00:00",
        "metrics": [
            {
                "name": "acc",
                "value": 0.4323295098641783,
                "created_at": "2024-08-25T19:15:24.658356+00:00"
            }
        ],
        "params": {},
        "path": "/home/user/repo/baseline/00000",
        "slug": "rustling_finicky_hoatzin",
        "saved_at": "2024-08-25T19:15:25.548339+00:00",
        "python_version": "3.10.12 (main, Jul 29 2024, 16:56:48) [GCC 11.4.0]",
        "user": "user",
        "host": "hostname"
    }
]

See tutorial in documentation

Cascade UI

Cascade features web-based experiment dashboard. You can install it with:

pip install cascade-ui

Then locate your Cascade Workspace and run:

cascade ui

Cascade UI is a separate project, that provides visual interface for Cascade experiments. For more detailed explanation you can visit UI docs.

Who could find Cascade useful

ML engineers and researchers in small teams or working individually. The price of integrating with large-scale MLOps solutions can be too high and the aim of Cascade is to bridge this gap for everyone.

Principles

The key principles of Cascade are:

Elegancy - ML code should be about ML with minimum meta-code
Flexibility - to easily build prototypes and integrate existing projects with Cascade (don't pay for what you don't use)
Reusability - code to be reused in similar projects with no effort
Traceability - everything should have meta-data

Contributing

Pull requests and issues are welcome! For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests and docs as appropriate.

License

Apache License 2.0

Versions

This project uses Semantic Versioning - https://semver.org/

Cite the code

If you used the code in your research, please cite it with:

@software{ilia_moiseev_2023_8006995,
  author       = {Ilia Moiseev},
  title        = {Oxid15/cascade: Lightweight ML Engineering library},
  month        = jun,
  year         = 2023,
  publisher    = {Zenodo},
  doi          = {10.5281/zenodo.8006995},
  url          = {https://doi.org/10.5281/zenodo.8006995}
}

Name		Name	Last commit message	Last commit date
Latest commit History 2,179 Commits
.github/workflows		.github/workflows
cascade		cascade
.flake8		.flake8
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Cascade - Small-scale MLOps Library

Installation

Docs

Usage Examples

ETL pipeline tracking

Experiment tracking

Cascade UI

Who could find Cascade useful

Principles

Contributing

License

Versions

Cite the code

About

Uh oh!

Releases 36

Uh oh!

Contributors 2

Uh oh!

Languages

License

Oxid15/cascade

Folders and files

Latest commit

History

Repository files navigation

Cascade - Small-scale MLOps Library

Installation

Docs

Usage Examples

ETL pipeline tracking

Experiment tracking

Cascade UI

Who could find Cascade useful

Principles

Contributing

License

Versions

Cite the code

About

Topics

Resources

License

Code of conduct

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 36

Uh oh!

Contributors 2

Uh oh!

Languages