GitHub - jayams2u/dstack: ✨ An open-source tool for teams to build reproducible ML workflows

Reproducible ML workflows for teams

dstack helps teams run ML workflow in a configured cloud, manage dependencies, and version data.

Docs • Examples • Quickstart • Slack • Twitter

Features

Workflows as code: Define your ML workflows as code, and run them in a configured cloud via the command-line.
Reusable artifacts: Save data, models, and environment as workflows artifacts, and reuse them across projects.
Built-in containers: Workflow containers are pre-built with Conda, Python, etc. No Docker is needed.

You can use the dstack CLI from both your IDE and your CI/CD pipelines.

For debugging purposes, you can run workflow locally, or attach to them interactive dev environments (e.g. VS Code, and JupyterLab).

How does it work?

Install dstack CLI locally
Configure the cloud credentials locally (e.g. via ~/.aws/credentials)
Define ML workflows in .dstack/workflows.yaml (within your existing Git repository)
Run ML workflows via the dstack run CLI command
Use other dstack CLI commands to manage runs, artifacts, etc.

When you run a workflow via the dstack CLI, it provisions the required compute resources (in a configured cloud account), sets up environment (such as Python, Conda, CUDA, etc), fetches your code, downloads deps, saves artifacts, and tears down compute resources.

Demo

dstack-run-gpu-1.mp4

Installation

Use pip to install dstack locally:

pip install dstack

The dstack CLI needs your AWS account credentials to be configured locally (e.g. in ~/.aws/credentials or AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables).

Before you can use the dstack CLI, you need to configure it:

dstack config

It will prompt you to select an AWS region where dstack will provision compute resources, and an S3 bucket, where dstack will store state and output artifacts.

AWS profile: default
AWS region: eu-west-1
S3 bucket: dstack-142421590066-eu-west-1
EC2 subnet: none

Usage example

Step 1: Create a .dstack/workflows.yaml file, and define there how to run the script, from where to load the data, how to store output artifacts, and what compute resources are needed to run it.

workflows: 
  - name: train
    provider: bash
    deps:
      - tag: mnist_data
    commands:
      - pip install requirements.txt
      - python src/train.py
    artifacts: 
      - path: ./checkpoint
    resources:
      interruptible: true
      gpu: 1

Use deps to add artifacts of other workflows as dependencies. You can refer to other workflows via the name of the workflow, or via the name of the tag.

Step 2: Run the workflow via dstack run:

dstack run train

It will automatically provision the required compute resource, and run the workflow. You'll see the output in real-time:

Provisioning... It may take up to a minute. ✓

To interrupt, press Ctrl+C.

Epoch 4: 100%|██████████████| 1876/1876 [00:17<00:00, 107.85it/s, loss=0.0944, v_num=0, val_loss=0.108, val_acc=0.968]

`Trainer.fit` stopped: `max_epochs=5` reached.

Testing DataLoader 0: 100%|██████████████| 313/313 [00:00<00:00, 589.34it/s]

Test metric   DataLoader 0
val_acc       0.965399980545044
val_loss      0.10975822806358337

Step 3: Use the dstack ps command to see the status of runs.

dstack ps -a

RUN               TARGET    SUBMITTED    OWNER           STATUS   TAG
angry-elephant-1  download  8 hours ago  peterschmidt85  Done     mnist_data
wet-insect-1      train     1 weeks ago  peterschmidt85  Running

Step 4: Use other commands to manage runs, artifacts, tags, secrets, and more.

More information

Licence

Mozilla Public License 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 419 Commits
.github/workflows		.github/workflows
cli		cli
dashboard		dashboard
docker		docker
docs		docs
runner		runner
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE.md		LICENSE.md
README.md		README.md
mkdocs.yml		mkdocs.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reproducible ML workflows for teams

Features

How does it work?

Demo

Installation

Usage example

More information

Licence

About

Uh oh!

Releases

Packages

Languages

License

jayams2u/dstack

Folders and files

Latest commit

History

Repository files navigation

Reproducible ML workflows for teams

Features

How does it work?

Demo

Installation

Usage example

More information

Licence

About

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages