dstack helps teams run ML workflow in a configured cloud, manage dependencies, and version data.
Docs • Examples • Quickstart • Slack • Twitter
- Workflows as code: Define your ML workflows as code, and run them in a configured cloud via the command-line.
- Reusable artifacts: Save data, models, and environment as workflows artifacts, and reuse them across projects.
- Built-in containers: Workflow containers are pre-built with Conda, Python, etc. No Docker is needed.
You can use the
dstackCLI from both your IDE and your CI/CD pipelines.For debugging purposes, you can run workflow locally, or attach to them interactive dev environments (e.g. VS Code, and JupyterLab).
- Install
dstackCLI locally - Configure the cloud credentials locally (e.g. via
~/.aws/credentials) - Define ML workflows in
.dstack/workflows.yaml(within your existing Git repository) - Run ML workflows via the
dstack runCLI command - Use other
dstackCLI commands to manage runs, artifacts, etc.
When you run a workflow via the dstack CLI, it provisions the required compute resources (in a configured cloud
account), sets up environment (such as Python, Conda, CUDA, etc), fetches your code, downloads deps,
saves artifacts, and tears down compute resources.
dstack-run-gpu-1.mp4
Use pip to install dstack locally:
pip install dstackThe dstack CLI needs your AWS account credentials to be configured locally
(e.g. in ~/.aws/credentials or AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables).
Before you can use the dstack CLI, you need to configure it:
dstack configIt will prompt you to select an AWS region
where dstack will provision compute resources, and an S3 bucket,
where dstack will store state and output artifacts.
AWS profile: default
AWS region: eu-west-1
S3 bucket: dstack-142421590066-eu-west-1
EC2 subnet: noneStep 1: Create a .dstack/workflows.yaml file, and define there how to run the script,
from where to load the data, how to store output artifacts, and what compute resources are
needed to run it.
workflows:
- name: train
provider: bash
deps:
- tag: mnist_data
commands:
- pip install requirements.txt
- python src/train.py
artifacts:
- path: ./checkpoint
resources:
interruptible: true
gpu: 1Use deps to add artifacts of other workflows as dependencies. You can refer to other
workflows via the name of the workflow, or via the name of the tag.
Step 2: Run the workflow via dstack run:
dstack run trainIt will automatically provision the required compute resource, and run the workflow. You'll see the output in real-time:
Provisioning... It may take up to a minute. ✓
To interrupt, press Ctrl+C.
Epoch 4: 100%|██████████████| 1876/1876 [00:17<00:00, 107.85it/s, loss=0.0944, v_num=0, val_loss=0.108, val_acc=0.968]
`Trainer.fit` stopped: `max_epochs=5` reached.
Testing DataLoader 0: 100%|██████████████| 313/313 [00:00<00:00, 589.34it/s]
Test metric DataLoader 0
val_acc 0.965399980545044
val_loss 0.10975822806358337Step 3: Use the dstack ps command to see the status of runs.
dstack ps -a
RUN TARGET SUBMITTED OWNER STATUS TAG
angry-elephant-1 download 8 hours ago peterschmidt85 Done mnist_data
wet-insect-1 train 1 weeks ago peterschmidt85 Running Step 4: Use other commands to manage runs, artifacts, tags, secrets, and more.