Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
121 changes: 105 additions & 16 deletions docs/docs/concepts/volumes.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,16 @@
# Volumes

Volumes allow you to persist data between runs. `dstack` allows to create and attach volumes to
dev environments, tasks, and services.
Volumes allow you to persist data between runs. `dstack` supports two kinds of volumes: [network volumes](#network-volumes)
and [instance volumes](#instance-volumes).

> Volumes are currently supported with the `aws`, `gcp`, and `runpod` backends.
## Network volumes

`dstack` allows to create and attach network volumes to dev environments, tasks, and services.

> Network volumes are currently supported with the `aws`, `gcp`, and `runpod` backends.
Support for other backends and SSH fleets is coming soon.

## Define a configuration
### Define a configuration

First, create a YAML file in your project folder. Its name must end with `.dstack.yml` (e.g. `.dstack.yml` or `vol.dstack.yml`
are both acceptable).
Expand Down Expand Up @@ -38,7 +42,7 @@ If you use this configuration, `dstack` will create a new volume based on the sp
See [.dstack.yml](../reference/dstack.yml/volume.md) for all the options supported by
volumes, along with multiple examples.

## Create, register, or update a volume
### Create, register, or update a volume

To create or register the volume, simply call the `dstack apply` command:

Expand All @@ -59,10 +63,11 @@ Volume my-new-volume does not exist yet. Create the volume? [y/n]: y

Once created, the volume can be attached with dev environments, tasks, and services.

## Attach a volume
### Attach a volume { #attach-network-volume }

Dev environments, tasks, and services let you attach any number of volumes.
To attach a volume, simply specify its name using the `volumes` property and specify where to mount its contents:
Dev environments, tasks, and services let you attach any number of network volumes.
To attach a network volume, simply specify its name using the `volumes` property
and specify where to mount its contents:

<div editor-title=".dstack.yml">

Expand All @@ -77,6 +82,10 @@ ide: vscode
volumes:
- name: my-new-volume
path: /volume_data

# You can also use the short syntax in the `name:path` form
# volumes:
# - my-new-volume:/volume_data
```

</div>
Expand All @@ -89,19 +98,19 @@ and its contents will persist across runs.
to `/workflow` (and sets that as the current working directory). Right now, `dstack` doesn't allow you to
attach volumes to `/workflow` or any of its subdirectories.

## Manage volumes
### Manage volumes { #manage-network-volumes }

### List volumes
#### List volumes

The [`dstack volume list`](../reference/cli/index.md#dstack-gateway-list) command lists created and registered volumes:
The [`dstack volume list`](../reference/cli/index.md#dstack-volume-list) command lists created and registered volumes:

```
$ dstack volume list
NAME BACKEND REGION STATUS CREATED
my-new-volume aws eu-central-1 active 3 weeks ago
```

### Delete volumes
#### Delete volumes

When the volume isn't attached to any active dev environment, task, or service, you can delete it using `dstack delete`:

Expand All @@ -112,19 +121,99 @@ $ dstack delete -f vol.dstack.yaml
If the volume was created using `dstack`, it will be physically destroyed along with the data.
If you've registered an existing volume, it will be de-registered with `dstack` but will keep the data.

## Instance volumes

> Instance volumes are currently supported on all backends except `runpod`, `vastai` and `kubernetes`.

Unlike [network volumes](#network-volumes), which are persistent external resources mounted over network,
instance volumes are part of the instance storage. Basically, the instance volume is a filesystem path
(a directory or a file) mounted inside the run container.

As a consequence, the contents of the instance volume are specific to the instance
where the run is executed, and data persistence, integrity, and even existence are guaranteed only if the subsequent run
is executed on the same exact instance, and there is no other runs in between.

### Manage volumes { #manage-instance-volumes }

You don't need to create or delete instance volumes, and they are not displayed in the
[`dstack volume list`](../reference/cli/index.md#dstack-volume-list) command output.

### Attach a volume { #attach-instance-volume }

Dev environments, tasks, and services let you attach any number of instance volumes.
To attach an instance volume, specify the `instance_path` and `path` in the `volumes` property:

<div editor-title=".dstack.yml">

```yaml
type: dev-environment
# A name of the dev environment
name: vscode-vol

ide: vscode

# Map the instance path to any container path
volumes:
- instance_path: /mnt/volume
path: /volume_data

# You can also use the short syntax in the `instance_path:path` form
# volumes:
# - /mnt/volume:/volume_data
```

</div>

### Use cases { #instance-volumes-use-cases }

Despite the limitations, instance volumes can still be useful in some cases:

=== "Cache"

For example, if runs regularly install packages with `pip install`, include the instance volume in the run configuration
to reuse pip cache between runs:

<div editor-title=".dstack.yml">

```yaml
type: task

volumes:
- /dstack-cache/pip:/root/.cache/pip
```

</div>

=== "Network storage with SSH fleet"

If you manage your own instances, you can mount network storages (e.g., NFS or SMB) to the hosts and access them in the runs.
Imagine you mounted the same network storage to all the fleet instances using the same path `/mnt/nfs-storage`,
then you can treat the instance volume as a shared persistent storage:

<div editor-title=".dstack.yml">

```yaml
type: task

volumes:
- /mnt/nfs-storage:/storage
```

</div>

## FAQ

##### Can I use volumes across backends?
##### Can I use network volumes across backends?

Since volumes are backed up by cloud network disks, you can only use them within the same cloud. If you need to access
data across different backends, you should either use object storage or replicate the data across multiple volumes.

##### Can I use volumes across regions?
##### Can I use network volumes across regions?

Typically, network volumes are associated with specific regions, so you can't use them in other regions. Often,
volumes are also linked to availability zones, but some providers support volumes that can be used across different
availability zones within the same region.

##### Can I attach volumes to multiple runs or instances?
##### Can I attach network volumes to multiple runs or instances?

You can mount a volume in multiple runs. This feature is currently supported only by the `runpod` backend.
You can mount a volume in multiple runs. This feature is currently supported only by the `runpod` backend.
35 changes: 28 additions & 7 deletions docs/docs/reference/dstack.yml/dev-environment.md
Original file line number Diff line number Diff line change
Expand Up @@ -258,7 +258,11 @@ volumes:
Once you run this configuration, the contents of the volume will be attached to `/volume_data` inside the dev
environment, and its contents will persist across runs.

??? info "Limitations"
??? Info "Instance volumes"
If data persistence is not a strict requirement, use can also use
ephemeral [instance volumes](../../concepts/volumes.md#instance-volumes).

!!! info "Limitations"
When you're running a dev environment, task, or service with `dstack`, it automatically mounts the project folder contents
to `/workflow` (and sets that as the current working directory). Right now, `dstack` doesn't allow you to
attach volumes to `/workflow` or any of its subdirectories.
Expand Down Expand Up @@ -306,10 +310,27 @@ The `dev-environment` configuration type supports many other options. See below.
type:
required: true

## `volumes`
## `volumes[n]` { #_volumes data-toc-label="volumes" }

#SCHEMA# dstack._internal.core.models.volumes.VolumeMountPoint
overrides:
show_root_heading: false
type:
required: true
=== "Network volumes"

#SCHEMA# dstack._internal.core.models.volumes.VolumeMountPoint
overrides:
show_root_heading: false
type:
required: true

=== "Instance volumes"

#SCHEMA# dstack._internal.core.models.volumes.InstanceMountPoint
overrides:
show_root_heading: false
type:
required: true

??? info "Short syntax"

The short syntax for volumes is a colon-separated string in the form of `source:destination`

* `volume-name:/container/path` for network volumes
* `/instance/path:/container/path` for instance volumes
38 changes: 32 additions & 6 deletions docs/docs/reference/dstack.yml/service.md
Original file line number Diff line number Diff line change
Expand Up @@ -442,6 +442,15 @@ volumes:
Once you run this configuration, the contents of the volume will be attached to `/volume_data` inside the service,
and its contents will persist across runs.

??? Info "Instance volumes"
If data persistence is not a strict requirement, use can also use
ephemeral [instance volumes](../../concepts/volumes.md#instance-volumes).

!!! info "Limitations"
When you're running a dev environment, task, or service with `dstack`, it automatically mounts the project folder contents
to `/workflow` (and sets that as the current working directory). Right now, `dstack` doesn't allow you to
attach volumes to `/workflow` or any of its subdirectories.

The `service` configuration type supports many other options. See below.

## Root reference
Expand Down Expand Up @@ -501,10 +510,27 @@ The `service` configuration type supports many other options. See below.
type:
required: true

## `volumes`
## `volumes[n]` { #_volumes data-toc-label="volumes" }

#SCHEMA# dstack._internal.core.models.volumes.VolumeMountPoint
overrides:
show_root_heading: false
type:
required: true
=== "Network volumes"

#SCHEMA# dstack._internal.core.models.volumes.VolumeMountPoint
overrides:
show_root_heading: false
type:
required: true

=== "Instance volumes"

#SCHEMA# dstack._internal.core.models.volumes.InstanceMountPoint
overrides:
show_root_heading: false
type:
required: true

??? info "Short syntax"

The short syntax for volumes is a colon-separated string in the form of `source:destination`

* `volume-name:/container/path` for network volumes
* `/instance/path:/container/path` for instance volumes
33 changes: 27 additions & 6 deletions docs/docs/reference/dstack.yml/task.md
Original file line number Diff line number Diff line change
Expand Up @@ -433,6 +433,10 @@ volumes:
Once you run this configuration, the contents of the volume will be attached to `/volume_data` inside the task,
and its contents will persist across runs.

??? Info "Instance volumes"
If data persistence is not a strict requirement, use can also use
ephemeral [instance volumes](../../concepts/volumes.md#instance-volumes).

!!! info "Limitations"
When you're running a dev environment, task, or service with `dstack`, it automatically mounts the project folder contents
to `/workflow` (and sets that as the current working directory). Right now, `dstack` doesn't allow you to
Expand Down Expand Up @@ -490,10 +494,27 @@ The `task` configuration type supports many other options. See below.
type:
required: true

## `volumes[n]`
## `volumes[n]` { #_volumes data-toc-label="volumes" }

#SCHEMA# dstack._internal.core.models.volumes.VolumeMountPoint
overrides:
show_root_heading: false
type:
required: true
=== "Network volumes"

#SCHEMA# dstack._internal.core.models.volumes.VolumeMountPoint
overrides:
show_root_heading: false
type:
required: true

=== "Instance volumes"

#SCHEMA# dstack._internal.core.models.volumes.InstanceMountPoint
overrides:
show_root_heading: false
type:
required: true

??? info "Short syntax"

The short syntax for volumes is a colon-separated string in the form of `source:destination`

* `volume-name:/container/path` for network volumes
* `/instance/path:/container/path` for instance volumes
Loading