Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion content/docs/command-reference/destroy.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ usage: dvc destroy [-h] [-q | -v] [-f]
directory from the <abbr>workspace</abbr>.

Note that the <abbr>cache directory</abbr> will be removed as well, unless it's
[set to an external location](/doc/use-cases/shared-development-server#configure-the-external-shared-cache)
[set to an external location](/doc/use-cases/shared-development-server#configure-the-shared-cache)
(by default a local cache is located in `.dvc/cache`). If you were using
[symlinks for linking](/doc/user-guide/large-dataset-optimization) data from the
cache, DVC will replace them with the latest versions of the actual files and
Expand Down
64 changes: 38 additions & 26 deletions content/docs/use-cases/shared-development-server.md
Original file line number Diff line number Diff line change
@@ -1,36 +1,42 @@
# Shared Development Server

Some teams may prefer using one single shared machine to run their experiments.
This allows better resource utilization, such as the ability to use multiple
GPUs, centralized data storage, etc. With DVC, you can easily setup shared data
storage on a server accessed by several users or for any other reason, in a way
that enables almost instantaneous <abbr>workspace</abbr> restoration/switching
speed for everyone – similar to `git checkout` for your code.
Some teams may prefer using a single shared machine to run their experiments.
This allows better resource utilization, such as GPU access, centralized data
storage, etc. With DVC, you can easily setup shared data storage on a server
with multiple users or processes. This enables near-instantaneous
<abbr>workspace</abbr> restoration and switching speeds for everyone – a
**checkout for data**.

![](/img/shared-server.png)
![](/img/shared-server.png) _Shared DVC project data_

Not only can several users share a single cache for a project, but in fact
multiple projects can use the same cache. This is useful when the datasets of
these projects overlap, since DVC detects and eliminates data storage redundancy
automatically.

> Note that `dvc gc` can be dangerous in this scenario. See it's `--projects`
> option.

## Preparation

Create a directory external to your <abbr>DVC projects</abbr> to be used as a
shared <abbr>cache</abbr> location for everyone's projects:
Create a directory outside your <abbr>DVC projects</abbr> to be used as a shared
<abbr>cache</abbr> location:

```dvc
$ mkdir -p /home/shared/dvc-cache
```

Make sure that the directory has proper permissions, so that all your colleagues
can write to it, and can read cached files written by others. The most
straightforward way to do this is to make all users members of the same group,
and have the shared cache directory owned by that group.
can write to it, and can read cached files owned by others.

## Transfer existing cache (optional)
> E.g. make all users members of a group that owns the shared cache directory.

You can skip this part if you are setting up a new DVC project where the local
<abbr>cache directory</abbr> (`.dvc/cache` by default), hasn't been used.
## Transfer existing cache (if any)

If you did work on the <abbr>DVC projects</abbr> previously and wish to transfer
its existing cache to the shared cache directory, you will simply need to move
its contents from the old location to the new one:
> Not needed for new DVC projects where the local cache hasn't been used.

For existing DVC projects to work on a new shared cache directory, first you'll
need to move their cache contents from the old location:

```dvc
$ mv .dvc/cache/* /home/shared/dvc-cache
Expand All @@ -46,26 +52,32 @@ $ sudo find /home/shared/dvc-cache -type f -exec chmod 0664 {} \;
$ sudo chown -R myuser:ourgroup /home/shared/dvc-cache/
```

## Configure the external shared cache
## Configure the shared cache

Tell DVC to use the directory we've set up above as the <abbr>cache</abbr> for
your <abbr>project</abbr>:
Tell DVC to use the directory we've set up above as _external
<abbr>cache</abbr>_ for your <abbr>project</abbr>:

```dvc
$ dvc cache dir /home/shared/dvc-cache
```

And tell DVC to set group permissions on newly created or downloaded cache
files:
And configure DVC to set group permissions on cached assets, and to enable all
[link types](/doc/user-guide/large-dataset-optimization#file-link-types-for-the-dvc-cache):

```dvc
$ dvc config cache.shared group
$ dvc config cache.type 'reflink,symlink,hardlink,copy'
```

> See `dvc cache dir` and `dvc config cache` for more information.
⚠️ Note that you can't manually modify tracked data with the above `cache.type`
value. Soft/hard links are disabled by default for this reason, but are needed
for a reliable shared cache.

> See `dvc cache dir` and `dvc config cache` for more details on the above
> steps.

If you're using Git, commit changes to your project's config file (`.dvc/config`
by default):
If using Git, commit the changes to your project's configuration (in
`.dvc/config` by default):

```dvc
$ git add .dvc/config
Expand Down
2 changes: 1 addition & 1 deletion content/docs/user-guide/managing-external-data.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ Currently, the following types (protocols) of external outputs (and
In order to specify an external output for a stage file, use the usual `-o` or
`-O` options of `dvc run`, but with the external path or URL to the file in
question. For <abbr>cached</abbr> external outputs (`-o`) you will need to
[setup an external cache](/doc/use-cases/shared-development-server#configure-the-external-shared-cache)
[setup an external cache](/doc/use-cases/shared-development-server#configure-the-shared-cache)
in the same external/remote file system first.

> Avoid using the same location of the
Expand Down