Skip to content

Commit

Permalink
use-cases: address pending feedback from /pull/821
Browse files Browse the repository at this point in the history
Specifically #821 (review)
  • Loading branch information
jorgeorpinel committed Dec 10, 2019
1 parent 2be074f commit a9a9bda
Show file tree
Hide file tree
Showing 2 changed files with 34 additions and 35 deletions.
67 changes: 33 additions & 34 deletions static/docs/command-reference/get.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
# get

Obtain a file or directory from any <abbr>DVC project</abbr> or Git repository
Download a file or directory from any <abbr>DVC project</abbr> or Git repository
(e.g. hosted on GitHub) into the current working directory.

> Unlike `dvc import`, this command does not track the obtained files (does not
> create a DVC-file).
> Unlike `dvc import`, this command does not track the downloaded files (does
> not create a DVC-file).
## Synopsis

Expand All @@ -15,14 +15,13 @@ Download/copy files or directories from DVC repository.
Documentation: <https://man.dvc.org/get>
positional arguments:
url URL of Git repository with DVC project to download
from.
path Path to a file or directory within a DVC repository.
url URL of Git repository with DVC project to download from.
path Path to a file or directory within a DVC repository.
```

## Description

Provides an easy way to obtain files or directories tracked in any <abbr>DVC
Provides an easy way to download files or directories tracked in any <abbr>DVC
repository</abbr>, both by Git (e.g. source code) and DVC (e.g. datasets, ML
models). The file or directory in path is copied to the current working
directory. (For remote URLs, it works like downloading with wget, but supporting
Expand All @@ -36,28 +35,31 @@ external <abbr>project</abbr>. Both HTTP and SSH protocols are supported for
online repositories (e.g. `[user@]server:project.git`). `url` can also be a
local file system path to an "offline" repository.

The `path` argument of this command is used to specify the location of the file
or directory within the source project. If the file is a
[DVC-file](/doc/user-guide/dvc-file-format) the source project must have a
default [DVC remote](/doc/command-reference/remote) configured.
The `path` argument of this command is used to specify the location, within the
source repository at `url`, of the target(s) to be downloaded. It can point to
any file or directory in the source project, including all files tracked by Git.
Note that data tracked by DVC should be specified in one of the
[DVC-files](/doc/user-guide/dvc-file-format) of the source repository. (In this
case, a default [DVC remote](/doc/command-reference/remote) needs to be
configured in the project, containing the actual data.)

> See `dvc get-url` to obtain data from other supported URLs.
> See `dvc get-url` to download data from other supported URLs.
After running this command successfully, the data found in the `url` `path` is
created in the current working directory, with its original file name.

## Options

- `-o`, `--out` - specify a path (directory and/or file name) to the desired
location to place the obtained file in. The default value (when this option
location to place the download file in. The default value (when this option
isn't used) is the current working directory (`.`) and original file name. If
an existing directory is specified, then the output will be placed inside of
it.

- `--rev` - specific
[Git revision](https://git-scm.com/book/en/v2/Git-Internals-Git-References)
(such as a branch name, a tag, or a commit hash) of the DVC repository to
obtain the file from. The tip of the default branch is used by default when
download the file from. The tip of the default branch is used by default when
this option is not specified.

- `-h`, `--help` - prints the usage/help message, and exit.
Expand All @@ -67,18 +69,14 @@ created in the current working directory, with its original file name.

- `-v`, `--verbose` - displays detailed tracing information.

## Example: Retrieve a model from a DVC remote
## Example: Get a DVC-tracked model file

> Note that `dvc get` can be used from anywhere in the file system, as long as
> DVC is [installed](/doc/install).
We can use `dvc get` to obtain the resulting model file from our
We can use `dvc get` to download the resulting model file from our
[get started example repo](https://github.com/iterative/example-get-started), a
<abbr>DVC project</abbr> external to the current working directory. The desired
<abbr>output</abbr> file would be located in the root of the external project
(if the
[`train.dvc` stage](https://github.com/iterative/example-get-started/blob/master/train.dvc)
was reproduced) and named `model.pkl`.
<abbr>DVC project</abbr> hosted on Github:

```dvc
$ dvc get https://github.com/iterative/example-get-started model.pkl
Expand All @@ -96,18 +94,18 @@ is found, that specifies `model.pkl` in its outputs (`outs`). DVC then
its
[config file](https://github.com/iterative/example-get-started/blob/master/.dvc/config)).

> A recommended use for obtaining binary files from DVC repositories, as done in
> this example, is to place a ML model inside a wrapper application that serves
> as an [ETL](https://en.wikipedia.org/wiki/Extract,_transform,_load) pipeline
> or as an HTTP/RESTful API (web service) that provides predictions upon
> request. This can be automated leveraging DVC with
> A recommended use for downloading binary files from DVC repositories, as done
> in this example, is to place a ML model inside a wrapper application that
> serves as an [ETL](https://en.wikipedia.org/wiki/Extract,_transform,_load)
> pipeline or as an HTTP/RESTful API (web service) that provides predictions
> upon request. This can be automated leveraging DVC with
> [CI/CD](https://en.wikipedia.org/wiki/CI/CD) tools.
The same example applies to raw or intermediate <abbr>data artifacts</abbr> as
well, of course, for cases where we want to obtain those files or directories
well, of course, for cases where we want to download those files or directories
and perform some analysis on them.

## Examples: Retrieve a file from a git repository
## Examples: Get a Git-tracked model file

We can also use `dvc get` to retrieve any file or directory that exists in a git
repository.
Expand All @@ -121,11 +119,12 @@ install.sh
## Example: Compare different versions of data or model

`dvc get` has the `--rev` option, to specify which version of the repository to
obtain a <abbr>data artifact</abbr> from. It also has the `--out` option to
specify the target path. Combining these two options allows us to do something
we can't achieve with the regular `git checkout` + `dvc checkout` process – see
for example the [Get Older Data Version](/doc/get-started/older-versions)
chapter of our _Get Started_ section.
download a <abbr>data artifact</abbr> from. It also has the `--out` option to
specify the location to place the artifact within the workspace. Combining these
two options allows us to do something we can't achieve with the regular
`git checkout` + `dvc checkout` process – see for example the
[Get Older Data Version](/doc/get-started/older-versions) chapter of our _Get
Started_ section.

Let's use the
[get started example repo](https://github.com/iterative/example-get-started)
Expand Down Expand Up @@ -159,7 +158,7 @@ get the most recent one, we use a similar command, but with
`-o model.bigrams.pkl` and `--rev 9-bigrams-model` or even without `--rev`
(since it's the latest version anyway). In fact, in this case using `dvc pull`
with the corresponding [DVC-files](/doc/user-guide/dvc-file-format) should
suffice, obtaining the file as just `model.pkl`. We can then rename it to make
suffice, downloading the file as just `model.pkl`. We can then rename it to make
its version explicit:

```dvc
Expand Down
2 changes: 1 addition & 1 deletion static/docs/use-cases/data-registry.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ use-cases/cats-dogs
└── dogs [400 image files]
```

In a local DVC project, we could have obtained this dataset at this point with
In a local DVC project, we could have downloaded this dataset at this point with
the following command:

```dvc
Expand Down

0 comments on commit a9a9bda

Please sign in to comment.