-
Notifications
You must be signed in to change notification settings - Fork 394
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
11 changed files
with
108 additions
and
61 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
# Versioned storage | ||
|
||
What if we could **combine data and ML model versioning features with large file | ||
storage** solutions like traditional hard drives, NAS, or cloud services such as | ||
Amazon S3 and Google Drive? DVC brings together the best of both worlds by | ||
implementing easy synchronization between the data <abbr>cache</abbr> and | ||
on-premises or cloud storage for sharing. | ||
|
||
![](/img/model-versioning-diagram.png) _DVC's hybrid versioned storage_ | ||
|
||
> Note that [remote storage](/doc/command-reference/remote) is optional in DVC: | ||
> no server setup or special services are needed, just the `dvc` command-line | ||
> tool. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
# Add Output to a Stage | ||
|
||
There are situations where we have executed a stage (either by writing | ||
`dvc.yaml` manually and using `dvc repro`, or with `dvc run`), but later notice | ||
that some of the output files or directories it creates, which are already in | ||
the <abbr>workspace</abbr>, are missing from `dvc.yaml` (`outs` field). Follow | ||
the steps below to add existing files or directories as <abbr>outputs</abbr> to | ||
a stage without re-executing it again, which can be expensive/time-consuming, | ||
and is unnecessary. | ||
|
||
We start with an example `prepare`, which has a single output. To add a missing | ||
output `data/validate` to this stage, we can edit `dvc.yaml` like this: | ||
|
||
```git | ||
stages: | ||
prepare: | ||
cmd: python src/prepare.py | ||
deps: | ||
- src/prepare.py | ||
outs: | ||
- data/train | ||
+ - data/validate | ||
``` | ||
|
||
> Note that you can also use `dvc run` with the `-f` and `--no-exec` options to | ||
> add another output to the stage: | ||
> | ||
> ```dvc | ||
> $ dvc run -f --no-exec \ | ||
> -n prepare \ | ||
> -d src/prepare.py \ | ||
> -o data/train \ | ||
> -o data/validate \ | ||
> python src/prepare.py | ||
> ``` | ||
> | ||
> `-f` overwrites the stage in `dvc.yaml`, while `--no-exec` updates the stage | ||
> without executing it. | ||
Finally, we need to run `dvc commit` to save the newly specified output(s) to | ||
the <abbr>cache</abbr> (and to update the corresponding hash values in | ||
`dvc.lock`): | ||
```dvc | ||
$ dvc commit | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters