Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to doc - Adding dependency to stage #1913

Closed
wants to merge 20 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion content/docs/sidebar.json
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@
"slug": "how-to",
"source": false,
"children": [
"add-output-to-stage",
"add-dependency-or-output-to-stage",
Copy link
Contributor

@jorgeorpinel jorgeorpinel Nov 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once merged with my changes, the slug should be add-deps-or-outs-to-a-stage and the doc title should be Add Dependencies or Outputs to a Stage but if that's too long for the nav label, let's use something else like "Add Deps/Outs to a Stage".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* Once merged with my changes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jorgeorpinel I merged the branch how-to. If you check "file changed" in this PR, it shows 12 files changed. But here (https://github.com/iterative/dvc.org/compare/guide/how-to) it shows 6 files changed (in guide/how-to branch) . Is there some problem/conflict? Because the changes in those extra 5-6 files (shown under this PR) have already been merged to master. Please help.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I don't know what you did in commits 6ecc645 and da629e9. You're gonna have to abandon this branch and just take over PR #1914. Please update the title and description and switch to that branch to apply again all of your changes here @imhardikj.

"undo-adding-data",
"update-tracked-files"
]
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# Add Dependency or Output to Stage

There are situations where we have executed a stage (either by writing
`dvc.yaml` manually and using `dvc repro`, or with `dvc run`), but later notice
that some of the dependencies, or the output files/directories it creates, which
are already in the <abbr>workspace</abbr>, are missing from `dvc.yaml` (`deps`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just adding "dependencies or" everywhere is not going to be enough. Please think the changes through and request my review then. Thanks @imhardikj

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed new update.

and `outs` field respectively). Follow the steps below to add existing files or
directories as <abbr>dependency</abbr> or <abbr>outputs</abbr> to a stage
without re-executing it again, which can be expensive/time-consuming, and is
unnecessary.

We start with an example `prepare`, which has a single dependency and output. To
add a missing dependency `data/data.csv` and output `data/validate` to this
stage, we can edit `dvc.yaml` like this:

```git
stages:
prepare:
cmd: python src/prepare.py
deps:
+ - data/data.csv
- src/prepare.py
outs:
- data/train
+ - data/validate
```

> Note that you can also use `dvc run` with the `-f` and `--no-exec` options to
> add another dependency/output to the stage:
>
> ```dvc
> $ dvc run -f --no-exec \
> -n prepare \
> -d data/data.csv \
> -d src/prepare.py \
> -o data/train \
> -o data/validate \
> python src/prepare.py
> ```
>
> `-f` overwrites the stage in `dvc.yaml`, while `--no-exec` updates the stage
> without executing it.

Finally, we need to run `dvc commit` to save the newly specified dependency or
output(s) to the <abbr>cache</abbr> (and to update the corresponding hash values
in `dvc.lock`):

```dvc
$ dvc commit
```
46 changes: 0 additions & 46 deletions content/docs/user-guide/how-to/add-output-to-stage.md

This file was deleted.