Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

import-url/update: add --no-download flag #3773

Merged
merged 1 commit into from
Aug 29, 2022

Conversation

dtrifiro
Copy link
Contributor

@dtrifiro dtrifiro commented Jul 18, 2022

Add documentation for new dvc import-url dvc update flags (--no-download)

related: iterative/dvc#8024 (per iterative/dvc#7918)

@jorgeorpinel jorgeorpinel added A: docs Area: user documentation (gatsby-theme-iterative) C: ref Content of /doc/*-reference labels Jul 18, 2022
Copy link
Contributor

@jorgeorpinel jorgeorpinel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's reorg/rewarp this under 69 cols (to avoid horizontal scrolling) and to match the order in the Options section. Committing:

content/docs/command-reference/import-url.md Show resolved Hide resolved
content/docs/command-reference/import-url.md Show resolved Hide resolved
Comment on lines 141 to 142
finish the operation(s)); or if the target data already exist locally and you
want to "DVCfy" this state of the project (see also `dvc commit`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not directly related to the change in question but the description of --no-exec could be better. For starters, maybe say something like "create a value-less import .dvc file"? (for contrast with --no-download which includes *some* values)

This last sentence (seen above) especially is pretty hard to understand. What is "to DVCfy"? 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also instead of "download everything later" it should be "import the data later" now, I think. Since there's a separate --no-download option now (so let's only use term download there). BUT it could state "implies --no-download".

p.s. this could be done in a separate PR but I think it's related enough to this change to address now.

p.p.s. should also apply to dvc import --no-exec.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would get rid of the second part ("DVCfy..."). It's not clear what it's supposed to do, and the command will fail if a file with the same of what is being imported already exists locally.

Copy link
Contributor

@jorgeorpinel jorgeorpinel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion on actual change

content/docs/command-reference/import-url.md Outdated Show resolved Hide resolved
content/docs/command-reference/update.md Outdated Show resolved Hide resolved
@jorgeorpinel
Copy link
Contributor

Hey @dtrifiro another note for future reference: please open PRs to this repo directly on the upstream (no fork). See https://github.com/iterative/dvc.org/wiki/2.-Pull-requests-for-core-DVC-changes for context. Thanks!

@jorgeorpinel jorgeorpinel added the ⌛ status: wait-core-merge Waiting for related product PR merge/release label Jul 26, 2022
@dberenbaum
Copy link
Contributor

@dtrifiro Do you think it's realistic to get this drafted, reviewed, and merged this sprint?

@jorgeorpinel jorgeorpinel removed the ⌛ status: wait-core-merge Waiting for related product PR merge/release label Aug 25, 2022
@dtrifiro dtrifiro marked this pull request as ready for review August 25, 2022 17:05
@dtrifiro dtrifiro force-pushed the import-url-no-download branch 2 times, most recently from cd68df1 to d146a87 Compare August 25, 2022 17:08
@dberenbaum
Copy link
Contributor

In case @jorgeorpinel didn't already mention it, please open future docs PRs from this repo instead of a fork. That way, the changes will be deployed into an example app when you open the PR and it's easy for people to see what the final docs will look like. 🙏 No need to bother for this PR.

Copy link
Contributor

@dberenbaum dberenbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dtrifiro LGTM except for 2 minor comments.

@jorgeorpinel Do you want to take another look or leave it to me to approve/merge?

@dberenbaum dberenbaum merged commit 00958f7 into iterative:main Aug 29, 2022
@dberenbaum
Copy link
Contributor

@jorgeorpinel I merged, but feel free to leave comments if you have them

@dtrifiro dtrifiro deleted the import-url-no-download branch August 30, 2022 21:51
Copy link
Contributor

@jorgeorpinel jorgeorpinel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feel free to leave comments

A few, minor:

Comment on lines +137 to +138
- `--no-exec` - create the import `.dvc` file but don't download `url` or get
checksums (assumes that the data source is valid). This is useful if you need
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checksums - no context for this term/idea. Not mentioned anywhere in the Description. Not even "hashes" are mentioned.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idea: could at least l'ink to https://dvc.org/doc/user-guide/project-structure/dvc-files#output-entries. And I'd prefer the term "file hash" but up to you.

to define the project imports quickly, and import the data later (use
`dvc update` to finish the operation(s)).

- `--no-download` - create the import `.dvc` with data checksums but without
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import .dvc file

Comment on lines 106 to 107
- `--no-exec` - create the import `.dvc` file but don't download `url` (assumes
that the data source is valid). This is useful if you need to define the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checksums not mentioned here (inconsistent?)

Comment on lines 13 to 16
usage: dvc import-url [-h] [-q | -v] [-j <number>] [--file <filename>]
[--no-exec] [--to-remote] [-r <name>]
[--no-exec | --no-download] [--to-remote] [-r <name>]
[--desc <text>]
url [out]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now too long when rendered. See https://dvc.org/doc/command-reference/import-url

💅🏼 but let's regroup the args? Same for the other 2 files.

Comment on lines +111 to +114
- `--no-download` - create the import `.dvc` with data checksums but without
downloading the associated data. This is useful if you need track changes in
remote data but do not (yet) need to download data to the local workspace.
Data can be later downloaded using `dvc pull`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we mention update --no-download in these?

with `-r`). Use `dvc pull` to get the data locally.
- `--no-download` - Update data checksums in the `.dvc` file (`md5`, `etag`, or
`checksum` fields) without actually downloading the latest data. See
`dvc import-url --no-download`/`dvc import --no-download` for more context.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💅🏼

Suggested change
`dvc import-url --no-download`/`dvc import --no-download` for more context.
`dvc import-url --no-download` or `dvc import --no-download` for more context.

@dtrifiro
Copy link
Contributor Author

dtrifiro commented Sep 8, 2022

Thanks for the feedback @jorgeorpinel, here's a new PR: main...import-url-no-download-fixes

we can move the discussion there

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: docs Area: user documentation (gatsby-theme-iterative) C: ref Content of /doc/*-reference
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants