Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added a BC: workspace document #2197
added a BC: workspace document #2197
Changes from 10 commits
168a290
3130952
c8d77e6
d119729
e0d915e
bdbcc97
7f2e93d
523f1c1
50964b1
46fe4bb
7a88ff5
44af0a9
42559e2
6cc0f6c
e85a0d7
5674991
0b099f6
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not exactly. File contents are org'd in the cache with a special file structure (see https://dvc.org/doc/user-guide/project-structure/internal-files#structure-of-the-cache-directory)
That part is correct. And contradicts the previous part 🙂 (because "visible part" implies there's a hidden part which must be in other dirs).
Let's open this p with that sentence.
This comment was marked as resolved.
Sorry, something went wrong.
This comment was marked as resolved.
Sorry, something went wrong.
This comment was marked as resolved.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But actually
add
can also download data to the workspace (see--out
and--to-remote
options). Also,import*
commands download AND track data. You may want to rephrase this part accordingly 🙂There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sometimes I think there is no command that hasn't got a duplicate, somehow :) I try to mention commands in passing, if we'd consider each and every option to commands, we'll need to duplicate the command reference here IMHO.
We may just delete the commands if you would like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can also list all possibilities for each functionality, like
In the workspace, you can
dvc add --out
,dvc import-url
...)but I think this will turn the document into a list of commands and options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to list every command usage of course, agreed!
My point was that these 3 commands mentioned actually overlap in a way that makes the current text slightly incorrect. In any case, the main use case of
add
is not to "add" but to "track", actually. Please check each cmd ref to try to find the right terms when needed 🙂"Download" is correct for
get/import
butadd
can also download (and they can all "transfer") so I'd avoid that term probably. And in fact I wouldn't even mentionget
here, since it doesn't require a DVC project/workspace. Forimport
I'd try to use the cmd name as the relevant action (to "import") I guess...There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then again
import*
also track the downloaded data 😅 ("adds"). Maybe it should be a single sentence about tracking and put alladd
,import
,import-url
in the same parenthesis.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dvc checkout
versions data with Git. Can you clarify a bit?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modified in 6cc0f6c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That list got a little long? Prob no need for "machine learning" here since the term is found elsewhere in the doc...
Good one! But needs some clarifying (sync what with what?).
Also, people usually need checkout before commit (which is included in add/repro). So maybe something like:
"When switching between |repository| versions, use
dvc checkout
to sync DVC-tracked data with Git-tracked |metafiles|. If you manually modify the workspace status, usedvc commit
to record the changes. And if needed, tracked data can be removed..."There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sentence looks more suitable to use cases or the user guide. Here I would like to mention just the capabilities. Conceptually
dvc checkout
is similar togit checkout
anddvc commit
is akin to.git commit
. This can be turned into a bullet list maybe,I feel this paragraph is too dense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But reading again, this looked rather like a political ad than a technical documentation. Maybe "you can"s are spurious. Just
dvc add
)dvc get
...)is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do we mean by this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We mean the user can live in the workspace and use dvc to do what they normally do with files. Create, copy, rename etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Yes it may be a relevant note. This sentence seems to belong more to the previous paragraph though? Something like:
"There's usually no need to modify tracked data manually as DVC provides commands to safely perform any update needed, but if you do, use
dvc commit
to register the changes..."(see my previous comment, these would have to be merged somehow).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these -> the?
But I'm still not sure I'm getting how this p related to the concept of workspace. May just need some rephrasing because mentioning dvc.yaml and *.dvc files could def. be relevant. Or fit those files into the intro p (probably best).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these refer to typical operations from the previous paragraph.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think introducing
.dvc
files in the intro is a bit distracting.Could you check e85a0d7 for changes to this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can definitely incorporate .dvc files and dvc.yaml mentions earlier, closer to the main concept definition. Metafiles are one of the most important contents of the workspace, along with the corresponding data. That, the corresponding data, and any git-tracked assets (mainly code). .dvc/ is not considered part of the workspace. Workspace is analogous to working tree in Git.