Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

guide: make proper DVC File guide + separate reference? #2059

Closed
jorgeorpinel opened this issue Dec 27, 2020 · 11 comments · Fixed by #2098
Closed

guide: make proper DVC File guide + separate reference? #2059

jorgeorpinel opened this issue Dec 27, 2020 · 11 comments · Fixed by #2098
Labels
A: docs Area: user documentation (gatsby-theme-iterative) C: guide Content of /doc/user-guide type: discussion Requires active participation to reach a conclusion. type: enhancement Something is not clear, small updates, improvement suggestions

Comments

@jorgeorpinel
Copy link
Contributor

jorgeorpinel commented Dec 27, 2020

Some/parts of the docs in https://dvc.org/doc/user-guide/dvc-files feel more like a reference than a guide. While it's arguable whether that is a problem (e.g. other docs in that section have the same situation), it seems we agree that an actual guide for writing dvc.yaml would be great to have, since it's such a major component of DVC projects. Some ideas (from #2052 (review)):

we could separate the dvc.yaml format explanation?
Like dvc.yaml short reference, then a section about "stage definition" explaining what key/value it takes, then "variables/parametriation" and then introduce loops. This way, concepts will get built (up on)

split it into subsections - one about regular simple pointers .dvc, second about dvc.yaml and start writing a proper guide in it, start building proper structure

similar docs (spacy, ansible, etc...
I like docker-compose. Examples with one or two lines of text
to write intro and help people write the yaml themselves, that fusion in spacy projects is nicer.
we could also take a look at pypyr: They have separated the yaml definitions nicely into sections/concepts

And from #2052 (review)

we should have this references, there should be some general description, then basic layout, then other syntax like templating, loops, etc. Again, let's refer to some good examples that describe languages

And #2052 (comment)

The biggest concern is still structure... these titles (inside DVC Files) do not look good to me... They should higher level.

  • .dvc file - is clearly a reference thing. Where do we put it?
  • do we need two seprate dvc.yaml sections?
  • advanced-dvc.yaml - is too heavy and a mix of a reference (precise spec) and guide (story-like)
@jorgeorpinel jorgeorpinel added A: docs Area: user documentation (gatsby-theme-iterative) type: enhancement Something is not clear, small updates, improvement suggestions labels Dec 27, 2020
@jorgeorpinel

This comment has been minimized.

@jorgeorpinel jorgeorpinel added the type: discussion Requires active participation to reach a conclusion. label Jan 12, 2021
@jorgeorpinel
Copy link
Contributor Author

@shcheklein I moved all pending structure discussions from #2052 to here (see desc.). On your Q from #2052 (comment):

do we need two separate dvc.yaml sections? why do we consider basic templating and advanced feature?

Ideally not (let's come up with a definite org here) but for now it works so we clearly separate the 2.0 features into https://dvc.org/doc/user-guide/dvc-files/advanced-dvc.yaml.

And we can call it "basic" templating but the explanation takes a while (params files, global vars, local vars, etc. etc.) so maybe it's not that basic. But if we completely extract the reference part then yeah, it should be able to fit in the single dvc.yaml guide along with multi-stages — it will still be long though but maybe that's OK/inevitable.

@skshetry

This comment has been minimized.

@jorgeorpinel

This comment has been minimized.

@jorgeorpinel
Copy link
Contributor Author

jorgeorpinel commented Jan 12, 2021

I don't understand why we don't use right-side TOC instead. I find traversing through those pages difficult

@skshetry yes maybe a single dvc.yaml guide with several H2 headers (even if it's long) will be part of the solution here. But one more reason we sometimes avoid that you can see in cmd refs with multiple Examples (e.g. https://dvc.org/doc/command-reference/remote): you may notice there are some issues, like having to scroll in a small space (see also iterative/gatsby-theme-iterative#37, iterative/gatsby-theme-iterative#38).

Can dvc.yaml fit a single page - good question. I really hope so, I'm though worried that about some points...

@shcheklein it could be one guide page and one reference page somewhere else. I think we may need to extract the reference part to a new top section for DVC File/Format refs though (DV "Language" ref). Put .dvc, dvc.yaml/lock, .dvcignore, and maybe even .dvc/ dir (internals) there... But let me finish researching other sites and come up with a final proposal ⌛

shcheklein pushed a commit that referenced this issue Jan 12, 2021
@shcheklein
Copy link
Member

@jorgeorpinel

you may notice there are some issues such, like having to scroll in a small space (see also iterative/gatsby-theme-iterative#37, iterative/gatsby-theme-iterative#38).

those glitches should not determine layout to my mind in any way. Those are issues that should be fixed.

But let me finish researching other sites and come up with a final proposal ⌛

yep!

@jorgeorpinel jorgeorpinel added the status: research Writing concrete steps for the issue label Jan 15, 2021
@jorgeorpinel
Copy link
Contributor Author

Pending research (1/2)

On the terminology, I'm going for "${} expression", "replace", "generate" (details in #2052 (comment))

@jorgeorpinel
Copy link
Contributor Author

jorgeorpinel commented Jan 15, 2021

(2/2)

On the organization of docs (Can dvc.yaml fit a single page?), here are some raw results:

Hidden 👎
  • Helm has
    (1) a long, monolithic "Topic" (guide + ref) for Charts (Chart.yaml), and
    (2-13) a series of guides about templating elsewhere. No dedicated schema ref.
    Verdict: 👎 Their docs lack easy nav menus and seem pretty hard to browse/revisit. The topic is spread across at least 13 pages.
  • Kubernetes
    (1) ConfigMap (K8s objects e.g. pod.yml) are explained in the "Overview/Configuration" section.
    (2) A disconnected how-to Config. Pod w/ ConfigMap page is provided too.
    (3) An entirely separate site for the underlying API ref exists as well.
    Verdict: 🆗 Considering their huge docs, you can find what you need (not super easy: the left nav is quite complicated). The right-side ToC nav helps explore long pages. The main guide seems too verbose though.

  • spaCy Projects (project.yml) is all
    (1) stuffed into a single, very long page. The top section is an intro guide, and the remaining sections mix guide + ref.
    Verdict: 🆗 Smart graphic design helps a lot (e.g. project.yml samples floated right, clear schema ref. tables). But the page is way long and all the nav (external and ToC) is all on the left side, while the right side has space to spare

    BTW their project.yml looks a lot like our dvc.yaml (fields description, vars, commands with deps/outputs)

Hidden 👎
  • Ansible playbooks (playbook.yml) are mentioned heavily in
    (1-4) several descriptive docs in their User Guide, and described properly in
    (5) a buried guide. The YAML schema is totally separate.
    Verdict: 👎 Too many guides in differnt places to cover the topic. The familiar readthedocs-site could be a good thing but the nav is super complicated.
  • docker-compose.yml is introduced in
    (1) the Compose docs, but
    (2) the Compose file page is an example-based "guided ref".
    Verdict: 👎 🆗 The schema ref. is guided and complete, with examples and some nice design elements (e.g. expandable file samples). Right-side ToC nav is long and always expanded, but somehow seems appropriate. There's a confusing, parallel doc in a GH repo though (and seems abandoned e.g. has broken links).

  • pypyr pipelines (pipelinename.yaml) are introed in the CLI ref and explained shortly after in
    (1) its own section of docs (mix of guide + ref).
    Verdict: 👍 Single place to learn the schema and go back to as a ref. The docs site UI is so-so though.

    That said it's obviously a simple project so it's easier to organize effectively.

@jorgeorpinel jorgeorpinel removed the status: research Writing concrete steps for the issue label Jan 15, 2021
@jorgeorpinel
Copy link
Contributor Author

See #2098 for a WIP proposal to address all this.

p.s. what about https://dvc.org/doc/user-guide/dvc-internals and https://dvc.org/doc/user-guide/dvcignore, shuold we move them into the DVC Files nav section?

@jorgeorpinel
Copy link
Contributor Author

about dvc.yaml and start writing a proper guide in it, start building proper structure

Question on this: won't it be pretty repetitive with https://dvc.org/doc/start/data-pipelines (once we rewrite that page)? Meaning, how should the basic dvc.yaml User Guide do differ from the Get Started: Data Pipelines?

@shcheklein
Copy link
Member

The purpose of the get started is to give a brief overview, it can be actionable so that people can try it. In our case it's already too long (mostly because we tend to grow all the documents and from time to time we need to dry them)

The purpose of the User Guide is to provide comprehensive information (e.g. language spec). To some extent even Command Reference can be considered part of the User Guide for example.

It means that these document should be very different in language, in how they are structured, otherwise something is probably not right.

@iesahin iesahin added the C: guide Content of /doc/user-guide label Oct 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: docs Area: user documentation (gatsby-theme-iterative) C: guide Content of /doc/user-guide type: discussion Requires active participation to reach a conclusion. type: enhancement Something is not clear, small updates, improvement suggestions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants