Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

start: complete setup for data pipelines #3998

Closed
wants to merge 6 commits into from

Conversation

jorgeorpinel
Copy link
Contributor

@jorgeorpinel jorgeorpinel commented Sep 27, 2022

Per #1943

... the user needs to do few additional steps...

  • run git init
  • run dvc init
  • fetch data/data.xml
    UPDATE: Or just checkout appropriate tag from example repo

@jorgeorpinel jorgeorpinel added 🐛 type: bug Something isn't working. A: docs Area: user documentation (gatsby-theme-iterative) C: start Content of /doc/start labels Sep 27, 2022
@shcheklein shcheklein temporarily deployed to dvc-org-start-fix-pipes-v2ffvm September 27, 2022 02:50 Inactive
@jorgeorpinel
Copy link
Contributor Author

p.s. not sure this is wanted per #1943 (comment):

Doing it completely reproducible and chapters independent makes it a hands-on tutorial like (which get started is not). It means longer read time, more distractions for users who want to get the idea quick, etc.

Co-authored-by: Restyled.io <commits@restyled.io>
@shcheklein shcheklein temporarily deployed to dvc-org-start-fix-pipes-v2ffvm September 27, 2022 03:03 Inactive
@github-actions
Copy link
Contributor

github-actions bot commented Sep 27, 2022

e985647

Link Check Report

All 2 links passed!

CML watermark

Co-authored-by: Jorge Orpinel <jorgeorpinel@users.noreply.github.com>
@shcheklein shcheklein temporarily deployed to dvc-org-start-fix-pipes-v2ffvm September 27, 2022 19:43 Inactive
@jorgeorpinel jorgeorpinel removed the 🐛 type: bug Something isn't working. label Sep 28, 2022
Co-authored-by: Dave Berenbaum <dave@iterative.ai>
@shcheklein shcheklein temporarily deployed to dvc-org-start-fix-pipes-v2ffvm September 28, 2022 04:24 Inactive
Copy link
Contributor Author

@jorgeorpinel jorgeorpinel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

180° turn: let's just mention the Data Versioning chapter (and corresponding example repo tag) as preparation then?

p.s. maybe the Pipelines chapter should come before Data and Model Access BTW?

content/docs/start/data-management/data-pipelines.md Outdated Show resolved Hide resolved
content/docs/start/data-management/data-pipelines.md Outdated Show resolved Hide resolved
content/docs/start/data-management/data-pipelines.md Outdated Show resolved Hide resolved
Comment on lines -62 to -63
Please also add or commit the source code directory with Git at this point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I'm leaving this sentence out. Also, it's repetitive given "This should be a good time to commit the changes with Git" which is at the end of https://dvc.org/doc/start/data-management/data-pipelines#dependency-graphs-dag (after stage add).

@shcheklein shcheklein temporarily deployed to dvc-org-start-fix-pipes-v2ffvm September 28, 2022 08:05 Inactive
[Data Versioning](/doc/start/data-management/data-versioning) chapter.
You can get there by cloning the `2-track-data` tag of our
[example-get-started](https://github.com/iterative/example-get-started)
repo.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this part is also needed? Please commit if so:

Suggested change
repo.
repo and using `dvc pull` to get the DVC-tracked data.

@jorgeorpinel jorgeorpinel self-assigned this Sep 29, 2022
@jorgeorpinel jorgeorpinel added the p2-nice-to-have Less of a priority at the moment. We don't usually deal with this immediately. label Sep 29, 2022
@dberenbaum
Copy link
Contributor

dberenbaum commented Sep 30, 2022

Also, I don't believe this should be a priority. Even if we do this at best we'll get some small improvement. It's not worth it. We should be doing major updates (whatever it means).

@jorgeorpinel What do you think about closing this PR? It seems like we went into detailed review without resolving the question of whether it's worth doing at all?

Edit: Similar discussion applies to #4000

Co-authored-by: Restyled.io <commits@restyled.io>
@shcheklein shcheklein temporarily deployed to dvc-org-start-fix-pipes-v2ffvm October 4, 2022 06:17 Inactive
@jorgeorpinel
Copy link
Contributor Author

If you don't approve it feel free to close it to avoid further discussion on a non-priority @dberenbaum . Otherwise, if the change can be valuable, please approve+merge (since the work has been done after all). No worries either way!

@dberenbaum
Copy link
Contributor

Thanks @jorgeorpinel! Despite the narrow scope, I still think there's discussion that would be needed here about what should be added and deleted, so going to close to not distract from priorities.

@dberenbaum dberenbaum closed this Oct 4, 2022
@jorgeorpinel jorgeorpinel deleted the start/fix-pipes-setup branch November 21, 2022 21:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: docs Area: user documentation (gatsby-theme-iterative) C: start Content of /doc/start p2-nice-to-have Less of a priority at the moment. We don't usually deal with this immediately.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants