Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce artifacts: section in DVC and make it work with GTO #9219

Closed
13 of 20 tasks
aguschin opened this issue Mar 21, 2023 · 6 comments
Closed
13 of 20 tasks

Introduce artifacts: section in DVC and make it work with GTO #9219

aguschin opened this issue Mar 21, 2023 · 6 comments
Assignees
Labels
A: artifacts Related to `artifacts` section in `dvc.yaml` epic Umbrella issue (high level). Include here: list of tasks/PRs, details, Qs, timelines, etc feature is a feature p1-important Important, aka current backlog of things to do

Comments

@aguschin
Copy link
Contributor

aguschin commented Mar 21, 2023

Summary / Background

To make working with Model Registry (including Studio MR) more accessible for DVC users, we merge artifacts.yaml into DVC. More detail discussion on this iterative/gto#337

Release scope

Described in iterative/mlem.ai#323, but summarizing it here. For now I'm excluding what we considered to be extra features (they can be found in docs proposal marked as [extra for now]):

Follow-ups after release (p1)

Follow-ups after release (p2)

Related issues

@aguschin aguschin added epic Umbrella issue (high level). Include here: list of tasks/PRs, details, Qs, timelines, etc feature is a feature labels Mar 21, 2023
@shcheklein
Copy link
Member

Thanks @aguschin, sounds like a good plan, overall. A few questions:

allow it to be read from a file instead of writing it in dvc.yaml completely, e.g. registry: artifacts.yaml

can we cut the scope? is it needed?

updating docs

does it make sense to start with this?

Do we need to include an item that simplifies GTO? (removes the registry files mechanics).

@aguschin
Copy link
Contributor Author

can we cut the scope? is it needed?

Maybe we can, but I'm trying to find a way to not break current users workflows. GTO don't have that many users, and I wouldn't like to lose them. This item can be postponed though - there will be some time before we implement this, and before we release it. So we can prioritize other things and get back to it before the release. Not mentioning some users asking for that specifically.

Re how we can avoid breaking user workflows though. When we support registry: artifacts.yaml right away, this won't add it to dvc.yaml still, so in the existing commits this part won't be discovered automatically. But you can add this to dvc.yaml and delay migration, while still using GTO for some time (since artifacts.yaml is still there). This will allow a smooth transition.

does it make sense to start with this?

IIRC, @dberenbaum wanted to take it and add to DVC docs? Although I had an impression there are some WIP changes proposed by @daavoo. Happy to help there if needed - writing docs is so much faster for me than implementing this in DVC (for now 😅 ).

@dberenbaum
Copy link
Collaborator

updating docs

does it make sense to start with this?

@shcheklein I added iterative/dvc.org#4423 to discuss docs. Since @aguschin already drafted iterative/mlem.ai#323, is there something missing from there that you'd like to see before implementation?

allow it to be read from a file instead of writing it in dvc.yaml completely, e.g. registry: artifacts.yaml

can we cut the scope? is it needed?

I asked @aguschin the same thing 😄, and he explained that he will try to do it to make it easier for existing users to transition if it's low effort. I do worry about documenting this and making dvc.yaml even more complex, but I think we could try to minimize this. I would even consider not documenting it for now if it's strictly a legacy thing to help existing GTO users.

@shcheklein
Copy link
Member

can we cut the scope? is it needed?

Folks, we need to be extra extra mindful about complexity and scope. Can we help the existing users (and we don't have many, right?) to transition to the new format manually, of with some small written instruction?

Complicating parsing in Studion, dvc.yaml format, etc - doesn't unfortunately sound like a very easy thing to do.

I feel your concerns and I like the deep care to the users though @aguschin !!

is there something missing from there that you'd like to see before implementation?

I don't know (your call), I was asking specifically because it sometimes uncovers more things. If you feel that it's stable already, we get sense for all the changes, etc - we good to go then.

@dberenbaum dberenbaum added the p1-important Important, aka current backlog of things to do label Mar 22, 2023
@daavoo daavoo added the A: artifacts Related to `artifacts` section in `dvc.yaml` label Mar 24, 2023
@dberenbaum
Copy link
Collaborator

@aguschin We are aiming to include this as part of a 3.0 release at the end of this quarter. Do you expect that we can get all of the items above done? Do you think there is more that needs to be added before release?

@aguschin
Copy link
Contributor Author

aguschin commented May 2, 2023

@dberenbaum, yes, I'm sure we'll get it done. And it looks like top issue's comment has all items to do.

aguschin added a commit to iterative/gto that referenced this issue May 22, 2023
…346)

Related to iterative/dvc#9219

One question I'd like to discuss: instead of removing `gto describe`
(and maybe `gto annotate` and `gto remove`), we could keep them. At
least for `describe` it's pretty trivial to support - we can call DVC
API to read annotations.

Also this can be helpful if we're planning to implement the command in
DVC - here we can implement and check it works and satisfies user needs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: artifacts Related to `artifacts` section in `dvc.yaml` epic Umbrella issue (high level). Include here: list of tasks/PRs, details, Qs, timelines, etc feature is a feature p1-important Important, aka current backlog of things to do
Projects
No open projects
Archived in project
Development

No branches or pull requests

4 participants