Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add API & CLI to parse artifact history #37

Closed
aguschin opened this issue Mar 5, 2022 · 6 comments · Fixed by #72
Closed

Add API & CLI to parse artifact history #37

aguschin opened this issue Mar 5, 2022 · 6 comments · Fixed by #72
Labels
external-product External products require this feature p1-high High priority

Comments

@aguschin
Copy link
Contributor

aguschin commented Mar 5, 2022

Something like gto history $ALIAS should give an answer to:

  1. In which commits the model was present.
  2. In which commits descriptions/tags were changed.
  3. In which commits versions were registered.
  4. In which commits envs were promoted.

History can be linear by commits (aligned with commits) or linear by time (aligned with time).

Something like what we can see in Studio UI mockups by Yaroslav should work well.

@aguschin
Copy link
Contributor Author

aguschin commented Mar 9, 2022

One more thing: we may be able to pin down commits in which the artifact actually changed (i.e. the artifact under path changed or enrichments changed). As an option, this may be done via yourpackage.gto modules which can be automatically discovered by GTO if yourpackage is installed (i.e. DVC, MLEM and all other enrichment tools are installed).
Then the history could look like this:

- abc1234: metrics and plots changed (DVC)        <- click to learn more 
- 1234abc: artifact description changed (GTO)
- 123abcd: model framework changed, metrics and plots changed (MLEM + DVC)
- (other relevant and easily interpreted things)

This can be implemented outside of GTO, DVC and MLEM - for example as a custom code in Studio BE. I would prefer though to make this accessible for everyone without Studio - in GTO CLI/API. Obviously, Studio will provide much smoother UI for it, but an option to see meaningful history of your artifact in CLI sounds like a very powerful feature.

@mike0sv what do you think about it? Can we try to draft an implementation for mlem.gto to see how it works together?
CC @dmpetrov

@aguschin aguschin added this to the First GTO release milestone Mar 9, 2022
@aguschin aguschin added external-product External products require this feature p1-high High priority labels Mar 9, 2022
@aguschin
Copy link
Contributor Author

aguschin commented Mar 9, 2022

This is needed for Studio BE in the first release of MR. The alternative is to implement this on their side, but this is a very powerful feature I would like to be implemented in GTO anyway, so let's see if we can do that now so Studio BE could rely on our internal implementation from the start.

@aguschin
Copy link
Contributor Author

aguschin commented Mar 9, 2022

Hi folks, tagging you after MLEM call so we can discuss this issue here.
@dmpetrov @shcheklein @mike0sv @madhur-tandon @omesser

@aguschin
Copy link
Contributor Author

Another question - should we return internal history representation (e.g. BaseLabel, BaseVersion, BaseCommit python class instances) like we do in MLEM. Or should we return some json representation with simple types.
CC @mike0sv to discuss nuances of this.

@mike0sv
Copy link
Contributor

mike0sv commented Mar 10, 2022

Since you will still have to install gto and those classes are pydantic models, I'd say we can return them. User can call .dict or .json on them if he wants to

@mike0sv
Copy link
Contributor

mike0sv commented Mar 10, 2022

also tagging iterative/mlem#168 here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
external-product External products require this feature p1-high High priority
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants