-
Notifications
You must be signed in to change notification settings - Fork 392
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MLEM Release blog post #3575
MLEM Release blog post #3575
Conversation
Gatsby Cloud Build Reportdvc.org 🎉 Your build was successful! See the Deploy preview here. Build Details🕐 Build time: 1m PerformanceLighthouse report
|
ML model registries give your team key capabilities: | ||
|
||
- Collect and organize model [versions] from different sources effectively, | ||
preserving their data provenance and lineage information. | ||
- Share metadata including [metrics and plots][mp] to help use and evaluate | ||
models. | ||
- A standard interface to access all your ML artifacts, from early-stage | ||
[experiments] to production-ready models. | ||
- Deploy specific models on different environments (dev, shadow, prod, etc.) | ||
without touching the applications that consume them. | ||
- For security, control who can manage models, and audit their usage trails. | ||
|
||
Many of these benefits are built into DVC: Your [modeling process] and | ||
[performance data][mp] become **codified** in Git-based <abbr>DVC | ||
repositories</abbr>, making it possible to reproduce and manage models with | ||
standard Git workflows (along with code). Large model files are stored | ||
separately and efficiently, and can be pushed to [remote storage] -- a scalable | ||
access point for [sharing]. | ||
|
||
To make a Git-native registry (on top of DVC or not), one option is to use [GTO] | ||
(Git Tag Ops). It tags ML model releases and promotions, and links them to | ||
artifacts in the repo using versioned annotations. This creates abstractions for | ||
your models, which lets you **manage their lifecycle** freely and directly from | ||
Git. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part I took from @jorgeorpinel's PR: #3333
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want to reuse explanations from other places that's fine but rephrase them in your own words (the way you understand it). Blog posts should have a consistent author's voice IMO.
OK to have very small sections (ad admonition, a sentence or 2) copy/pasted between blog and docs.
@jendefig @jorgeorpinel @jurv11 would be glad to get some comments. I added my part of the text very quickly and this is WIP, so not sure you need to provide very detailed feedback for this iteration. Does the structure work? Do some examples seem irrelevant? Did I miss to demonstrate some big ideas down the road? Thanks! |
We’re excited to announce the launch of our latest open source offering, | ||
[MLEM](https://mlem.ai)! MLEM is a tool that automatically extracts meta | ||
information like environment and frameworks from models and standardizes that | ||
information into a human-readable format within Git. ML teams can then use the | ||
model information for deployment into downstream production apps and services. | ||
MLEM easily connects to solutions like Heroku to dramatically decrease model | ||
deployment time. | ||
picture: 2022-05-24/mlem-rocket.png | ||
author: aguschin | ||
# commentsUrl: TODO | ||
tags: | ||
- Machine Learning | ||
- Deployment | ||
- Model Registry | ||
- MLOps | ||
--- | ||
|
||
We built MLEM to address issues that MLOps teams have around managing model | ||
information as they move them from training and development to production and, | ||
ultimately, retirement. MLEM is meant to help teams automate the collection of | ||
information around how the model was trained, what the model is for, and | ||
operational requirements around deployment. | ||
|
||
Just like all our [other](https://dvc.org) [tools](https://cml.dev), MLEM uses | ||
your Git service to store model information and connects with CI/CD solutions | ||
for deployment (like Heroku). This Git-based model | ||
([one of our core philosophies](https://iterative.ai/why-iterative/)) aligns | ||
model operations and deployment with software development teams – information | ||
and automation is all based on familiar DevOps tools – so that deploying any | ||
model into production is that much faster. | ||
|
||
With MLEM, ML teams get: | ||
|
||
- Human-readable information about a model for search and documentation | ||
- One-step automated deployment across any cloud | ||
- Fast model registry setup based on Git |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part I took from @jurv11 doc
MLEM takes the different models (on the mlem rocket ship with the mlem dog) and deliver to deploying to the different stars in space. It's too late to change image. |
|
@aguschin the structure makes sense
|
p.s. I think I know what the issue is: we mention Git in the abstract and intro but never explain (in the codification section) that you can version .mlem files with Git, bringing you to GitOps. That context is a missing piece of the puzzle rn. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking really good! Added some thoughts/changes/questions/comments
information around how the model was trained, what the model is for, and | ||
operational requirements around deployment. | ||
|
||
Just like all our [other](https://dvc.org) [tools](https://cml.dev), MLEM uses |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of doing these two links, maybe we should send to... Ok nevermind. I thought we have a product page at iterative.ai, but it's just a drop - down. cc: @jurv11 @julieg18, we should add this to the website list if we don't have it on there yet. There's the pricing page which shows all the tools, but that's not where we would want to send people in this case.
[gitops]: https://www.gitops.tech/ | ||
|
||
MLEM is a core building block for a Git-based ML model registry, together with | ||
other Iterative tools, like GTO and DVC. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GTO - other than at ODSC East and those that have found repo, we haven't really exposed GTO. We probably need more links/explanation/docs/repo pointing here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ALso I'm realizing we need to address that in the image for Twitter....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think we need something. I'm also thinking about a technical page that explains how to set up MLEM + GTO + DVC together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work @aguschin !
Some comments:
- Intro is very long - the whole screen of text that goes into explanations about Git, etc, etc w/o giving me fist even idea what the tool is about. My 2cs - start simpler with "With MLEM, ML teams get:", then some before / after side by side then some deployment magic. Explanations can go in the middle. A bit extreme, probably the best format is something in between :)
... codification
- not sure this is the best, codification is still niche, probably better to avoid it, be more explicit or use that + explanation- DVC pipelines - I think if we want to include it - let's do a separate section at the end. Describe storage and pipelines. Otherwise it makes text too complicated, we can't expect people to know DVC, etc, etc
The main goal of MLEM is to provide you a single tool that enables any kind of model productionization scenarios.
- why don't we mention this in the very beginning of the blog post?- Git-native - on the fence here on using it in the title 🤔
- What's next - need to put an image, make it more actionable? Start - can be an emoji, etc ... can we make some competition or some viral thingy on Twitter here cc @jendefig ?
|
||
With MLEM, ML teams get: | ||
|
||
- **Model metadata codification**: Human-readable information about a model for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks like codification is only for search and docs, but this undersells it. This meta-information is needed to deploy things in the first place, to build clients faster, etc? This is main purpose.
Ideally we can converge this into a single value prop - packaging models to deploy, everything else comes as a benefit on top?
otherwise we start with some philosophy, then we go into codification ... and only after we go into deployment ... and only after into model registry ... it feels it should be presenting things other way around - high level solution / value prop first, then goes into impl details and ... or at least they should come really close to each other
I hope it makes sense :) happy to brainstorm on this more if needed ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If "codification" is to niche or technical maybe speak of the user benefits like "reliable, standard metadata".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your review, @shcheklein!
- It seems I can't distribute the first section paragraphs ("We built MLEM to address..." and "Just like all our..." and "Capturing model-specific...") anywhere except for the second section "Model metadata codification". At least in the current form. So I can try to rewrite those and move them to the rest of the document. But after addressing your other comments that may be not needed anymore. Please let me know WDYT.
- I think we need to use "codify". It sounds great and explains what MLEM does with metainformation in a single word - that's good for the quicker explanations later. I've provided some description about codification right after the first word occurence. Do you think it's enough?
- Removed DVC code examples.
- I think this is addressed now.
- "Productionize your models with MLEM in a Git-native way" maybe?
- I put a picture with a dog asking for the stars for now :)
Is this ready for release @aguschin ? |
Yes, unless @shcheklein or @dmpetrov wants to provide some feedback. If you need this ASAP, I think it's ok to take it as is. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found some grammar/typos
Docker Image, or export it as some special format (like `.onnx` which is coming | ||
soon). | ||
|
||
```shell |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@iterative/websites do we have syntax highlighters ready for MLEM?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a Gatsby Cloud issue that is preventing us from merging #3396. It's already available on other websites.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yathomasi, we can't use cli
highlighter here also yet, right?
No description provided.