Skip to content
This repository has been archived by the owner on Jan 5, 2024. It is now read-only.

start: clarify existing content #265

Closed
wants to merge 14 commits into from
Closed

start: clarify existing content #265

wants to merge 14 commits into from

Conversation

and move some info to the UG index page

rel. #238 (comment)
@jorgeorpinel jorgeorpinel added A: docs Area: user documentation (gatsby-theme-iterative) C: get-started Content of /doc/get-started labels Dec 13, 2022
@jorgeorpinel jorgeorpinel requested a review from a team December 13, 2022 19:33
Copy link
Contributor Author

@jorgeorpinel jorgeorpinel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explanation of changes so far 👇🏼

content/docs/get-started/index.md Outdated Show resolved Hide resolved
content/docs/get-started/index.md Outdated Show resolved Hide resolved
content/docs/get-started/index.md Show resolved Hide resolved
Comment on lines +6 to +9
## Codification: the MLEM way

Please choose from the navigation sidebar to the left, or click the `Next`
button below ↘
Saving machine learning models to files or loading them back into Python objects
may seem like a simple task at first. For example, the `pickle` and `torch`
Copy link
Contributor Author

@jorgeorpinel jorgeorpinel Dec 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved all this to the User Guide

... here (edited a little 🙂)

p.s. we can expand on this page, probably summarizing the usage basics and removing some details from the sub-pages (separate task)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what exactly do you suggest moving?

Copy link
Contributor Author

@jorgeorpinel jorgeorpinel Dec 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aguschin here I was just explaining what I did (moving the codification background info to the UG).

If you referred to "removing some details from the sub-pages (separate task)", to start with, the info from Basic concepts and Working with models could be put in the UG index (ideally rewritten in a way that explains why should you learn this paradigm, similar to what we're doing in DVC).

But agian, that would be a separate task (not for this PR). And I can help.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to move to basic-concepts (see here)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I don't love that page in general (Basic Concepts). What's its purpose? It only covers MLEM Object which is cornerstone so it can probably also be absorbed into the guide index page (or somewhere else).

Also, I rewrote most of the text here @omesser , so it's not the same version as what you have in https://github.com/iterative/mlem.ai/pull/281/files#diff-9e5ea1c7881a8d2248846bcfbd5ed700fb26749b25a3fc0fcb335e7c82489c17

Restyle start: clarify existing content
@shcheklein shcheklein temporarily deployed to mlem-ai-start-edits-ojgyvycqih December 13, 2022 19:40 Inactive
@github-actions
Copy link

github-actions bot commented Dec 13, 2022

Link Check Report

All 28 links passed!

@shcheklein shcheklein temporarily deployed to mlem-ai-start-edits-ojgyvycqih December 14, 2022 01:16 Inactive
@jorgeorpinel jorgeorpinel marked this pull request as ready for review December 14, 2022 01:16
@shcheklein shcheklein temporarily deployed to mlem-ai-start-edits-ojgyvycqih December 14, 2022 02:24 Inactive
Comment on lines 58 to +60
## Making requests

You also can create MLEM Client for your deployment to make some requests:
You also can create a MLEM client for your deployment from Python code
Copy link
Contributor Author

@jorgeorpinel jorgeorpinel Dec 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I linked to this section from the Docker deployment section of GS (which is the main/happy path). However the GS mentions curl and mlem apply-remote, which are not covered here.

Should this section of the guide be more comprehensive?

Copy link
Contributor Author

@jorgeorpinel jorgeorpinel Dec 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Also, there's several similar sections in other guides (specifically 1, 2, 3, 4), should they be consolidated somehow?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However the GS mentions curl and mlem apply-remote, which are not covered here

mentioning and giving a link to doc/user-guide/serving/index.md should be enough I think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I see. The structure of the UG is a bit confusing then, bc/ we would link from a section about deployment to a guide about serving when there's a specific guide about deployment... ⏳

Copy link
Contributor Author

@jorgeorpinel jorgeorpinel Dec 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the link, but the guide and/or terminology is def. still quite confusing.

  • We should clarify somehow (separate task)

p.s. it may be a UX issue (product question) but hopefully we can at least help with good docs until that's addressed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

➕ Agree we should work on terminology 🙏
I think @jorgeorpinel nailed an important UX gap here.
I'm gonna guess this is completely clear for whoever works on mlem because of how it is built and behaves, but in the "wild", serving and deploying of models is often interchangeable and it can cause a lot of confusions.
Definitely a separate discussion - a change in the fundamental abstraction/terminology will cause major changes in docs

Comment on lines -26 to +28
Servers automatically create endpoints from model methods with payload schemas
corresponding to serialized dataset types.
The server is now running and listening for requests on the URL shown
above.Endpoints are created automatically from model methods (using the
`sample_data` [provided earlier](#saving-your-model)). You can open the
Copy link
Contributor Author

@jorgeorpinel jorgeorpinel Dec 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some edits for consistency with the language used in the GS.

  • But TBH this page as well as the Building and Deploying index pages are not in guide format. If I remember correctly they are copies from an old versions of the Get Started... Should rework 😄 (separate task)

## Get Started with MLEM
-->

<admon type="info">
Copy link
Contributor

@aguschin aguschin Dec 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks strange right at the top of the tutorial https://mlem-ai-start-edits-ojgyvycqih.herokuapp.com/doc/get-started?tab=Heroku-app. I'd argue we don't need this admon

Copy link
Contributor Author

@jorgeorpinel jorgeorpinel Dec 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it's 3rd-party tool documentation, so it should be separated somehow IMO. Would you prefer a hidden <details> section? I wouldn't hide it since the steps are required to follow the tutorial. In my current proposal, it looks like this:

image

p.s. if the admonition seems to loud to you, maybe we can use an old-school block quote (like this paragraph). Lmk


1. Model methods: Like `predict` and `predict_proba`
2. Input data schema: Describes the dataframe (Iris dataset)
3. Python Requirements: `sklearn` and `pandas` in this case, with the specific
versions used to train the model

<admon type='tip'>
<admon type="info">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 was looking more appealing to me than ℹ️ . Just complaining, if that's our standard for all docs websites, please ignore

Copy link
Contributor Author

@jorgeorpinel jorgeorpinel Dec 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, is this a tip or is it related info? The way each one looks could be changed, but that's a design question IMO. No strong opinion though, feel free to revert this if you prefer.

Comment on lines -170 to -182
#### Why do it the MLEM way ?

Saving models to files or loading them back into python objects may seem like a
deceptively simple task at first. For example, `pickle` and `torch` libraries
can serialize/deserialize model objects to/from files. However, MLEM adds some
"special sauce" by inspecting the objects and serializing their metadata into
`.mlem` files and intelligently using this later on. This metadata is necessary
to reliably enable actions like packaging and serving of different models types
down in various ways. MLEM allows us to automate a lot of the pain points we
would hit later on in our ML workflow by codifying and managing this metadata
about our models (or other objects) for us.

</details>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My take: I'd keep this section. It's a bit too abstract for my taste, so I can suggest putting an example here. Saying something like:

  • we need to know there is predict and predict_proba to build a REST API interface. Same for input data schema.
  • we need to know it's pandas and numpy to pip install requirements in containers on Heroku/AWS

I'd keep it - with examples it can be easier to understand why MLEM is powerful.
WDYT?

Copy link
Contributor Author

@jorgeorpinel jorgeorpinel Dec 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd keep this section

If we want to present the GS along these lines (give more motivation) then we should incorporate the ideas into the main text. E.g. instead of "Before we explore everything that MLEM can do, you need to save a machine learning model with MLEM." (here) we can use an explanation similar to the text above.

But if we put it in a hidden section that comes after another hidden section, to me that signals that we already decided this is not really important for the GS -- and I would agree it's best to keep the GS actionable (not theoretical). Thus moved it to a longer-format doc (that's linked several times from this page).

putting an example

If I understand correctly this example would be disconnected from the main story in the GS which would complicates things for the reader. We could def. add that example in the UG though, as another way to expand on it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to move it away from GS in my recent GS PR here: #281
user-guide/basic-concepts seems to make sense to me

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the change in this PR seems to point in the right direction. We have to merge one PR at a time so I'll focus on this one for now but I'm happy to solve any conflicts that arise in #281, unless yo guys think it's easier to do #281 first (lmk).

@shcheklein shcheklein temporarily deployed to mlem-ai-start-edits-ojgyvycqih December 27, 2022 04:31 Inactive
@shcheklein shcheklein temporarily deployed to mlem-ai-start-edits-ojgyvycqih December 27, 2022 04:47 Inactive
@shcheklein shcheklein temporarily deployed to mlem-ai-start-edits-ojgyvycqih December 27, 2022 04:49 Inactive
@shcheklein shcheklein temporarily deployed to mlem-ai-start-edits-ojgyvycqih December 27, 2022 08:07 Inactive
@shcheklein shcheklein temporarily deployed to mlem-ai-start-edits-ojgyvycqih December 27, 2022 08:17 Inactive
@shcheklein shcheklein temporarily deployed to mlem-ai-start-edits-ojgyvycqih December 27, 2022 08:18 Inactive
jorgeorpinel added a commit that referenced this pull request Dec 27, 2022
Automated style fixes for #265, created by [Restyled][].

The following restylers [made
fixes](https://restyled.io/gh/iterative/repos/mlem.ai/jobs/2680383):

- [prettier](https://prettier.io/docs/en/)


To incorporate these changes, merge this Pull Request into the original.
We
recommend using the Squash or Rebase strategies.


**NOTE**: As work continues on the original Pull Request, this process
will
re-run and update (force-push) this Pull Request with updated style
fixes as
necessary. If the style is fixed manually at any point (i.e. this
process finds
no fixes to make), this Pull Request will be closed automatically.

Sorry if this was unexpected. To disable it, see our [documentation][].

[restyled]: https://restyled.io
[documentation]:
https://github.com/restyled-io/restyled.io/wiki/Disabling-Restyled
@shcheklein shcheklein temporarily deployed to mlem-ai-start-edits-ojgyvycqih December 27, 2022 08:19 Inactive
@shcheklein shcheklein temporarily deployed to mlem-ai-start-edits-ojgyvycqih December 27, 2022 08:21 Inactive
@jorgeorpinel jorgeorpinel added the C: user-guide Content of /doc/user-guide label Dec 27, 2022
@omesser
Copy link
Contributor

omesser commented Dec 27, 2022

General comment - I was working on GS too for the past few days (see #281) and the get-started changes are conflicting :)
The changes I've done there are moving around or removing pieces that were edited here. @jorgeorpinel - would love your review there as well if you can

Copy link
Contributor

@omesser omesser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! mostly lgtm, some comments here and there, still some uncertainty about serving/deploying that we need to live with until clarified

Servers automatically create endpoints from model methods with payload schemas
corresponding to serialized dataset types.
The server is now running and listening for requests on the URL shown
above.Endpoints are created automatically from model methods (using the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
above.Endpoints are created automatically from model methods (using the
above. Endpoints are created automatically from model methods (using the

Note, that serving the model requires you to have the correct packages to be
installed. You can check out how to create a `venv` with right packages with
MLEM, or how to serve the model in a
This requires the correct packages to be installed. You can check out how to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This requires the correct packages to be installed. You can check out how to
This requires the correct packages to be installed for the server to serve the model.

(this is more for @aguschin since I'm questioning the original contents) - The connection to venv and docker guide is not obvious here to me. Maybe instead it's better to mention that the needed requirements are inferred from the model metadata extracted when saving it, and link to model codification. wdyt @aguschin ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that's a good idea.

corresponding to serialized dataset types.
The server is now running and listening for requests on the URL shown
above.Endpoints are created automatically from model methods (using the
`sample_data` [provided earlier](#saving-your-model)). You can open the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`sample_data` [provided earlier](#saving-your-model)). You can open the
`sample_data` provided when [saving the model](#saving-your-model)) to infer the payload schema. You can open the

Each server implementation also has its client counterpart (e.g. `HTTPClient`
for FastAPI). Clients can be used to make requests. Since a server also exposes
the model interface description, the client will know what methods are available
and handle serialization for you. You can use them via `mlem apply-remote`:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clients can be used to make requests

mentioning the servers is important here (maybe even to their "corresponding servers")

same for serialization and deserialization - not sure why this is cut out. There are 2 directions here - serializing requests made from the client, and de-serializing the result. It's both more accurate to mention both and also exrmplifies the value of using MLEM to instantiate a client instead of implementing one yourself


```cli
$ mlem apply-remote http test_x.csv --host="0.0.0.0" --port=8080 --json
$ mlem apply-remote http test_x.csv --json \
--host="0.0.0.0" --port=8080
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aesthetics/ocd - if not one line, break lines fully for CLI options:

$ mlem apply-remote http test_x.csv \
   --json \
   --host="0.0.0.0" \
   --port=8080

Comment on lines 58 to +60
## Making requests

You also can create MLEM Client for your deployment to make some requests:
You also can create a MLEM client for your deployment from Python code
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

➕ Agree we should work on terminology 🙏
I think @jorgeorpinel nailed an important UX gap here.
I'm gonna guess this is completely clear for whoever works on mlem because of how it is built and behaves, but in the "wild", serving and deploying of models is often interchangeable and it can cause a lot of confusions.
Definitely a separate discussion - a change in the fundamental abstraction/terminology will cause major changes in docs

@jorgeorpinel
Copy link
Contributor Author

Not sure whether I should continue to work on this per #281 (comment).

@jorgeorpinel jorgeorpinel removed their assignment Jan 11, 2023
@aguschin
Copy link
Contributor

@jorgeorpinel, may I kindly ask you to resolve conflicts and apply your changes on top of what was merged? I'm afraid I'll lose your intentions if I'll do that myself.

@omesser
Copy link
Contributor

omesser commented Jan 16, 2023

@jorgeorpinel - might be easier for you to split 🙏 :

  • all changes outside of get-started in this PR - most of them reviewed to some extent, so probably close to mergeable
  • get-started - which would probably require a review/rewrite due to Getting started - more changes #281 instead of conflict management

@omesser
Copy link
Contributor

omesser commented Jan 31, 2023

No activity in a long time,
plucked non conflicting changes here: #301
Hope it's easier to follow up there

@omesser omesser closed this Jan 31, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A: docs Area: user documentation (gatsby-theme-iterative) C: get-started Content of /doc/get-started C: user-guide Content of /doc/user-guide
Projects
No open projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

5 participants