-
Notifications
You must be signed in to change notification settings - Fork 439
add smolvla to model-libraries.ts #1529
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a nice model inference snippet as well?
--policy.device=cuda \\ | ||
--wandb.enable=true`, | ||
]; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We usually demonstrate simple inference snippets. Does that make sense for smolvla, or not really?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree with @pcuenca that an inference example feels more natural at this stage since if a user has pushed their model, it means it has already been trained (at least I expect).
FYI, export const smolvla = (model: ModelData): string[] => [
returns a list so it could return both a training and an inference examples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I see what you mean. However, unfortunately we cannot use smolvla_base for inference directly, the only use case for smolvla_base for now is fine-tuning.
As it is not meant to perform zero-shot and will cause many errors. Hence, we strongly recommend people fine-tuning the model first.
We could have pushed the inference snippet for any model fine-tuned on top of smolvla_base however, but that is not really applicable here I think. wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(originally posted in DMs on slack)
However, unfortunately we cannot use smolvla_base for inference directly, the only use case for smolvla_base for now is fine-tuning.
Interesting! What I would suggest:
- if
modelId is smolvla_base
=> then provide the finetune command - if
modelId != smolvla_base
=> then provide the "how to use" command + the finetune command
Since export const smolvla = (model: ModelData): string[] => [
is a method and you have access to ModelData
you can easily add some simple logic like this to the snippet generation
I see there are 380+ models on https://huggingface.co/models?other=lerobot and only 1 is not finetuned (if I understand correctly?) so it would be a shame not to show the inference command for all of them. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few comments. Agree on the importance of havign an inference example (or both train and inference if that's more natural).
Also I'm wondering if it shouldn't be lerobot
entirely instead of smolvla
. Is smolvla
really considered to be a library on its own or it is more a subpart of the library (e.g. LeRobot has many policies as transformers have many architectures).
Using correct tags e.g.
library_name: lerobot
tags:
- smolvla
we should be able to display high-level information related to the library (i.e. repo name, pretty label, repoUrl, docsUrl, etc.) while having dedicated snippets for each policy (export const smolvla = (model: ModelData): string[] => [
is a method and we have access to the model tags so we can do interesting stuff there).
`# git clone https://github.com/huggingface/lerobot.git | ||
cd lerobot | ||
conda create -y -n lerobot python=3.10 | ||
conda activate lerobot | ||
|
||
# Install ffmpeg (required for video processing) | ||
conda install ffmpeg=7.1.1 -c conda-forge | ||
|
||
# Install LeRobot with the SmolVLA extra dependencies | ||
pip install -e ".[${model.id}]" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel that all of this should be simplified to:
`# git clone https://github.com/huggingface/lerobot.git | |
cd lerobot | |
conda create -y -n lerobot python=3.10 | |
conda activate lerobot | |
# Install ffmpeg (required for video processing) | |
conda install ffmpeg=7.1.1 -c conda-forge | |
# Install LeRobot with the SmolVLA extra dependencies | |
pip install -e ".[${model.id}]" | |
`# !pip install "git+https://github.com/huggingface/lerobot.git#egg=lerobot[smolvla]" |
i.e. no need for editable mode, no need to download the repo just to install it, no need for conda instructions (users can figure that part by themselves, depending on what they prefer).
If users want more detailed installation guide, they can go to https://github.com/huggingface/lerobot?#installation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice! thanks!
--policy.device=cuda \\ | ||
--wandb.enable=true`, | ||
]; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree with @pcuenca that an inference example feels more natural at this stage since if a user has pushed their model, it means it has already been trained (at least I expect).
FYI, export const smolvla = (model: ModelData): string[] => [
returns a list so it could return both a training and an inference examples.
Co-authored-by: Lucain <lucainp@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Lucain <lucainp@gmail.com>
Co-authored-by: Lucain <lucainp@gmail.com>
Thanks for the review! @Wauplin I changed the tag to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed in DMs, here is my suggestion for LeRobot snippets based on tags. For this to work, all smolvla models must include tags: - smolvla
in their model card metadata. Doing this will make them listed here https://huggingface.co/models?other=smolvla. This is typically something lerobot library should do automatically to that users don't have to do it themselves. Updating lerobot library will benefit all future models. For existing ones, you might want to open PRs on the Hub to correct them (similar to https://huggingface.co/lerobot/smolvla_base/discussions/7).
Since it all relies on modelcard metadata (tags, base_model, etc.), it's important that the LeRobot scripts take care of them. I do think this will be highly valuable to users, not just in the context of code snippets but also for discoverability on the Hub.
Co-authored-by: Lucain <lucainp@gmail.com>
Co-authored-by: Lucain <lucainp@gmail.com>
Co-authored-by: Lucain <lucainp@gmail.com>
Co-authored-by: Lucain <lucainp@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating @danaaubakirova! Except a small nit, PR looks good to merge now. Pinging @pcuenca @Vaibhavs10 if you'd like to have a quick look.
@danaaubakirova as I mentioned above, in parallel of this PR it's very important for the LeRobot library to include more metadata when generating new models (library_name, tags, base_model, citation, etc.). A tighter integration will benefit users on the long run. If you need any help or advice about this, please let me know 🤗
Note: CI failure is unrelated
TYSM @Wauplin! Sure, we can work on generating more informative and well-organised metadata when pushing models using lerobot, and keep you in the loop, in case we will have any questions:) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great suggestions @Wauplin, thanks a lot!
We could potentially:
- Use a simplified install command (as @Wauplin mentioned), plus the inference snippet if the model is not the base one.
- Use the more complicated install + fine-tune example when it is.
Currently, IIUC we are showing all the snippets if the model is not the base one, is that what we intended?
Happy to merge after Wauplin's approval anyway.
I will leave the final decision to @danaaubakirova! In my understanding, users could want to retrain/finetune an already finetuned model but maybe that's not the main use case? In any case, both are fine for me. @danaaubakirova let us know what you think! |
Hi @pcuenca, thanks for the fix!
Let me know if I can improve this PR further given the constraints I described above :) P.S. One downside of user-uploaded, fine-tuned models is that they’re tied to specific datasets and don’t generalize well, so they can’t be freely reused across different setups/users. |
Let's merge it "as-is" then and we can reevaluate later. Looks like the most important for now is to have correct snippets for https://huggingface.co/lerobot/smolvla_base + have a "Use with LeRobot" button in all others. |
No description provided.