Skip to content

add smolvla to model-libraries.ts #1529

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 30 commits into from
Jun 25, 2025
Merged

Conversation

danaaubakirova
Copy link
Contributor

No description provided.

Copy link
Member

@Vaibhavs10 Vaibhavs10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a nice model inference snippet as well?

--policy.device=cuda \\
--wandb.enable=true`,
];

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually demonstrate simple inference snippets. Does that make sense for smolvla, or not really?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree with @pcuenca that an inference example feels more natural at this stage since if a user has pushed their model, it means it has already been trained (at least I expect).

FYI, export const smolvla = (model: ModelData): string[] => [ returns a list so it could return both a training and an inference examples.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I see what you mean. However, unfortunately we cannot use smolvla_base for inference directly, the only use case for smolvla_base for now is fine-tuning.
As it is not meant to perform zero-shot and will cause many errors. Hence, we strongly recommend people fine-tuning the model first.

We could have pushed the inference snippet for any model fine-tuned on top of smolvla_base however, but that is not really applicable here I think. wdyt?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(originally posted in DMs on slack)

However, unfortunately we cannot use smolvla_base for inference directly, the only use case for smolvla_base for now is fine-tuning.
Interesting! What I would suggest:

  • if modelId is smolvla_base => then provide the finetune command
  • if modelId != smolvla_base => then provide the "how to use" command + the finetune command

Since export const smolvla = (model: ModelData): string[] => [ is a method and you have access to ModelData you can easily add some simple logic like this to the snippet generation

I see there are 380+ models on https://huggingface.co/models?other=lerobot and only 1 is not finetuned (if I understand correctly?) so it would be a shame not to show the inference command for all of them. What do you think?

Copy link
Contributor

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments. Agree on the importance of havign an inference example (or both train and inference if that's more natural).

Also I'm wondering if it shouldn't be lerobot entirely instead of smolvla. Is smolvla really considered to be a library on its own or it is more a subpart of the library (e.g. LeRobot has many policies as transformers have many architectures).
Using correct tags e.g.

library_name: lerobot
tags:
    - smolvla

we should be able to display high-level information related to the library (i.e. repo name, pretty label, repoUrl, docsUrl, etc.) while having dedicated snippets for each policy (export const smolvla = (model: ModelData): string[] => [ is a method and we have access to the model tags so we can do interesting stuff there).

Comment on lines 1642 to 1651
`# git clone https://github.com/huggingface/lerobot.git
cd lerobot
conda create -y -n lerobot python=3.10
conda activate lerobot

# Install ffmpeg (required for video processing)
conda install ffmpeg=7.1.1 -c conda-forge

# Install LeRobot with the SmolVLA extra dependencies
pip install -e ".[${model.id}]"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that all of this should be simplified to:

Suggested change
`# git clone https://github.com/huggingface/lerobot.git
cd lerobot
conda create -y -n lerobot python=3.10
conda activate lerobot
# Install ffmpeg (required for video processing)
conda install ffmpeg=7.1.1 -c conda-forge
# Install LeRobot with the SmolVLA extra dependencies
pip install -e ".[${model.id}]"
`# !pip install "git+https://github.com/huggingface/lerobot.git#egg=lerobot[smolvla]"

i.e. no need for editable mode, no need to download the repo just to install it, no need for conda instructions (users can figure that part by themselves, depending on what they prefer).

If users want more detailed installation guide, they can go to https://github.com/huggingface/lerobot?#installation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! thanks!

--policy.device=cuda \\
--wandb.enable=true`,
];

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree with @pcuenca that an inference example feels more natural at this stage since if a user has pushed their model, it means it has already been trained (at least I expect).

FYI, export const smolvla = (model: ModelData): string[] => [ returns a list so it could return both a training and an inference examples.

danaaubakirova and others added 5 commits June 23, 2025 14:49
Co-authored-by: Lucain <lucainp@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Lucain <lucainp@gmail.com>
Co-authored-by: Lucain <lucainp@gmail.com>
@danaaubakirova
Copy link
Contributor Author

Thanks for the review! @Wauplin I changed the tag to lerobot in model-libraries following your reasoning about smolvla being the tag and lerobot being main library. Lmk if this looks good!

Copy link
Contributor

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in DMs, here is my suggestion for LeRobot snippets based on tags. For this to work, all smolvla models must include tags: - smolvla in their model card metadata. Doing this will make them listed here https://huggingface.co/models?other=smolvla. This is typically something lerobot library should do automatically to that users don't have to do it themselves. Updating lerobot library will benefit all future models. For existing ones, you might want to open PRs on the Hub to correct them (similar to https://huggingface.co/lerobot/smolvla_base/discussions/7).

Since it all relies on modelcard metadata (tags, base_model, etc.), it's important that the LeRobot scripts take care of them. I do think this will be highly valuable to users, not just in the context of code snippets but also for discoverability on the Hub.

Co-authored-by: Lucain <lucainp@gmail.com>
Copy link
Contributor

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating @danaaubakirova! Except a small nit, PR looks good to merge now. Pinging @pcuenca @Vaibhavs10 if you'd like to have a quick look.

@danaaubakirova as I mentioned above, in parallel of this PR it's very important for the LeRobot library to include more metadata when generating new models (library_name, tags, base_model, citation, etc.). A tighter integration will benefit users on the long run. If you need any help or advice about this, please let me know 🤗


Note: CI failure is unrelated

@danaaubakirova
Copy link
Contributor Author

Thanks for updating @danaaubakirova! Except a small nit, PR looks good to merge now. Pinging @pcuenca @Vaibhavs10 if you'd like to have a quick look.

@danaaubakirova as I mentioned above, in parallel of this PR it's very important for the LeRobot library to include more metadata when generating new models (library_name, tags, base_model, citation, etc.). A tighter integration will benefit users on the long run. If you need any help or advice about this, please let me know 🤗

Note: CI failure is unrelated

TYSM @Wauplin! Sure, we can work on generating more informative and well-organised metadata when pushing models using lerobot, and keep you in the loop, in case we will have any questions:)

Copy link
Member

@pcuenca pcuenca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great suggestions @Wauplin, thanks a lot!

We could potentially:

  • Use a simplified install command (as @Wauplin mentioned), plus the inference snippet if the model is not the base one.
  • Use the more complicated install + fine-tune example when it is.

Currently, IIUC we are showing all the snippets if the model is not the base one, is that what we intended?

Happy to merge after Wauplin's approval anyway.

@Wauplin
Copy link
Contributor

Wauplin commented Jun 24, 2025

We could potentially:

  • Use a simplified install command (as @Wauplin mentioned), plus the inference snippet if the model is not the base one.
  • Use the more complicated install + fine-tune example when it is.
    Currently, IIUC we are showing all the snippets if the model is not the base one, is that what we intended?

I will leave the final decision to @danaaubakirova! In my understanding, users could want to retrain/finetune an already finetuned model but maybe that's not the main use case? In any case, both are fine for me. @danaaubakirova let us know what you think!

@danaaubakirova
Copy link
Contributor Author

Hi @pcuenca, thanks for the fix!

  • The simplified install command pip install "git+https://github.com/huggingface/lerobot.git#egg=lerobot[smolvla]"
    doesn’t handle our environment setup. We still need a specific ffmpeg version installed via conda, so I’d point people to the install guide rather than rely on the one-liner or at least setup conda environment before the installation.

  • Right now we “force” everyone to fine-tune smolvla_base since that’s the only supported workflow. For completed, fine-tuned models we are shipping inference snippets, but users who really want to re-fine-tune those models can still do so as noted by @Wauplin.

Let me know if I can improve this PR further given the constraints I described above :)

P.S. One downside of user-uploaded, fine-tuned models is that they’re tied to specific datasets and don’t generalize well, so they can’t be freely reused across different setups/users.

@Wauplin
Copy link
Contributor

Wauplin commented Jun 25, 2025

Let's merge it "as-is" then and we can reevaluate later. Looks like the most important for now is to have correct snippets for https://huggingface.co/lerobot/smolvla_base + have a "Use with LeRobot" button in all others.

@Wauplin Wauplin merged commit 5831b52 into huggingface:main Jun 25, 2025
1 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants