-
Notifications
You must be signed in to change notification settings - Fork 492
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrating Hugging Face Hub #198
Conversation
Hello,
I would even add automatic download to enjoy if the model is not present (with user prompt that can be bypassed for testing)
Looks good but I would keep the api of the zoo if possible or at least allow it. Also, do you save the entire folder or only the last model? (a folder can contain any checkpoints and the best model too)
not sure if it makes sense for training (you usually know only late if you want to save the model to the hub or not depending on the results) but for enjoy yes it totally makes sense ;)
I would go for option one as it is consistent with how you train the models, what do you think? |
import importlib | ||
import os | ||
import sys | ||
from huggingface_hub import HfApi, HfFolder, Repository |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add this to the requirements
also missing: updating the README
Hey Antonin, thanks for your feedback,
I suppose you mean for rl-trained-agents repository, this is something possible. For instance if I type: And if the model is not in the user rl-trained-agents folder, I'll retrieve it (repo_id = sb3/ppo-CartPole-v1) from the hub. But for custom trained agents, that would imply arguments (repo_id) so I'm not sure it's something you're agree on. Or we can have a user prompt as you mentioned (not the final one, this is quite dirty it's just for illustration purposes)
The problem is where to bypass it? because we have 2 checks:
What do you mean? Not having a save_to_hub.py and directly save on train.py?
That's the question, what do you think is the best? In terms of usage
Repo_id (does not exist it's an example): ThomasSimonini/SpaceInvadersNoFrameskip-v4
=> In this case that implies that I need to generate a folder when download it with exp_id = 1. Since it's the way enjoy.py works. |
yes
I would do only for "official" pre-trained agents, there is your download script for the download of custom models.
I think would make more sense (and only if
I meant allowing to use
it should create it but with
If there is no restrictions on space for huggingface hub, it should save everything but we can probably add an option to filter what is saved (for instance do not save checkpoints or do not save best model). |
Hey Antonin 👋 , So for the first step in the integration of RL-Zoo, we can automatically retrieve an "official" model from the hub, as you mentioned the idea is when you run:
And this model is not in trained-agents/ folder, it automatically retrieve it from the hub: Line 113: if not found:
local_dir = f"rl-trained-agents/{algo}/{env_id}_1"
clone_from = f"TestSB3/{algo}-{env_id}"
repo = Repository(local_dir, clone_from) # This function clone a Hugging Face Repo and place it into the local_dir path About the model card, here's a simple version, wdyt (there will be on the right a video widget of the agent performing): |
Hey team 👋,
Description
We work on the integration of Hugging Face Hub with rl-baselines-zoo.
The Hugging Face Hub works as a central place where anyone can share and explore saved models.
The idea is to allow people to share and save their models on the hub.
The PR is a draft for now because there are some points to discuss and code to change given the decisions + some bugs I need to resolve. So not all the elements of the checklist are done.
What does our implementation for now:
load_from_hf.py
: download the model from Hugging Face Hub and place it in the correct folder (rl-saved-agents by default).save_to_hub.py
: upload the model from rl-baselines-zoo to Hugging Face Hub.Case 1: I load a sb3 saved model to test it
Case 2: I train a model and push to hub (some bugs for now in terms of git that I need to resolve)
Question 1: —load_from_hf and —save_to_hf args
Instead of having this procedure of calling save_to_hf.py and load_from_hf as described above we could add a parameter —load_to_hf and —save_to_hf with a Subcommand register as we done with AllenNLP.
https://github.com/allenai/allennlp/blob/82b1f4f80899c7eab31fa1ee0949be2eb12fc184/allennlp/commands/push_to_hf.py#L16
This way a user will just need to do something like this:
Question 2: Repos names
For the repos from rl-trained-agents we wanted to know what is your preferences for the name of the repos in the Hugging Face Hub ?
Motivation and Context
Allowing people to be able to easily share their saved models, using the Hugging Face Hub that allows you to host and share for free your models.
Types of changes
Checklist:
make format
(required)make check-codestyle
andmake lint
(required)make pytest
andmake type
both pass. (required)Note: we are using a maximum length of 127 characters per line