New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] HF Hub Integration #171
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @natolambert , really nice implementation ! :)
I left a comment about the log
directory. I think it should be deleted from the Hub before uploading otherwise you can end up with a messy foler. Other than that, looks good 👍
if repo_logdir.exists(): | ||
shutil.rmtree(repo_logdir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here repo_logdir
is a tmp dir (at least when called from package_to_hub
.
So deleting it locally will not delete it on the Hub. A possibility is to use delete_folder
to delete the logs folder on the Hub before uploading the new one.
mbrl/util/huggingface.py
Outdated
) | ||
print(msg) | ||
|
||
repo_url = HfApi().create_repo( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(nit) for consistency with upload_folder
calls, I would import create_repo
directly from huggingface_hub
.
If not called with any parameter, using HfApi().
is not needed.
mbrl/util/huggingface.py
Outdated
:param token | ||
""" | ||
|
||
repo_url = HfApi().create_repo( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(nit) same as above about HfApi
WIP Colab to showcase this workflow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @natolambert! I left a bunch of comments, but mostly pedantic stuff about formatting and docstrings. I'm not too familiar with HF Hub, so most of the rest LGTM. However, I noticed that package_hub
only receives a Model
, even though we also have algorithms that require a trained Agent
(e.g., MBPO, and eventually Dreamer). This seems like a potential issue?
Also, is the expectation that should be able to download saved model/agents from the hub and run them on our example scripts?
from ..models import Model | ||
|
||
|
||
def get_model_id(org_name: str, env_id: str, tag: str = None, algo_name: str = None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing type annotation for return value.
Construct a name for the model on the hub. | ||
|
||
Args: | ||
org_name (str): HF organization to upload the model to. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you line break the docstrings so that they are always less than 88 chars long? These seem longer that that, not sure. Do note that black
won't edit comments.
|
||
Args: | ||
org_name (str): HF organization to upload the model to. | ||
env_id: environment name for the data (often from Gym registration). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
env_id (str)
tag (str, optional)
algo_name (str, optional)
|
||
# Step 2: Get data (JSON containing infos) and read it | ||
# TODO update this for cfg / yaml | ||
with open(Path.joinpath(local_path, unzipped_model_folder, "data")) as json_file: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it not possible to do local_path / unzipped_model_folder / "data"
?
with open(Path.joinpath(local_path, unzipped_model_folder, "data")) as json_file: | ||
data = json.load(json_file) | ||
# Add system_info elements to our JSON | ||
# data["system_info"] = stable_baselines3.get_system_info(print_info=False)[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delete commented out line.
This method does the complete pipeline: | ||
- It generates the model card TODO(NOL) | ||
- It pushes everything to the hub | ||
:param model: trained model |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing type info docstrings argument.
commit_message: str, | ||
is_deterministic: bool = True, | ||
token: Optional[str] = None, | ||
logs=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing type annotations for logs
, mean_reward
, and std_reward
(e.g., mean_reward: Optional[float] = None
).
# Step 1: Save the model | ||
model.save(tmpdirname / model_name) | ||
|
||
# Retrieve VecNormalize wrapper if it exists |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this commented out section using during debugging? Should we remove?
token: Optional[str] = None, | ||
): | ||
""" | ||
Upload a model to Hugging Face Hub (this is for updating an existing repo!). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing type info and periods in docstring.
def load_from_hub(repo_id: str, filename: str) -> str: | ||
""" | ||
Download a model from Hugging Face Hub. | ||
:param repo_id: id of the model repository from the Hugging Face Hub |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing type info and periods in docstring.
Working towards closing #169
Things to do (roughly):