Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support listing Hugging Face model info #4619

Merged
merged 6 commits into from
Jul 27, 2022

Conversation

titu1994
Copy link
Collaborator

@titu1994 titu1994 commented Jul 26, 2022

Signed-off-by: smajumdar smajumdar@nvidia.com

What does this PR do ?

Add new apis to filter and obtain NeMo hugging face models via code.

Collection: [Core]

Changelog

  • Adds two methods get_hf_model_filter() and search_huggingface_models(model_filter=None) to ModelPT.
  • First method returns a basic ModelFilter object, can be overridden by the user. Used as default to next method if nothing is provided.
  • Add support for metadata attributes to the filter to modify the result list of ModelInfo.
  • list_available_models_on_hf() lists the nemo models that satisfy the filter.

Usage

from nemo.core import ModelPT

# You can replace <DomainSubclass> with any subclass of ModelPT.
# Get default ModelFilter
filt = <DomainSubclass>.get_hf_model_filter()

# Make any modifications to the filter as necessary
filt.language = [...]
filt.task = ...
filt.tags = [...]

# Add any metadata to the filter as needed
filt.limit_results = 5

# Obtain model info
model_infos = <DomainSubclass>.search_huggingface_models(model_filter=filt)

# Browse through cards and select an appropriate one
card = model_infos[0]

# Restore model using `modelId` of the card.
model = ModelPT.from_pretrained(card.modelId)

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

Signed-off-by: smajumdar <smajumdar@nvidia.com>
Signed-off-by: smajumdar <smajumdar@nvidia.com>
Copy link
Member

@okuchaiev okuchaiev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's extend list_available_models method instead to accept optional ModelFilter instance. By default should be None and should list both NGC and HF models (even if duplicated). Then make list_available_models_on_hf private and call it under the hood

Signed-off-by: smajumdar <smajumdar@nvidia.com>
@ericharper
Copy link
Collaborator

Let's extend list_available_models method instead to accept optional ModelFilter instance. By default should be None and should list both NGC and HF models (even if duplicated). Then make list_available_models_on_hf private and call it under the hood

I agree with this. We don't want users to have to use two different methods to list models.

@titu1994
Copy link
Collaborator Author

titu1994 commented Jul 27, 2022

Reasons states belpw : it would require updating every single class that implements list_available_models which is impractical


Problem is that method is subclassed by dozens of classes, and each of those subclasses need to be updated to support the new argument

Another issue is that this method will fail if used in offline no internet area, whereas the original ngc ones has code baked urls, so it will still work.

okuchaiev
okuchaiev previously approved these changes Jul 27, 2022
Signed-off-by: smajumdar <smajumdar@nvidia.com>
@titu1994
Copy link
Collaborator Author

Updated PR with different name for method to remove confusion against list_available_models

Copy link
Collaborator

@ericharper ericharper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the changes!

@titu1994 titu1994 merged commit 90ad5af into NVIDIA:main Jul 27, 2022
@titu1994 titu1994 deleted the add_hf_filter_support branch July 27, 2022 22:12
Davood-M pushed a commit to Davood-M/NeMo that referenced this pull request Aug 9, 2022
* Support listing Hugging Face model info

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Add documentation about usage

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Add documentation about usage

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Update name of method, support list of model filters

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Improve docstring

Signed-off-by: smajumdar <smajumdar@nvidia.com>
Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
piraka9011 pushed a commit to piraka9011/NeMo that referenced this pull request Aug 25, 2022
* Support listing Hugging Face model info

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Add documentation about usage

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Add documentation about usage

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Update name of method, support list of model filters

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Improve docstring

Signed-off-by: smajumdar <smajumdar@nvidia.com>
Signed-off-by: Anas Abou Allaban <aabouallaban@pm.me>
hainan-xv pushed a commit to hainan-xv/NeMo that referenced this pull request Nov 29, 2022
* Support listing Hugging Face model info

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Add documentation about usage

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Add documentation about usage

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Update name of method, support list of model filters

Signed-off-by: smajumdar <smajumdar@nvidia.com>

* Improve docstring

Signed-off-by: smajumdar <smajumdar@nvidia.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants