Support listing Hugging Face model info #4619

titu1994 · 2022-07-26T23:02:02Z

Signed-off-by: smajumdar smajumdar@nvidia.com

What does this PR do ?

Add new apis to filter and obtain NeMo hugging face models via code.

Collection: [Core]

Changelog

Adds two methods get_hf_model_filter() and search_huggingface_models(model_filter=None) to ModelPT.
First method returns a basic ModelFilter object, can be overridden by the user. Used as default to next method if nothing is provided.
Add support for metadata attributes to the filter to modify the result list of ModelInfo.
list_available_models_on_hf() lists the nemo models that satisfy the filter.

Usage

from nemo.core import ModelPT

# You can replace <DomainSubclass> with any subclass of ModelPT.
# Get default ModelFilter
filt = <DomainSubclass>.get_hf_model_filter()

# Make any modifications to the filter as necessary
filt.language = [...]
filt.task = ...
filt.tags = [...]

# Add any metadata to the filter as needed
filt.limit_results = 5

# Obtain model info
model_infos = <DomainSubclass>.search_huggingface_models(model_filter=filt)

# Browse through cards and select an appropriate one
card = model_infos[0]

# Restore model using `modelId` of the card.
model = ModelPT.from_pretrained(card.modelId)

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

New Feature
Bugfix
Documentation

Signed-off-by: smajumdar <smajumdar@nvidia.com>

okuchaiev

Let's extend list_available_models method instead to accept optional ModelFilter instance. By default should be None and should list both NGC and HF models (even if duplicated). Then make list_available_models_on_hf private and call it under the hood

Signed-off-by: smajumdar <smajumdar@nvidia.com>

ericharper · 2022-07-27T19:05:37Z

Let's extend list_available_models method instead to accept optional ModelFilter instance. By default should be None and should list both NGC and HF models (even if duplicated). Then make list_available_models_on_hf private and call it under the hood

I agree with this. We don't want users to have to use two different methods to list models.

titu1994 · 2022-07-27T19:10:51Z

Reasons states belpw : it would require updating every single class that implements list_available_models which is impractical

Problem is that method is subclassed by dozens of classes, and each of those subclasses need to be updated to support the new argument

Another issue is that this method will fail if used in offline no internet area, whereas the original ngc ones has code baked urls, so it will still work.

Signed-off-by: smajumdar <smajumdar@nvidia.com>

titu1994 · 2022-07-27T20:21:48Z

Updated PR with different name for method to remove confusion against list_available_models

Signed-off-by: smajumdar <smajumdar@nvidia.com>

ericharper

LGTM. Thanks for the changes!

* Support listing Hugging Face model info Signed-off-by: smajumdar <smajumdar@nvidia.com> * Add documentation about usage Signed-off-by: smajumdar <smajumdar@nvidia.com> * Add documentation about usage Signed-off-by: smajumdar <smajumdar@nvidia.com> * Update name of method, support list of model filters Signed-off-by: smajumdar <smajumdar@nvidia.com> * Improve docstring Signed-off-by: smajumdar <smajumdar@nvidia.com> Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* Support listing Hugging Face model info Signed-off-by: smajumdar <smajumdar@nvidia.com> * Add documentation about usage Signed-off-by: smajumdar <smajumdar@nvidia.com> * Add documentation about usage Signed-off-by: smajumdar <smajumdar@nvidia.com> * Update name of method, support list of model filters Signed-off-by: smajumdar <smajumdar@nvidia.com> * Improve docstring Signed-off-by: smajumdar <smajumdar@nvidia.com> Signed-off-by: Anas Abou Allaban <aabouallaban@pm.me>

* Support listing Hugging Face model info Signed-off-by: smajumdar <smajumdar@nvidia.com> * Add documentation about usage Signed-off-by: smajumdar <smajumdar@nvidia.com> * Add documentation about usage Signed-off-by: smajumdar <smajumdar@nvidia.com> * Update name of method, support list of model filters Signed-off-by: smajumdar <smajumdar@nvidia.com> * Improve docstring Signed-off-by: smajumdar <smajumdar@nvidia.com> Signed-off-by: Hainan Xu <hainanx@nvidia.com>

titu1994 added 2 commits July 26, 2022 15:52

Support listing Hugging Face model info

2df9140

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Add documentation about usage

438d2e4

Signed-off-by: smajumdar <smajumdar@nvidia.com>

titu1994 requested review from ericharper and okuchaiev July 26, 2022 23:12

okuchaiev requested changes Jul 26, 2022

View reviewed changes

Add documentation about usage

07070cd

Signed-off-by: smajumdar <smajumdar@nvidia.com>

okuchaiev previously approved these changes Jul 27, 2022

View reviewed changes

Update name of method, support list of model filters

5c6cf33

Signed-off-by: smajumdar <smajumdar@nvidia.com>

titu1994 dismissed okuchaiev’s stale review via 5c6cf33 July 27, 2022 20:21

titu1994 added 2 commits July 27, 2022 13:21

Merge branch 'main' into add_hf_filter_support

8892b1d

Improve docstring

14b82af

Signed-off-by: smajumdar <smajumdar@nvidia.com>

ericharper approved these changes Jul 27, 2022

View reviewed changes

titu1994 merged commit 90ad5af into NVIDIA:main Jul 27, 2022

titu1994 deleted the add_hf_filter_support branch July 27, 2022 22:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support listing Hugging Face model info #4619

Support listing Hugging Face model info #4619

titu1994 commented Jul 26, 2022 •

edited

Loading

okuchaiev left a comment

ericharper commented Jul 27, 2022

titu1994 commented Jul 27, 2022 •

edited

Loading

titu1994 commented Jul 27, 2022

ericharper left a comment

Support listing Hugging Face model info #4619

Support listing Hugging Face model info #4619

Conversation

titu1994 commented Jul 26, 2022 • edited Loading

What does this PR do ?

Changelog

Usage

Before your PR is "Ready for review"

okuchaiev left a comment

Choose a reason for hiding this comment

ericharper commented Jul 27, 2022

titu1994 commented Jul 27, 2022 • edited Loading

titu1994 commented Jul 27, 2022

ericharper left a comment

Choose a reason for hiding this comment

titu1994 commented Jul 26, 2022 •

edited

Loading

titu1994 commented Jul 27, 2022 •

edited

Loading