Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ModelHubMixin: more metadata + arbitrary config types + proper guide #2230

Merged
merged 9 commits into from
Apr 26, 2024

Conversation

Wauplin
Copy link
Contributor

@Wauplin Wauplin commented Apr 17, 2024

This PR adds some improvement to the ModelHubMixin class and in particular:

  1. Possibility to define more metadata for a library; license, license_link, license_name, pipeline_tag, languages.
  2. Possibility to define a custom model card template. Useful if somehow wants to add a citation section in all model cards for instance. It's also possible to dynamically generate the model card, for example to store metrics at runtime.
  3. Possibility to handle arbitrary input types. Very useful when trying to integrate with libraries that instantiate models with custom values. Typically happened for VoiceCraft (expects a argparse.Namespace) or MichelAngelo (expects a OmegaConf see #2212). It is currently possible to mitigate this (see VoiceCraft#90) but solution is quite hacky/unintuitive. Instead of manually adding support for more types in huggingface_hub, this PR adds a way to define a custom encoder/decoder for the type. For example, the voicecraft integration would become:
from argparse import Namespace

class VoiceCraft(
   nn.Module,
   PytorchModelHubMixin,  # inherit from mixin
   coders: {
      Namespace = (
         lambda x: vars(x),  # Encoder: how to convert a `Namespace` to a valid jsonable value?
         lambda data: Namespace(**data),  # Decoder: how to reconstruct a `Namespace` from a dictionary?
      )
   }
):
    def __init__(self, args: Namespace): # annotate `args`
      self.pattern = self.args.pattern
      self.hidden_size = self.args.hidden_size
      ...

In addition to this, I have update the Integration guide to explain in details how to use the advanced features of the mixin (metadata, model card template, config, custom encoders, etc.)

cc @NielsRogge @not-lain

EDIT: link to updated guide + package reference.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@not-lain
Copy link
Contributor

can we move the logic outside the inheritance?
basically have encode and decode methods

class A (nn.Module , PyTorchModelHubMixin, meta):
  def __init_(self,...) : 
   (...)
 @classmethod
 def encode(*args,**kwargs):
  """logic for configuring _hub_mixin_config"""
 for key,value in kwargs.items() : 
  if value is_jasonable : 
    self._hub_mixin_config[key] = value
  else : 
    warnings.warn(f"""the parameter {key} has a value of type {type(value)} and could not be added to _hub_mixin_config 
we advise you create your own "configure_hub_config" method, current jasonable types are {jasonable_types}
if you chose to make any changes to the encode method we encourage you to update the decode method too""") 

@classmethod 
def decode(config) : 
"""logic for decoding"""
 try : 
    return model(**config)
 except : 
   raise error (f""""model could not be initialized, this is either due to model expecting a parameter that is not a {jsonable_types}
or because you have a logic for reading a file from a local directory in the init method and that file does not exist currently
to fix this error we advise you to create your own "decode" method """")

and sice there's way more metadata now can we do it like this now :

class A (nn.Module, PyTorchModelHubMixin):
    meta1 = v1
    meta2 = v2
    def __init__(self, ... ) : 
        (...) 

accessing meta later is as easy as A.meta1 or getattr(A, "meta1", None ) ....

@Wauplin
Copy link
Contributor Author

Wauplin commented Apr 17, 2024

can we move the logic outside the inheritance?
basically have encode and decode methods

and sice there's way more metadata now can we do it like this now :

Is there a benefit in doing so? Attributes can conflicts with existing methods or properties of class we are inheriting from so I would prefer to avoid them.

Copy link
Member

@stevhliu stevhliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice additions to the ModelHubMixin! 🔥

docs/source/en/guides/integrations.md Outdated Show resolved Hide resolved
docs/source/en/guides/integrations.md Outdated Show resolved Hide resolved
docs/source/en/guides/integrations.md Outdated Show resolved Hide resolved
docs/source/en/guides/integrations.md Outdated Show resolved Hide resolved
docs/source/en/guides/integrations.md Outdated Show resolved Hide resolved
docs/source/en/guides/integrations.md Outdated Show resolved Hide resolved
docs/source/en/guides/integrations.md Outdated Show resolved Hide resolved
docs/source/en/guides/integrations.md Outdated Show resolved Hide resolved
docs/source/en/guides/integrations.md Outdated Show resolved Hide resolved
docs/source/en/guides/integrations.md Outdated Show resolved Hide resolved
Wauplin and others added 2 commits April 18, 2024 10:15
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
@Wauplin
Copy link
Contributor Author

Wauplin commented Apr 18, 2024

As always, thanks for your very welcomed comments @stevhliu! ❤️ I addressed them all.

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool ! Thanks for your work @Wauplin

@Wauplin
Copy link
Contributor Author

Wauplin commented Apr 26, 2024

Thanks for the review! Will merge it :)

@Wauplin Wauplin merged commit 61b156a into main Apr 26, 2024
15 of 16 checks passed
@Wauplin Wauplin deleted the mixin-guide-and-few-improvements branch April 26, 2024 07:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants