Models

The base classes [PreTrainedModel], [TFPreTrainedModel], and [FlaxPreTrainedModel] implement the common methods for loading/saving a model either from a local file or directory, or from a pretrained model configuration provided by the library (downloaded from HuggingFace's AWS S3 repository).

[PreTrainedModel] and [TFPreTrainedModel] also implement a few methods which are common among all the models to:

resize the input token embeddings when new tokens are added to the vocabulary
prune the attention heads of the model.

The other methods that are common to each model are defined in [~modeling_utils.ModuleUtilsMixin] (for the PyTorch models) and [~modeling_tf_utils.TFModuleUtilsMixin] (for the TensorFlow models) or for text generation, [~generation.GenerationMixin] (for the PyTorch models), [~generation.TFGenerationMixin] (for the TensorFlow models) and [~generation.FlaxGenerationMixin] (for the Flax/JAX models).

PreTrainedModel

[[autodoc]] PreTrainedModel - push_to_hub - all

ModuleUtilsMixin

[[autodoc]] modeling_utils.ModuleUtilsMixin

TFPreTrainedModel

[[autodoc]] TFPreTrainedModel - push_to_hub - all

TFModelUtilsMixin

[[autodoc]] modeling_tf_utils.TFModelUtilsMixin

FlaxPreTrainedModel

[[autodoc]] FlaxPreTrainedModel - push_to_hub - all

Pushing to the Hub

[[autodoc]] utils.PushToHubMixin

Sharded checkpoints

[[autodoc]] modeling_utils.load_sharded_checkpoint

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

model.md

model.md

Models

PreTrainedModel

ModuleUtilsMixin

TFPreTrainedModel

TFModelUtilsMixin

FlaxPreTrainedModel

Pushing to the Hub

Sharded checkpoints

Files

model.md

Latest commit

History

model.md

File metadata and controls

Models

PreTrainedModel

ModuleUtilsMixin

TFPreTrainedModel

TFModelUtilsMixin

FlaxPreTrainedModel

Pushing to the Hub

Sharded checkpoints