The Transformers Integration makes :ref:`transformer-based classification <libraries/transformers_integration:Transformer-based Classification>` and :ref:`sentence transformer finetuning <libraries/transformers_integration:Sentence Transformer Finetuning>` usable in small-text. It relies on the :doc:`Pytorch Integration<pytorch_integration>` which is a prerequisite.
Note
Some implementation make use of :ref:`optional dependencies <install:Optional Dependencies>`.
Overview
Before you can use the transformers integration :ref:`make sure the required dependencies have been installed <installation-transformers>`.
With the integration you will have access to the following additional components:
Components | Resources |
---|---|
Datasets | :ref:`TransformersDataset <api-transformers-dataset>` |
Classifiers | :ref:`TransformerBasedClassification <api-classifiers-transformer-based-classification>` |
Query Strategies | (See :doc:`Query Strategies </components/query_strategies>`) |
While this integration is tailored to the transformers library, but since models (and their corresponding) tokenizers can vary considerably, not all models are applicable for small-text classifiers. To help you with finding a suitable model, we list a subset of compatible models in the following which you can use as a starting point:
Size | Models |
---|---|
< 1B Parameters | BERT, T5, DistilRoBERTa, DistilBERT, ELECTRA, BioGPT |
English Models
- BERT models: bert-base-uncased, bert-large-uncased, bert-base-uncased
- T5: t5-small, t5-base, t5-large
- DistilRoBERTa: distilroberta-base
- DistilBERT: distilbert-base-uncased, distilroberta-base-cased
- ELECTRA: google/electra-base-discriminator, google/electra-small-discriminator
- BioGPT: microsoft/biogpt
This list is not exhaustive. Let us know when you have tested other models that might belong on these lists.
Layer-specific fine-tuning can be enabled by setting :py:class:`~small_text.integrations.transformers.classifiers.classification.FineTuningArguments` during the construction of :py:class:`~small_text.integrations.transformers.classifiers.classification.TransformerBasedClassification`. With this, you can enable layerwise gradient decay and gradual unfreezing:
- Layerwise gradient decay: learning rates decrease the lower the layer's level is.
- Gradual unfreezing: lower layers are frozen at the start of the training and become gradually unfrozen with each epoch.
See [HR18]_ for more details on these methods.
An example is provided in :file:`examples/examplecode/transformers_multiclass_classification.py`:
.. literalinclude:: ../../examples/examplecode/transformers_multiclass_classification.py :language: python
An example is provided in :file:`examples/examplecode/setfit_multiclass_classification.py`:
.. literalinclude:: ../../examples/examplecode/setfit_multiclass_classification.py :language: python