Transformers Integration

The Transformers Integration makes :ref:`transformer-based classification <libraries/transformers_integration:Transformer-based Classification>` and :ref:`sentence transformer finetuning <libraries/transformers_integration:Sentence Transformer Finetuning>` usable in small-text. It relies on the :doc:`Pytorch Integration<pytorch_integration>` which is a prerequisite.

Note

Some implementation make use of :ref:`optional dependencies <install:Optional Dependencies>`.

Overview

Installation
Contents
Compatible Models
TransformerBasedClassification: Extended Functionality
Examples

Installation

Before you can use the transformers integration :ref:`make sure the required dependencies have been installed <installation-transformers>`.

Compatible Models

While this integration is tailored to the transformers library, but since models (and their corresponding) tokenizers can vary considerably, not all models are applicable for small-text classifiers. To help you with finding a suitable model, we list a subset of compatible models in the following which you can use as a starting point:

Size	Models
< 1B Parameters	BERT, T5, DistilRoBERTa, DistilBERT, ELECTRA, BioGPT

English Models

BERT models: bert-base-uncased, bert-large-uncased, bert-base-uncased
T5: t5-small, t5-base, t5-large
DistilRoBERTa: distilroberta-base
DistilBERT: distilbert-base-uncased, distilroberta-base-cased
ELECTRA: google/electra-base-discriminator, google/electra-small-discriminator
BioGPT: microsoft/biogpt

This list is not exhaustive. Let us know when you have tested other models that might belong on these lists.

TransformerBasedClassification: Extended Functionality

Layer-specific Fine-tuning

Layer-specific fine-tuning can be enabled by setting :py:class:`~small_text.integrations.transformers.classifiers.classification.FineTuningArguments` during the construction of :py:class:`~small_text.integrations.transformers.classifiers.classification.TransformerBasedClassification`. With this, you can enable layerwise gradient decay and gradual unfreezing:

Layerwise gradient decay: learning rates decrease the lower the layer's level is.
Gradual unfreezing: lower layers are frozen at the start of the training and become gradually unfrozen with each epoch.

See [HR18]_ for more details on these methods.

Examples

Transformer-based Classification

An example is provided in :file:`examples/examplecode/transformers_multiclass_classification.py`:

.. literalinclude:: ../../examples/examplecode/transformers_multiclass_classification.py
   :language: python

Sentence Transformer Finetuning

An example is provided in :file:`examples/examplecode/setfit_multiclass_classification.py`:

.. literalinclude:: ../../examples/examplecode/setfit_multiclass_classification.py
   :language: python

Components	Resources
Datasets	:ref:`TransformersDataset <api-transformers-dataset>`
Classifiers	:ref:`TransformerBasedClassification <api-classifiers-transformer-based-classification>`
Query Strategies	(See :doc:`Query Strategies </components/query_strategies>`)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

transformers_integration.rst

transformers_integration.rst

Transformers Integration

Installation

Contents

Compatible Models

TransformerBasedClassification: Extended Functionality

Layer-specific Fine-tuning

Examples

Transformer-based Classification

Sentence Transformer Finetuning

Files

transformers_integration.rst

Latest commit

History

transformers_integration.rst

File metadata and controls

Transformers Integration

Installation

Contents

Compatible Models

TransformerBasedClassification: Extended Functionality

Layer-specific Fine-tuning

Examples

Transformer-based Classification

Sentence Transformer Finetuning