# SetFit for Text Classification

Welcome to SetFit for Text Classification! In this short notebook exercise we'll go into just how easy it is to do few-shot classification with SetFit.

To start, we install SetFit which will install all of the dependencies you will need:

In [None]:
!pip install setfit

We then import relevant functions (data loading and loss function) and our handy SetFitModel and SetFitTrainer

In [None]:
from datasets import load_dataset
from sentence_transformers.losses import CosineSimilarityLoss

from setfit import SetFitModel, SetFitTrainer

We start by loading a dataset to work with. In this example we'll load the "emotion" dataset already on the HF hub.

In [None]:
# Load a dataset
dataset = load_dataset("emotion")

Let's now select N number of examples per class in our train and test sets. In our example let's start with N=8

In [None]:
# Select N examples per class (8 in this case)
train_ds = dataset["train"].shuffle(seed=42).select(range(8 * 2))
test_ds = dataset["test"]

Now load a Sentence Transformer model, also accessible from the HF hub. In this example we download paraphrase-mpnet. You can use the SetFitModel function which handles the SetFit component of the downloaded model for you!

In [None]:
# Load SetFit model from Hub
model = SetFitModel.from_pretrained("sentence-transformers/paraphrase-mpnet-base-v2")

Now we're ready to train! Again, the SetFitTrainer function we have provided makes this easy! Input the 1. model, 2. train/test sets we've generated above, 3. loss function we've imported, 4. batch size, and 5. number of epochs/iterations.

In [None]:
# Create trainer
trainer = SetFitTrainer(
    model=model,
    train_dataset=train_ds,
    eval_dataset=test_ds,
    loss_class=CosineSimilarityLoss,
    batch_size=16,
    num_epochs=1,
    num_iterations=20,
)

In [None]:
!echo CUDA_VISIBLE_DEVICES

In [None]:
# Train and evaluate!
trainer.train()

Now you can check your model's performance with the trainer's evaluate function.

metrics = trainer.evaluate()
metrics

If you'd like to upload your newly SetFit few-shot trained model to the hub you can log in via cli to the HF hub...

In [None]:
!huggingface-cli login

... then push to hub! Success!

In [None]:
# Push model to the Hub
trainer.push_to_hub("my-awesome-setfit-model-2")

With our implementation of SetFit, downloading, training, and uploading it back to the hub can be done in just a few lines of code :) 