# Image classification

In this tutorial, we will show a standard workflow for a image classification task, in this case, using transformers, sentence-transformers and Argilla.

We will follow these steps:

* Configure the Argilla dataset
* Add initial model suggestions
* Add vectors for image search
* Evaluate with Argilla
* Train your model
* Update the suggestions with the new model

## Getting started

### Deploy the Argilla server

If you already have deployed Argilla, you can skip this step. Otherwise, you can quickly deploy Argilla following [this guide](../getting_started/quickstart.md).

### Set up the environment

To complete this tutorial, you need to install the Argilla SDK and a few third-party libraries via `pip`.

In [None]:
!pip install argilla

In [None]:
!pip install sentence-transformers>=3 transformers==4.40.2

Let's make the required imports:

In [39]:
import argilla as rg

import base64
import io
from datasets import load_dataset, Dataset, Image
from sentence_transformers import SentenceTransformer

You also need to connect to the Argilla server using the `api_url` and `api_key`.

In [4]:
# Replace api_url with your url if using Docker
# Replace api_key if you configured a custom API key
# Uncomment the last line and set your HF_TOKEN if your space is private
client = rg.Argilla(
    # api_url="https://[your-owner-name]-[your_space_name].hf.space",
    # api_url=
    api_key="argilla.apikey",
    # headers={"Authorization": f"Bearer {HF_TOKEN}"}
)

## Configure and create the Argilla dataset

Now, we will need to configure the dataset. In the settings, we can specify the guidelines, fields, and questions. If needed, you can also add metadata and vectors. However, for our use case, we just need a text field and a label question.

!!! note
    Check this [how-to guide](../how_to_guides/dataset.md) to know more about configuring and creating a dataset.

In [28]:
labels = [str(x) for x in range(10)]

settings = rg.Settings(
    guidelines="The goal of this task is to classify a given image of a handwritten digit into one of 10 classes representing integer values from 0 to 9, inclusively.",
    fields=[
        rg.ImageField(
            name="image",
            title="An image of a handwritten digit.",
        ),
    ],
    questions=[
        rg.LabelQuestion(
            name="image_label",
            title="What digit do you see on the image?",
            labels=labels,
        )
    ]
)

Let's create the dataset with the name and the defined settings:

In [29]:
dataset = rg.Dataset(
    name="image_classification_dataset",
    settings=settings,
)
dataset.create()



Dataset(id=UUID('706d637a-9216-4c96-b28c-b273d6f1680f') inserted_at=datetime.datetime(2024, 8, 13, 15, 18, 28, 53048) updated_at=datetime.datetime(2024, 8, 13, 15, 18, 28, 221716) name='image_classification_datasets' status='ready' guidelines='The goal of this task is to classify a given image of a handwritten digit into one of 10 classes representing integer values from 0 to 9, inclusively.' allow_extra_metadata=False distribution=OverlapTaskDistributionModel(strategy='overlap', min_submitted=1) workspace_id=UUID('fcdac196-1f8c-4634-bbc3-d6e490fcb481') last_activity_at=datetime.datetime(2024, 8, 13, 15, 18, 28, 221716))

## Add records

Even if we have created the dataset, it still lacks the information to be annotated (you can check it in the UI). We will use the `ylecun/mnist` dataset from [the Hugging Face Hub](https://huggingface.co/datasets/ylecun/mnist). Specifically, we will use the `train` split and get `100` examples. 

!!! tip
    When working with Hugging Face dataset you can set `Image(decode=False)` so that we can get [public image URLs](https://huggingface.co/docs/datasets/en/image_load#local-files), however, this depends on the dataset.

In [40]:
hf_dataset = load_dataset("ylecun/mnist", split="train[:100]")
hf_dataset

Dataset({
    features: ['image', 'label'],
    num_rows: 100
})

Let's have a look at the first image in the dataset.

In [41]:
hf_dataset[0]

{'image': <PIL.PngImagePlugin.PngImageFile image mode=L size=28x28>,
 'label': 5}

As we can seem, the image is a 28x28 grayscale image. In order to use it in in Argilla, we need to convert it to a base64 string.

In [43]:
def pil_to_data_uri(batch):
    data_uri = []
    for image in batch["image"]:
        buffered = io.BytesIO()
        image.save(buffered, format="PNG")
        img_str = base64.b64encode(buffered.getvalue()).decode()
        data_uri.append(f"data:image/png;base64,{img_str}")
    batch["image_data_uri"] = data_uri
    return batch

hf_dataset= hf_dataset.map(pil_to_data_uri, batched=True)
hf_dataset[0]

Map: 100%|██████████| 100/100 [00:00<00:00, 1203.15 examples/s]


{'image': <PIL.PngImagePlugin.PngImageFile image mode=L size=28x28>,
 'label': 5,
 'image_data_uri': ''}

We will easily add them to the dataset using `log` and the mapping, where we indicate that the column `text` is the data that should be added to the field `review`.

In [48]:
dataset.records.log(records=hf_dataset, mapping={"image_data_uri": "image"})

RecordsIngestionError: Argilla SDK error: RecordsIngestionError: Mapped attribute is not a valid dataset attribute: image.

### Add initial model suggestions

The next step is to add suggestions to the dataset. This will make things easier and faster for the annotation team. Suggestions will appear as preselected options, so annotators will only need to correct them. In our case, we will generate them using a zero-shot SetFit model. However, you can use a framework or technique of your choice.

We will start by defining an example training set with the required labels: `positive` and `negative`. Using `get_templated_dataset` will create sentences from the default template: "This sentence is {label}."

In [11]:
zero_ds = get_templated_dataset(
    candidate_labels=labels,
    sample_size=8,
)

Now, we will prepare a function to train the SetFit model.

!!! note
    For further customization, you can check the [SetFit documentation](https://huggingface.co/docs/setfit/reference/main).

In [12]:
def train_model(model_name, dataset):
    model = SetFitModel.from_pretrained(model_name)

    trainer = Trainer(
        model=model,
        train_dataset=dataset,
    )

    trainer.train()

    return model

Let's train the model. We will use `TaylorAI/bge-micro-v2`, available in the [Hugging Face Hub](https://huggingface.co/TaylorAI/bge-micro-v2).

In [None]:
model = train_model(model_name="TaylorAI/bge-micro-v2", dataset=zero_ds)

You can save it locally or push it to the Hub. And then, load it from there.

In [14]:
# Save and load locally
# model.save_pretrained("text_classification_model")
# model = SetFitModel.from_pretrained("text_classification_model")

# Push and load in HF
# model.push_to_hub("[username]/text_classification_model")
# model = SetFitModel.from_pretrained("[username]/text_classification_model")

It's time to make the predictions! We will set a function that uses the `predict` method to get the suggested label. The model will infer the label based on the text.

In [14]:
def predict(model, input, labels):
    model.labels = labels

    prediction = model.predict([input])

    return prediction[0]

To update the records, we will need to retrieve them from the server and update them with the new suggestions. The `id` will always need to be provided as it is the records' identifier to update a record and avoid creating a new one.

In [None]:
data = dataset.records.to_list(flatten=True)
updated_data = [
    {
        "sentiment_label": predict(model, sample["review"], labels),
        "id": sample["id"],
    }
    for sample in data
]
dataset.records.log(records=updated_data)

Voilà! We have added the suggestions to the dataset, and they will appear in the UI marked with a ✨. 

## Evaluate with Argilla

Now, we can start the annotation process. Just open the dataset in the Argilla UI and start annotating the records. If the suggestions are correct, you can just click on `Submit`. Otherwise, you can select the correct label.

!!! note
    Check this [how-to guide](../how_to_guides/annotate.md) to know more about annotating in the UI.

## Train your model

After the annotation, we will have a robust dataset to train the main model. In our case, we will fine-tune using SetFit. However, you can select the one that best fits your requirements. So, let's start by retrieving the annotated records.

!!! note
    Check this [how-to guide](../how_to_guides/query.md) to know more about filtering and querying in Argilla.

In [16]:
dataset = client.datasets("text_classification_dataset")

In [18]:
status_filter = rg.Query(filter=rg.Filter(("response.status", "==", "submitted")))

submitted = dataset.records(status_filter).to_list(flatten=True)

As we have a single response per record, we can retrieve the selected label straightforwardly and create the training set with 8 samples per label. We selected 8 samples per label to have a balanced dataset for few-shot learning.

In [None]:
train_records = [
    {
        "text": r["review"],
        "label": r["sentiment_label.responses"][0],
    }
    for r in submitted
]
train_dataset = Dataset.from_list(train_records)
train_dataset = sample_dataset(train_dataset, label_column="label", num_samples=8)

We can train the model using our previous function, but this time with a high-quality human-annotated training set.

In [None]:
model = train_model(model_name="TaylorAI/bge-micro-v2", dataset=train_dataset)

As the training data had a better-quality, we can expect a better model. So, we can update the remaining non-annotated records with the new model's suggestions.

In [None]:
data = dataset.records.to_list(flatten=True)
updated_data = [
    {
        "sentiment_label": predict(model, sample["review"], labels),
        "id": sample["id"],
    }
    for sample in data
]
dataset.records.log(records=updated_data)

## Conclusions

In this tutorial, we present an end-to-end example of a text classification task. This serves as the base, but it can be performed iteratively and seamlessly integrated into your workflow to ensure high-quality curation of your data and improved results.

We started by configuring the dataset, adding records, and training a zero-shot SetFit model, as an example, to add suggestions. After the annotation process, we trained a new model with the annotated data and updated the remaining records with the new suggestions.