## Introduction

Predictive models are cheap, predictable, scalable and easy to use, therefore they are still widely used in many industries. However, the quality of the predictions depends on the quality of the data used to train the model, which can be a challenge to obtain. In order to make labelling data easier, Argilla provides a vector-based approach to data labelling, which allows you to quickly label data in bulk. On top of that, few-shot learning can be used to fine-tune models with a smaller amount of labelled data. Therefore the goal of this tutorial is to show you how to prepare use Argilla to label data in bulk, and use that data with few-shot learning to fine-tune a model.

In this tutorial, you will learn to:

- Publish dataset
- Add records to a dataset
- Update records with vectors
- Use bulk labeling to label data
- Fine-tune a text classification model
- Add model suggestion to a dataset

??? note "used libraries"
    - [Sentence Transformers](https://github.com/UKPLab/sentence-transformers) is the Python library that will be used to obtain our vectors. It is a library that allows you to use pre-trained transformer models to encode text into vectors.
    - [SetFit](https://github.com/huggingface/setfit) is the Python library that will be used to fine-tune our model. It is a library that allows you to fine-tune a model with a smaller amount of labelled data.

??? note "used dataset"
    [Sentence Transformers](https://github.com/UKPLab/sentence-transformers) is the Python library that will be used to obtain our vectors. It is a library that allows you to use pre-trained transformer models to encode text into vectors.
    [SetFit](https://github.com/huggingface/setfit) is the Python library that will be used to fine-tune our model. It is a library that allows you to fine-tune a model with a smaller amount of labelled data.

## Setup

### Installation

We assume that you've already installed and deployed Argilla. If you haven't, please follow the instructions in the [installation documentation](/argilla-python/getting_started/installation/).

To install the required libraries, run the following command:

In [1]:
!pip install sentence-transformers setfit

Looking in indexes: https://pypi.org/simple, https://dmrepository.datamaran.com:8443/repository/dmPYTHON/simple
Collecting sentence-transformers
  Downloading https://dmrepository.datamaran.com:8443/repository/dmPYTHON/packages/sentence-transformers/3.0.0/sentence_transformers-3.0.0-py3-none-any.whl (224 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m224.7/224.7 kB[0m [31m1.7 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
[?25hCollecting setfit
  Using cached https://dmrepository.datamaran.com:8443/repository/dmPYTHON/packages/setfit/1.0.3/setfit-1.0.3-py3-none-any.whl (75 kB)
Collecting scipy
  Downloading https://dmrepository.datamaran.com:8443/repository/dmPYTHON/packages/scipy/1.13.1/scipy-1.13.1-cp310-cp310-macosx_12_0_arm64.whl (30.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m30.3/30.3 MB[0m [31m13.2 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25hCollecting transformers<5.0.0,>=4.34.0
  Downloading https://dmrepository.dat

### Imports


In [None]:
import argilla_sdk as rg
from sentence_transformers import SentenceTransformer
from setfit import SetFitModel, SetFitTrainer

## Application

### Publish dataset

We will start by publishing a dataset. This dataset will be used to store the data that we will label in bulk.


In [None]:
ds = rg.Settings(
    fields=[
        rg.TextField(
            name='text',
            required=True,
            description='Text to be classified',
            use_markdown=False
        )
    ],
    questions=[
        rg.LabelQuestion(
            name='label',
            description='Label of the text',
            options=['positive', 'negative', 'neutral']
        )
    ],
    vectors=[
        rg.VectorField(
            name='vector',
            dimensions=768,
            description='Vector representation of the text'
        )
    ]
)

### Add records to a dataset

### Update records with vectors

### Use bulk labeling to label data

### Fine-tune a text classification model

### Add model suggestion to a dataset

## Conclusion