[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/lenguajenatural-ai/autotransformers/blob/master/notebooks/classification/train_emotion_classification.ipynb)

# Emotion Classification with autotransformers

In this notebook we will see how we can use autotransformers to train a model for text classification, in the context of emotion detection. 

We first import the needed modules or, if you are running this notebook in Google colab, please uncomment the cell below and run it before importing, in order to install `autotransformers`.

We import `DatasetConfig`, the class that configures how datasets are managed inside `AutoTrainer`. We also need `ModelConfig` to define the models to train, and `ResultsPlotter` to plot the experiment results.
Additionally, we import the default hyperparameter space for base-sized models.

In [None]:
# !pip install git+https://github.com/lenguajenatural-ai/autotransformers.git 

In [None]:
from autotransformers import DatasetConfig, ModelConfig, AutoTrainer, ResultsPlotter
from autotransformers.default_param_spaces import hp_space_base

## Configure the dataset

The next step is to define the fixed train args, which will be the `transformers.TrainingArguments` passed to `transformers.Trainer` inside `autotransformers.AutoTrainer`. For a full list of arguments check [TrainingArguments documentation](https://huggingface.co/docs/transformers/en/main_classes/trainer#transformers.TrainingArguments). `DatasetConfig` expects these arguments in dictionary format.

To save time, we set `max_steps` to 1; in a real setting we would need to define these arguments differently. However, that is out of scope for this tutorial. To learn how to work with Transformers, and how to configure the training arguments, please check Huggingface Course on NLP. 

In [None]:
fixed_train_args = {
        "evaluation_strategy": "steps",
        "num_train_epochs": 10,
        "do_train": True,
        "do_eval": True,
        "logging_strategy": "steps",
        "eval_steps": 1,
        "save_steps": 1,
        "logging_steps": 1,
        "save_strategy": "steps",
        "save_total_limit": 2,
        "seed": 69,
        "fp16": False,
        "load_best_model_at_end": True,
        "per_device_eval_batch_size": 16,
        "max_steps": 1
    }

Next, there are some parameters that will be common for any dataset we may want to train on of this type (classification). We call this dictionary `default_args_dataset`. This dictionary configures the direction of the optimization process (for hyperparameter search), the metric to optimize, whether to retrain models at end or not, and the `fixed_train_args`, which are common to all datasets.

In [None]:
default_args_dataset = {
        "seed": 44,
        "direction_optimize": "maximize",
        "metric_optimize": "eval_f1-score",
        "retrain_at_end": False,
        "fixed_training_args": fixed_train_args
}

Now we want to create a configuration which has all of that, but with some additional parameters. We will call it `tweet_eval_config`, as it is the `DatasetConfig` of tweeteval dataset. Next we define the dataset name, the alias (in this case the same as the dataset name), the task (classification in this case), the text field (field in the dataset where source texts are), the label col, which is the name of the field in the dataset containing the labels, and finally `hf_load_kwargs` is a dictionary with the keyword arguments needed to load the dataset from HF's datasets 
(`load_dataset(**hf_load_kwargs)`).

In [None]:
tweet_eval_config = default_args_dataset.copy()
tweet_eval_config.update(
    {
        "dataset_name": "tweeteval",
        "alias": "tweeteval",
        "task": "classification",
        "text_field": "text",
        "label_col": "label",
        "hf_load_kwargs": {"path": "tweet_eval", "name": "emotion"}
    }
)

When we have the full dataset configuration ready, we can easily create the `DatasetConfig`:

In [None]:
tweet_eval_config = DatasetConfig(**tweet_eval_config)

## Configure Models

Once we have the dataset configuration ready, we just need to configure the model. In this case, we are going to use BERTIN model, which is a Spanish RoBERTa model. We use the default `hp_space_base` as the hyperparameter space, and set number of trials to 1. For a real setting, please set the number of trials to at least 20 to obtain realiable results. Otherwise the hyperparameter tuning will serve for nothing.

Additionally, we will create another 2 model configs, to show how we can later compare their results etc.

**Note that we are using Spanish models for an English task. As we are not actually trying to train realistic good performing models for this task this does not matter, as this notebook is for learning purposes solely. However, please make sure you choose models that fit your tasks when using `autotransformers` for real projects.**

In [None]:
bertin_config = ModelConfig(
        name="bertin-project/bertin-roberta-base-spanish",
        save_name="bertin",
        hp_space=hp_space_base,
        n_trials=1,
)
beto_config = ModelConfig(
        name="dccuchile/bert-base-spanish-wwm-cased",
        save_name="beto",
        hp_space=hp_space_base,
        n_trials=1,
)
albert_config = ModelConfig(
        name="CenIA/albert-tiny-spanish",
        save_name="albert",
        hp_space=hp_space_base,
        n_trials=1
)

## Create AutoTrainer

We can now create `AutoTrainer`. For that, we will use the model configs and the dataset config we have just created. We will additionally define a metrics dir, where metrics will be saved after training.

In [None]:
autotrainer = AutoTrainer(
        model_configs=[bertin_config, beto_config, albert_config],
        dataset_configs=[tweet_eval_config],
        metrics_dir="tweeteval_metrics"
)

## Train!

Now we can train the selected models on the tweeteval dataset by calling the `AutoTrainer` object. We also print out the results, which are obviously awful in this case because we trained for one step.

In [None]:
results = autotrainer()
print(results)

## Plot the Results

Once the models have trained, we might want to see a comparison of their performance. `ResultsPlotter` can be helpful in this respect, as we see in the next cells. It automatically adds an average row, which shows the average performance of the models over the different datasets. This is useful when comparing multiple models over multiple datasets.

In [None]:
plotter = ResultsPlotter(
        metrics_dir=autotrainer.metrics_dir,
        model_names=[model_config.save_name for model_config in autotrainer.model_configs],
        dataset_to_task_map={dataset_config.alias: dataset_config.task for dataset_config in autotrainer.dataset_configs},
)

In [None]:
ax = plotter.plot_metrics()