[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/pln-fing-udelar/retuyt-inco-huhu-2023/finetuning-robertuito/run_finetuning_robertuito.ipynb)

# Fine-tuning Robertuito for the HUHU Shared Task

This notebook presents **how to run** the fine-tuned versions of [pysentimiento/robertuito-base-uncased](https://huggingface.co/pysentimiento/robertuito-base-uncased) created for the HUHU Shared Task at IberLEF 2023. All models are available in [Hugging Face](https://huggingface.co/).

To follow the training process we used for these models, please refer to the notebook *train_finetuning_robertuito.ipynb*.

The Shared Task consisted of the following three subtasks (as explained in [the official website](https://sites.google.com/view/huhuatiberlef23/huhu)).

### Subtask 1: HUrtful HUmour Detection

The first subtask consists in determining whether a prejudicial tweet is intended to cause humour. Participants will have to distinguish between tweets that using humour express prejudice and tweets that express prejudice without using humour.

### Subtask 2A: Prejudice Target Detection

Taking into account the following minority groups: Women and feminists, LGBTIQ community and Immigrants, racially discriminated people, and overweight people, participants are asked to identify the targeted groups on each tweet as a multilabel classification task.

### Subtask 2B: Degree of Prejudice Prediction

The third subtask consists of predicting on a continuous scale from 1 to 5 to evaluate how prejudicial the message is on average among minority groups.

In [None]:
#@title Install transformers package

!pip install transformers

In [9]:
#@title Remove verbosity (optional)
from transformers.utils import logging

logging.set_verbosity(logging.CRITICAL)

## Subtask 1: HUrtful HUmour Detection

Model in Hugging Face: [pln-fing-udelar/robertuito-HUHU-task1](https://huggingface.co/pln-fing-udelar/robertuito-HUHU-task1)

In [13]:
#@title Run model
from transformers import pipeline

tweet = "El español es un idioma muy hablado en el mundo." #@param {"type":"string"}

classifier = pipeline(model="pln-fing-udelar/robertuito-HUHU-task1")
prediction = classifier(tweet)

print(f"Label: {prediction[0]['label']}, score: {prediction[0]['score']}")

Label: NON-HUMOROUS, score: 0.5713184475898743


## Subtask 2A: Prejudice Target Detection

Models in Hugging Face:

*   Prejudice-woman: [pln-fing-udelar/robertuito-HUHU-task2a-group1](https://huggingface.co/pln-fing-udelar/robertuito-HUHU-task2a-group1)
*   Prejudice-LGBTIQ: [pln-fing-udelar/robertuito-HUHU-task2a-group2](https://huggingface.co/pln-fing-udelar/robertuito-HUHU-task2a-group2)
*   Prejudice-inmigrant-race: [pln-fing-udelar/robertuito-HUHU-task2a-group3](https://huggingface.co/pln-fing-udelar/robertuito-HUHU-task2a-group3)
*   Prejudice-overweight: [pln-fing-udelar/robertuito-HUHU-task2a-group4](https://huggingface.co/pln-fing-udelar/robertuito-HUHU-task2a-group4)

In [11]:
#@title Run models
from transformers import pipeline

tweet = "El español es un idioma muy hablado en el mundo." #@param {"type":"string"}

for i in range(4):
  classifier = pipeline(model=f"pln-fing-udelar/robertuito-HUHU-task2a-group{i+1}")
  prediction = classifier(tweet)
  print(f"Label: {prediction[0]['label']}, score: {prediction[0]['score']}")

Label: NO-PREJUDICE-WOMAN, score: 0.993057370185852
Label: NO-PREJUDICE-LGBTIQ, score: 0.9904792904853821
Label: PREJUDICE-INMIGRANT-RACE, score: 0.9952902793884277
Label: NO-PREJUDICE-OVERWEIGHT, score: 0.9981707334518433


## Subtask 2B: Degree of Prejudice Prediction

Model in Hugging Face: [pln-fing-udelar/robertuito-HUHU-task2b](https://huggingface.co/pln-fing-udelar/robertuito-HUHU-task2b)

Since this is a regression task, the head used for transfer learning is a classification head with one output label, and no function should be applied to the output value. This is specified in the transformers pipeline setting the parameter *function_to_apply* to "none".

In [15]:
#@title Run model
from transformers import pipeline

tweet = "El español es un idioma muy hablado en el mundo." #@param {"type":"string"}

classifier = pipeline(model="pln-fing-udelar/robertuito-HUHU-task2b", function_to_apply="none")
prediction = classifier(tweet)

print(f"Degree of prejudice: {prediction[0]['score']}")

Degree of prejudice: 3.222658634185791
