# How to use fine-tuned models

Transformer models such as BERT and RoBERTa can easily be fine-tuned for downstream tasks. The Huggingface model hub lists many of these models trained for specific tasks. You can download such a model and use it to perform specific NLP tasks. Here we show two examples of fine-tuned models for xlm-roberta. Because the language model is cross-lingual, also the fine-tuned model works for all the 100 languages that xlm-roberta models.

## Sentiment

We search on the Model Hub of Huggingface for a fine-tuned model for sentiment classification, e.g.:

https://huggingface.co/cardiffnlp/twitter-xlm-roberta-base-sentiment

In [4]:
from transformers import pipeline
model_path = "cardiffnlp/twitter-xlm-roberta-base-sentiment"
sentiment_task = pipeline("sentiment-analysis", model=model_path, tokenizer=model_path)

Downloading:   0%|          | 0.00/841 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.11G [00:00<?, ?B/s]

Some weights of XLMRobertaModel were not initialized from the model checkpoint at cardiffnlp/twitter-xlm-roberta-base-sentiment and are newly initialized: ['roberta.pooler.dense.weight', 'roberta.pooler.dense.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Downloading:   0%|          | 0.00/5.07M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/150 [00:00<?, ?B/s]

In [7]:
print(sentiment_task("What an awful movie!"))
print(sentiment_task("Wat een waardeloze film!"))

[{'label': 'Negative', 'score': 0.9273000359535217}]
[{'label': 'Negative', 'score': 0.8501133322715759}]


## Named Entity Recognition

In [8]:
from transformers import AutoTokenizer, AutoModelForTokenClassification
from transformers import pipeline
tokenizer = AutoTokenizer.from_pretrained("Davlan/xlm-roberta-base-ner-hrl")
model = AutoModelForTokenClassification.from_pretrained("Davlan/xlm-roberta-base-ner-hrl")
nlp = pipeline("ner", model=model, tokenizer=tokenizer)
example = "Nader Jokhadar had given Syria the lead with a well-struck header in the seventh minute."
ner_results = nlp(example)
print(ner_results)

Downloading:   0%|          | 0.00/980 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/5.07M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/150 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/211 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.11G [00:00<?, ?B/s]

[{'word': '▁Na', 'score': 0.9998415112495422, 'entity': 'B-PER', 'index': 1, 'start': 0, 'end': 2}, {'word': 'der', 'score': 0.880562961101532, 'entity': 'I-PER', 'index': 2, 'start': 2, 'end': 5}, {'word': '▁Jo', 'score': 0.9998159408569336, 'entity': 'I-PER', 'index': 3, 'start': 5, 'end': 8}, {'word': 'kha', 'score': 0.9998022317886353, 'entity': 'I-PER', 'index': 4, 'start': 8, 'end': 11}, {'word': 'dar', 'score': 0.9997529983520508, 'entity': 'I-PER', 'index': 5, 'start': 11, 'end': 14}, {'word': '▁Syria', 'score': 0.9996248483657837, 'entity': 'B-LOC', 'index': 8, 'start': 24, 'end': 30}]


In [9]:
example = "Mark Rutte kondigt aan dat de VVD tech bedrijven zoals Google, Facebook en Apple zwaarder gaat belasten."
ner_results = nlp(example)
print(ner_results)

[{'word': '▁Mark', 'score': 0.9998753070831299, 'entity': 'B-PER', 'index': 1, 'start': 0, 'end': 4}, {'word': '▁Rut', 'score': 0.9998551607131958, 'entity': 'I-PER', 'index': 2, 'start': 4, 'end': 8}, {'word': 'te', 'score': 0.9998762011528015, 'entity': 'I-PER', 'index': 3, 'start': 8, 'end': 10}, {'word': '▁VVD', 'score': 0.9991850256919861, 'entity': 'B-ORG', 'index': 9, 'start': 29, 'end': 33}, {'word': '▁Google', 'score': 0.999866247177124, 'entity': 'B-ORG', 'index': 13, 'start': 54, 'end': 61}, {'word': '▁Facebook', 'score': 0.9998590350151062, 'entity': 'B-ORG', 'index': 15, 'start': 62, 'end': 71}, {'word': '▁Apple', 'score': 0.999846875667572, 'entity': 'B-ORG', 'index': 17, 'start': 74, 'end': 80}]
