# Working with Transformer Models: HuggingFace Library

*Machine Learning Foundations 3, Prof. Dr. M. Kurpicz-Briki, Bachelor in Data Engineering, Bern University of Applied Sciences*

In this notebook you find some selected and adapted/commented examples. For the full tutorial please see: [https://huggingface.co/learn/nlp-course/chapter1/3](https://huggingface.co/learn/nlp-course/chapter1/3)

## Sentiment Analysis Pipeline

A classical problem in Natural Language Processing is sentiment analysis. This can be done with the HuggingFace library as a out-of-the-box function.

In [None]:
from transformers import pipeline

classifier = pipeline("sentiment-analysis") # uses default model, as model is not defined.
classifier("I've been waiting for a HuggingFace course all my life.")

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision 714eb0f (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

Device set to use cpu


[{'label': 'POSITIVE', 'score': 0.9771266579627991}]

In [None]:
from transformers import pipeline

sentiment_analyzer = pipeline("sentiment-analysis", model="distilbert-base-uncased") # Distilled version of the BERT model
## A distilled model is a smaller, faster version of a larger model, trained to mimic its behavior while retaining most of its performance.

text = "I love using Hugging Face Transformers! It's amazing."

sentiment = sentiment_analyzer(text)
print(sentiment)

config.json:   0%|          | 0.00/483 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

Device set to use cpu


[{'label': 'LABEL_1', 'score': 0.502373993396759}]


## Zero-Shot Classification

Zero-shot classification in prompting refers to the process of using a pre-trained language model to classify inputs into categories without providing any task-specific training examples. Alternatively, the model could be fine-tuned for a classification task with specific data, for example to obtain more specific results.

In [None]:
from transformers import pipeline

classifier = pipeline("zero-shot-classification")
classifier(
    "This is a course about the Transformers library",
    candidate_labels=["education", "politics", "business"],
)

No model was supplied, defaulted to facebook/bart-large-mnli and revision d7645e1 (https://huggingface.co/facebook/bart-large-mnli).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json: 0.00B [00:00, ?B/s]

model.safetensors:   0%|          | 0.00/1.63G [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

Device set to use cpu


{'sequence': 'This is a course about the Transformers library',
 'labels': ['education', 'business', 'politics'],
 'scores': [0.8445994257926941, 0.11197380721569061, 0.04342673346400261]}

# Text Generation

Text generation can be, for example, auto-completion of a sentence.

In [None]:
from transformers import pipeline

generator = pipeline("text-generation", model="distilgpt2") # smaller, distilled version of the GPT-2 model
generator(
    "In this course, we will teach you how to",
    max_length=30,
    num_return_sequences=2,
)

config.json:   0%|          | 0.00/762 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/353M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Device set to use cpu
Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=30) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


[{'generated_text': 'In this course, we will teach you how to do some basic and effective actions to help you become more and more comfortable with being a better person.”\n\n\n\nThis course will give you some insights into a variety of different things that you can do to help you become better. This course will teach you how to do some basic and effective actions to help you become more and more comfortable with being a better person.\nIn part I will explain how to do some basic and effective actions to help you become more and more comfortable with being a better person. I will explain how to do some basic and effective actions to help you become more and more comfortable with being a better person.'},
 {'generated_text': 'In this course, we will teach you how to be a leader and a leader, so you will learn how to be so successful in your own organization.\n\n\n*I do not recommend working on an organization that is completely unproductive.\n*I recommend working with organizations that

# Text Summarization with the T5 Model

To learn more about the T5 model, refer to https://huggingface.co/docs/transformers/model_doc/t5

In [None]:
from transformers import pipeline

# Load T5 model for summarization
summarizer = pipeline("summarization", model="t5-small")

# Input text to summarize
text = """
Artificial intelligence (AI) refers to the simulation of human intelligence in machines
that are programmed to think and act like humans. The term may also be applied to any
machine that exhibits traits associated with a human mind such as learning and problem-solving.
"""

# Generate a summary
summary = summarizer(text, max_length=50, min_length=25, do_sample=False)
print(summary[0]['summary_text'])

# Mask Filling

Different models can be used to predict a missing word in the sentence. For example, the 2 most likely words for this position (top_k=2).

In [None]:
from transformers import pipeline

# Example 1
unmasker = pipeline("fill-mask") # defaults to distilbert/distilroberta-base
unmasker("This course will teach you all about <mask> models.", top_k=2)

In [None]:
# Example 2
unmasker = pipeline("fill-mask", model="bert-base-uncased")
# Not all models can be used for all tasks, e.g., T5 cannot be used for fill-mask.
# Masking token can be different, e.g. [MASK] for BERT.
unmasker("This course will teach you all about [MASK] models.", top_k=2)

# Question Answering
There are different ways how transformer models can be used for question answering. Closed questions can also be answered with zero-shot prompting as we have seen before. Another option is the pipeline for question-answering. A context can be provided to provide more specific information in addition to the model itself. Chain of Thought prompting can also be helpful. This typically works only with the larger models.

In [None]:
from transformers import pipeline

## Example 1: Answering Questions with Context
question_answerer = pipeline("question-answering") # defaults to distilbert/distilbert-base-cased-distilled-squad
question_answerer(
    question="Where do I work?",
    context="My name is Sylvain and I work at Hugging Face in Brooklyn",
)

In [None]:
from transformers import pipeline

## Example 2: Compare different models
qa_model1 = pipeline("question-answering", model="distilbert-base-cased-distilled-squad")
qa_model2 = pipeline("question-answering", model="bert-large-uncased-whole-word-masking-finetuned-squad")

context = """
The Eiffel Tower is a wrought-iron lattice tower on the Champ de Mars in Paris, France.
It is one of the most recognizable structures in the world and a global cultural icon of France.
The tower was completed in 1889 and named after the engineer Gustave Eiffel, whose company designed and built the tower.
"""

question = "Where is the Eiffel Tower located?"

answer1 = qa_model1(question=question, context=context)
answer2 = qa_model2(question=question, context=context)

print("Answer from DistilBERT (qa_model1):")
print(f"Answer: {answer1['answer']}, Confidence: {answer1['score']:.2f}\n")

print("Answer from BERT (qa_model2):")
print(f"Answer: {answer2['answer']}, Confidence: {answer2['score']:.2f}")


In [None]:
## Example 3: Chain of Thought Prompting
from transformers import pipeline

reasoning_model = pipeline("text2text-generation", model="google/flan-t5-large") # only works with larger models

question = "John has 3 apples. Sarah gives him 2 more. Then he eats 1 apple. How many apples does he have now?"
cot_prompt = f"Answer the following question step by step:\n{question}"

response = reasoning_model(cot_prompt, max_length=100, do_sample=False)
print(response[0]['generated_text'])

# Beware of Bias in Transformer Models
As we have seen before for word embeddings, societal stereotypes can also be reflected in transformer-based models. Sometimes we can see them directly, sometimes they can impact implicitely the results.

In [None]:
from transformers import pipeline

# Bias in Machine Translation

translatorDE = pipeline("translation", model="Helsinki-NLP/opus-mt-en-de")
translatorFR = pipeline("translation", model="Helsinki-NLP/opus-mt-en-fr")


# Translate English to French
en_to_fr = translatorFR("Here is the professor, there is the secretary.", src_lang="en", tgt_lang="fr")
print(en_to_fr[0]["translation_text"])

# Translate English to German
en_to_de = translatorDE("Here is the professor, there is the secretary.", src_lang="en", tgt_lang="de")
print(en_to_de[0]["translation_text"])


config.json: 0.00B [00:00, ?B/s]

pytorch_model.bin:   0%|          | 0.00/298M [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/298M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/293 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/42.0 [00:00<?, ?B/s]

source.spm:   0%|          | 0.00/768k [00:00<?, ?B/s]

target.spm:   0%|          | 0.00/797k [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

Device set to use cpu


config.json: 0.00B [00:00, ?B/s]

pytorch_model.bin:   0%|          | 0.00/301M [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/301M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/293 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/42.0 [00:00<?, ?B/s]

source.spm:   0%|          | 0.00/778k [00:00<?, ?B/s]

target.spm:   0%|          | 0.00/802k [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

'(ReadTimeoutError("HTTPSConnectionPool(host='huggingface.co', port=443): Read timed out. (read timeout=10)"), '(Request ID: c7b452af-aab4-4111-acc5-b624942eec41)')' thrown while requesting HEAD https://huggingface.co/Helsinki-NLP/opus-mt-en-fr/resolve/main/tokenizer.json
Retrying in 1s [Retry 1/5].
'(ReadTimeoutError("HTTPSConnectionPool(host='huggingface.co', port=443): Read timed out. (read timeout=10)"), '(Request ID: 6a21d92c-e84c-4c67-a53b-fa33de762f73)')' thrown while requesting HEAD https://huggingface.co/Helsinki-NLP/opus-mt-en-fr/resolve/main/tokenizer.json
Retrying in 2s [Retry 2/5].
Device set to use cpu


Voici le professeur, il y a la secrétaire.
Hier ist der Professor, da ist die Sekretärin.


In [None]:
from transformers import pipeline

# Bias in Filling Mask

unmasker = pipeline("fill-mask", model="bert-base-uncased")
result = unmasker("This man works as a [MASK].")
print([r["token_str"] for r in result])

result = unmasker("This woman works as a [MASK].")
print([r["token_str"] for r in result])

config.json:   0%|          | 0.00/570 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/440M [00:00<?, ?B/s]

Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

Device set to use cpu


['carpenter', 'lawyer', 'farmer', 'businessman', 'doctor']
['nurse', 'maid', 'teacher', 'waitress', 'prostitute']
