# How to use transformers?
* Install the Transformers, Datasets, and Evaluate libraries to run this notebook.

In [None]:
# Skip this cell if you have already installed the requirements from requirements.txt in your virtual environment
# %pip install datasets evaluate transformers[sentencepiece] torch # VSCode
!pip install datasets evaluate transformers[sentencepiece] torch # Colab

In [1]:
import torch
print(torch.__version__)
print(torch.cuda.is_available())  # Should return True if GPU is available

2.6.0+cpu
False


### 1. Sentiment analysis pipeline

In [None]:

"""
By default, this pipeline selects a particular pretrained model (if skip model = ""), that has been fine-tuned 
for sentiment analysis in English. The model is downloaded and cached when you create the classifier object. 
If you rerun the command, the cached model will be used instead and there is no need to download the model again.
"""

"""
There are three main steps involved when you pass some text to a pipeline:

1. The text is preprocessed into a format the model can understand.
2. The preprocessed inputs are passed to the model.
3. The predictions of the model are post-processed, so you can make sense of them.
"""

from transformers import pipeline

classifier = pipeline("sentiment-analysis", model="cardiffnlp/twitter-roberta-base-sentiment-latest") # Please test different models from model part of HF
classifier("My name is Pejman, and I'm excited to share this notebook, which explores the fascinating topics of Transformers and the Evolution of AI!")

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
Some weights of the model checkpoint at cardiffnlp/twitter-roberta-base-sentiment-latest were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Device set to use cpu


[{'label': 'positive', 'score': 0.9863893389701843}]

In [None]:
# from transformers import pipeline # Not necessary in every cell

# We can even use multiple text classification.
classifier(
    ["I love learning.", 
    "University of Liechtenstein supports innovation.", 
    "AI is constantly evolving.", 
    "I do not enjoy procrastination."]
)


[{'label': 'positive', 'score': 0.969894289970398},
 {'label': 'positive', 'score': 0.840086817741394},
 {'label': 'positive', 'score': 0.5534008145332336},
 {'label': 'negative', 'score': 0.9037972092628479}]

### 2. Zero-shot classification

> We’ll start by tackling a more challenging task where we need to classify texts that haven’t been labelled. This is a common scenario in real-world projects because annotating text is usually time-consuming and requires domain expertise. For this use case, the zero-shot-classification pipeline is very powerful: it allows you to specify which labels to use for the classification, so you don’t have to rely on the labels of the pretrained model. You’ve already seen how the model can classify a sentence as positive or negative using those two labels — but it can also classify the text using any other set of labels you like.

In [None]:
# from transformers import pipeline # Not necessary in every cell

"""
* This pipeline is called zero-shot because you don’t need to fine-tune the model on your data to use it. 
It can directly return probability scores for any list of labels you want
"""

classifier = pipeline("zero-shot-classification")
classifier(
    "This is a course about the Transformers library.",
    candidate_labels=["education", "politics", "business"],
)

No model was supplied, defaulted to facebook/bart-large-mnli and revision d7645e1 (https://huggingface.co/facebook/bart-large-mnli).
Using a pipeline without specifying a model name and revision in production is not recommended.
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
Device set to use cpu


{'sequence': 'This is a course about the Transformers library.',
 'labels': ['education', 'business', 'politics'],
 'scores': [0.8719875812530518, 0.09406551718711853, 0.0339469276368618]}

### 3. Text generation

* This is similar to the predictive text feature that is found on many phones. Text generation involves randomness, so it’s normal if you don’t get the same results as shown below.

In [None]:
# from transformers import pipeline # Not necessary in every cell

generator = pipeline("text-generation")
generator("Students of the University of Liechtenstein")

No model was supplied, defaulted to openai-community/gpt2 and revision 607a30d (https://huggingface.co/openai-community/gpt2).
Using a pipeline without specifying a model name and revision in production is not recommended.
Device set to use cpu
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': 'Students of the University of Liechtenstein are invited to the meeting for further discussion.\n\nProud of the work we accomplished, the opportunity and support we were awarded, the gratitude we give to you, the generosity I feel, the gratitude'}]

### Exercise: ✏️ Try it out! Use the `num_return_sequences` and `max_length` arguments to generate two sentences of 20 words each.

In [None]:
# Solution
# from transformers import pipeline

generator = pipeline("text-generation", model="distilgpt2")
generator(
    "In this deep learning course, we will learn how to",
    max_length=20,
    num_return_sequences=2,
)

Device set to use cpu
Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': 'In this deep learning course, we will learn how to solve the neural networks to identify them in the'},
 {'generated_text': 'In this deep learning course, we will learn how to perform neural nets on people who have no brain'}]

### 4. Mask Filling
* The idea of this task is to fill in the blanks in a given text

In [None]:
# from transformers import pipeline

"""
The top_k argument controls how many possibilities you want to be displayed. 
Note that here the model fills in the special <mask> word, which is often referred to as a mask token. 
Other mask-filling models might have different mask tokens, so it’s always good to verify the proper mask word 
when exploring other models. 
One way to check it is by looking at the mask word used in the widget.
"""

unmasker = pipeline("fill-mask")
unmasker("Pejman and Benjamin help students to <mask> about AI", top_k=2)

No model was supplied, defaulted to distilbert/distilroberta-base and revision fb53ab8 (https://huggingface.co/distilbert/distilroberta-base).
Using a pipeline without specifying a model name and revision in production is not recommended.
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
Some weights of the model checkpoint at distilbert/distilroberta-base were not used when initializing RobertaForMaskedLM: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
- This IS expected if you are initializing RobertaForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForMaskedLM from the check

[{'score': 0.8917235732078552,
  'token': 1532,
  'token_str': ' learn',
  'sequence': 'Pejman and Benjamin help students to learn about AI'},
 {'score': 0.03247198462486267,
  'token': 206,
  'token_str': ' think',
  'sequence': 'Pejman and Benjamin help students to think about AI'}]

### 5. Named entity recognition (NER)
* Named entity recognition (NER) is a task where the model has to find which parts of the input text correspond to entities such as persons, locations, or organizations.

In [None]:
# from transformers import pipeline

"""
the option grouped_entities=True in the pipeline creation function
to tell the pipeline to regroup together the parts of the sentence that correspond to the same entity
"""

ner = pipeline("ner", grouped_entities=True)
ner("My name is Pejman, an AI engineer, and I'm working at the University of Liechtenstein.")

No model was supplied, defaulted to dbmdz/bert-large-cased-finetuned-conll03-english and revision 4c53496 (https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
Some weights of the model checkpoint at dbmdz/bert-large-cased-finetuned-conll03-english were not used when initializing BertForTokenClassification: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT

[{'entity_group': 'PER',
  'score': np.float32(0.99953675),
  'word': 'Pejman',
  'start': 11,
  'end': 17},
 {'entity_group': 'ORG',
  'score': np.float32(0.96328616),
  'word': 'AI',
  'start': 22,
  'end': 24},
 {'entity_group': 'ORG',
  'score': np.float32(0.883817),
  'word': 'University of Liechtenstein',
  'start': 58,
  'end': 85}]

### 6. Question answering
* The question-answering pipeline answers questions using information from a given context


In [None]:
# from transformers import pipeline

question_answerer = pipeline("question-answering")
question_answerer(
    question="What is my job?",
    context="My name is Pejman, an AI engineer, and I'm working at the University of Liechtenstein.",
)

No model was supplied, defaulted to distilbert/distilbert-base-cased-distilled-squad and revision 564e9b5 (https://huggingface.co/distilbert/distilbert-base-cased-distilled-squad).
Using a pipeline without specifying a model name and revision in production is not recommended.
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
Device set to use cpu


{'score': 0.7282426953315735, 'start': 22, 'end': 33, 'answer': 'AI engineer'}

### 7. Summarization 
* Summarization is the task of reducing a text into a shorter text while keeping all (or most) of the important aspects referenced in the text.

In [29]:
# from transformers import pipeline

summarizer = pipeline("summarization")
summarizer(
    """
Chapter 1: Introduction to AI
Artificial Intelligence (AI) refers to the simulation of human intelligence in machines that are programmed to think, learn, and problem-solve. AI systems are designed to mimic human cognitive functions such as perception, decision-making, and reasoning. The technology has evolved rapidly over the years, from simple algorithms to complex neural networks. Today, AI plays a significant role in various industries, ranging from healthcare and finance to transportation and entertainment, improving efficiency, accuracy, and decision-making processes.

Chapter 2: Types of AI
AI can be categorized into two main types: narrow AI and general AI. Narrow AI, also known as weak AI, is designed to perform a specific task, such as language translation, image recognition, or voice assistants like Siri. In contrast, general AI, or strong AI, aims to replicate human intelligence across a wide range of tasks, enabling the machine to perform any intellectual task that a human can do. While narrow AI has already been widely implemented, general AI is still a theoretical concept and has not yet been realized.

Chapter 3: Ethical Considerations in AI
As AI continues to advance, ethical considerations become increasingly important. Issues such as privacy, bias, job displacement, and the potential for misuse of AI technology raise significant concerns. Ensuring that AI systems are developed and used responsibly is crucial for minimizing risks and promoting fairness. Governments, organizations, and researchers are actively working on establishing guidelines and frameworks to address these concerns and ensure that AI benefits society as a whole.


"""
)

No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 and revision a4f8f3e (https://huggingface.co/sshleifer/distilbart-cnn-12-6).
Using a pipeline without specifying a model name and revision in production is not recommended.
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
Device set to use cpu


[{'summary_text': ' Artificial Intelligence (AI) refers to the simulation of human intelligence in machines that are programmed to think, learn, and problem-solve . AI systems are designed to mimic human cognitive functions such as perception, decision-making, and reasoning . The technology has evolved rapidly over the years, from simple algorithms to complex neural networks .'}]

### 8. Translation

In [31]:
# from transformers import pipeline

"""
Model Description:
Developed by: Language Technology Research Group at the University of Helsinki
Model Type: Translation
Language(s):
Source Language: Chinese
Target Language: English
License: CC-BY-4.0

"""
translator = pipeline("translation", model="Helsinki-NLP/opus-mt-zh-en")
translator("我的名字是Pejman")

Device set to use cpu


[{'translation_text': 'My name is Pejman.'}]

In [32]:
# Bias Eaxmple
from transformers import pipeline

unmasker = pipeline("fill-mask", model="bert-base-uncased")
result = unmasker("This man works as a [MASK].")
print([r["token_str"] for r in result])

result = unmasker("This woman works as a [MASK].")
print([r["token_str"] for r in result])

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Device set to use cpu


['carpenter', 'lawyer', 'farmer', 'businessman', 'doctor']
['nurse', 'maid', 'teacher', 'waitress', 'prostitute']
