# Transformers, what can they do?

In [13]:
from transformers import pipeline
import warnings
warnings.filterwarnings('ignore')

classifier = pipeline("sentiment-analysis")
classifier("I've been waiting for a HuggingFace course my whole life.")

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision 714eb0f.
Using a pipeline without specifying a model name and revision in production is not recommended.


Loading weights:   0%|          | 0/104 [00:00<?, ?it/s]

[{'label': 'POSITIVE', 'score': 0.9598049521446228}]

In [14]:
classifier(
    ["I've been waiting for a HuggingFace course my whole life.", "I hate this so much!"]
)

[{'label': 'POSITIVE', 'score': 0.9598049521446228},
 {'label': 'NEGATIVE', 'score': 0.9994558691978455}]

In [15]:
classifier = pipeline("zero-shot-classification")
classifier(
    "This is a course about the Transformers library",
    candidate_labels=["education", "politics", "business"],
)

No model was supplied, defaulted to facebook/bart-large-mnli and revision d7645e1.
Using a pipeline without specifying a model name and revision in production is not recommended.


Loading weights:   0%|          | 0/515 [00:00<?, ?it/s]

{'sequence': 'This is a course about the Transformers library',
 'labels': ['education', 'business', 'politics'],
 'scores': [0.8445994257926941, 0.11197380721569061, 0.04342673346400261]}

ASSIGNMENT NR. 1

In [16]:
# Look for Assignment Nr.1

generator = pipeline("text-generation")
results = generator(
    "In this course, we will teach you how to",
    max_length=15,
    num_return_sequences=2
)
for i, res in enumerate(results):
    print(f"Sentence {i+1}: {res['generated_text']}")

No model was supplied, defaulted to openai-community/gpt2 and revision 607a30d.
Using a pipeline without specifying a model name and revision in production is not recommended.


Loading weights:   0%|          | 0/148 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=15) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


Sentence 1: In this course, we will teach you how to find a place to live in the world and how to build your own community.
Sentence 2: In this course, we will teach you how to control your eyes. We will teach you how to look at objects. We will teach you how to touch your body and how to breathe. We will teach you how to talk to others. We will teach you how to think. We will teach you how to eat. We will teach you how to look at the moon and the stars. You will learn to control your brain. You will learn to control your body. We will teach you how to read your mind. You will learn to control your mind. We will teach you to control your body. You will learn to control your body. You will learn to control your body. You will learn to control your body. You will learn to control your body. You will learn to control your body. You will learn to control your body. You will learn to control your body. You will learn to control your body. You will learn to control your body. You will learn 

ASSIGNMENT NR. 2

In [17]:
# Look for Assignment Nr.2

generator = pipeline("text-generation", model="distilgpt2")
generator(
    "In this course, we will teach you how to",
    max_length=30,
    num_return_sequences=2,
)
for idx, res in enumerate(results, 1):
    print(res["generated_text"])

Loading weights:   0%|          | 0/76 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=30) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


In this course, we will teach you how to find a place to live in the world and how to build your own community.
In this course, we will teach you how to control your eyes. We will teach you how to look at objects. We will teach you how to touch your body and how to breathe. We will teach you how to talk to others. We will teach you how to think. We will teach you how to eat. We will teach you how to look at the moon and the stars. You will learn to control your brain. You will learn to control your body. We will teach you how to read your mind. You will learn to control your mind. We will teach you to control your body. You will learn to control your body. You will learn to control your body. You will learn to control your body. You will learn to control your body. You will learn to control your body. You will learn to control your body. You will learn to control your body. You will learn to control your body. You will learn to control your body. You will learn to control your body. Yo

ASSIGNMENT NR.3

In [18]:
# Look for Assignment Nr.3

unmasker = pipeline("fill-mask", model="bert-base-cased")
results = unmasker("This course will teach you all about [MASK] models.", top_k=2)
for res in results:
    print(f"Prediction: {res['sequence']} (score: {res['score']:.4f})")

Loading weights:   0%|          | 0/202 [00:00<?, ?it/s]

[1mBertForMaskedLM LOAD REPORT[0m from: bert-base-cased
Key                         | Status     |  | 
----------------------------+------------+--+-
cls.seq_relationship.weight | UNEXPECTED |  | 
bert.pooler.dense.weight    | UNEXPECTED |  | 
bert.pooler.dense.bias      | UNEXPECTED |  | 
cls.seq_relationship.bias   | UNEXPECTED |  | 

[3mNotes:
- UNEXPECTED[3m	:can be ignored when loading from different task/architecture; not ok if you expect identical arch.[0m


Prediction: This course will teach you all about role models. (score: 0.2596)
Prediction: This course will teach you all about the models. (score: 0.0943)


ASSIGNMENT NR. 4

In [19]:
# Named entity recognition (NER) is a task where the model has to find which parts of the input text
# correspond to entities such as persons, locations, or organizations.

# Look for Assignment Nr.4

pos_tagger = pipeline("token-classification", model="vblagoje/bert-english-uncased-finetuned-pos", aggregation_strategy="simple")
result = pos_tagger("My name is Sylvain and I work at Hugging Face in Brooklyn.")
for token in result:
    print(f"{token['word']}: {token['entity_group']}")

Loading weights:   0%|          | 0/199 [00:00<?, ?it/s]

[1mBertForTokenClassification LOAD REPORT[0m from: vblagoje/bert-english-uncased-finetuned-pos
Key                          | Status     |  | 
-----------------------------+------------+--+-
bert.embeddings.position_ids | UNEXPECTED |  | 
bert.pooler.dense.weight     | UNEXPECTED |  | 
bert.pooler.dense.bias       | UNEXPECTED |  | 

[3mNotes:
- UNEXPECTED[3m	:can be ignored when loading from different task/architecture; not ok if you expect identical arch.[0m


my: PRON
name: NOUN
is: AUX
sylvain: PROPN
and: CCONJ
i: PRON
work: VERB
at: ADP
hugging face: PROPN
in: ADP
brooklyn: PROPN
.: PUNCT


In [20]:
question_answerer = pipeline("question-answering")
question_answerer(
    question="Where do I work?",
    context="My name is Sylvain and I work at Hugging Face in Brooklyn",
)

No model was supplied, defaulted to distilbert/distilbert-base-cased-distilled-squad and revision 564e9b5.
Using a pipeline without specifying a model name and revision in production is not recommended.


Loading weights:   0%|          | 0/102 [00:00<?, ?it/s]

{'score': 0.6949766278266907, 'start': 33, 'end': 45, 'answer': 'Hugging Face'}

In [21]:
summarizer = pipeline("text-generation")
summarizer(
    """
    America has changed dramatically during recent years. Not only has the number of
    graduates in traditional engineering disciplines such as mechanical, civil,
    electrical, chemical, and aeronautical engineering declined, but in most of
    the premier American universities engineering curricula now concentrate on
    and encourage largely the study of engineering science. As a result, there
    are declining offerings in engineering subjects dealing with infrastructure,
    the environment, and related issues, and greater concentration on high
    technology subjects, largely supporting increasingly complex scientific
    developments. While the latter is important, it should not be at the expense
    of more traditional engineering.

    Rapidly developing economies such as China and India, as well as other
    industrial countries in Europe and Asia, continue to encourage and advance
    the teaching of engineering. Both China and India, respectively, graduate
    six and eight times as many traditional engineers as does the United States.
    Other industrial countries at minimum maintain their output, while America
    suffers an increasingly serious decline in the number of engineering graduates
    and a lack of well-educated engineers.
"""
)

No model was supplied, defaulted to openai-community/gpt2 and revision 607a30d.
Using a pipeline without specifying a model name and revision in production is not recommended.


Loading weights:   0%|          | 0/148 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=50) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


[{'generated_text': '\n    America has changed dramatically during recent years. Not only has the number of\n    graduates in traditional engineering disciplines such as mechanical, civil,\n    electrical, chemical, and aeronautical engineering declined, but in most of\n    the premier American universities engineering curricula now concentrate on\n    and encourage largely the study of engineering science. As a result, there\n    are declining offerings in engineering subjects dealing with infrastructure,\n    the environment, and related issues, and greater concentration on high\n    technology subjects, largely supporting increasingly complex scientific\n    developments. While the latter is important, it should not be at the expense\n    of more traditional engineering.\n\n    Rapidly developing economies such as China and India, as well as other\n    industrial countries in Europe and Asia, continue to encourage and advance\n    the teaching of engineering. Both China and India, r

In [22]:
# Look for Assignment Nr.5
translator = pipeline("translation_fr_to_en", model="Helsinki-NLP/opus-mt-fr-en")
translator("Ce cours est produit par Hugging Face.")

KeyError: 'translation'

I was unable to complete the 5th assignment because I received several errors in the previous code cell.

## Assignment

1. Use the num_return_sequences and max_length arguments to generate two sentences of 15 words each.
2. Use the filters to find a text generation model for another language. Feel free to play with the widget and use it in a pipeline!
3. Search for the bert-base-cased model on the Hub and identify its mask word in the Inference API widget. What does this model predict for the sentence in our pipeline example above?
4. Search the Model Hub for a model able to do part-of-speech tagging (usually abbreviated as POS) in English. What does this model predict for the sentence in the example above?
5. Search for translation models in other languages and try to translate the previous sentence into a few different languages.

## License
Source: https://huggingface.co/learn/llm-course/chapter1/3