# Transformers, what can they do?

This notebook contains some examples of the different tasks that can be done through the Huggingface "pipeline" library!

Install the Transformers and Datasets libraries to run this notebook.

In [60]:
! pip install datasets transformers[sentencepiece]



In [61]:
# Sentiment Analysis

In [62]:
from transformers import pipeline

classifier = pipeline("sentiment-analysis")
classifier("I've been waiting for a HuggingFace course my whole life.")

[{'label': 'POSITIVE', 'score': 0.9598047137260437}]

In [63]:
classifier([
    "I've been waiting for a HuggingFace course my whole life.", 
    "I hate this so much!",
    "I am indifferent about starting these new Huggingface courses.",
    "I am so glad that Huggingface has launched these new educational courses."
])

[{'label': 'POSITIVE', 'score': 0.9598047137260437},
 {'label': 'NEGATIVE', 'score': 0.9994558095932007},
 {'label': 'NEGATIVE', 'score': 0.999576210975647},
 {'label': 'POSITIVE', 'score': 0.9977303147315979}]

In [64]:
# Zero-Shot Text Sequence Classification

In [65]:
from transformers import pipeline

classifier = pipeline("zero-shot-classification")
classifier(
    "This is a course about the Transformers library",
    candidate_labels=["education", "politics", "business","machine learning","artificial intelligence"]
)

{'labels': ['education',
  'artificial intelligence',
  'business',
  'machine learning',
  'politics'],
 'scores': [0.6956349611282349,
  0.10013897716999054,
  0.09222657978534698,
  0.07623127847909927,
  0.03576822578907013],
 'sequence': 'This is a course about the Transformers library'}

In [66]:
# Text Generation

In [67]:
from transformers import pipeline

generator = pipeline("text-generation")
generator("In this course, we will teach you how to")

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': "In this course, we will teach you how to use Microsoft's online security service to build your own self-protecting, self-rewarding, self-reliant and resilient applications. You will also understand what it takes to survive; learn"}]

In [68]:
from transformers import pipeline

generator = pipeline("text-generation", model="distilgpt2")
generator(
    "In this course, we will teach you how to",
    max_length=15,
    num_return_sequences=2,
)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': 'In this course, we will teach you how to use a standard, open'},
 {'generated_text': 'In this course, we will teach you how to use common techniques without using'}]

In [69]:
# Fill-Mask

In [70]:
from transformers import pipeline

unmasker = pipeline("fill-mask", model="bert-base-cased")
unmasker("This course will teach you all about [MASK] models.", top_k=4)

Some weights of the model checkpoint at bert-base-cased were not used when initializing BertForMaskedLM: ['cls.seq_relationship.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


[{'score': 0.25963109731674194,
  'sequence': 'This course will teach you all about role models.',
  'token': 1648,
  'token_str': 'role'},
 {'score': 0.09427270293235779,
  'sequence': 'This course will teach you all about the models.',
  'token': 1103,
  'token_str': 'the'},
 {'score': 0.033867526799440384,
  'sequence': 'This course will teach you all about fashion models.',
  'token': 4633,
  'token_str': 'fashion'},
 {'score': 0.025944111868739128,
  'sequence': 'This course will teach you all about life models.',
  'token': 1297,
  'token_str': 'life'}]

In [71]:
# Named Entity Recognition

In [72]:
from transformers import pipeline

ner = pipeline("ner", grouped_entities=True)
ner("My name is Jacob and I work at Sightly who uses Huggingface Transformers to develop machine learning workflows.")

  f'`grouped_entities` is deprecated and will be removed in version v5.0.0, defaulted to `aggregation_strategy="{aggregation_strategy}"` instead.'


[{'end': 16,
  'entity_group': 'PER',
  'score': 0.9992159,
  'start': 11,
  'word': 'Jacob'},
 {'end': 38,
  'entity_group': 'ORG',
  'score': 0.98637885,
  'start': 31,
  'word': 'Sightly'},
 {'end': 72,
  'entity_group': 'ORG',
  'score': 0.95767474,
  'start': 48,
  'word': 'Huggingface Transformers'}]

In [73]:
# Part of Speech Tagging

In [74]:
from transformers import pipeline
generator = pipeline('ner',
                     model='mrm8488/mobilebert-finetuned-pos',
                     tokenizer='mrm8488/mobilebert-finetuned-pos',
                     grouped_entities=True)
generator("My name is Jacob and I work at Sightly who uses Huggingface Transformers to develop machine learning workflows.")

  f'`grouped_entities` is deprecated and will be removed in version v5.0.0, defaulted to `aggregation_strategy="{aggregation_strategy}"` instead.'
Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.
  scores = np.exp(entities) / np.exp(entities).sum(-1, keepdims=True)
  scores = np.exp(entities) / np.exp(entities).sum(-1, keepdims=True)
  scores = np.nanmean([entity["score"] for entity in entities])


[{'end': 2,
  'entity_group': 'PRP',
  'score': 0.8868122,
  'start': 0,
  'word': 'my'},
 {'end': 7,
  'entity_group': 'NN',
  'score': 0.98060846,
  'start': 3,
  'word': 'name'},
 {'end': 10,
  'entity_group': 'VBZ',
  'score': 0.99738294,
  'start': 8,
  'word': 'is'},
 {'end': 16,
  'entity_group': 'NNP',
  'score': 0.98590136,
  'start': 11,
  'word': 'jacob'},
 {'end': 20,
  'entity_group': 'CC',
  'score': 0.99994904,
  'start': 17,
  'word': 'and'},
 {'end': 22,
  'entity_group': 'PRP',
  'score': 0.7972169,
  'start': 21,
  'word': 'i'},
 {'end': 27,
  'entity_group': 'VBP',
  'score': 0.58215654,
  'start': 23,
  'word': 'work'},
 {'end': 30,
  'entity_group': 'IN',
  'score': 0.98807603,
  'start': 28,
  'word': 'at'},
 {'end': 36,
  'entity_group': 'JJ',
  'score': 0.85107315,
  'start': 31,
  'word': 'sight'},
 {'end': 38,
  'entity_group': 'NNP',
  'score': 0.3509983,
  'start': 36,
  'word': '##ly'},
 {'end': 42,
  'entity_group': 'WP',
  'score': 0.9984299,
  'start': 

In [75]:
# Question Answering

In [76]:
from transformers import pipeline

question_answerer = pipeline("question-answering")
question_answerer(
    question="What does Sightly use to develop machine learning workflows?",
    context="My name is Jacob and I work at Sightly who uses Huggingface Transformers to develop machine learning workflows."
)

{'answer': 'Huggingface Transformers',
 'end': 72,
 'score': 0.9880443215370178,
 'start': 48}

In [77]:
# Summarization

In [78]:
from transformers import pipeline

summarizer = pipeline("summarization")
summarizer("""
    America has changed dramatically during recent years. Not only has the number of 
    graduates in traditional engineering disciplines such as mechanical, civil, 
    electrical, chemical, and aeronautical engineering declined, but in most of 
    the premier American universities engineering curricula now concentrate on 
    and encourage largely the study of engineering science. As a result, there 
    are declining offerings in engineering subjects dealing with infrastructure, 
    the environment, and related issues, and greater concentration on high 
    technology subjects, largely supporting increasingly complex scientific 
    developments. While the latter is important, it should not be at the expense 
    of more traditional engineering.

    Rapidly developing economies such as China and India, as well as other 
    industrial countries in Europe and Asia, continue to encourage and advance 
    the teaching of engineering. Both China and India, respectively, graduate 
    six and eight times as many traditional engineers as does the United States. 
    Other industrial countries at minimum maintain their output, while America 
    suffers an increasingly serious decline in the number of engineering graduates 
    and a lack of well-educated engineers.
""")

[{'summary_text': ' America has changed dramatically during recent years . The number of engineering graduates in the U.S. has declined in traditional engineering disciplines such as mechanical, civil,    electrical, chemical, and aeronautical engineering . Rapidly developing economies such as China and India continue to encourage and advance the teaching of engineering .'}]

In [79]:
# Text Translation

In [80]:
from transformers import pipeline

translator = pipeline("translation", model="Helsinki-NLP/opus-mt-en-es")
translator("This course was created by the great folks over at Huggingfae in Brooklyn, New York.")

[{'translation_text': 'Este curso fue creado por la gran gente de Huggingfae en Brooklyn, Nueva York.'}]