# Exercises - 1: Inference with pipeline

In this notebook, you will solve example cases from Chapter 1 of the book **"Natural Language Processing with Transformers: Builiding Language Applications with Hugging Face"** by Tunstall, von Werra, and Wolf.

You will get first experience in using pretrained models for specific tasks, using the *pipelines* API from the Hugging Face Transformes library, which allows you to do inference at a very high level of abstraction.

In each exercise, you will first define a *pipeline* for a specific task, using a pretrained model from the Hugging Face models library, and apply the pipeline on an example input text to generate an output.

## Resources:
- [pipeline documentation](https://huggingface.co/docs/transformers/main_classes/pipelines)
- [Hugging Face models library](https://huggingface.co/models)


## The input text:

In [1]:
text = """A miracle is taking place as you read these lines: the squiggles on this page
are trans‐ forming into words and concepts and emotions as they navigate their way
through your cortex. My thoughts from November 2021 have now successfully invaded your
brain. If they manage to catch your attention and survive long enough in this harsh and
highly competitive environment, they may have a chance to reproduce again as you share
these thoughts with others. Thanks to language, thoughts have become air‐borne and
highly contagious brain germs—and no vaccine is coming.

Luckily, most brain germs are harmless,1 and a few are wonderfully useful. In fact,
humanity’s brain germs constitute two of our most precious treasures: knowledge and
culture. Much as we can’t digest properly without healthy gut bacteria, we cannot think
properly without healthy brain germs. Most of your thoughts are not actually yours: they
arose and grew and evolved in many other brains before they infected you. So if we want
to build intelligent machines, we will need to find a way to infect them too.

The good news is that another miracle has been unfolding over the last few years:
several breakthroughs in deep learning have given birth to powerful language models.
Since you are reading this book, you have probably seen some astonishing demos ofthese
language models, such as GPT-3, which given a short prompt such as “a frog meets a
crocodile” can write a whole story. Although it’s not quite Shakespeare yet, it’s
ometimes hard to believe that these texts were written by an artificial neural net‐
work. In fact, GitHub’s Copilot system is helping me write these lines: you’ll never now
how much I really wrote."""

In [2]:
from transformers import pipeline

## Task 1 - Text classification

1. Define a classifier pipeline using the ` "distilbert-base-uncased-finetuned-sst-2-english"` model.

2. Apply the pipeline on each paragraph of the input text to extract sentiments.

[{'label': 'POSITIVE', 'score': 0.9735191464424133}]

## Task 2 - Named entity recognition
1. Define a named entity recognition (ner) pipeline using the `"dslim/bert-base-NER-uncased"` model. Use the same value for `aggregation_strategy`, as in the slides.

2. Apply the pipeline on each paragraph of the text input

[{'entity_group': 'MISC',
  'score': 0.7362914,
  'word': 'humanity',
  'start': 647,
  'end': 655},
 {'entity_group': 'MISC',
  'score': 0.8491372,
  'word': 'gpt',
  'start': 1354,
  'end': 1357},
 {'entity_group': 'MISC',
  'score': 0.8317224,
  'word': '3',
  'start': 1358,
  'end': 1359},
 {'entity_group': 'MISC',
  'score': 0.9769497,
  'word': 'shakespeare',
  'start': 1472,
  'end': 1483},
 {'entity_group': 'ORG',
  'score': 0.9260996,
  'word': 'github',
  'start': 1593,
  'end': 1599}]

## Task 3: Question answering
1. Define a question answering pipeline using the `"question-answering"` task and the `"distilbert-base-cased-distilled-squad"` model.

2. Apply the pipeline on the input text to retrieve answers from the text.

[{'score': 0.027284299954771996,
  'start': 999,
  'end': 1025,
  'answer': 'build intelligent machines'},
 {'score': 0.047199103981256485,
  'start': 1593,
  'end': 1616,
  'answer': 'GitHub’s Copilot system'}]

## Task 4: Summarization
1. Define a text summarizing pipeline using the `"summarization"` task and the `"sshleifer/distilbart-cnn-6-6"` model. Use the same additional arguments as in the lecture slides.

2. Apply the pipeline on each paragraph of the input text to get a summary of the text

[{'summary_text': ' A miracle is taking place as you read these lines: the squiggles on this page are transforming into words and concepts and emotions as they navigate their way through your cortex. Thanks to language, thoughts have'}]

## Task 5: Translation
1. Define a pipeline to translate the input text from English to German, using an appropriate model from the Hugging Face models library.

2. Apply the pipeline on the example input text to translate the content to German.

Ein Wunder findet statt, wie Sie diese Zeilen lesen: die Squiggles auf dieser Seite transformieren sich in Worte und Konzepte und Emotionen, wie sie ihren Weg durch Ihren Cortex navigieren. Meine Gedanken von November 2021 haben nun erfolgreich in Ihr Gehirn eingedrungen. Wenn sie es schaffen, Ihre Aufmerksamkeit zu fangen und lange genug in diesem harten und stark wettbewerbsfähigen Umfeld überleben, können sie eine Chance haben, wieder zu reproduzieren, wie Sie diese Gedanken mit anderen teilen. Dank der Sprache, Gedanken sind Luft getragen und hoch ansteckende Gehirnkeime geworden – und kein Impfstoff kommt. Zum Glück, die meisten Gehirnkeime sind harmlos,1 und ein paar sind wunderbar nützlich. In der Tat, die Menschheit, Gehirnkeime bilden zwei unserer wertvollsten Schätze: Wissen und Kultur. So viel wie wir nicht richtig verdauen können ohne gesunde Darmbakterien, können wir nicht richtig denken, ohne gesunde Gehirnkeime. Die meisten Ihrer Gedanken sind nicht wirklich Ihre: sie en