## Pipelines

* Most basic of objects in Transformers library. 
* Connects a model with its necessary preprocessing and postprocessing steps.

In [1]:
from transformers import pipeline

classifier = pipeline("sentiment-analysis")
classifier("I've been waiting for a HuggingFace course my whole life.")

No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
Downloading (…)lve/main/config.json: 100%|██████████| 629/629 [00:00<00:00, 230kB/s]
Downloading pytorch_model.bin: 100%|██████████| 268M/268M [00:57<00:00, 4.66MB/s] 
Downloading (…)okenizer_config.json: 100%|██████████| 48.0/48.0 [00:00<00:00, 31.7kB/s]
Downloading (…)solve/main/vocab.txt: 100%|██████████| 232k/232k [00:01<00:00, 213kB/s]


[{'label': 'POSITIVE', 'score': 0.9598048329353333}]

* Pipeline, by default, selects a pretrained model that has been fine-tuned for sentiment analysis in English.
* The model is downloaded and cached

HuggigFace's currently available pipelines are:

* feature-extraction (get the vector representation of a text)
* fill-mask
* ner (named entity recognition)
* question-answering
* sentiment-analysis
* summarization
* text-generation
* translation
* zero-shot-classification

### Zero-shot Classification

* Allows us to specify which labels to use for classification 

In [4]:
classifier = pipeline('zero-shot-classification')
classifier(
    "This is a course about the Transformers library",
    candidate_labels = ["education", "politics", "business"]
)

Output would be
```py

{
    'sequence': 'This is a course about the Transformers library',
    'labels': ['education', 'business', 'politics'],
    'scores': [0.8445963859558105, 0.111976258456707, 0.043427448719739914]
}

```


### Text Generation

* Given a prompt, auto-complete it be generating remaining text.

In [5]:
generator = pipeline("text-generation")
generator("In this course, we will teach you how to")

Output would be 
```py
[{'generated_text': 'In this course, we will teach you how to understand and use data flow and data interchange when handling user data. We will be working with one or more of the most commonly used data flows — data flows of various types, as seen by the HTTP'}]
```


### Model from Hub 

* Any model from model hub can be used in pipeline


In [6]:
generator = pipeline("text-generation", model="distilgpt2")
generator(
    "In this course, we will teach you how to",
    max_length=30,
    num_return_sequences=2,
)

Output would be
```py

[{'generated_text': 'In this course, we will teach you how to manipulate the world and move your mental and physical capabilities to your advantage.'},
 {'generated_text': 'In this course, we will teach you how to become an expert and practice realtime, and with a hands on experience on both realtime and real'}]

```

### Inference API

* All models can be tested in browser using the inference API

### Mask Filling

* Fill in the blanks

In [9]:
unmasker = pipeline("fill-mask")
unmasker("This course will teach you all about <mask> models.", top_k=2)

Output would be

```py

[{'sequence': 'This course will teach you all about mathematical models.',
  'score': 0.19619831442832947,
  'token': 30412,
  'token_str': ' mathematical'},
 {'sequence': 'This course will teach you all about computational models.',
  'score': 0.04052725434303284,
  'token': 38163,
  'token_str': ' computational'}]
```


### Named Entity Recognition

* Find which parts of input text corresponds to which entities

In [10]:
ner = pipeline("ner", grouped_entities=True)
ner("My name is Sylvain and I work at Hugging Face in Brooklyn.")

Output would be

```py
[{'entity_group': 'PER', 'score': 0.99816, 'word': 'Sylvain', 'start': 11, 'end': 18}, 
 {'entity_group': 'ORG', 'score': 0.97960, 'word': 'Hugging Face', 'start': 33, 'end': 45}, 
 {'entity_group': 'LOC', 'score': 0.99321, 'word': 'Brooklyn', 'start': 49, 'end': 57}
]
```

### Question Answering

* Given context and question, give answer

In [11]:
question_answerer = pipeline("question-answering")
question_answerer(
    question="Where do I work?",
    context="My name is Sylvain and I work at Hugging Face in Brooklyn",
)

Output would be 
```py
{'score': 0.6385916471481323, 'start': 33, 'end': 45, 'answer': 'Hugging Face'}
```

### Summarization

* Reduce text to a shorter version



In [12]:
summarizer = pipeline("summarization")
summarizer(
    """
    America has changed dramatically during recent years. Not only has the number of 
    graduates in traditional engineering disciplines such as mechanical, civil, 
    electrical, chemical, and aeronautical engineering declined, but in most of 
    the premier American universities engineering curricula now concentrate on 
    and encourage largely the study of engineering science. As a result, there 
    are declining offerings in engineering subjects dealing with infrastructure, 
    the environment, and related issues, and greater concentration on high 
    technology subjects, largely supporting increasingly complex scientific 
    developments. While the latter is important, it should not be at the expense 
    of more traditional engineering.

    Rapidly developing economies such as China and India, as well as other 
    industrial countries in Europe and Asia, continue to encourage and advance 
    the teaching of engineering. Both China and India, respectively, graduate 
    six and eight times as many traditional engineers as does the United States. 
    Other industrial countries at minimum maintain their output, while America 
    suffers an increasingly serious decline in the number of engineering graduates 
    and a lack of well-educated engineers.
"""
)

Output would be

```py

[{'summary_text': ' America has changed dramatically during recent years . The number of engineering graduates in the U.S. has declined in traditional engineering disciplines such as mechanical, civil, electrical, chemical, and aeronautical engineering . Rapidly developing economies such as China and India, as well as other industrial countries in Europe and Asia, continue to encourage and advance engineering .'}]
```

### Translation

* Translate from one language to another

In [13]:
translator = pipeline("translation", model="Helsinki-NLP/opus-mt-fr-en")
translator("Ce cours est produit par Hugging Face.")

Output would be

```py
[{'translation_text': 'This course is produced by Hugging Face.'}]
```