#### The pipline function returns and end-to-end object that performs and NLP task on one or several texts, it supports most common NLP tasks out of the box.
#### The pipline consists of three stages
<div>
<img src="image/pipeline1.png" width="800"/>
</div>

#### The first task for trying the pipeline API on is ***sentiment analysis***, it classifies texts as positive or negative

In [62]:
from transformers import pipeline

classifier = pipeline("sentiment-analysis")
result = classifier("I have been waiting for a HuggingFace course my whole life.")
print(result)

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.


[{'label': 'POSITIVE', 'score': 0.9433631896972656}]


#### Multiple texts can be passed to the object returned by a pipeline to treat them together

In [63]:
from transformers import pipeline

classifier = pipeline("sentiment-analysis")
result = classifier(["I have been waiting for a HuggingFace course my whole life.",
                    "I hate it so much!"])
print(result)

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.


[{'label': 'POSITIVE', 'score': 0.9433631896972656}, {'label': 'NEGATIVE', 'score': 0.9995473027229309}]


#### The ***zero-shot-classification*** pipeline lets you selecet the labels for classification

In [64]:
from transformers import pipeline

classifier = pipeline("zero-shot-classification")
result = classifier(
    "This is a course about the Transfomers library.",
    candidate_labels=["education", "politics", "bussiness"]
    )
print(result)

No model was supplied, defaulted to facebook/bart-large-mnli and revision c626438 (https://huggingface.co/facebook/bart-large-mnli).
Using a pipeline without specifying a model name and revision in production is not recommended.
Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.


{'sequence': 'This is a course about the Transfomers library.', 'labels': ['education', 'bussiness', 'politics'], 'scores': [0.8097766041755676, 0.14343445003032684, 0.04678897187113762]}


#### The ***text-generation*** pipeline uses an input prompt to generate text 

In [65]:
from transformers import pipeline

generator = pipeline("text-generation")
result = generator("In this course, we will teach you how to",
                   pad_token_id=generator.tokenizer.eos_token_id
                   )
print(result)

No model was supplied, defaulted to openai-community/gpt2 and revision 6c0e608 (https://huggingface.co/openai-community/gpt2).
Using a pipeline without specifying a model name and revision in production is not recommended.
Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.


[{'generated_text': 'In this course, we will teach you how to build and install the most versatile Bluetooth® Smart Bluetooth devices. We will also use your mobile telephone to communicate with your music and music apps using your mobile phone. As you know, it is well known'}]


#### For each task, you can search the model hub for various models to use in the pipeline: [HuggingFace model hub](https://huggingface.co/models)

#### Here is another ***text generation*** pipeline, using the ***distilgpt2*** model

In [66]:
from transformers import pipeline

generator = pipeline("text-generation",
                     model="distilgpt2"
                     )
result = generator(
    "In this course, we will teach you how to",
    max_length=30,
    num_return_sequences=2,
)
print(result)


Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.
Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': 'In this course, we will teach you how to apply to all topics on the subject in this class. As a result, you will gain experience through'}, {'generated_text': 'In this course, we will teach you how to manipulate the computer and use different algorithms and techniques. The lessons we will teach you to apply you to'}]


#### The ***fill-mask*** pipeline  will predict missing words in a sentence

In [67]:
from transformers import pipeline

unmasker = pipeline("fill-mask")
result = unmasker(
    "This course will teach you all about <mask> models.",
    top_k=2
    )
print(result)

No model was supplied, defaulted to distilbert/distilroberta-base and revision ec58a5b (https://huggingface.co/distilbert/distilroberta-base).
Using a pipeline without specifying a model name and revision in production is not recommended.
Some weights of the model checkpoint at distilbert/distilroberta-base were not used when initializing RobertaForMaskedLM: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
- This IS expected if you are initializing RobertaForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `P

[{'score': 0.19198395311832428, 'token': 30412, 'token_str': ' mathematical', 'sequence': 'This course will teach you all about mathematical models.'}, {'score': 0.04209199175238609, 'token': 38163, 'token_str': ' computational', 'sequence': 'This course will teach you all about computational models.'}]


#### The ***NER*** pipeline indentifies entities such as persons, organizations or locations in a sentence.

In [68]:
from transformers import pipeline

ner = pipeline("ner", grouped_entities=True)
result = ner("My name is Qiyao Xue and I am studying the HuggingFace transformers library.")
print(result)

No model was supplied, defaulted to dbmdz/bert-large-cased-finetuned-conll03-english and revision f2482bf (https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
Some weights of the model checkpoint at dbmdz/bert-large-cased-finetuned-conll03-english were not used when initializing BertForTokenClassification: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Hardware accelerator e.g. GPU is

[{'entity_group': 'PER', 'score': 0.9850975, 'word': 'Qiyao Xue', 'start': 11, 'end': 20}, {'entity_group': 'ORG', 'score': 0.9732388, 'word': 'HuggingFace', 'start': 43, 'end': 54}]




#### The ***question-answering*** pipeline extracts answers to a question from a given context

In [69]:
from transformers import pipeline

QA = pipeline("question-answering")
result = QA(
    question="Where do I work?",
    context="My name is QiyaoXue and I work at Pittsburgh"
)
print(result)

No model was supplied, defaulted to distilbert/distilbert-base-cased-distilled-squad and revision 626af31 (https://huggingface.co/distilbert/distilbert-base-cased-distilled-squad).
Using a pipeline without specifying a model name and revision in production is not recommended.
Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.


{'score': 0.9962271451950073, 'start': 34, 'end': 44, 'answer': 'Pittsburgh'}


#### The ***summarization*** pipeline creates summaries of long texts

In [70]:
from transformers import pipeline

summarizer = pipeline("summarization")
result = summarizer("""Hugging Face, Inc. is an American company incorporated under the Delaware General Corporation Law[1] and based in New York City that develops computation tools for building applications using machine learning. It is most notable for its transformers library built for natural language processing applications and its platform that allows users to share machine learning models and datasets and showcase their work.""",
                    max_length=18
                    )
print(result)

No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 and revision a4f8f3e (https://huggingface.co/sshleifer/distilbart-cnn-12-6).
Using a pipeline without specifying a model name and revision in production is not recommended.
Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.
Your min_length=56 must be inferior than your max_length=18.


[{'summary_text': ' Hugging Face, Inc. develops computation tools for building applications using machine learning'}]


#### The ***translation*** pipeline translate text from one language to another

In [71]:
from transformers import pipeline

translator = pipeline("translation",
                      model="Helsinki-NLP/opus-mt-fr-en"
                      )
result = translator("Ce cours est produit par HuggingFace.")
print(result)

Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.


[{'translation_text': 'This course is produced by HuggingFace.'}]


#### The pipline consists of three stages
<div>
<img src="image/pipeline1.png" width="800"/>
</div>

## Stage1: Tokenlization
<div>
<img src="image/pipeline2.png" width='800'/>
</div>

#### The ***AutoTokenizer*** class can load the tokeniszer for any checkpoint

In [72]:
from transformers import AutoTokenizer

checkpoint = "distilbert-base-uncased-finetuned-sst-2-english"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)

raw_inputs = [
    "I have been waiting for a HuggingFace course my whole life.",
    "I hate it so much"
]

inputs = tokenizer(raw_inputs, padding=True, truncation=True, return_tensors="pt")
# padding=True: padding science the two sentences are not of the same size, the shortest one need to be padded to be able to build an array
# truncation=True: ensure that anyt sentence longer than the maximum the model can handle is truncated
# return_tensors="pt": let tokenizer returns pytorch tensor
# the output attention_mask indicate where the padding has been applied, so the model not pay attention to it
print(inputs)

{'input_ids': tensor([[  101,  1045,  2031,  2042,  3403,  2005,  1037, 17662, 12172,  2607,
          2026,  2878,  2166,  1012,   102],
        [  101,  1045,  5223,  2009,  2061,  2172,   102,     0,     0,     0,
             0,     0,     0,     0,     0]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0]])}


## Stage2: Model

#### The ***AutoModel*** class loads a modle without its pretaining head

In [73]:
from transformers import AutoModel

checkpoint = "distilbert-base-uncased-finetuned-sst-2-english"
model = AutoModel.from_pretrained(checkpoint)
outputs = model(**inputs)
print(outputs.last_hidden_state.shape)
# the output [2, 15, 768] refers to [batch size, sequence length, hidden size]

torch.Size([2, 15, 768])


#### Each ***AutoModelForXXX*** class loads a model sutiable for a specific task

In [74]:
from transformers import AutoModelForSequenceClassification

checkpoint = "distilbert-base-uncased-finetuned-sst-2-english"
model = AutoModelForSequenceClassification.from_pretrained(checkpoint)
outputs = model(**inputs)
print(outputs.logits)
# logits refers to the last layer FFN output

tensor([[-1.3782,  1.4346],
        [ 4.3257, -3.4977]], grad_fn=<AddmmBackward0>)


## Stage3: Postprocessing

#### To go from logits to probabilities we apply a SoftMax layer

In [75]:
import torch
predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
print(predictions.detach())

tensor([[5.6637e-02, 9.4336e-01],
        [9.9960e-01, 4.0010e-04]])


In [76]:
id2label = model.config.id2label
print(id2label)
print(f"sentence1: {id2label[torch.argmax(predictions[0]).item()]}, sentence2: {id2label[torch.argmax(predictions[1]).item()]}")

{0: 'NEGATIVE', 1: 'POSITIVE'}
sentence1: POSITIVE, sentence2: NEGATIVE


## Inside the token classification pipeline

#### The token classification pipeline gives each token in the sentence a label, whether each word corresponding to a person, an organization or a location

In [77]:
from transformers import pipeline

token_classifier = pipeline("token-classification")
result = token_classifier("My name is QiyaoXue and I am from China")
print(result)


No model was supplied, defaulted to dbmdz/bert-large-cased-finetuned-conll03-english and revision f2482bf (https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
Some weights of the model checkpoint at dbmdz/bert-large-cased-finetuned-conll03-english were not used when initializing BertForTokenClassification: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Hardware accelerator e.g. GPU is

[{'entity': 'I-PER', 'score': 0.99662614, 'index': 4, 'word': 'Qi', 'start': 11, 'end': 13}, {'entity': 'I-PER', 'score': 0.9735996, 'index': 5, 'word': '##ya', 'start': 13, 'end': 15}, {'entity': 'I-PER', 'score': 0.94115335, 'index': 6, 'word': '##o', 'start': 15, 'end': 16}, {'entity': 'I-PER', 'score': 0.9444561, 'index': 7, 'word': '##X', 'start': 16, 'end': 17}, {'entity': 'I-PER', 'score': 0.9376296, 'index': 8, 'word': '##ue', 'start': 17, 'end': 19}, {'entity': 'I-LOC', 'score': 0.99976164, 'index': 13, 'word': 'China', 'start': 34, 'end': 39}]


#### It can also group together tokens corresponding to the same entity

In [78]:
token_classifier = pipeline("token-classification", aggregation_strategy="simple")
agg_result = token_classifier("My name is QiyaoXue and I am from China")
print(agg_result)

No model was supplied, defaulted to dbmdz/bert-large-cased-finetuned-conll03-english and revision f2482bf (https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
Some weights of the model checkpoint at dbmdz/bert-large-cased-finetuned-conll03-english were not used when initializing BertForTokenClassification: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Hardware accelerator e.g. GPU is

[{'entity_group': 'PER', 'score': 0.9586929, 'word': 'QiyaoXue', 'start': 11, 'end': 19}, {'entity_group': 'LOC', 'score': 0.99976164, 'word': 'China', 'start': 34, 'end': 39}]


#### tokenization and model

In [79]:
from transformers import AutoTokenizer, AutoModelForTokenClassification

model_checkpoint = "dbmdz/bert-large-cased-finetuned-conll03-english"
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
model = AutoModelForTokenClassification.from_pretrained(model_checkpoint)

example = "My name is Qiyao Xue and I am a Chinese."
inputs = tokenizer(example, return_tensors="pt")
outputs = model(**inputs)

print(inputs["input_ids"].shape)
print(outputs.logits.shape)


Some weights of the model checkpoint at dbmdz/bert-large-cased-finetuned-conll03-english were not used when initializing BertForTokenClassification: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


torch.Size([1, 16])
torch.Size([1, 16, 9])


#### Get the classification result

In [80]:
import torch
probabilities = torch.nn.functional.softmax(outputs.logits, dim=-1)[0]
predictions = probabilities.argmax(dim=-1).tolist()
print(f"id2label{model.config.id2label}")
print(f"prediction id:{predictions}\nprediction label:{[model.config.id2label[predict] for predict in predictions]}")

id2label{0: 'O', 1: 'B-MISC', 2: 'I-MISC', 3: 'B-PER', 4: 'I-PER', 5: 'B-ORG', 6: 'I-ORG', 7: 'B-LOC', 8: 'I-LOC'}
prediction id:[0, 0, 0, 0, 4, 4, 4, 4, 4, 0, 0, 0, 0, 2, 0, 0]
prediction label:['O', 'O', 'O', 'O', 'I-PER', 'I-PER', 'I-PER', 'I-PER', 'I-PER', 'O', 'O', 'O', 'O', 'I-MISC', 'O', 'O']


#### The start and end character positions can be found settignt the ***return_offset_mapping=True*** when giving text input to the tokenizer, the returned start, end index is left close right open

In [81]:
results = []
input_with_offsets = tokenizer(example, return_offsets_mapping=True)
print(input_with_offsets)
tokens = input_with_offsets.tokens()
offsets = input_with_offsets["offset_mapping"]
zero_label = model.config.id2label[0]

for idx, pred in enumerate(predictions):
    label = model.config.id2label[pred]
    if label != zero_label:
        start, end = offsets[idx]
        results.append({"entity": label, "score": probabilities[idx][pred].item(), "word": tokens[idx], "start": start, "end": end})
print(results)


{'input_ids': [101, 1422, 1271, 1110, 24357, 2315, 1186, 17584, 1162, 1105, 146, 1821, 170, 1922, 119, 102], 'token_type_ids': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], 'offset_mapping': [(0, 0), (0, 2), (3, 7), (8, 10), (11, 13), (13, 15), (15, 16), (17, 19), (19, 20), (21, 24), (25, 26), (27, 29), (30, 31), (32, 39), (39, 40), (0, 0)]}
[{'entity': 'I-PER', 'score': 0.9963464140892029, 'word': 'Qi', 'start': 11, 'end': 13}, {'entity': 'I-PER', 'score': 0.9570150375366211, 'word': '##ya', 'start': 13, 'end': 15}, {'entity': 'I-PER', 'score': 0.9778035283088684, 'word': '##o', 'start': 15, 'end': 16}, {'entity': 'I-PER', 'score': 0.9940474033355713, 'word': 'Xu', 'start': 17, 'end': 19}, {'entity': 'I-PER', 'score': 0.9704844951629639, 'word': '##e', 'start': 19, 'end': 20}, {'entity': 'I-MISC', 'score': 0.9973861575126648, 'word': 'Chinese', 'start': 32, 'end': 39}]


#### There are generally two way of labeling to do the token classification
* use the B-XXX label at the beginning of each new entity
* use the B-XXX label to separate two adjacent entities of the same type
<div><img src="image/pipeline3.png" width=800></div>

In [83]:
import numpy as np

label_map = model.config.id2label
results = []
idx = 0
while idx < len(predictions):
    pred = predictions[idx]
    label = label_map[pred]
    if label != zero_label:
        # remove B- or I- in the label
        label = label[2:]
        start, _ = offsets[idx]
        while idx < len(predictions) and label_map[predictions[idx]] == f"I-{label}":
            _, end = offsets[idx]
            idx += 1
        
        word = example[start:end]
        results.append({"entity_group": label, "word": word, "start": start, "end": end})
    idx += 1
print(results)

[{'entity_group': 'PER', 'word': 'Qiyao Xue', 'start': 11, 'end': 20}, {'entity_group': 'MISC', 'word': 'Chinese', 'start': 32, 'end': 39}]


## Inside question answering pipeline
#### The question-answering pipeline finds the answer to questions in agiven context

In [84]:
from transformers import pipeline

question_answer = pipeline("question-answering")
context = "HuggingFace Transformers is backed by the three most popular learning libraries - Jax, Pytorch and TensorFlow - with between them. It's straightforward to train your models with one before loading them forinference with the other"
question ="Which deep learning libraries back HuggingFace Transformers"
question_answer(question=question, context=context)

No model was supplied, defaulted to distilbert/distilbert-base-cased-distilled-squad and revision 626af31 (https://huggingface.co/distilbert/distilbert-base-cased-distilled-squad).
Using a pipeline without specifying a model name and revision in production is not recommended.
Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.


{'score': 0.9557097554206848,
 'start': 82,
 'end': 109,
 'answer': 'Jax, Pytorch and TensorFlow'}

#### tokenization and model
#### The model have two outputs represent the start label and the end label of the answer to the question in all inputs
<div><img src="image/pipeline4.png" width=800></div> 

In [85]:
from transformers import AutoTokenizer, AutoModelForQuestionAnswering

model_ckpt = "distilbert-base-cased-distilled-squad"
tokenizer = AutoTokenizer.from_pretrained(model_ckpt)
model = AutoModelForQuestionAnswering.from_pretrained(model_ckpt)

inputs = tokenizer(question, context, return_tensors="pt")
outputs = model(**inputs)
print(outputs)

start_logits = outputs.start_logits
end_logits = outputs.end_logits
print(inputs["input_ids"].shape, start_logits.shape , end_logits.shape)

QuestionAnsweringModelOutput(loss=None, start_logits=tensor([[-5.2959, -5.6157, -3.8885, -5.3484, -7.1676, -6.8139, -4.4925, -6.9972,
         -6.4198, -7.3189, -5.1918, -5.6344, -0.8541, -5.1717, -4.5820, -5.7030,
         -2.0335, -3.0432, -1.3448, -2.5769,  2.0164,  3.1982, -1.6996, -2.3744,
         -1.8704, -2.4196,  0.7976,  9.3653, -2.6236,  5.8951, -2.6435, -3.6129,
         -2.6053, -2.7659,  0.3541, -2.3827, -2.3757, -1.8303, -3.5220, -2.1199,
         -2.3484, -3.6579, -3.9880, -2.0975, -6.1880, -7.0002, -4.0363, -6.3267,
         -3.7176, -4.3637, -3.4225, -7.2126, -5.4035, -6.3427, -4.4348, -6.9884,
         -5.9979, -7.8049, -7.3074, -8.1919, -6.9520, -7.7807, -5.6345]],
       grad_fn=<CloneBackward0>), end_logits=tensor([[-2.8224, -7.4711, -6.3334, -5.0852, -5.1913, -8.1811, -8.0520, -6.4976,
         -7.3729, -4.2701, -4.4396, -5.2550, -5.9191, -4.8538, -5.5134, -2.0127,
         -1.1316, -5.3485, -4.3896, -6.0071, -5.1289,  1.7767, -5.2148, -2.1351,
         -2.1775, 

#### Before applying the Softmax, we mask the logits outside of teh context, the sentence id input to the tokenizer and be checked using the ***sequence_ids()*** with the tokenized result
<img src="image/pipeline5.png" width=800>

In [88]:
import torch
sequence_ids = inputs.sequence_ids()
print(sequence_ids)
mask = [ i != 1 for i in sequence_ids]
# unmask the [CLS] token
mask[0] = False
mask = torch.tensor(mask).unsqueeze(dim=0)

start_logits[mask] = -10000
end_logits[mask] = -10000

start_probabilites = torch.nn.functional.softmax(start_logits, dim=-1).squeeze()
end_probabilites = torch.nn.functional.softmax(end_logits, dim=-1).squeeze()
print(start_probabilites, "\n", end_probabilites)


[None, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, None, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, None]
tensor([4.1497e-07, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
        0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
        3.5241e-05, 4.6986e-07, 8.4738e-07, 2.7620e-07, 1.0836e-05, 3.9477e-06,
        2.1576e-05, 6.2934e-06, 6.2191e-04, 2.0275e-03, 1.5132e-05, 7.7056e-06,
        1.2756e-05, 7.3650e-06, 1.8382e-04, 9.6675e-01, 6.0064e-06, 3.0075e-02,
        5.8876e-06, 2.2334e-06, 6.1173e-06, 5.2094e-06, 1.1797e-04, 7.6419e-06,
        7.6957e-06, 1.3277e-05, 2.4459e-06, 9.9395e-06, 7.9090e-06, 2.1350e-06,
        1.5347e-06, 1.0164e-05, 1.7005e-07, 7.5482e-08, 1.4624e-06, 1.4803e-07,
        2.0112e-06, 1.0541e-06, 2.7017e-06, 6.1038e-08, 3.7264e-07, 1.4568e-07,
        9.8175e-07, 7.6381e-08, 2.0566e-07, 3.3757e-08, 5.5521e-08, 2.2924e-08,
 

#### An answer isa pair of start and end positions
<div><img src="image/pipeline6.png" width=800></div>

In [91]:
scores = start_probabilites.unsqueeze(dim=-1) * end_probabilites.unsqueeze(dim=0)
scores = scores.triu()
print(scores)

tensor([[2.5375e-12, 0.0000e+00, 0.0000e+00,  ..., 1.0826e-13, 6.6022e-12,
         0.0000e+00],
        [0.0000e+00, 0.0000e+00, 0.0000e+00,  ..., 0.0000e+00, 0.0000e+00,
         0.0000e+00],
        [0.0000e+00, 0.0000e+00, 0.0000e+00,  ..., 0.0000e+00, 0.0000e+00,
         0.0000e+00],
        ...,
        [0.0000e+00, 0.0000e+00, 0.0000e+00,  ..., 2.0666e-14, 1.2603e-12,
         0.0000e+00],
        [0.0000e+00, 0.0000e+00, 0.0000e+00,  ..., 0.0000e+00, 5.5022e-13,
         0.0000e+00],
        [0.0000e+00, 0.0000e+00, 0.0000e+00,  ..., 0.0000e+00, 0.0000e+00,
         0.0000e+00]], grad_fn=<TriuBackward0>)


#### After finding the possible answer with the best score, we use the offset mappings to find the corresponding answer in the context

In [98]:
score = scores.max().item()
max_index = scores.argmax().item()
start_index = max_index // scores.shape[1]
end_index = max_index % scores.shape[1]

input_with_offsets = tokenizer(question, context, return_offsets_mapping=True)
offsets = input_with_offsets["offset_mapping"]
print(len(input_with_offsets["input_ids"]), len(offsets))

start_char, _ = offsets[start_index]
_, end_char = offsets[end_index]
answer = context[start_char:end_char]

print(f"answer: '{answer}', score: {score}")


63 63
answer: 'Jax, Pytorch and TensorFlow', score: 0.9557098746299744
