#Pre-Trained Models with Pipelines
In this tutorial, we illustrate how to use pre-trained models for inference from *transformers* library in a very convinient way - using *pipelines*. 

Various piplines are available for different tasks: token classification, text classification, NER, question answering, summarization, text generation, etc.

Have fun!

In [2]:
!pip install transformers

Collecting transformers
[?25l  Downloading https://files.pythonhosted.org/packages/27/3c/91ed8f5c4e7ef3227b4119200fc0ed4b4fd965b1f0172021c25701087825/transformers-3.0.2-py3-none-any.whl (769kB)
[K     |████████████████████████████████| 778kB 4.6MB/s 
Collecting tokenizers==0.8.1.rc1
[?25l  Downloading https://files.pythonhosted.org/packages/40/d0/30d5f8d221a0ed981a186c8eb986ce1c94e3a6e87f994eae9f4aa5250217/tokenizers-0.8.1rc1-cp36-cp36m-manylinux1_x86_64.whl (3.0MB)
[K     |████████████████████████████████| 3.0MB 25.0MB/s 
[?25hCollecting sentencepiece!=0.1.92
[?25l  Downloading https://files.pythonhosted.org/packages/d4/a4/d0a884c4300004a78cca907a6ff9a5e9fe4f090f5d95ab341c53d28cbc58/sentencepiece-0.1.91-cp36-cp36m-manylinux1_x86_64.whl (1.1MB)
[K     |████████████████████████████████| 1.1MB 52.1MB/s 
[?25hCollecting sacremoses
[?25l  Downloading https://files.pythonhosted.org/packages/7d/34/09d19aff26edcc8eb2a01bed8e98f13a1537005d31e95233fd48216eed10/sacremoses-0.0.43.tar.gz 

#1. Feature Extraction
There's a convenient pipeline for feature extraction. However, the output is said to be the last hidden layer. If you want other layer, you have to take the manual approach we did in Tutorial 1.

In [3]:
import numpy as np
from transformers import AutoTokenizer, AutoModel, pipeline

model = AutoModel.from_pretrained('bert-base-uncased')
tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')
nlp = pipeline('feature-extraction', model=model, tokenizer=tokenizer)

HBox(children=(FloatProgress(value=0.0, description='Downloading', max=433.0, style=ProgressStyle(description_…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=440473133.0, style=ProgressStyle(descri…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=231508.0, style=ProgressStyle(descripti…




In [11]:
features = nlp('Do you like cookies ?')
features = np.squeeze(features)
print(features.shape)

(7, 768)


Remember our earlier exercise measuring similarity between sentences? Let's try it here.

In [13]:
import torch
from scipy.spatial.distance import cosine

In [19]:
sents = ["What's the time now in Singapore?",
                   "What is the weather in Seattle today?",
                   "Apple is looking at buying the U.K. startup for $1 billion."]

vec0 = torch.tensor(np.squeeze(nlp(sents[0])))
sent0 = torch.mean(vec0, dim=0)
sent0.size()

vec1 = torch.tensor(np.squeeze(nlp(sents[1])))
sent1 = torch.mean(vec1, dim=0)

vec2 = torch.tensor(np.squeeze(nlp(sents[2])))
sent2 = torch.mean(vec2, dim=0)

In [20]:
sim_01 = 1 - cosine(sent0, sent1)

sim_02 = 1 - cosine(sent0, sent2)

print('Vector similarity for example 0 & 1:  %.2f' % sim_01)
print('Vector similarity for example 0 & 2:  %.2f' % sim_02)

Vector similarity for example 0 & 1:  0.78
Vector similarity for example 0 & 2:  0.55


#2. Sentiment Classification
Initialize the pipeline with keyword "sentiment-analysis" with a model that has been fine-tuned for sentiment classification. By default, the model downloaded for this pipeline is called “distilbert-base-uncased-finetuned-sst-2-english”. It uses the DistilBERT architecture and has been fine-tuned on a dataset called SST-2 for the sentiment analysis task. 

The result returned includes the sentiment label and score.

In [23]:
#using fine-tuned models
from transformers import pipeline

#for sentiment classification
nlp = pipeline("sentiment-analysis")

result = nlp("I hate you")[0]
print(f"label: {result['label']}, with score: {round(result['score'], 4)}")

result = nlp("I love you")[0]
print(f"label: {result['label']}, with score: {round(result['score'], 4)}")

result = nlp("This article is terribly good")[0]
print(f"label: {result['label']}, with score: {round(result['score'], 4)}")

result = nlp("This dress is pretty ugly")[0]
print(f"label: {result['label']}, with score: {round(result['score'], 4)}")


label: NEGATIVE, with score: 0.9991
label: POSITIVE, with score: 0.9999
label: POSITIVE, with score: 0.9999
label: NEGATIVE, with score: 0.9998


#3. Sequence Classification
To classify a sequence of two sentences A and B into predefined classes like whether B is a paraphrase of A. Here we use a model finetuened on GLUE MRPC dataset (The Microsoft Research Paraphrase Corpus).

In [25]:
#====sequence classification=========
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("bert-base-cased-finetuned-mrpc")
model = AutoModelForSequenceClassification.from_pretrained("bert-base-cased-finetuned-mrpc")

classes = ["not paraphrase", "is paraphrase"]

sequence_0 = "The company HuggingFace is based in New York City"
sequence_1 = "Apples are especially bad for your health"
sequence_2 = "HuggingFace's headquarters are situated in Manhattan"

# Should be paraphrase
paraphrase = tokenizer(sequence_0, sequence_2, return_tensors="pt")
paraphrase_classification_logits = model(**paraphrase)[0]
paraphrase_results = torch.softmax(paraphrase_classification_logits, dim=1).tolist()[0]


HBox(children=(FloatProgress(value=0.0, description='Downloading', max=433.0, style=ProgressStyle(description_…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=213450.0, style=ProgressStyle(descripti…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=433297515.0, style=ProgressStyle(descri…




In [26]:
for i in range(len(classes)):
    print(f"{classes[i]}: {int(round(paraphrase_results[i] * 100))}%")

not paraphrase: 10%
is paraphrase: 90%


In [27]:
# Should not be paraphrase
not_paraphrase = tokenizer(sequence_0, sequence_1, return_tensors="pt")
not_paraphrase_classification_logits = model(**not_paraphrase)[0]
not_paraphrase_results = torch.softmax(not_paraphrase_classification_logits, dim=1).tolist()[0]

for i in range(len(classes)):
    print(f"{classes[i]}: {int(round(not_paraphrase_results[i] * 100))}%")


not paraphrase: 94%
is paraphrase: 6%


In [28]:
print(not_paraphrase_results)
not_paraphrase_classification_logits

[0.94038325548172, 0.059616751968860626]


tensor([[ 0.5386, -2.2197]], grad_fn=<AddmmBackward>)

#4. Question Answering (Extractive)
This is the task of identifying the segment of text in "context" that's best for the given "question". It uses a model finetuned on SQuAD.

In [29]:
#====Extractive question answering
from transformers import pipeline

nlp = pipeline("question-answering")

context = r"""
Extractive Question Answering is the task of extracting an answer from a text given a question. An example of a
question answering dataset is the SQuAD dataset, which is entirely based on that task. If you would like to fine-tune
a model on a SQuAD task, you may leverage the examples/question-answering/run_squad.py script.
"""

result = nlp(question="What is extractive question answering?", context=context)
print(f"Answer: '{result['answer']}', score: {round(result['score'], 4)}, start: {result['start']}, end: {result['end']}")


HBox(children=(FloatProgress(value=0.0, description='Downloading', max=473.0, style=ProgressStyle(description_…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=213450.0, style=ProgressStyle(descripti…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=230.0, style=ProgressStyle(description_…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=260793700.0, style=ProgressStyle(descri…


Answer: 'the task of extracting an answer from a text given a question.', score: 0.6186, start: 34, end: 96


In [31]:
result = nlp(question="What is a good example of a question answering dataset?", context=context)
print(result['answer'], result['score'])
result = nlp(question="What do you need if you want to finetune a model?", context=context)
print(result['answer'], result['score'])

SQuAD dataset, 0.5039560434349002
leverage the examples/question-answering/run_squad.py script. 0.3603504267141131


If you want to use a specific model, and have many questions, these are the example codes.

In [None]:
#====QA with multiple answers====
from transformers import AutoTokenizer, AutoModelForQuestionAnswering
import torch
tokenizer = AutoTokenizer.from_pretrained("bert-large-uncased-whole-word-masking-finetuned-squad")
model = AutoModelForQuestionAnswering.from_pretrained("bert-large-uncased-whole-word-masking-finetuned-squad")

text = r"""
🤗 Transformers (formerly known as pytorch-transformers and pytorch-pretrained-bert) provides general-purpose
architectures (BERT, GPT-2, RoBERTa, XLM, DistilBert, XLNet…) for Natural Language Understanding (NLU) and Natural
Language Generation (NLG) with over 32+ pretrained models in 100+ languages and deep interoperability between
TensorFlow 2.0 and PyTorch.
"""

questions = [
    "How many pretrained models are available in 🤗 Transformers?",
    "What does 🤗 Transformers provide?",
    "🤗 Transformers provides interoperability between which frameworks?",
]


HBox(children=(FloatProgress(value=0.0, description='Downloading', max=443.0, style=ProgressStyle(description_…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=231508.0, style=ProgressStyle(descripti…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=1340675298.0, style=ProgressStyle(descr…




In [None]:
for question in questions:
    inputs = tokenizer(question, text, add_special_tokens=True, return_tensors="pt")
    input_ids = inputs["input_ids"].tolist()[0]

    text_tokens = tokenizer.convert_ids_to_tokens(input_ids)
    answer_start_scores, answer_end_scores = model(**inputs)

    answer_start = torch.argmax(
        answer_start_scores
    )  # Get the most likely beginning of answer with the argmax of the score
    answer_end = torch.argmax(answer_end_scores) + 1  # Get the most likely end of answer with the argmax of the score

    answer = tokenizer.convert_tokens_to_string(tokenizer.convert_ids_to_tokens(input_ids[answer_start:answer_end]))

    print(f"Question: {question}")
    print(f"Answer: {answer}")
    

Question: How many pretrained models are available in 🤗 Transformers?
Answer: over 32 +
Question: What does 🤗 Transformers provide?
Answer: general - purpose architectures
Question: 🤗 Transformers provides interoperability between which frameworks?
Answer: tensorflow 2 . 0 and pytorch


#5. Fill in the Blank ([MASK]
Maked language modeling allows the model to perform this cloze task - fill in the blank considering the context from both left and right.

In [34]:
#=======Masked Language Modelling============
from transformers import pipeline
from transformers import AutoModelWithLMHead, AutoTokenizer
import torch
from pprint import pprint

nlp = pipeline("fill-mask")

pprint(nlp(f"HuggingFace is creating a {nlp.tokenizer.mask_token} that the community uses to solve NLP tasks."))


Some weights of RobertaForMaskedLM were not initialized from the model checkpoint at distilroberta-base and are newly initialized: ['lm_head.decoder.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


[{'score': 0.17927460372447968,
  'sequence': '<s>HuggingFace is creating a tool that the community uses to '
              'solve NLP tasks.</s>',
  'token': 3944,
  'token_str': 'Ġtool'},
 {'score': 0.1134939044713974,
  'sequence': '<s>HuggingFace is creating a framework that the community uses '
              'to solve NLP tasks.</s>',
  'token': 7208,
  'token_str': 'Ġframework'},
 {'score': 0.05243545398116112,
  'sequence': '<s>HuggingFace is creating a library that the community uses to '
              'solve NLP tasks.</s>',
  'token': 5560,
  'token_str': 'Ġlibrary'},
 {'score': 0.03493543714284897,
  'sequence': '<s>HuggingFace is creating a database that the community uses '
              'to solve NLP tasks.</s>',
  'token': 8503,
  'token_str': 'Ġdatabase'},
 {'score': 0.02860247902572155,
  'sequence': '<s>HuggingFace is creating a prototype that the community uses '
              'to solve NLP tasks.</s>',
  'token': 17715,
  'token_str': 'Ġprototype'}]


In [35]:
#see the details use a specific model
tokenizer = AutoTokenizer.from_pretrained("distilbert-base-cased")
model = AutoModelWithLMHead.from_pretrained("distilbert-base-cased")
sequence = f"Distilled models are smaller than the models they mimic. Using them instead of the large versions would help {tokenizer.mask_token} our carbon footprint."
input = tokenizer.encode(sequence, return_tensors="pt")
mask_token_index = torch.where(input == tokenizer.mask_token_id)[1]
token_logits = model(input)[0]
token_logits.size()

HBox(children=(FloatProgress(value=0.0, description='Downloading', max=411.0, style=ProgressStyle(description_…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=213450.0, style=ProgressStyle(descripti…






HBox(children=(FloatProgress(value=0.0, description='Downloading', max=263273408.0, style=ProgressStyle(descri…




torch.Size([1, 30, 28996])

In [36]:
#get the logits for the masked token
mask_token_logits = token_logits[0, mask_token_index, :]
top_5_tokens = torch.topk(mask_token_logits, 5, dim=1).indices[0].tolist()

for token in top_5_tokens:
    print(sequence.replace(tokenizer.mask_token, tokenizer.decode([token])))

Distilled models are smaller than the models they mimic. Using them instead of the large versions would help reduce our carbon footprint.
Distilled models are smaller than the models they mimic. Using them instead of the large versions would help increase our carbon footprint.
Distilled models are smaller than the models they mimic. Using them instead of the large versions would help decrease our carbon footprint.
Distilled models are smaller than the models they mimic. Using them instead of the large versions would help offset our carbon footprint.
Distilled models are smaller than the models they mimic. Using them instead of the large versions would help improve our carbon footprint.


#6. Text Generation
Models trained for the classic language modeling task (also known as causal language modelling) can be used for text generation. In this pipeline, GPT-2 is used by default. 
Let's try it.

In [37]:
#====text generation=====
from transformers import pipeline
text_generator = pipeline("text-generation")


HBox(children=(FloatProgress(value=0.0, description='Downloading', max=665.0, style=ProgressStyle(description_…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=1042301.0, style=ProgressStyle(descript…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=456318.0, style=ProgressStyle(descripti…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=230.0, style=ProgressStyle(description_…






HBox(children=(FloatProgress(value=0.0, description='Downloading', max=548118077.0, style=ProgressStyle(descri…




Some weights of GPT2LMHeadModel were not initialized from the model checkpoint at gpt2 and are newly initialized: ['h.0.attn.masked_bias', 'h.1.attn.masked_bias', 'h.2.attn.masked_bias', 'h.3.attn.masked_bias', 'h.4.attn.masked_bias', 'h.5.attn.masked_bias', 'h.6.attn.masked_bias', 'h.7.attn.masked_bias', 'h.8.attn.masked_bias', 'h.9.attn.masked_bias', 'h.10.attn.masked_bias', 'h.11.attn.masked_bias', 'lm_head.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [48]:
# Generating by greedy search
text = text_generator("As far as I am concerned, I will", max_length=100, do_sample=False)
print(text[0]['generated_text'])

Setting `pad_token_id` to 50256 (first `eos_token_id`) to generate sequence


As far as I am concerned, I will be the first to admit that I am not a fan of the idea of a "free market." I think that the idea of a free market is a bit of a stretch. I think that the idea of a free market is a bit of a stretch. I think that the idea of a free market is a bit of a stretch. I think that the idea of a free market is a bit of a stretch. I think that the idea of a


In [40]:
# bringing in random selection of the next word according to its conditional probability distribution
text = text_generator("As far as I am concerned, I will", max_length=100, do_sample=True)
print(text[0]['generated_text'])

Setting `pad_token_id` to 50256 (first `eos_token_id`) to generate sequence


As far as I am concerned, I will have no business with your company."

"Then tell me that your company is taking a break."

"Well, as long as I have you, I'm not going to bother."

The two laughed softly in the corner as they walked back to the car to check on their surroundings.

In the middle of the night, they continued to get on.

They began eating while talking outside. After all, everyone ate


In [46]:
# using beam search, other higher probability sequences have a chance, too.
text = text_generator("As far as I am concerned, I will", max_length=100, num_beams=5)
print(text[0]['generated_text'])

Setting `pad_token_id` to 50256 (first `eos_token_id`) to generate sequence


As far as I am concerned, I will not be making any changes to the game any time soon.

If you like what you see, let me know what you think in the comments below.

If you like what you see, let me know what you think in the comments below.

If you like what you see, let me know what you think in the comments below.

If you like what you see, let me know what you think in the comments below


In [49]:
# the annoying repetition can be stopped.
text = text_generator("As far as I am concerned, I will", max_length=100, num_beams=5, no_repeat_ngram_size=2)
print(text[0]['generated_text'])

Setting `pad_token_id` to 50256 (first `eos_token_id`) to generate sequence


As far as I am concerned, I will be the only one who is going to be able to do it.

"It's not like I can't do something. It's just that I don't want to. I'm not ready for it."


In [52]:
# sampling can be helpful when it's TopK Sampling
text = text_generator("As far as I am concerned, I will", max_length=100, do_sample=True, top_k=20)
print(text[0]['generated_text'])

Setting `pad_token_id` to 50256 (first `eos_token_id`) to generate sequence


As far as I am concerned, I will not get involved in the politics of Brexit. I have not had a chance to ask this question. But the fact that many MPs, politicians, academics, and even the European Parliament have been forced to support the idea of withdrawing from the EU means that I can be of little comfort to those who want to have this debate resolved in a meaningful way. This is not a question of politics, or of policy, or even of the issues. It is a


In [60]:
# now let's try giving longer starting text
text = text_generator("In the wake of Michelle Obama's blunt warning to the country Monday night on what a second term for Donald Trump would mean, the President was asked about the former first lady's speech at the Democratic National Convention.", 
                      max_length=200, do_sample=True, top_k=20, no_repeat_ngram_size=3)
print(text[0]['generated_text'])

Setting `pad_token_id` to 50256 (first `eos_token_id`) to generate sequence



"I mean, it was something that Hillary had to explain to her, how did they come up with so many different things? And she was like, 'I think Hillary is going to lead.' I was like 'what?' And she said 'Well, it's not something in between, I think her message to the whole country is the same, how do you know what a new generation, how you know who you're voting for?' And of course that was the first speech of the Clinton campaign, and that was her campaign for president. It was all just very much on autopilot on the part of that group of people. It's just very far from the truth that we're seeing now," he told host Candy Crowley.



#7. Text Summarization
To summarize a long text/article into a shorter text. Here the pipeline by default uses a Bart model that was fine-tuned on the CNN / Daily Mail data set.

In [65]:
#=====summarization

summarizer = pipeline("summarization")



In [69]:
ARTICLE = """Democrats formally nominated Joe Biden for president on Tuesday (Aug 18), with elder statesmen and rising stars promising he would  repair a pandemic-devastated America and end the chaos of Republican President Donald Trump.
The convention's second night, under the theme "Leadership Matters", aimed to make the case that Biden would represent a return to normalcy.
"At a time like this, the Oval Office should be a command centre," former US President Bill Clinton said in a prerecorded video. 
"Instead, it's a storm centre. There's only chaos. Just one thing never changes - his determination to deny responsibility and shift the blame."
With the four-day convention largely virtual due to the coronavirus, delegates from around the country cast votes remotely to confirm Biden as the nominee.
In clips from around the country, Democrats of all stripes explained why they were supporting Biden while putting their own state-specific spin on the proceedings, from a calamari appetiser in Rhode Island to a herd of cattle in Montana.
Following his home state of Delaware, which went last in his honor, Biden appeared live for the first time at a Delaware school, where his wife, Jill, was set to deliver the night's headline address later in the evening.
"Thank you very, very much from the bottom of my heart," said Biden, who will deliver his acceptance speech on Thursday. "It means the world to me and my family."
Democratic presidential candidate and former Vice President Joe Biden and running mate Senator Kamala Harris are seen on screen at virtual 2020 Democratic Convention hosted from Milwaukee, Wisconsin.
The programme started by showcasing some of the party's rising politicians. But rather than a single keynote speech that could be a star-making turn, as it was for then-state Senator Barack Obama in 2004, the programme featured 17 stars in a video address, including Stacey Abrams, the one-time Georgia gubernatorial nominee whom Biden once considered for a running mate.
"America faces a triple threat: A public health catastrophe, and economic collapse and a reckoning with racial justice and inequality," Abrams said. 
"So our choice is clear: A steady experienced public servant who can lead us out of this crisis just like he's done before, or a man who only knows how to deny and distract."
As they did on Monday's opening night, Democrats featured a handful of Republicans who have crossed party lines to praise Biden, 77, over Trump, 74, ahead of the Nov 3 election.
Cindy McCain, widow of Republican Senator John McCain, was scheduled to appear in a video talking about her husband's long friendship with Biden, according to a preview posted online. Trump clashed with McCain, who was the Republican nominee for president in 2008, and the president criticised McCain even after his 2018 death.
Republican former Secretary of State Colin Powell, a retired four-star general who endorsed Biden in June, was one of several national security officials due to speak on the Democrat's behalf.
"Our country needs a commander in chief who takes care of our troops in the same way he would his own family," he said. 
“He will trust our diplomats and our intelligence community, not the flattery of dictators and despots. He will make it his job to know when anyone dares to threaten us. He will stand up to our adversaries with strength and experience. They will know he means business.”
Democratic former Secretary of State John Kerry said of Trump: "When this president goes overseas, it isn’t a goodwill mission, it’s a blooper reel. He breaks up with our allies and writes love letters to dictators. America deserves a president who is looked up to, not laughed at."
Biden's vice presidential pick, Senator Kamala Harris, will headline Wednesday night's programme along with Obama.
Without the cheering crowds at the in-person gathering originally planned for Milwaukee, Wisconsin, TV viewership on Monday was down from 2016. But an additional 10.2 million people watched on digital platforms, the Biden campaign said, for a total audience of nearly 30 million.
Aiming to draw attention away from Biden, Trump, trailing in opinion polls, held a campaign rally in Arizona, a hotly contested battleground state that can swing to either party and play a decisive role in the election.
The convention was being held amid worries about the safety of in-person voting. Democrats have pushed mail-in ballots as an alternative and pressured the head of the US Postal Service, a top Trump donor, to suspend cost cuts that delayed mail deliveries. 
Bowing to that pressure, Postmaster General Louis DeJoy put off the cost-cutting measures until after the election.
"""
print(summarizer(ARTICLE, max_length=150, min_length=50, do_sample=False))

[{'summary_text': " Democrats formally nominate Joe Biden for president on Tuesday (Aug 18), with elder statesmen and rising stars promising he would repair a pandemic-devastated America and end the chaos of Republican President Donald Trump . Biden appeared live for the first time at a Delaware school, where his wife, Jill, was set to deliver the night's headline address later in the evening . Biden's vice presidential pick, Senator Kamala Harris, will headline Wednesday night's programme along with Obama ."}]


We can also use "t5" for summarization task.

In [70]:
from transformers import AutoModelWithLMHead, AutoTokenizer
model = AutoModelWithLMHead.from_pretrained("t5-base")
tokenizer = AutoTokenizer.from_pretrained("t5-base")



HBox(children=(FloatProgress(value=0.0, description='Downloading', max=1199.0, style=ProgressStyle(description…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=891691430.0, style=ProgressStyle(descri…




Some weights of T5ForConditionalGeneration were not initialized from the model checkpoint at t5-base and are newly initialized: ['encoder.embed_tokens.weight', 'decoder.embed_tokens.weight', 'lm_head.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


HBox(children=(FloatProgress(value=0.0, description='Downloading', max=791656.0, style=ProgressStyle(descripti…




In [73]:
# T5 uses a max_length of 512 so we cut the article to 512 tokens.
inputs = tokenizer.encode("summarize: " + ARTICLE, return_tensors="pt", max_length=512, truncation=True)
outputs = model.generate(inputs, max_length=150, min_length=50, length_penalty=2.0, num_beams=4, early_stopping=True)
print(outputs)

tensor([[    0,     8,  8346,    31,     7,   511,   706,     6,   365,     8,
          3800,    96,  2796,     9,   588,  2009, 19888,     7,  1686,     3,
          8287,    12,   143,     8,   495,    24,  2106,   537,   133,  4221,
             3,     9,  1205,    12,  1389,    75,    63,     3,     5,     3,
         22878,    45,   300,     8,   684,  4061, 11839, 20081,    12,  3606,
          2106,   537,    38,     8, 21077,     3,     5,     8,   662,    18,
          1135,  8346,    47,     3,  6974,  4291,   788,    12,     8,  4301,
           106,     9, 18095,     3,     5]])


In [74]:
print(tokenizer.decode(outputs[0]))

the convention's second night, under the theme "Leadership Matters", aimed to make the case that Biden would represent a return to normalcy. delegates from around the country cast votes remotely to confirm Biden as the nominee. the four-day convention was largely virtual due to the coronavirus.


#8. Machine Translation

In [None]:
#=====translation===
from transformers import pipeline
translator = pipeline("translation_en_to_fr")
print(translator("Hugging Face is a technology company based in New York and Paris", max_length=40))

HBox(children=(FloatProgress(value=0.0, description='Downloading', max=1199.0, style=ProgressStyle(description…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=791656.0, style=ProgressStyle(descripti…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=230.0, style=ProgressStyle(description_…






HBox(children=(FloatProgress(value=0.0, description='Downloading', max=891691430.0, style=ProgressStyle(descri…




Some weights of T5ForConditionalGeneration were not initialized from the model checkpoint at t5-base and are newly initialized: ['encoder.embed_tokens.weight', 'decoder.embed_tokens.weight', 'lm_head.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


[{'translation_text': 'Hugging Face est une entreprise technologique basée à New York et à Paris.'}]


#Reference
Transformers documentations: https://huggingface.co/transformers/index.html