#Pre-Trained Models with Pipelines (Part II) 

In this tutorial, we illustrate how to use pre-trained models for inference from *transformers* library in a very convenient way - using *pipelines*. 

Various piplines are available for different tasks: token classification, text classification, NER, question answering, summarization, text generation, etc.

Have fun!



In [None]:
!pip install transformers

Collecting transformers
  Downloading transformers-4.9.2-py3-none-any.whl (2.6 MB)
[K     |████████████████████████████████| 2.6 MB 5.1 MB/s 
[?25hCollecting tokenizers<0.11,>=0.10.1
  Downloading tokenizers-0.10.3-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (3.3 MB)
[K     |████████████████████████████████| 3.3 MB 39.5 MB/s 
Collecting sacremoses
  Downloading sacremoses-0.0.45-py3-none-any.whl (895 kB)
[K     |████████████████████████████████| 895 kB 52.5 MB/s 
[?25hCollecting pyyaml>=5.1
  Downloading PyYAML-5.4.1-cp37-cp37m-manylinux1_x86_64.whl (636 kB)
[K     |████████████████████████████████| 636 kB 44.8 MB/s 
[?25hCollecting huggingface-hub==0.0.12
  Downloading huggingface_hub-0.0.12-py3-none-any.whl (37 kB)
Installing collected packages: tokenizers, sacremoses, pyyaml, huggingface-hub, transformers
  Attempting uninstall: pyyaml
    Found existing installation: PyYAML 3.13
    Uninstalling PyYAML-3.13:
      Successf

#1. Text Generation
Models trained for the classic language modeling task (also known as causal language modelling) can be used for text generation. In this pipeline, GPT-2 is used by default. 

Let's try it. 

In [None]:
from transformers import pipeline
text_generator = pipeline("text-generation")

Downloading:   0%|          | 0.00/665 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/548M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/456k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

## 1.1 Generating using Greedy Search

In [None]:
text = text_generator("As far as I am concerned, I will", max_length=100, do_sample=False)
print(text[0]['generated_text'])

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


As far as I am concerned, I will be the first to admit that I am not a fan of the idea of a "free market." I think that the idea of a free market is a bit of a stretch. I think that the idea of a free market is a bit of a stretch. I think that the idea of a free market is a bit of a stretch. I think that the idea of a free market is a bit of a stretch. I think that the idea of a


## 1.2 Bringing in random selection of the next word according to its conditional probability distribution

In [None]:

text = text_generator("As far as I am concerned, I will", max_length=100, do_sample=True)
print(text[0]['generated_text'])

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


As far as I am concerned, I will leave out my old one. In fact, there's this one I can find at any bookstore or coffeehouse. As for the price of the book, it's actually really fair considering that it might take as long as five minutes. The price is much less than a lot of other books being made (if you want to spend a single dollar), I'm sure… but so far as I can tell, it's only about 150 units, and probably


## 1.3 Using beam search, other higher probability sequences get a chance, too. Try with different number of beams.

In [None]:

text = text_generator("As far as I am concerned, I will", max_length=100, num_beams=5)
print(text[0]['generated_text'])

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


As far as I am concerned, I will continue to do the best I can to make sure that this is the case.

I will continue to do the best I can to make sure that this is the case.

I will continue to do the best I can to make sure that this is the case.

I will continue to do the best I can to make sure that this is the case.

I will continue to do the best I can to make sure that


## 1.4 Stopping the annoying repetition. Try different ngram sizes.

In [None]:

text = text_generator("As far as I am concerned, I will", max_length=100, num_beams=5, no_repeat_ngram_size=3)
print(text[0]['generated_text'])

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


As far as I am concerned, I will be the first to say that I have no idea what is going on here, but I do know that it is very, very serious.

I think it's important to understand that, if you're going to talk about something as serious as this, you're talking about something that has been going on for a very long time. It's not something that's going to stop anytime soon, and it's not going to end anytime soon.



## 1.5 Sampling can be helpful to avoid boredom. Let's try TopK Sampling

In [None]:

text = text_generator("As far as I am concerned, I will", max_length=100, do_sample=True, top_k=10)
print(text[0]['generated_text'])

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


As far as I am concerned, I will never be able to say this in my own country.

The American people deserve better than that, especially if Donald Trump is elected president. He doesn't deserve anything less and he won't win.

He has done everything he can to make America great again, and he is the only one who understands that.

He has a chance to do that. He will win.

If he wins this election, he will be President


## 1.6 And Top P Sampling

In [None]:
text = text_generator("As far as I am concerned, I will", max_length=100, do_sample=True, top_p=0.9)
print(text[0]['generated_text'])

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


As far as I am concerned, I will be writing a post about this as soon as I have time.

How do you get into the world of sports analytics?

Let me start by saying that you do not need to be an expert, just know what you are doing. You are just doing it because you do not want to lose anyone.

As long as you are able to keep yourself updated on every single game you play, you can see if you have any kind


#2. Text Summarization
To summarize a long text/article into a shorter text. Here the pipeline by default uses a Bart model that was fine-tuned on the CNN / Daily Mail data set.

In [None]:
#=====summarization
from transformers import pipeline
summarizer = pipeline("summarization")

Downloading:   0%|          | 0.00/1.80k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.22G [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/899k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/456k [00:00<?, ?B/s]

In [None]:
ARTICLE = """Democrats formally nominated Joe Biden for president on Tuesday (Aug 18), with elder statesmen and rising stars promising he would  repair a pandemic-devastated America and end the chaos of Republican President Donald Trump.
The convention's second night, under the theme "Leadership Matters", aimed to make the case that Biden would represent a return to normalcy.
"At a time like this, the Oval Office should be a command centre," former US President Bill Clinton said in a prerecorded video. 
"Instead, it's a storm centre. There's only chaos. Just one thing never changes - his determination to deny responsibility and shift the blame."
With the four-day convention largely virtual due to the coronavirus, delegates from around the country cast votes remotely to confirm Biden as the nominee.
In clips from around the country, Democrats of all stripes explained why they were supporting Biden while putting their own state-specific spin on the proceedings, from a calamari appetiser in Rhode Island to a herd of cattle in Montana.
Following his home state of Delaware, which went last in his honor, Biden appeared live for the first time at a Delaware school, where his wife, Jill, was set to deliver the night's headline address later in the evening.
"Thank you very, very much from the bottom of my heart," said Biden, who will deliver his acceptance speech on Thursday. "It means the world to me and my family."
Democratic presidential candidate and former Vice President Joe Biden and running mate Senator Kamala Harris are seen on screen at virtual 2020 Democratic Convention hosted from Milwaukee, Wisconsin.
The programme started by showcasing some of the party's rising politicians. But rather than a single keynote speech that could be a star-making turn, as it was for then-state Senator Barack Obama in 2004, the programme featured 17 stars in a video address, including Stacey Abrams, the one-time Georgia gubernatorial nominee whom Biden once considered for a running mate.
"America faces a triple threat: A public health catastrophe, and economic collapse and a reckoning with racial justice and inequality," Abrams said. 
"So our choice is clear: A steady experienced public servant who can lead us out of this crisis just like he's done before, or a man who only knows how to deny and distract."
As they did on Monday's opening night, Democrats featured a handful of Republicans who have crossed party lines to praise Biden, 77, over Trump, 74, ahead of the Nov 3 election.
Cindy McCain, widow of Republican Senator John McCain, was scheduled to appear in a video talking about her husband's long friendship with Biden, according to a preview posted online. Trump clashed with McCain, who was the Republican nominee for president in 2008, and the president criticised McCain even after his 2018 death.
Republican former Secretary of State Colin Powell, a retired four-star general who endorsed Biden in June, was one of several national security officials due to speak on the Democrat's behalf.
"Our country needs a commander in chief who takes care of our troops in the same way he would his own family," he said. 
“He will trust our diplomats and our intelligence community, not the flattery of dictators and despots. He will make it his job to know when anyone dares to threaten us. He will stand up to our adversaries with strength and experience. They will know he means business.”
Democratic former Secretary of State John Kerry said of Trump: "When this president goes overseas, it isn’t a goodwill mission, it’s a blooper reel. He breaks up with our allies and writes love letters to dictators. America deserves a president who is looked up to, not laughed at."
Biden's vice presidential pick, Senator Kamala Harris, will headline Wednesday night's programme along with Obama.
Without the cheering crowds at the in-person gathering originally planned for Milwaukee, Wisconsin, TV viewership on Monday was down from 2016. But an additional 10.2 million people watched on digital platforms, the Biden campaign said, for a total audience of nearly 30 million.
Aiming to draw attention away from Biden, Trump, trailing in opinion polls, held a campaign rally in Arizona, a hotly contested battleground state that can swing to either party and play a decisive role in the election.
The convention was being held amid worries about the safety of in-person voting. Democrats have pushed mail-in ballots as an alternative and pressured the head of the US Postal Service, a top Trump donor, to suspend cost cuts that delayed mail deliveries. 
Bowing to that pressure, Postmaster General Louis DeJoy put off the cost-cutting measures until after the election.
"""


In [None]:
ARTICLE = """On July 21  2013  Employee #1  with Perez Farms  was using an injection pump  to irrigate and fertilize a field. The hose developed a pin hole  and  fertilizer was sprayed into Employee #1's eyes. The fertilizer contained Urea  and sulfuric acid (pH of 1). Employee #1 was wearing safety glasses but not  goggles. The initial spray blew off Employee #1's glasses from his face and  then hit Employee #1's eyes. Employee #1 was hospitalized. He did not suffer  permanent damage to the eyes  but his eyes did require unspecified treatment. 
"""

In [None]:
print(summarizer(ARTICLE, max_length=100, min_length=20, do_sample=False))

[{'summary_text': " Democrats formally nominate Joe Biden for president on Tuesday (Aug 18), with elder statesmen and rising stars promising he would repair a pandemic-devastated America and end the chaos of Republican President Donald Trump . Biden appeared live for the first time at a Delaware school, where his wife, Jill, was set to deliver the night's headline address later in the evening . Biden's vice presidential pick, Senator Kamala Harris, will headline Wednesday night's programme along with Obama ."}]


We can also use "t5" for summarization task.

In [None]:
from transformers import AutoModelWithLMHead, AutoTokenizer
model = AutoModelWithLMHead.from_pretrained("t5-base")
tokenizer = AutoTokenizer.from_pretrained("t5-base")



Downloading:   0%|          | 0.00/1.20k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/892M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/792k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.39M [00:00<?, ?B/s]

In [None]:
ARTICLE = """Hotels in Mumbai and other Indian cities are to train their staff to spot signs of sex trafficking such as frequent requests for bed linen changes or a "Do not disturb" sign left on the door for days on end. The group behind the initiative is also developing a mobile phone app - Rescue Me - which hotel staff can use to alert local police and senior anti-trafficking officers if they see suspicious behavior. "Hotels are breeding grounds for human trade," said Sanee Awsarmmel, chairman of the alumni group of Maharashtra State Institute of Hotel Management and Catering Technology. "(We) have hospitality professionals working in hotels across the country. We are committed to this cause."The initiative, spearheaded by the alumni group and backed by the Maharashtra state government, comes amid growing international recognition that hotels have a key role to play in fighting modern day slavery. MAHARASHTRA MAJOR DESTINATION FOR TRAFFICKED GIRLS Maharashtra, of which Mumbai is the capital, is a major destination for trafficked girls who are lured from poor states and nearby countries on the promise of jobs, but then sold into the sex trade or domestic servitude. With rising property prices, some traditional red light districts like those in Mumbai have started to disappear pushing the sex trade underground into private lodges and hotels, which makes it hard for police to monitor.Awsarmmel said hotels would be told about 50 signs that staff needed to watch out for.These include requests for rooms with a view of the car park which are favored by traffickers as they allow them to vet clients for signs of trouble and check out their cars to gauge how much to charge.Awsarmmel said hotel staff often noticed strange behavior such as a girl's reticence during the check-in process or her dependence on the person accompanying her to answer questions and provide her proof of identity.But in most cases, staff ignore these signs or have no idea what to do, he told the Thomson Reuters Foundation.RESCUE ME APP The Rescue Me app - to be launched in a couple of months - will have a text feature where hotel staff can fill in details including room numbers to send an alert to police.Human trafficking is the world's fastest growing criminal enterprise worth an estimated $150 billion a year, according to the International Labor Organization, which says nearly 21 million people globally are victims of forced labor and trafficking.Last year, major hotel groups, including the Hilton and Shiva Hotels, pledged to examine their supply chains for forced labor, and train staff how to spot and report signs of trafficking.Earlier this year, Mexico City also launched an initiative to train hotel staff about trafficking.Vijaya Rahatkar, chairwoman of the Maharashtra State Women's Commission, said the initiative would have an impact beyond the state as the alumni group had contact with about a million small hotels across India.The group is also developing a training module on trafficking for hotel staff and hospitality students which could be used across the country.ALSO READFYI | Legal revenge: Child sex trafficking survivors get 'School of Justice' to fight their own battlesMumbai: Woman DJ arrested in high-profile sex racket case
"""

In [None]:
ARTICLE = """At approximately 7:00 a.m.  on September 17  2008  Employee #1  a forklift  driver for the Sweetener Products Company  was working at the railroad dock in  the warehouse. The company converts granulated sugar into liquid sugar  products. Employee #1's duties included off-loading railcars of granulated  sugar. He was walking near the railroad tracks to get two support tubes that  are placed under the loading ramp for additional support while forklifts are  unloading rail cars. A coworker was in the warehouse  lowering the ramp so  they could off-load a rail car. To lower the ramp  the employees must push the  ramp until it leans forward. The employees then push a button on the wall   adjacent to the dock door  to activate the hydraulic system that controls the  ramp plate. However  when the coworker pushed on the ramp plate  it fell   striking Employee #1 on the back of his head and neck. He was transported to  USC Medical Center  where doctors performed a MRI and determined that he was  able to be released. He told his treating physician that he was in intense  pain and unable to walk. When the physician informed him that he was going to  be sent home  his wife informed the physician that he was covered by private  insurance through Kaiser. A nurse from USC verified his coverage and made  arrangements to have Employee #1 transported to Kaiser Sunset. He had  sustained fractures to the back of his neck. He underwent surgery on September  20  2008  and was hospitalized for nine days. 
"""

In [None]:
# T5 uses a max_length of 512 so we cut the article to 512 tokens.
inputs = tokenizer.encode("summarize: " + ARTICLE, return_tensors="pt", max_length=512, truncation=True)
outputs = model.generate(inputs, max_length=50, min_length=5, repetition_penalty=2.5, length_penalty=1.0, num_beams=2, early_stopping=True)
print(outputs)

tensor([[    0,  1595,   871,    56,    36,  4252,    12,  2140,  3957,    13,
             3,     7,   994,  2117,  1765,    16, 15810,    11,   119,  3119,
             3,     5,     8,  6121,    19,    92,  2421,     3,     9,  1156,
           951,  1120,     3,    18, 22175,  1212,     3,     5,    96, 21015,
             7,    33, 18995,  9808,    21,   936,  1668,   976,   845,    46]])


In [None]:
print(tokenizer.decode(outputs[0]))

<pad> hotel staff will be trained to spot signs of sex trafficking in Mumbai and other cities. the initiative is also developing a mobile phone app - Rescue Me. "hotels are breeding grounds for human trade," says an


#3. Machine Translation

In [None]:
#=====translation===
from transformers import pipeline
translator = pipeline("translation_en_to_fr")
print(translator("Hugging Face is a technology company based in New York and Paris", max_length=40))

[{'translation_text': 'Hugging Face est une entreprise technologique basée à New York et à Paris.'}]


In [None]:
conv = pipeline("conversational")

Downloading:   0%|          | 0.00/642 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/863M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/456k [00:00<?, ?B/s]

In [None]:
from transformers import Conversation
conversation_1 = Conversation("Going to the movies tonight - any suggestions?")
conversation_2 = Conversation("What's the last book you have read?")
conv([conversation_1, conversation_2])

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[Conversation id: 94313601-8104-4d35-a724-68f646f57b90 
 user >> Going to the movies tonight - any suggestions? 
 bot >> The Big Lebowski ,
 Conversation id: f1de448f-f971-4c75-bb75-46d8d380e7b6 
 user >> What's the last book you have read? 
 bot >> The Last Question ]

In [None]:
conversation_1.add_user_input("Is it an action movie?")
conversation_2.add_user_input("What is the genre of this book?")

conv([conversation_1, conversation_2])

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[Conversation id: 94313601-8104-4d35-a724-68f646f57b90 
 user >> Going to the movies tonight - any suggestions? 
 bot >> The Big Lebowski 
 user >> Is it an action movie? 
 bot >> It's a comedy. , Conversation id: f1de448f-f971-4c75-bb75-46d8d380e7b6 
 user >> What's the last book you have read? 
 bot >> The Last Question 
 user >> What is the genre of this book? 
 bot >> I'm not sure, but I think it's fantasy. ]

#Reference
Transformers documentations: https://huggingface.co/transformers/index.html