# NLP Transformers Pipelines

This notebook pretends to show the use cases provided by the [*transformers*](https://huggingface.co/transformers/) library. More specifically, there's a high-level API that provides [*Pipelines*](https://huggingface.co/transformers/main_classes/pipelines.html) with pretrained specific models.

In [6]:
import json

from transformers import pipeline

## Load of Pipelines

Please, take in mind that the pipelines above contain large pretrained language models to download (+250 MB). Feel free to use whatever of them you might find useful.

In [8]:
sentiment_analysis_pipeline = pipeline('sentiment-analysis')

HBox(children=(FloatProgress(value=0.0, description='Downloading', max=230.0, style=ProgressStyle(description_…




In [9]:
question_answering_pipeline = pipeline('question-answering')

HBox(children=(FloatProgress(value=0.0, description='Downloading', max=230.0, style=ProgressStyle(description_…




In [10]:
summarization_pipeline = pipeline('summarization')

In [11]:
fill_mask_pipeline = pipeline('fill-mask')

HBox(children=(FloatProgress(value=0.0, description='Downloading', max=230.0, style=ProgressStyle(description_…




## Sentiment Analysis

Find out the sentiment associated to a particular text.

In [6]:
text = "The quality of the food was outstanding and the service was really helpful"
print(sentiment_analysis_pipeline(text))

[{'label': 'POSITIVE', 'score': 0.9998406767845154}]


In [10]:
text = "The movie was plain and boring, and the cinema too crowded"
print(sentiment_analysis_pipeline(text))

[{'label': 'NEGATIVE', 'score': 0.9998052716255188}]


## Questions & Answers

Given a certain text, the pipeline will process it and, for a particular question, will give its answer and the text position where it is located.

In [14]:
text = """The great pyramid of Giza has a height of 138.8 meters. It's also one of the Seven Wonders of the Ancient World"""

In [16]:
print(question_answering_pipeline(question=f"How tall is the pyramid of Giza", context=text))
print(question_answering_pipeline(question=f"Is the pyramid of Giza one of the Ancient World Wonders?", context=text))

{'score': 0.9495661311375265, 'start': 42, 'end': 55, 'answer': '138.8 meters.'}
{'score': 0.7056748149489778, 'start': 56, 'end': 110, 'answer': "It's also one of the Seven Wonders of the Ancient World"}


## Text summarization
Summarize a given text. You can specify the maximum and minimum length of the summarization's result.

In [64]:
text = """Three of the biggest AI vendors pledged to stop providing face 
    recognition services to police — but other companies continue to serve 
    the law-enforcement market.
    What’s new: Amid protests over police killings of unarmed Black people 
    in the U.S., Amazon imposed a one year moratorium on licensing its 
    Rekognition technology to police departments, and Microsoft announced 
    a similar hiatus. Both said they would re-enter the market if the 
    government imposed limits on police use of the technology. IBM exited 
    the face recognition market altogether.
    Demand, meet supply: The big AI companies are highly visible, but 
    most law enforcement agencies get the technology from lesser-known 
    firms, the Wall Street Journal reported.
    
    Clearview AI has 2,400 police customers in the U.S. and Canada.
    NEC licenses face recognition to 20 law enforcement agencies. 
    Ayonix, iOmniscient, and Herta Security each serve a handful of 
    U.S. law enforcement agencies. The French company Idemia works 
    with the New York Police Dept., the U.S. State Dept., and the U.S. 
    Transportation Safety Administration as well as the European and 
    Australian governments.Why it matters: Concern over fairness in law 
    enforcement has renewed worries that unfettered use of face recognition 
    leads to miscarriages of justice. Research spearheaded by MIT Media Lab 
    researcher Joy Buolamwini showed that commercially available systems 
    consistently misclassified women and people with darker complexions. A 
    study by the American Civil Liberties Union found that Amazon’s system 
    erroneously matched mugshots with the faces of 28 members of the U.S. 
    Congress. Some police departments have misused the technology in ways 
    that experts say could lead to mistaken arrests.
    We’re thinking: It's great to see the big AI providers exercising 
    responsibility. Now we need prudent regulation and auditing mechanisms 
    geared to protect civil rights and support social justice."""

In [65]:
print(json.dumps(summarization_pipeline(text, max_length=130, min_length=30), indent=2))

[
  {
    "summary_text": "Amazon, Microsoft, and IBM have pledged to stop providing face recognition services to police. But other companies continue to serve the law-enforcement market. Concerns that unfettered use of face recognition leads to miscarriages of justice."
  }
]


## Fill Mask
This Pipeline is able to fill a blank token in a given sentence.

In [7]:
text = "The Fitzerald's Steak House has one of the best {nlp.tokenizer.mask_token} in town."
print(json.dumps(fill_mask_pipeline(f"The Fitzerald's Steak House has one of the best {fill_mask_pipeline.tokenizer.mask_token} in town."), indent=2))

[
  {
    "sequence": "<s> The Fitzerald's Steak House has one of the best burgers in town.</s>",
    "score": 0.1277015060186386,
    "token": 22551
  },
  {
    "sequence": "<s> The Fitzerald's Steak House has one of the best restaurants in town.</s>",
    "score": 0.10874764621257782,
    "token": 4329
  },
  {
    "sequence": "<s> The Fitzerald's Steak House has one of the best ribs in town.</s>",
    "score": 0.06571755558252335,
    "token": 21443
  },
  {
    "sequence": "<s> The Fitzerald's Steak House has one of the best sandwiches in town.</s>",
    "score": 0.05932178720831871,
    "token": 19072
  },
  {
    "sequence": "<s> The Fitzerald's Steak House has one of the best wines in town.</s>",
    "score": 0.04994919151067734,
    "token": 13921
  }
]
