<a href="https://colab.research.google.com/github/Matonice/30-Days-of-Transformer/blob/main/What_can_transformers_do.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
%%capture
!pip install transformers
!pip install sentencepiece

In [None]:
from transformers import pipeline

##**Text Classification**
Getting the sentiment of a text

In [None]:
classifier = pipeline("sentiment-analysis")
classifier("I'm so excited to get started with this 30 days of transformers challenge")

No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


[{'label': 'POSITIVE', 'score': 0.997882068157196}]

##**Zero shot classification**
Classification on texts that has not been labelled, you just need to specify the labels to use for classification

In [None]:
text = """The best and worst U.S. cities for air quality
The federal government’s Clean Air Act has helped reduce dangerous pollutants from transportation, power plants and manufacturing. However, paired with a changing climate, the American Lung Association believes it's becoming harder to protect human health.
By Shirin Ali | April 22, 2022

Story at a glance
The American Lung Association (ALA) estimates over 137 million Americans are living in places with unhealthy levels of air pollution. 
Fine particles and ozone pollution are considered the two most widespread and dangerous air pollutants. 
California was given failing air pollution grades by the ALA. 
More than 137 million Americans are living in places with failing grades for unhealthy levels of air pollution, according to the American Lung Association (ALA). 

The ALA created three separate lists highlighting the 25 worst cities for ozone pollution, short-term exposure particle pollution and year-round particle pollution. The ALA defines fine particles and ozone as the two most widespread and dangerous air pollutants. The federal government’s Clean Air Act has helped reduce dangerous pollutants from transportation, power plants and manufacturing, however paired with a changing climate, the ALA believes it’s becoming harder to protect human health. 

Starting with ozone air pollution, a type of pollutant that makes it harder to breathe for millions of Americans, often called smog. It comes out of tailpipes in cars, smokestacks and other sources. 

Ozone pollution, or smog, comes from cars, smokestacks and other sources and makes it harder to breathe for millions of Americans. Four different cities in California take the top spots in ALA’s 25 most polluted cities by ozone: 
"""
classifier = pipeline("zero-shot-classification")
labels = ['education', 'technology', 'politics', 'health', 'business', 'world', 'sports', 'entertainment']
classifier(
    text,
    candidate_labels=labels,
)

No model was supplied, defaulted to facebook/bart-large-mnli and revision c626438 (https://huggingface.co/facebook/bart-large-mnli).
Using a pipeline without specifying a model name and revision in production is not recommended.


{'sequence': "The best and worst U.S. cities for air quality\nThe federal government’s Clean Air Act has helped reduce dangerous pollutants from transportation, power plants and manufacturing. However, paired with a changing climate, the American Lung Association believes it's becoming harder to protect human health.\nBy Shirin Ali | April 22, 2022\n\nStory at a glance\nThe American Lung Association (ALA) estimates over 137 million Americans are living in places with unhealthy levels of air pollution. \nFine particles and ozone pollution are considered the two most widespread and dangerous air pollutants. \nCalifornia was given failing air pollution grades by the ALA. \nMore than 137 million Americans are living in places with failing grades for unhealthy levels of air pollution, according to the American Lung Association (ALA). \n\nThe ALA created three separate lists highlighting the 25 worst cities for ozone pollution, short-term exposure particle pollution and year-round particle p

##**Text Generation**
Auto Completing a text given a prompt

In [None]:
generator = pipeline("text-generation",max_length=50,num_return_sequences=2,)
generator("In the next 30 day, i'm going to learn about the transformers models")


No model was supplied, defaulted to gpt2 and revision 6c0e608 (https://huggingface.co/gpt2).
Using a pipeline without specifying a model name and revision in production is not recommended.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': "In the next 30 day, i'm going to learn about the transformers models for 3D printing on plastic, making new polymers as a result of these transformations, and the next two batches of plastic and polymers made by others. Then I"},
 {'generated_text': "In the next 30 day, i'm going to learn about the transformers models in 5 minute videos over the next 100 days. That's when there's going to be a big discussion about what the future looks like in this space. So here are"}]

##**Mask Filling**
Filling in the blank space in a given text

In [None]:
unmasker = pipeline("fill-mask")
unmasker("In the next 30 <mask>, I'm going to learn about the transformer models ", top_k=2)

No model was supplied, defaulted to distilroberta-base and revision ec58a5b (https://huggingface.co/distilroberta-base).
Using a pipeline without specifying a model name and revision in production is not recommended.


[{'score': 0.4613357484340668,
  'token': 360,
  'token_str': ' days',
  'sequence': "In the next 30 days, I'm going to learn about the transformer models "},
 {'score': 0.2611152231693268,
  'token': 728,
  'token_str': ' minutes',
  'sequence': "In the next 30 minutes, I'm going to learn about the transformer models "}]

##**Named Entity Recognition**
Identity different entities(Place, Organization ...) in a text

In [None]:
ner = pipeline("ner", grouped_entities=True)
ner("My name is Abdulmatin, and i'm getting started with 30 days of transformers models challenge ")

No model was supplied, defaulted to dbmdz/bert-large-cased-finetuned-conll03-english and revision f2482bf (https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
  "`grouped_entities` is deprecated and will be removed in version v5.0.0, defaulted to"


[{'entity_group': 'PER',
  'score': 0.99932474,
  'word': 'Abdulmatin',
  'start': 11,
  'end': 21}]

##**Question Answering**

Providing an answer to a question from a given context

In [None]:
question_answerer = pipeline("question-answering")
question_answerer(
    question="Who is getting started with the transformer challenge?",
    context="My name is Abdulmatin, and i'm getting started with 30 days of transformers models",
)

No model was supplied, defaulted to distilbert-base-cased-distilled-squad and revision 626af31 (https://huggingface.co/distilbert-base-cased-distilled-squad).
Using a pipeline without specifying a model name and revision in production is not recommended.


{'score': 0.9803182482719421, 'start': 11, 'end': 21, 'answer': 'Abdulmatin'}

##**Text Summarization**
Giving a short summary of a longer text while keeping the most important aspect of the text

In [None]:
summarizer = pipeline("summarization")
summarizer("""
  The best and worst U.S. cities for air quality
The federal government’s Clean Air Act has helped reduce dangerous pollutants from transportation, power plants and manufacturing. However, paired with a changing climate, the American Lung Association believes it's becoming harder to protect human health.
By Shirin Ali | April 22, 2022

Story at a glance
The American Lung Association (ALA) estimates over 137 million Americans are living in places with unhealthy levels of air pollution. 
Fine particles and ozone pollution are considered the two most widespread and dangerous air pollutants. 
California was given failing air pollution grades by the ALA. 
More than 137 million Americans are living in places with failing grades for unhealthy levels of air pollution, according to the American Lung Association (ALA). 

The ALA created three separate lists highlighting the 25 worst cities for ozone pollution, short-term exposure particle pollution and year-round particle pollution. The ALA defines fine particles and ozone as the two most widespread and dangerous air pollutants. The federal government’s Clean Air Act has helped reduce dangerous pollutants from transportation, power plants and manufacturing, however paired with a changing climate, the ALA believes it’s becoming harder to protect human health. 

Starting with ozone air pollution, a type of pollutant that makes it harder to breathe for millions of Americans, often called smog. It comes out of tailpipes in cars, smokestacks and other sources. 

Ozone pollution, or smog, comes from cars, smokestacks and other sources and makes it harder to breathe for millions of Americans. Four different cities in California take the top spots in ALA’s 25 most polluted cities by ozone: 

Los Angeles- Long Beach, CA—an average of nearly 180 bad air days 
Bakersfield, CA – an average of 95 bad air days 
Visalia, CA – an average of nearly 92 bad air days 
Fresno-Madera-Hanford, CA – an average of 60 bad air days 
Phoenix- Mesa, AZ – an average of 39 bad air days 
There were also some 63.2 million people that lived in areas that earned an F for unhealthy spikes in particulate matter air pollution, which ALA defines as short-term pollution. That’s a type of pollution that can kill, as peaks in particle pollution can last from hours to days. It has many harmful effects, ranging from decreased lung function to heart attacks. ALA’s list of 25 cities polluted daily by particle matter was again dominated by California. 

Fresno-Madera-Hartford, CA – an average of 51 bad air days 
Bakersfield, CA – an average of 41 bad air days 
Fairbanks, AK – an average of 37 bad air days 
San Jose- San Francisco-Oakland, CA – an average of 27 bad air days 
Redding-Red Bluff, CA – an average of 24 bad air days 
There are also areas of the country impacting over 20.9 million people where year-round particle pollution levels are worse than the national air quality limit. Recent wildfires in the West caused year-round particle pollution to rise. Cities with high power plant emissions and local industrial sources also tend to have high amounts of year-round particle pollution.  
""")

No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 and revision a4f8f3e (https://huggingface.co/sshleifer/distilbart-cnn-12-6).
Using a pipeline without specifying a model name and revision in production is not recommended.


[{'summary_text': " The American Lung Association (ALA) estimates over 137 million Americans are living in places with unhealthy levels of air pollution . The federal government’s Clean Air Act has helped reduce dangerous pollutants from transportation, power plants and manufacturing, however paired with a changing climate, the ALA believes it's becoming harder to protect human health . The ALA created three separate lists highlighting the 25 worst cities for ozone pollution, short-term exposure particle pollution and year-round particle pollution ."}]

**Translation**

In [None]:
translator = pipeline("translation", model="Helsinki-NLP/opus-mt-en-fr")
translator("I'm getting started with the 30 days of transformers challenge")

Downloading source.spm:   0%|          | 0.00/760k [00:00<?, ?B/s]

Downloading target.spm:   0%|          | 0.00/784k [00:00<?, ?B/s]

Downloading vocab.json:   0%|          | 0.00/1.28M [00:00<?, ?B/s]



[{'translation_text': 'Je commence avec les 30 jours du défi des transformateurs'}]