# This was followed by the introduction of several influential models, including:
June 2018: GPT, the first pretrained Transformer model, used for fine-tuning on various NLP tasks and obtained state-of-the-art results

October 2018: BERT, another large pretrained model, this one designed to produce better summaries of sentences (more on this in the next chapter!)

February 2019: GPT-2, an improved (and bigger) version of GPT that was not immediately publicly released due to ethical concerns

October 2019: DistilBERT, a distilled version of BERT that is 60% faster, 40% lighter in memory, and still retains 97% of BERT’s performance

October 2019: BART and T5, two large pretrained models using the same architecture as the original Transformer model (the first to do so)

May 2020, GPT-3, an even bigger version of GPT-2 that is able to perform well on a variety of tasks without the need for fine-tuning (called zero-shot learning)

# Text classification using transfomer

In [2]:
# !pip install transformers

In [5]:
from transformers import pipeline

classifier = pipeline("zero-shot-classification", model="facebook/bart-large-mnli", device = 0)
#https://huggingface.co/models

In [6]:
sequence = "Many stores are have a lack of inventory due to supply shortages"
labels = ["energy", "retail", "politics", "economy"]

In [7]:
classifier(sequence, labels)

{'sequence': 'Many stores are have a lack of inventory due to supply shortages',
 'labels': ['retail', 'economy', 'energy', 'politics'],
 'scores': [0.8051833510398865,
  0.14998775720596313,
  0.03164863586425781,
  0.013180266134440899]}

In [8]:
sequences = ["An increase in travel demand is one of the causes of an oil price increase",
"As expected, polls show that the party in power will lose seats in Congress during the midterm elections.",
"The party in power may lose seats in the next election due to inflation and recession concerns"]
classifier(sequences, labels)

[{'sequence': 'An increase in travel demand is one of the causes of an oil price increase',
  'labels': ['energy', 'economy', 'retail', 'politics'],
  'scores': [0.9332441687583923,
   0.04136708378791809,
   0.014958749525249004,
   0.010430028662085533]},
 {'sequence': 'As expected, polls show that the party in power will lose seats in Congress during the midterm elections.',
  'labels': ['politics', 'economy', 'retail', 'energy'],
  'scores': [0.9175485372543335,
   0.039845362305641174,
   0.02190743386745453,
   0.02069861814379692]},
 {'sequence': 'The party in power may lose seats in the next election due to inflation and recession concerns',
  'labels': ['economy', 'politics', 'energy', 'retail'],
  'scores': [0.5547062754631042,
   0.43135273456573486,
   0.008113222196698189,
   0.005827777087688446]}]

In [10]:
classifier(sequences, labels, multi_label=True)

[{'sequence': 'An increase in travel demand is one of the causes of an oil price increase',
  'labels': ['energy', 'economy', 'retail', 'politics'],
  'scores': [0.7724307179450989,
   0.0009950794046744704,
   0.00019580473599489778,
   4.688703847932629e-05]},
 {'sequence': 'As expected, polls show that the party in power will lose seats in Congress during the midterm elections.',
  'labels': ['politics', 'economy', 'energy', 'retail'],
  'scores': [0.9895074963569641,
   0.4433147609233856,
   0.20555660128593445,
   0.08531258255243301]},
 {'sequence': 'The party in power may lose seats in the next election due to inflation and recession concerns',
  'labels': ['politics', 'economy', 'energy', 'retail'],
  'scores': [0.9844667315483093,
   0.9779922366142273,
   0.05363397300243378,
   0.018095433712005615]}]