<a href="https://colab.research.google.com/github/plaban1981/NLP-Transfer-Learning/blob/master/%F0%9F%A4%97_Zero_Shot_Pipeline.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!pip install git+https://github.com/huggingface/transformers.git

In [None]:
from transformers import pipeline

In [None]:
classifier = pipeline("zero-shot-classification")

Some weights of the model checkpoint at facebook/bart-large-mnli were not used when initializing BartForSequenceClassification: ['model.encoder.version', 'model.decoder.version']
- This IS expected if you are initializing BartForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPretraining model).
- This IS NOT expected if you are initializing BartForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


We can use this pipeline by passing in a sequence and a list of candidate labels. The pipeline assumes by default that only one of the candidate labels is true, returning a list of scores for each label which add up to 1.

In [None]:
sequence = "Who are you voting for in 2020?"
candidate_labels = ["politics", "public health", "economics"]

classifier(sequence, candidate_labels)

{'labels': ['politics', 'economics', 'public health'],
 'scores': [0.972518801689148, 0.01458414364606142, 0.012897025793790817],
 'sequence': 'Who are you voting for in 2020?'}

To do multi-class classification, simply pass `multi_class=True`. In this case, the scores will be independent, but each will fall between 0 and 1.

In [None]:
sequence = "Who are you voting for in 2020?"
candidate_labels = ["politics", "public health", "economics", "elections"]

classifier(sequence, candidate_labels, multi_class=True)

{'labels': ['politics', 'elections', 'public health', 'economics'],
 'scores': [0.972069501876831,
  0.967610776424408,
  0.03248710557818413,
  0.0061644683592021465],
 'sequence': 'Who are you voting for in 2020?'}

Here's an example of sentiment classification: 

In [None]:
sequence = "I hated this movie. The acting sucked."
candidate_labels = ["positive", "negative"]

classifier(sequence, candidate_labels)

{'labels': ['negative', 'positive'],
 'scores': [0.9916268587112427, 0.00837317667901516],
 'sequence': 'I hated this movie. The acting sucked.'}

So how does this method work?

The underlying model is trained on the task of Natural Language Inference (NLI), which takes in two sequences and determines whether they contradict each other, entail each other, or neither.

This can be adapted to the task of zero-shot classification by treating the sequence which we want to classify as one NLI sequence (called the premise) and turning a candidate label into the other (the hypothesis). If the model predicts that the constructed premise _entails_ the hypothesis, then we can take that as a prediction that the label applies to the text. Check out [this blog post](https://joeddav.github.io/blog/2020/05/29/ZSL.html) for a more detailed explanation.

By default, the pipeline turns labels into hypotheses with the template `This example is {class_name}.`. This works well in many settings, but you can also customize this for your specific setting. Let's add another review to our above sentiment classification example that's a bit more challenging:

In [None]:
sequences = [
    "I hated this movie. The acting sucked.",
    "This movie didn't quite live up to my high expectations, but overall I still really enjoyed it."
]
candidate_labels = ["positive", "negative"]

classifier(sequences, candidate_labels)

[{'labels': ['negative', 'positive'],
  'scores': [0.9916267991065979, 0.008373182266950607],
  'sequence': 'I hated this movie. The acting sucked.'},
 {'labels': ['negative', 'positive'],
  'scores': [0.8148515820503235, 0.1851484179496765],
  'sequence': "This movie didn't quite live up to my high expectations, but overall I still really enjoyed it."}]

The second example is a bit harder. Let's see if we can improve the results by using a hypothesis template which is more specific to the setting of review sentiment analysis. Instead of the default, `This example is {}.`, we'll use, `The sentiment of this review is {}.` (where `{}` is replaced with the candidate class name)

In [None]:
sequences = [
    "I hated this movie. The acting sucked.",
    "This movie didn't quite live up to my high expectations, but overall I still really enjoyed it."
]
candidate_labels = ["positive", "negative"]
hypothesis_template = "The sentiment of this review is {}."

classifier(sequences, candidate_labels, hypothesis_template=hypothesis_template)

[{'labels': ['negative', 'positive'],
  'scores': [0.9890093207359314, 0.010990672744810581],
  'sequence': 'I hated this movie. The acting sucked.'},
 {'labels': ['positive', 'negative'],
  'scores': [0.9581228494644165, 0.0418771356344223],
  'sequence': "This movie didn't quite live up to my high expectations, but overall I still really enjoyed it."}]

By providing a more precise hypothesis template, we are able to see a more accurate classification of the second review.

> Note that sentiment classification is used here just as an illustrative example. The [Hugging Face Model Hub](https://huggingface.co/models?filter=text-classification) has a number of models trained specifically on sentiment tasks which can be used instead.