In [1]:
%reload_ext autoreload
%autoreload 2
%matplotlib inline
import os
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID";
os.environ["CUDA_VISIBLE_DEVICES"]="0" 

# Zero Shot Learning Using Natural Language Inference

In this notebook, we will demonstrate **zero-shot** topic classification.  **Zero-Shot Learning (ZSL)** is being able to solve a task despite not having received any training examples of that task.  The `ZeroShotClassifier` class in *ktrain* can be used to perform topic classification with no training examples.  The technique is based on **Natural Language Inference (or NLI)** as described in [this interesting blog post](https://joeddav.github.io/blog/2020/05/29/ZSL.html) by Joe Davison.

## STEP 1: Setup the Zero Shot Classifier and Describe Topics

We first instantiate the zero-shot-classifier and then describe the topic labels for our classifier with strings.

In [2]:
from ktrain import text 

In [3]:
zsl = text.ZeroShotClassifier()
topic_strings=['politics', 'elections', 'sports', 'films', 'television']

## STEP 2: Predict

There is no training involved here, as we are using **zero-shot-learning**.  We will simply supply the document that is being classified and the `topic_strings` defined earlier. The `predict` method uses Natural Language Inference (NLI) to infer the topic probabilities.

In [4]:
doc = 'I am extremely dissatisfied with the President and will definitely vote in 2020.'
zsl.predict(doc, topic_strings=topic_strings, include_labels=True)

[('politics', 0.9829113483428955),
 ('elections', 0.9880988001823425),
 ('sports', 0.00030677582253701985),
 ('films', 0.0008969294722191989),
 ('television', 0.00045271270209923387)]

As you can see, our model correctly assigned the highest probabilities to `politics` and `elections`, as the text supplied pertains to both these topics.

Let's try some other examples.
#### document about `television`

In [5]:
doc = 'What is your favorite sitcom of all time?'
zsl.predict(doc, topic_strings=topic_strings, include_labels=True)

[('politics', 0.0001159722960437648),
 ('elections', 0.00015142698248382658),
 ('sports', 0.00011554622324183583),
 ('films', 0.035863082855939865),
 ('television', 0.9755581617355347)]

#### document about both `politics` and `television`

In [6]:
doc = """
President Donald Trump's senior adviser and son-in-law, Jared Kushner, praised 
the administration's response to the coronavirus pandemic as a \"great success story\" on Wednesday -- 
less than a day after the number of confirmed coronavirus cases in the United States topped 1 million. 
Kushner painted a rosy picture for \"Fox and Friends\" Wednesday morning, 
saying that \"the federal government rose to the challenge and 
this is a great success story and I think that that's really what needs to be told.\"
"""
zsl.predict(doc, topic_strings=topic_strings, include_labels=True)

[('politics', 0.8382046818733215),
 ('elections', 0.009549508802592754),
 ('sports', 0.003681211732327938),
 ('films', 0.045103102922439575),
 ('television', 0.9293773174285889)]

#### document about `sports`, `television`, and `film`

In [7]:
doc = "The Last Dance is a 2020 American basketball documentary miniseries co-produced by ESPN Films and Netflix."
zsl.predict(doc, topic_strings=topic_strings, include_labels=True)

[('politics', 0.0003102553600911051),
 ('elections', 0.00048395441262982786),
 ('sports', 0.9848700761795044),
 ('films', 0.9717175364494324),
 ('television', 0.9505334496498108)]