# Zero-shot Classification

The main goal of zero-shot classification is to be able to classify text without using any labeled data and without seeing labelled text. Zero-shot classification models can be used on text of a different domain that is was not initially trained on. This makes these types of models best for generalized topics overall.

In [None]:
#to check gpu usage
from GPUtil import showUtilization as gpu_usage
gpu_usage()  

In [None]:
%env CUDA_VISIBLE_DEVICES= 3

In [None]:
#packages that need to be installed
!pip3 install transformers

In [None]:
#import needed packages
from transformers import pipeline, AutoModelForSequenceClassification, AutoTokenizer
import torch

Before looking further under the hood of the model, will use Hugging Face's pipline module to see how it works. The pipeline allows for mdoels to be tested out where only the needed inputs are needed.

In [None]:
classifier= pipeline(task="zero-shot-classification", model="facebook/bart-large-mnli")

input_sequence = "I love traveling"
label_candidate = ['travel', 'cooking', 'entertainment', 'dancing', 'technology']
classifier(input_sequence, label_candidate)

The result above shows the label that is most likely to be related out of the list of labels. This is a single-label version of zero-shot classification.

Zero-shot classification can also be used for multi-label classification where it tests the probablity of each label being similar to the statement or input. Each probability score is calculated independently instead of together. This is tested below while still using the pipeline function above.

In [None]:
import pprint

text_piece = "I love traveling"
labels = ['travel', 'cooking', 'entertainment', 'dancing', 'technology']

predictions = classifier(text_piece, labels, multi_label=True)
pprint.pprint(predictions)

When comparing the single-label to the multi-label, "travel" is still the label that is the msot related to the statement. The multi-class version shows that "entertainment" can also be highly related to the statement.

In [None]:
#select the pretrained model with AutoModelForSequenceClassification
model= AutoModelForSequenceClassification.from_pretrained('joeddav/xlm-roberta-large-xnli')

In [None]:
#configuration feature used to see how the model works and what the pretrained settings are set as
model.config

In [None]:
checkpoint= 'joeddav/xlm-roberta-large-xnli'

model= AutoModelForSequenceClassification.from_pretrained(checkpoint)
tokenizer= AutoTokenizer.from_pretrained(checkpoint)

#device= torch.device('cuda' if torch.cuda.is_available() else "cpu")

premise= input('Please enter a statement in English or any of the languages listed above: ')
#allow for input of multiple labels at once instead of repeatedly asking
labels= [str(x) for x in input("Please list the labels you would like to use for this classification. Use a comma to separate each word: ").split(', ')]

output={}

for i in labels:
    x= tokenizer.encode(premise, i, return_tensors='pt', truncation_strategy='only_first')
    logits= model(x.to(device))[0]

    entail_contradiction_logits= logits[:, [0,2]]
    probs= entail_contradiction_logits.softmax(dim=1)
    output[i]= float(probs[:,1])

#return results in order of probability of similarity
def sort(d, reverse=False):
    return dict(sorted(d.items(), key= lambda x: x[1], reverse=reverse))

def display(d, show=True):
    plt.barh(range(len(d)), d.values(), tick_label= d.keys())

print(premise)
print('')
print(output)
output.display(True)