**Zero-shot classification** in NLP refers to the ability of a model to classify text into categories it has never seen during training. This is a powerful approach because traditional classification models typically require labeled data for each category. Zero-shot models, on the other hand, can generalize their understanding of language to assign text to new, unseen categories.

Zero-shot classification is especially useful in rapidly evolving domains, where the categories are constantly changing, or when labeled data is scarce. The model's ability to generalize makes it a valuable tool for many NLP applications.

Zero-shot classification uses pre-trained models like BERT or GPT that understand language patterns. These models convert both the input text and labels into vectors and predict the label with the highest similarity. The process involves:

1. **Pre-trained Language Model**: Models like BART, GPT-3, or RoBERTa are trained on large text corpora.
2. **Natural Language Prompts**: The model receives prompts for each possible class and assesses similarity.
3. **Generalization**: The model can classify new categories not seen during training.

**Use Cases** include topic classification, sentiment analysis, and intent detection.

**Benefits**:
- No labeled data needed
- Flexible for new categories

**Challenges**:
- Potentially lower accuracy compared to task-specific models
- Dependent on model quality.

In [1]:
from transformers import pipeline
import pandas as pd



In [None]:

# classifier
task = "zero-shot-classification"
model = "facebook/bart-large-mnli"
classifier = pipeline(task=task, model=model)

#%% data prep
# first example: Raymond Chandler "The Big Sleep" (crime novel)
# second example: J.R.R. Tolkien "The Lord of the Rings" (fantasy novel)
# third example: Bill Bryson "A Short History of Nearly Everything"
documents = ["It was about eleven o’clock in the morning, mid October, with the sun not shining and a look of hard wet rain in the clearness of the foothills. I was wearing my powder-blue suit, with dark blue shirt, tie and display handkerchief, black brogues, black wool socks with dark blue clocks on them. I was neat, clean, shaved and sober, and I didn’t care who knew it. I was everything the well-dressed private detective ought to be. I was calling on four million dollars.",
             "When Mr. Bilbo Baggins of Bag End announced that he would shortly be celebrating his eleventy-first birthday with a party of special magnificence, there was much talk and excitement in Hobbiton.",
             "Welcome. And congratulations. I am delighted that you could make it. Getting here wasn’t easy, I know. In fact, I suspect it was a little tougher than you realize. To begin with, for you to be here now trillions of drifting atoms had somehow to assemble in an intricate and curiously obliging manner to create you. It’s an arrangement so specialized and particular that it has never been tried before and will only exist this once."
             ]
# %% candidate labels
candidate_labels = ["crime", "fantasy", "history"]
# %% model inference
res = classifier(documents, candidate_labels = candidate_labels)
# %% visualise result
pd.DataFrame(res[1]).plot.bar(x='labels', y='scores', rot=0, title="The Lord of the Rings")
# %% flag multiple labels
classifier(documents[0], candidate_labels = candidate_labels, multi_class=True)
 
# %%