## Extended use of Huggingface's Zero-Shot Pipeline

- In this notebook, we extend the last notebook's zero-shot learning while using custom sentences and labels to classify those texts.  
- You will also see, how multi-lingual transformer models can be used to perform various tasks in many languages.

In [1]:
from transformers import pipeline

import pandas as pd

In [2]:
classifier = pipeline("zero-shot-classification", device=0) # to utilize GPU

No model was supplied, defaulted to FacebookAI/roberta-large-mnli and revision 130fb28 (https://huggingface.co/FacebookAI/roberta-large-mnli).
Using a pipeline without specifying a model name and revision in production is not recommended.
2026-02-09 21:05:17.759841: I metal_plugin/src/device/metal_device.cc:1154] Metal device set to: Apple M2 Pro
2026-02-09 21:05:17.759866: I metal_plugin/src/device/metal_device.cc:296] systemMemory: 32.00 GB
2026-02-09 21:05:17.759871: I metal_plugin/src/device/metal_device.cc:313] maxCacheSize: 10.67 GB
2026-02-09 21:05:17.759886: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2026-02-09 21:05:17.759896: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical Pluggabl

We can use this pipeline by passing in a sequence and a list of candidate labels. The pipeline assumes by default that only one of the candidate labels is true, returning a list of scores for each label which add up to 1.

In [3]:
"""
We create a function to display our predictions from the model in a tabular form
"""
def get_predictions_score(prediction):
    pred_labels = prediction['labels']
    pred_scores = prediction['scores']
    seq = [prediction['sequence']]
    return  pd.concat([
                pd.DataFrame(seq),
                pd.DataFrame(pred_labels),
                pd.DataFrame(pred_scores),
            ], axis=1, ignore_index=True).rename(columns={0:'Sequence',1:'Labels', 2:'Probability'}).set_index(['Sequence'])

In [4]:
sequence = "Amazon is the longest river in the world"
candidate_labels = ["geography",  "delivery"]

pred = classifier(sequence, candidate_labels)
get_predictions_score(pred)

Unnamed: 0_level_0,Labels,Probability
Sequence,Unnamed: 1_level_1,Unnamed: 2_level_1
Amazon is the longest river in the world,geography,0.870195
,delivery,0.129805


What if we change some spellings? Here we change Amazon -> amazon. It doesn't make much difference but in some cases it will. <br> Try playing with spellings and adding or removing labels

In [5]:
sequence = "amazon is the longest river in the world"
candidate_labels = ["geography",  "delivery"]

pred = classifier(sequence, candidate_labels)
get_predictions_score(pred)

Unnamed: 0_level_0,Labels,Probability
Sequence,Unnamed: 1_level_1,Unnamed: 2_level_1
amazon is the longest river in the world,geography,0.805672
,delivery,0.194328


In [7]:
sequence = "amazon is the longest river in the World"
candidate_labels = ["geography",  "delivery"]

pred = classifier(sequence, candidate_labels)
get_predictions_score(pred)

Unnamed: 0_level_0,Labels,Probability
Sequence,Unnamed: 1_level_1,Unnamed: 2_level_1
amazon is the longest river in the World,geography,0.768614
,delivery,0.231386


In [13]:
sequence = "amazon is the longest river in the world"
candidate_labels = ["geography", "nature", "water"]

pred = classifier(sequence, candidate_labels)
get_predictions_score(pred)

Unnamed: 0_level_0,Labels,Probability
Sequence,Unnamed: 1_level_1,Unnamed: 2_level_1
amazon is the longest river in the world,water,0.682985
,nature,0.165954
,geography,0.151062


In the example below, you'll see how good are these models in understanding the context, with a slight spelling mistake. <br> Try changing the spelling and observe the results

In [10]:
sequence = "are we going to Oktoberfest?"
candidate_labels = ["food", "Munich", "bear", "wine", "pretzel", "sausage"] ## What if you change bear (animal) -> beer (drink)

pred = classifier(sequence, candidate_labels)
get_predictions_score(pred)

Unnamed: 0_level_0,Labels,Probability
Sequence,Unnamed: 1_level_1,Unnamed: 2_level_1
are we going to Oktoberfest?,wine,0.489264
,Munich,0.160479
,sausage,0.132136
,food,0.091373
,bear,0.085948
,pretzel,0.040801


In [11]:
sequence = "are we going to Oktoberfest?"
candidate_labels = ["food", "Munich", "beer", "wine", "pretzel", "sausage"] ## What if you change bear (animal) -> beer (drink)

pred = classifier(sequence, candidate_labels)
get_predictions_score(pred)

Unnamed: 0_level_0,Labels,Probability
Sequence,Unnamed: 1_level_1,Unnamed: 2_level_1
are we going to Oktoberfest?,beer,0.531387
,wine,0.250834
,Munich,0.082274
,sausage,0.067743
,food,0.046845
,pretzel,0.020918


In [12]:
sequence = "Who are you voting for in 2020?"
candidate_labels = ["food", "public health", "plants", "fruits","america"]

pred = classifier(sequence, candidate_labels)
get_predictions_score(pred)

Unnamed: 0_level_0,Labels,Probability
Sequence,Unnamed: 1_level_1,Unnamed: 2_level_1
Who are you voting for in 2020?,america,0.39764
,public health,0.213483
,plants,0.137286
,fruits,0.133617
,food,0.117974


##### The predictions are poor as the labels are not related to the sequence. But there are ways to improve upon this. We can provide related target labels for the input sequence.


In [16]:
## Think about other labels which can improve the predictions
## HINT: Labels related to your text
sequence = "Who are you voting for in 2020?"
candidate_labels = ["elections","america","president","politics","republican","democrate"]

pred = classifier(sequence, candidate_labels)
get_predictions_score(pred)

Unnamed: 0_level_0,Labels,Probability
Sequence,Unnamed: 1_level_1,Unnamed: 2_level_1
Who are you voting for in 2020?,politics,0.503332
,elections,0.447219
,president,0.022313
,america,0.014836
,democrate,0.006287
,republican,0.006012


To do multi-class classification, simply pass `multi_class=True`. In this case, the scores will be independent, but each will fall between 0 and 1.

In [17]:
sequence = "Who are you voting for in 2020?"
candidate_labels = ["politics", "public health", "economics", "elections"]

pred = classifier(sequence, candidate_labels, multi_label=True)
get_predictions_score(pred)

Unnamed: 0_level_0,Labels,Probability
Sequence,Unnamed: 1_level_1,Unnamed: 2_level_1
Who are you voting for in 2020?,politics,0.996493
,elections,0.993664
,economics,0.063439
,public health,0.054792


#### Sentiment Classification

Here's an example of sentiment classification: 

In [18]:
sequence = "I hated this movie. The acting sucked."
candidate_labels = ["positive", "negative"]

pred = classifier(sequence, candidate_labels)
get_predictions_score(pred)

Unnamed: 0_level_0,Labels,Probability
Sequence,Unnamed: 1_level_1,Unnamed: 2_level_1
I hated this movie. The acting sucked.,negative,0.995728
,positive,0.004272


So how does this method work?

The underlying model is trained on the task of Natural Language Inference (NLI), which takes in two sequences and determines whether they contradict each other, entail each other, or neither.

This can be adapted to the task of zero-shot classification by treating the sequence which we want to classify as one NLI sequence (called the premise) and turning a candidate label into the other (the hypothesis). If the model predicts that the constructed premise _entails_ the hypothesis, then we can take that as a prediction that the label applies to the text. Check out [this blog post](https://joeddav.github.io/blog/2020/05/29/ZSL.html) for a more detailed explanation.

By default, the pipeline turns labels into hypotheses with the template `This example is {class_name}.`. This works well in many settings, but you can also customize this for your specific setting. Let's add another review to our above sentiment classification example that's a bit more challenging:

In [19]:
sequences = [
    "I hated this movie. The acting sucked.",
    "This movie didn't quite live up to my high expectations, but overall I still really enjoyed it."
]
candidate_labels = ["positive", "negative"]

classifier(sequences, candidate_labels)

[{'sequence': 'I hated this movie. The acting sucked.',
  'labels': ['negative', 'positive'],
  'scores': [0.9957284331321716, 0.004271592944860458]},
 {'sequence': "This movie didn't quite live up to my high expectations, but overall I still really enjoyed it.",
  'labels': ['positive', 'negative'],
  'scores': [0.8578121662139893, 0.14218787848949432]}]

The second example is a bit harder. Let's see if we can improve the results by using a hypothesis template which is more specific to the setting of review sentiment analysis. Instead of the default, `This example is {}.`, we'll use, `The sentiment of this review is {}.` (where `{}` is replaced with the candidate class name)

In [20]:
sequences = [
    "I hated this movie. The acting sucked.",
    "This movie didn't quite live up to my high expectations, but overall I still really enjoyed it."
]
candidate_labels = ["positive", "negative"]
hypothesis_template = "The sentiment of this review is {}."

classifier(sequences, candidate_labels, hypothesis_template=hypothesis_template)

[{'sequence': 'I hated this movie. The acting sucked.',
  'labels': ['negative', 'positive'],
  'scores': [0.9941108822822571, 0.0058891079388558865]},
 {'sequence': "This movie didn't quite live up to my high expectations, but overall I still really enjoyed it.",
  'labels': ['positive', 'negative'],
  'scores': [0.9871302247047424, 0.01286973338574171]}]

By providing a more precise hypothesis template, we are able to see a more accurate classification of the second review.

> Note that sentiment classification is used here just as an illustrative example. The [Hugging Face Model Hub](https://huggingface.co/models?filter=text-classification) has a number of models trained specifically on sentiment tasks which can be used instead.

#### Zero-shot classification in more than 100 languages



Interested in using the pipeline for languages other than English? There is a cross-lingual model on top of XLM RoBERTa which you can use by passing `model='joeddav/xlm-roberta-large-xnli'` when creating the pipeline: 

In [26]:
classifier = pipeline("zero-shot-classification", model='joeddav/xlm-roberta-large-xnli', device=0)

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


config.json:   0%|          | 0.00/734 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/2.24G [00:00<?, ?B/s]

Some weights of the PyTorch model were not used when initializing the TF 2.0 model TFXLMRobertaForSequenceClassification: ['roberta.embeddings.position_ids']
- This IS expected if you are initializing TFXLMRobertaForSequenceClassification from a PyTorch model trained on another task or with another architecture (e.g. initializing a TFBertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing TFXLMRobertaForSequenceClassification from a PyTorch model that you expect to be exactly identical (e.g. initializing a TFBertForSequenceClassification model from a BertForSequenceClassification model).
All the weights of TFXLMRobertaForSequenceClassification were initialized from the PyTorch model.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFXLMRobertaForSequenceClassification for predictions without further training.


tokenizer_config.json:   0%|          | 0.00/25.0 [00:00<?, ?B/s]

sentencepiece.bpe.model:   0%|          | 0.00/5.07M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/150 [00:00<?, ?B/s]

You can use it with any combination of languages. For example, let's classify a Russian sentence with English candidate labels:

In [27]:
sequence = "За кого вы голосуете в 2020 году?" # translation: "Who are you voting for in 2020?"
candidate_labels = ["Europe", "public health", "politics"]

classifier(sequence, candidate_labels)

{'sequence': 'За кого вы голосуете в 2020 году?',
 'labels': ['politics', 'Europe', 'public health'],
 'scores': [0.904848575592041, 0.05722159147262573, 0.03792988136410713]}

Now let's do the same but with the labels in French:



In [28]:
sequence = "За кого вы голосуете в 2020 году?" # translation: "Who are you voting for in 2020?"
candidate_labels = ["Europe", "santé publique", "politique"]

classifier(sequence, candidate_labels)

{'sequence': 'За кого вы голосуете в 2020 году?',
 'labels': ['politique', 'Europe', 'santé publique'],
 'scores': [0.9726150035858154, 0.01712874136865139, 0.010256249457597733]}

As we discussed in the last section, the default hypothesis template is the English, `This text is {}.`. If you are working strictly within one language, it may be worthwhile to translate this to the language you are working with:

In [29]:
sequence = "¿A quién vas a votar en 2020?"
candidate_labels = ["Europa", "salud pública", "política"]
hypothesis_template = "Este ejemplo es {}."

classifier(sequence, candidate_labels, hypothesis_template=hypothesis_template)

{'sequence': '¿A quién vas a votar en 2020?',
 'labels': ['política', 'Europa', 'salud pública'],
 'scores': [0.9109603762626648, 0.05954741686582565, 0.02949213795363903]}

The model is fine-tuned on XNLI which includes 15 languages: Arabic, Bulgarian, Chinese, English, French, German, Greek, Hindi, Russian, Spanish, Swahili, Thai, Turkish, Urdu, and Vietnamese. The base model is trained on 85 more, so the model will work to some degree for any of those in the XLM RoBERTa training corpus (see the full list in appendix A of the [XLM Roberata paper](https://arxiv.org/abs/1911.02116)).

See the [model page](https://huggingface.co/joeddav/xlm-roberta-large-xnli) for more.

### Different Pipeline models

[Read here](https://huggingface.co/docs/transformers/main_classes/pipelines) about different models available from Huggingface pipeline.

#### Text Generation

In [30]:
text_gen = pipeline("text-generation", model='gpt2') # to utilize GPU

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

All PyTorch model weights were used when initializing TFGPT2LMHeadModel.

All the weights of TFGPT2LMHeadModel were initialized from the PyTorch model.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFGPT2LMHeadModel for predictions without further training.


tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

In [31]:
prompt = "Data Science is cool"
text_gen(prompt, max_length=30, num_return_sequences=3)

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': "Data Science is cool right now because I'm pretty good at it. But it's hard to do much except get my hands on a few years on"},
 {'generated_text': 'Data Science is cool! And I would love to see more of your research involved in your project!\n\nFor more info'},
 {'generated_text': "Data Science is cool, but why wouldn't it be cool for anyone in the gaming community to see their games and videos disappear from YouTube? The game"}]

In [None]:
prompt = "Gambling is"
text_gen(prompt, max_length=30, num_return_sequences=3)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': 'Gambling is a serious problem in the United States, with over 2.3 million reported gambling crimes each year and nearly $2 billion going to criminals'},
 {'generated_text': 'Gambling is such an integral part of modern sports, it is almost impossible to describe what is gambling like without having actually seen it. We all know'},
 {'generated_text': 'Gambling is the new $5 billion that allows wealthy investors to invest and control millions of dollars in Bitcoin.\n\nBitcoin is one of the most'}]

In [34]:
prompt = "crypto is"
text_gen(prompt, max_length=30, num_return_sequences=3)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': 'crypto is not going away.\n\nWith the exception of the very last bitcoin, which is already dead, the crypto revolution has made bitcoin more'},
 {'generated_text': 'crypto is getting more and more people to think about cryptocurrency, and I think people are starting to realize that I don\'t give a shit."\n'},
 {'generated_text': "crypto is a public protocol that was invented back in 1986 by some people. I've actually heard about it back then (including the original BNet"}]

You can play around with different starting sentence. You can change `max_length` argument if you want shorter or longer sentences.

#### Sentiment Analysis

The sentiment analysis example in the beginning of the notebook can also be done using a sentiment analysis pipeline

In [None]:
## Create a new sentiment-analysis pipeline and play with the examples in the new pipeline
## HINT: You don't need to provide labels to the sentiment analysis pipeline as it is trained for the same task

### OPTIONAL
#### You can create a Hugging face account and create a token if you wish to create or push content to a repository (e.g., when training a model or modifying a model card) within hugging face.

- Create an account at https://huggingface.co/
- After logging in
    - go to Settings->Access Tokens
    - Create new token and give write permissions

- Run these commands 
    - `brew install huggingface-cli`
    - `huggingface-cli login` and paste the access token from huggingface
    - **Do not add access token for github if it asks**

    Reference: https://huggingface.co/docs/hub/security-tokens