## Natural Language Inference

Zero-shot classification can also be done using [Natural Language Inference (NLI)](https://nlpprogress.com/english/natural_language_inference.html), which refers to the task of  determining the logical relationship between two sentences: a Premise (المقدمة) and a Hypothesis (الفرضية):

- True (**entailment**)
- False (**contradiction**)
- Undetermined (**neutral**)

### Logical Classifications (التصنيفات المنطقية)

There are three possible outputs:

- **Neutral (الإمكان):** The truth of the Premise **does not determine** the truth of the Hypothesis.
    - _Variation Example:_
        - **Premise:** "Two people are smiling at a party."
        - **Hypothesis:** "They are happy about the food."
        - **Arabic:** (شخصان يبتسمان في حفلة) $\leftarrow$ (إنهما سعيدان بالطعام).
- **Entailment (اللزوم):** If the Premise is true, the Hypothesis **must be true**.
    - _Variation Example:_
        - **Premise:** "A soccer player is running across the field."
        - **Hypothesis:** "A person is moving."
        - **Arabic:** (لاعب كرة قدم يركض عبر الملعب) $\leftarrow$ (شخص يتحرك).
- **Contradiction (التناقض):** If the Premise is true, the Hypothesis **must be false**.
    - _Variation Example:_
        - **Premise:** "A woman is inspecting a uniform."
        - **Hypothesis:** "The woman is sleeping."
        - **Arabic:** (شخص يتفحص زيًا رسميًا) $\leftarrow$ (شخص نائم).

Example:

| Premise | Hypothesis | Label |
| --- | --- | --- |
| A soccer game with multiple males playing. | Some men are playing a sport. | entailment |
| A man inspects the uniform of a figure in some East Asian country. | The man is sleeping. | contradiction |
| An older and younger man smiling. | Two men are smiling and laughing at the cats playing on the floor. | neutral |

NLI (Natural Language Inference) can pull off zero-shot classification by turning the task into a true/false question. Here's how:
1. take the text you want to classify (e.g., a movie review) and call it the "premise."
2. craft a "hypothesis" like, “This is a positive review.”
3. The model checks if this hypothesis follows from the premise (entailment = true) or contradicts it (false).
    - If it "entails," label it positive;
    - if it "contradicts," it’s negative.

You don't even need specific training for this.

Zero, single and few-shot classification seem to be an emergent feature of large language models. This feature seems to come about around model sizes of +100M parameters. The effectiveness of a model at a zero, single or few-shot task seems to scale with model size, meaning that larger models (models with more trainable parameters or layers) generally do better at this task.

- Approaches used for NLI include earlier symbolic and statistical approaches to more recent deep learning approaches.
- Benchmark datasets used for NLI include [SNLI](https://paperswithcode.com/dataset/snli), [MultiNLI](https://paperswithcode.com/dataset/multinli), [SciTail](https://paperswithcode.com/dataset/scitail), among others.
- You can get hands-on practice on the SNLI task by following this [d2l.ai chapter](https://d2l.ai/chapter_natural-language-processing-applications/natural-language-inference-and-dataset.html).

Let's grab a [NLI model from HuggingFace](https://huggingface.co/models?pipeline_tag=zero-shot-classification) and demonstrate how to use it for zero-shot classification:

In [None]:
%pip install -qU datasets transformers[sentencepiece]

In [None]:
from transformers import pipeline

# Pre-trained MNLI model
pipe = pipeline(
    model="facebook/bart-large-mnli",
    device='cuda',
)

In [None]:
predictions = pipe("I have a problem with my iphone that needs to be resolved asap!",
    candidate_labels=["urgent", "not urgent", "phone", "tablet", "computer"],
)
predictions

Let's make it multi-label classification via `multi_labels=True`:

In [None]:
predictions = pipe("I have a problem with my iphone that needs to be resolved asap!",
    candidate_labels=["urgent", "not urgent", "phone", "tablet", "computer"],
    multi_label=True
)
predictions

Let's try running this on the `rotten_tomatoes` dataset (movie reviews):

In [None]:
import pandas as pd
from datasets import load_dataset

tomatoes = load_dataset("rotten_tomatoes")

# Pandas for easier control
# tomatoes_train_df = pd.DataFrame(tomatoes["train"])
tomatoes_eval_df = pd.DataFrame(tomatoes["test"])

It takes 44s to classify all 1,066 examples on a run on T4 GPU:

In [None]:
# Candidate labels
candidate_labels = [
    "very negative movie review",
    "very positive movie review",
]
candidate_labels_dict = {k: v for k, v in enumerate(candidate_labels)}

# Create predictions
predictions = pipe(tomatoes_eval_df.text.tolist(), candidate_labels)

In [None]:
predictions

## Exercise

Identify the user intent using NLI:

- `"Hello, I want to get me a Laptop how much does it cost?"`
    - `"BUY Laptop"`
- `"I am very frustrated with your service, and I wanna cancel right now!"`
    - `"CANCEL Subscription"`
- `"Here I bought this Keyboard, but it is not working. I want to get my money back!"`
    - `"REFUND Transaction"`

In [None]:
# YOUR CODE HERE

In [None]:
predictions = pipe(
    "Hello, I want to get me a Laptop how much does it cost?",
    candidate_labels=[
        "BUY LAPTOP",
        "CANCEL Subscription",
        "REFUND Transaction",
        ],
    multi_label=True
)
predictions