# Zero-Shot Classification

In this notebook, we'll use the `zero-shot-classification` pipeline from the HF Transformers library to predict the intents of sentences in a dataset. We'll compare the predicted intents with the actual labels and print the evaluation metrics.

Creating an NLP-based framework to parse the input question to categorize the intent into one of the question types.

Question Types:
1. Why is action A not used in the plan, rather than being used?
2. Why is action A used in the plan, rather than not being used?
3. Why is action A used in state S, rather than action B?

## Single Text Prediction

In [2]:
from transformers import pipeline
from pprint import pprint

In [3]:
# Load a pre-trained zero-shot classification pipeline
classifier = pipeline("zero-shot-classification", model="facebook/bart-large-mnli")

In [4]:
# Define the query and candidate labels
candidate_labels = ["Why is action A not used in the plan?", 
                    "Why is action A used in the plan?", 
                    "Why is action A used in state S, rather than action B?"]
query = "What made 'push box to the left' more suitable than 'move to the right'?"

# Perform zero-shot classification
result = classifier(query, candidate_labels)
pprint(result, width=100)

{'labels': ['Why is action A used in the plan?',
            'Why is action A not used in the plan?',
            'Why is action A used in state S, rather than action B?'],
 'scores': [0.3973885476589203, 0.3694774806499481, 0.2331339716911316],
 'sequence': "What made 'push box to the left' more suitable than 'move to the right'?"}


In [5]:
# Define the query and candidate labels
candidate_labels = ["Why is action A not used in the plan?", 
                    "Why is action A used in the plan?", 
                    "Why is action A used rather than action B?"]
query = "What made 'push box to the left' more suitable than 'move to the right'?"

# Perform zero-shot classification
result = classifier(query, candidate_labels)
pprint(result, width=100)

{'labels': ['Why is action A used rather than action B?',
            'Why is action A used in the plan?',
            'Why is action A not used in the plan?'],
 'scores': [0.7602096199989319, 0.12425892055034637, 0.11553144454956055],
 'sequence': "What made 'push box to the left' more suitable than 'move to the right'?"}


It seems that "Why is action A used rather than action B?" is a better intent category label than "Why is action A used in state S, rather than action B?".

<br>

## Multiple Text Prediction and Evaluation
Predict the intents of the sentences in the text column from the data csv, compare them with the actual labels, and print the evaluation metrics.

In [12]:
import pandas as pd
from transformers import pipeline
from sklearn.metrics import classification_report, accuracy_score

# Load the CSV file into a DataFrame
df = pd.read_csv('./data/intent_classification_dataset.csv')
print(f"Number of rows in the dataset: {df.shape[0]}")
df.head()

Number of rows in the dataset: 107


Unnamed: 0,text,label
0,Why is action A not included in the project ro...,1
1,What are the reasons for excluding action A fr...,1
2,Why was action A omitted from the strategy?,1
3,Why didn't we consider action A for the projec...,1
4,Why was action A left out of the final plan?,1


In [7]:
# Define the candidate labels and their corresponding intent numbers
candidate_labels = ["Why is action A not used in the plan?", 
                    "Why is action A used in the plan?", 
                    "Why is action A used rather than action B?"]

intent_to_label = {label: intent for label, intent in zip(candidate_labels, range(1, 4))}
intent_to_label

{'Why is action A not used in the plan?': 1,
 'Why is action A used in the plan?': 2,
 'Why is action A used rather than action B?': 3}

In [8]:
# Load a pre-trained zero-shot classification pipeline
classifier = pipeline("zero-shot-classification", model="facebook/bart-large-mnli")

# Function to get predictions for each text
def get_prediction(text):
    result = classifier(text, candidate_labels)
    predicted_label = intent_to_label[result['labels'][0]]
    return predicted_label

In [9]:
# Apply the function to the text column
df['predicted_label'] = df['text'].apply(get_prediction)

In [10]:
# Compare predicted labels with actual labels
y_true = df['label']
y_pred = df['predicted_label']

# Print the classification report
print(classification_report(y_true, y_pred))
print(f"Accuracy: {accuracy_score(y_true, y_pred):.2f}")

              precision    recall  f1-score   support

           1       1.00      0.97      0.99        36
           2       0.97      1.00      0.99        35
           3       1.00      1.00      1.00        36

    accuracy                           0.99       107
   macro avg       0.99      0.99      0.99       107
weighted avg       0.99      0.99      0.99       107

Accuracy: 0.99


In [11]:
# Display the rows in which the predictions didn't match the label
incorrect_predictions = df[df['label'] != df['predicted_label']]
incorrect_predictions

Unnamed: 0,text,label,predicted_label
35,The player doesn't push any boxes. Shouldn't p...,1,2
