# Zero Shot Classification

This notebook shows how to perform text classification where we do not have labeled dataset.

**Steps:**
1. Load Dataset `rotten_tomatoes`
2. Load the Model `sentence-transformers/all-mpnet-base-v2`
3. Encode the test dataset.
4. Define the labels and encode them.
5. Use `cosine_similarity` to create the similarity.
6. evaluate the performance

In [1]:
from datasets import load_dataset

dataset = load_dataset("rotten_tomatoes")

In [8]:
from sentence_transformers import SentenceTransformer

model_name = "sentence-transformers/all-mpnet-base-v2"

model = SentenceTransformer(model_name)

In [9]:
labels = [
    "A negative movie review.",
    "A positive movie review."
    ]


In [10]:
encoded_sentences = model.encode(dataset["test"]["text"])
encoded_labels = model.encode(labels)

In [11]:
from sklearn.metrics.pairwise import cosine_similarity

sim_matrix = cosine_similarity(encoded_sentences, encoded_labels)


In [14]:
import numpy as np

y_pred = np.argmax(sim_matrix,axis=1)

In [15]:
from sklearn.metrics import classification_report

def evaluate_performance(y_true, y_pred):
    performance = classification_report(y_true, y_pred, target_names=["Negative Review","Positive Review"])
    print(performance)

In [16]:
evaluate_performance(dataset["test"]["label"], y_pred)

                 precision    recall  f1-score   support

Negative Review       0.83      0.78      0.80       533
Positive Review       0.79      0.83      0.81       533

       accuracy                           0.81      1066
      macro avg       0.81      0.81      0.81      1066
   weighted avg       0.81      0.81      0.81      1066

