# Sentiment Classification Demo Notebook

Welcome to the inference notebook for our sentiment classification models. Below you’ll find two sections:

---

## 1. Interactive Inference with Our Toy DistilBERT Model

Use the cell below to classify your own sentences as **positive** or **negative**. Simply edit the `sentences` list at the top with the text you want to evaluate.

> **Note:** This is a **toy model** trained on only 40 examples due to limited compute resources. In a real‐world scenario you would fine-tune on the full SST-2 dataset (≈67 000 examples), use more epochs, larger batches, and (ideally) GPU acceleration to achieve much higher accuracy.

**Steps:**
1. **Specify your sentences**  
   At the very top of the cell, replace or extend the entries in the `sentences` list:
   ```python
   sentences = [
       "I enjoyed this product!",
       "Really awful experience.",
       "Your custom sentence here."
   ]
Load the fine-tuned model
The code will automatically load the tokenizer and model from the distilbert_finetuned/ directory where our toy classifier is saved.

Tokenise & Infer
Each sentence is tokenised (max length 64), converted to PyTorch tensors, and fed into the model in evaluation mode.

Map predictions to labels
The numeric outputs (0 or 1) are mapped back to human-readable labels ("negative" or "positive") and printed.

2. Interactive Sentiment Inference Using a Pre-Trained SST-2 Model
This cell lets you classify any English sentences as POSITIVE or NEGATIVE using Hugging Face’s official DistilBERT checkpoint fine-tuned on the full SST-2 dataset.

Steps:

Specify your sentences
At the top of the cell, edit the examples list with any sentences you want to evaluate:

examples = [
    "I don't understand why people even buy this.",
    "Really awful experience.",
    "The product is excellent—highly recommended"
    # Add your custom sentences here.
]

Load the official SST-2 model
The code uses the checkpoint distilbert-base-uncased-fine­tuned-sst-2-english.

Tokenise & Infer
Sentences are tokenised to max length 64, converted to tensors, and passed into the model in evaluation mode.

Print predictions
Each input is printed alongside its predicted label ("POSITIVE" or "NEGATIVE").

Feel free to experiment with both the toy and the official models. For production-grade performance, follow the fine-tuning pipeline on the full dataset as shown in the previous notebook.





In [1]:
!pip install --quiet --upgrade --force-reinstall torch --index-url https://download.pytorch.org/whl/cpu


  You can safely remove it manually.
  You can safely remove it manually.
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
jupyterlab 4.2.4 requires httpx>=0.25.0, but you have httpx 0.13.3 which is incompatible.
torchvision 0.22.0 requires torch==2.7.0, but you have torch 2.7.1+cpu which is incompatible.


Thought for a couple of seconds

```markdown
## Interactive Inference with Our Toy DistilBERT Model

Use the cell below to classify your own sentences as **positive** or **negative**. Simply edit the `sentences` list at the top with the text you want to evaluate.

> **Note:** This is a **toy model** trained on only 40 examples due to limited compute resources. In a real‐world scenario you would fine-tune on the full SST-2 dataset (≈67k examples), use more epochs, larger batches, and (ideally) GPU acceleration to achieve much higher accuracy.

---

1. **Specify your sentences**  
   At the very top of the cell, replace or extend the entries in the `sentences` list:
   ```python
   sentences = [
       "I enjoyed this product!",
       "Really awful experience.",
       "Your custom sentence here."
   ]
   ```

2. **Load the fine-tuned model**  
   The code will automatically load the tokenizer and model from the `distilbert_finetuned/` directory where our toy classifier is saved.

3. **Tokenise & Infer**  
   Each sentence is tokenised (max length 64), converted to PyTorch tensors, and fed into the model in evaluation mode.

4. **Map predictions to labels**  
   The numeric outputs (`0` or `1`) are mapped back to human-readable labels (`"negative"` or `"positive"`) and printed.

---

Feel free to experiment with different sentences. If you need higher fidelity, follow the same pipeline but fine-tune on a larger dataset and adjust hyperparameters accordingly.  
```

In [2]:
# 0) 👉 Enter your own sentences below to classify
#    Replace or extend the list with any sentences you want the model to predict.
sentences = [
    "I enjoyed this product!",
    "Really awful experience.",
    "Product is simply Excellent!"
    # Add more sentences here, e.g. "Your custom sentence here."
]

# 1) Imports
import torch
from transformers import DistilBertForSequenceClassification, DistilBertTokenizerFast

# 2) Point to your fine-tuned model directory
model_path = r"C:\Users\IAGhe\OneDrive\Documents\Learning\portfolio\toy_transformer_sentiment\Model\distilbert_finetuned"

# 3) Load tokenizer & model
tokenizer = DistilBertTokenizerFast.from_pretrained(model_path)
model     = DistilBertForSequenceClassification.from_pretrained(model_path)
model.eval()   # set to evaluation mode

# 4) Tokenise and convert to tensors
inputs = tokenizer(
    sentences,
    padding=True,
    truncation=True,
    max_length=64,
    return_tensors="pt"
)

# 5) Run inference (CPU-only)
with torch.no_grad():
    logits = model(**inputs).logits
    preds  = logits.argmax(dim=-1).tolist()

# 6) Map to human-readable labels and print results
id2label = {0: "negative", 1: "positive"}
results = [{"sentence": s, "prediction": id2label[p]} for s, p in zip(sentences, preds)]
for r in results:
    print(f"‣ \"{r['sentence']}\" → {r['prediction']}")


‣ "I enjoyed this product!" → negative
‣ "Really awful experience." → negative
‣ "Product is simply Excellent!" → negative


## Interactive Sentiment Inference Using a Pre-Trained SST-2 Model

This cell lets you classify any English sentences as **POSITIVE** or **NEGATIVE** using Hugging Face’s official DistilBERT checkpoint fine-tuned on SST-2.

1. **Specify your sentences**
   At the top, edit the `examples` list with any sentences you want to evaluate:

   ```python
   examples = [
       "I don't understand why people even buy this.",
       "Really awful experience.",
       "The product is excellent—highly recommended"
       # Add your custom sentences here.
   ]
   ```


In [7]:
# 0) 👉 Enter your own sentences below to classify
#    Replace or extend this list with any sentences you want to evaluate.
examples = [
    "I don't understand why people even buy this.",
    "Really awful experience.",
    "The product is excellent—highly recommended"
    # e.g. "Your custom sentence here."
]

# 1) Imports
import torch
from transformers import DistilBertTokenizerFast, DistilBertForSequenceClassification

# 2) Load the official SST-2 fine-tuned DistilBERT model
checkpoint = "distilbert-base-uncased-finetuned-sst-2-english"
tokenizer  = DistilBertTokenizerFast.from_pretrained(checkpoint)
model      = DistilBertForSequenceClassification.from_pretrained(checkpoint)
model.eval()

# 3) Inference helper function
def predict(sentences):
    enc = tokenizer(
        sentences,
        padding=True,
        truncation=True,
        max_length=64,
        return_tensors="pt"
    )
    with torch.no_grad():
        logits = model(**enc).logits
        preds  = logits.argmax(dim=-1).tolist()
    return [model.config.id2label[p] for p in preds]

# 4) Run predictions on your examples
results = predict(examples)
for sentence, label in zip(examples, results):
    print(f"» \"{sentence}\" → {label}")


» "I don't understand why people even buy this." → NEGATIVE
» "Really awful experience." → NEGATIVE
» "The product is excellent—highly recommended" → POSITIVE
