# Sentence Transformers & Multi-Task Learning:

### Task 1: Implementing the Sentence Transformer Model

In this task, we will implement a Sentence Transformer model that encodes sentences into fixed-length embeddings. We will also test the model by encoding a few example sentences and printing their embeddings.

We use the MiniLM-L6-v2 transformer model. It's lightweight, efficient, and provides good sentence embeddings for NLP tasks.

We will use the sentence-transformers library, which provides a convenient way to load pre-trained transformer models and generate embeddings.

SentenceTransformer model (like 'all-MiniLM-L6-v2') automatically generates fixed-length embeddings for any input sentence, no matter how short or long the sentence is. This is a key feature of Sentence Transformers.


In [95]:
import numpy as np
import pandas as pd

In [96]:
from sentence_transformers import SentenceTransformer,util

model = SentenceTransformer('all-MiniLM-L6-v2')
print("Model loaded successfully!")

Model loaded successfully!


In [97]:
senetences=['This frame work generates embeddings for each sentence',
            'Sentences are passed as a list of string',
            "I am eating an apple",
            "I like fruits",
            "The weather is great today",
            "I love programming"]
embeddings=model.encode(senetences,convert_to_tensor=True)
for sentence, embedding in zip(senetences, embeddings):
    print("Sentence:", sentence)
    print("Embedding:", embedding)
    print("")

Sentence: This frame work generates embeddings for each sentence
Embedding: tensor([-8.7361e-03, -1.1969e-02,  2.0720e-03, -1.1325e-02,  7.7704e-02,
         6.9390e-02,  5.7189e-02, -7.0450e-02,  4.5878e-02, -8.0678e-02,
         1.4686e-02,  1.2852e-02, -2.8919e-02,  3.4489e-02, -6.1153e-02,
         5.9264e-02,  2.2751e-02,  4.9651e-02, -3.9746e-02, -1.1132e-01,
         3.1434e-03,  6.0088e-02,  1.1203e-01, -4.1000e-02,  8.8195e-03,
         8.4127e-02, -1.0279e-01,  1.4482e-02,  1.0780e-01, -2.5484e-03,
         5.7024e-02, -2.6720e-02,  6.0212e-03,  4.0472e-02,  6.2981e-02,
         1.3852e-02, -2.7132e-02,  6.9404e-02, -3.7095e-02, -1.4485e-03,
        -2.3013e-03,  2.7515e-02,  6.8190e-02,  9.7448e-02,  1.8713e-02,
        -2.4885e-02, -5.6664e-02, -2.9501e-02, -2.5298e-02, -1.3050e-02,
        -9.0464e-02, -3.7461e-02,  9.9929e-03, -2.4404e-02, -3.3269e-02,
        -1.4803e-02,  1.8427e-02,  4.5409e-03,  3.9333e-02, -4.9104e-02,
        -5.5657e-02, -2.6084e-02,  1.0064e-02,  

To find the Cosine Similarity between two embeddings

In [98]:
emb1=model.encode("I am eating apple")
emb2=model.encode("I like fruits")
cos_sim=util.cos_sim(emb1,emb2)
print("Cosine Similarity:", cos_sim)

Cosine Similarity: tensor([[0.5398]])


### Task 2: Multi-Task Learning Expansion

In [99]:
import torch
import torch.nn as nn

In [100]:
class MultiTaskModel(nn.Module):
    def __init__(self, model_name='all-MiniLM-L6-v2', num_classes_taskA=3, num_classes_taskB=3):
        super().__init__()
        self.transformer = SentenceTransformer(model_name)
        dim = self.transformer.get_sentence_embedding_dimension()
        self.class_head = nn.Linear(dim, num_classes_taskA)
        self.sent_head = nn.Linear(dim, num_classes_taskB)
    
    def forward(self, sentences):
        embeddings = self.transformer.encode(sentences, convert_to_tensor=True)
        return self.class_head(embeddings), self.sent_head(embeddings)

### Task 3: Training Considerations


Task 3 Solution
1. Entire Network Frozen
Implications: Model stays fixed—no learning happens.
Advantages: Fast, no training needed, uses pre-trained weights as-is.
Rationale & How to Train: Good if 'all-MiniLM-L6-v2' already fits your tasks (e.g., 5 categories, 3 sentiments). Don’t train—just run model(sentences).

2. Transformer Backbone Frozen
Implications: self.transformer fixed, classification_head and sentiment_head update.
Advantages: Quick training, keeps pre-trained embeddings intact.
Rationale & How to Train: Pre-trained embeddings are smart enough; train heads to match your data. Freeze with model.transformer.eval(); for param in model.transformer.parameters(): param.requires_grad = False, then use a training loop.

3. One Task-Specific Head Frozen (e.g., Classification Head)
-Implications: classification_head fixed, self.transformer and sentiment_head learn.
-Advantages: Saves effort if one task is already good, improves the other.
-Rationale & How to Train: If Task A is set, focus on Task B. Freeze with for param in model.classification_head.parameters(): param.-requires_grad = False, train with both losses.

Transfer Learning Approach
Scenario: Beneficial with little data, leveraging pre-trained knowledge.
1. Pre-Trained Model: 'all-MiniLM-L6-v2'—small, fast, pre-trained on 1B+ sentences.
2. Freeze/Unfreeze: Freeze self.transformer, unfreeze classification_head and sentiment_head.
3. Rationale: Transformer’s embeddings are reliable from pre-training; train heads to fit my 5 categories and 3 sentiments quickly.


### Task 4: Training Loop Implementation

In [101]:
def train_model(model, reviews, labels_a, labels_b, epochs=5):
    optimizer = torch.optim.AdamW(model.parameters(), lr=0.01)  # Fast learning rate
    loss_fn = nn.CrossEntropyLoss()
    model.train()
    labels_a = torch.tensor(labels_a)  # Category labels
    labels_b = torch.tensor(labels_b)  # Sentiment labels
    for _ in range(epochs):
        class_logits, sent_logits = model(reviews)
        loss_a = loss_fn(class_logits, labels_a)
        loss_b = loss_fn(sent_logits, labels_b)
        loss = loss_a + loss_b
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

In [102]:
import json
def predict_reviews(model, reviews, categories, sentiments):
    model.eval()
    class_logits, sent_logits = model(reviews)
    class_preds = torch.argmax(class_logits, dim=1).tolist()
    sent_preds = torch.argmax(sent_logits, dim=1).tolist()
    
    results = []
    for review, c, s in zip(reviews, class_preds, sent_preds):
        result = f"{categories[c]} - {sentiments[s]}"
        print(f"Review: {review}\nResult: {result}\n")
        results.append({"review": review, "result": result})
    
    # Save to log
    with open("log.jsonl", "a") as f:
        for r in results:
            f.write(json.dumps(r) + "\n")
    return results

In [103]:
# Example data
categories = ["Tech", "Clothing", "Books"]
sentiments = ["Negative", "Positive", "Neutral"]

reviews = [
    "This phone crashes all the time!",        
    "Love this comfy sweater!",               
    "The book was pretty average.",           
    "Amazing tablet, super fast!",             
    "These pants ripped in a day.",            
    "Cool headphones, decent sound.",         
    "This novel is so dull.",                  
    "Best jacket ever, so warm!",              
    "Laptop battery dies too quick.",          
    "Interesting story, worth a read.",
]

labels_a = [0, 1, 2, 0, 1, 0, 2, 1, 0, 2]
labels_b = [0, 1, 2, 1, 0, 2, 0, 1, 0, 1]

# Run
if __name__ == "__main__":
    model = MultiTaskModel('all-MiniLM-L6-v2', num_classes_taskA=len(categories), num_classes_taskB=len(sentiments))
    train_model(model, reviews, labels_a, labels_b)
    predict_reviews(model, reviews, categories, sentiments)

Review: This phone crashes all the time!
Result: Tech - Negative

Review: Love this comfy sweater!
Result: Clothing - Positive

Review: The book was pretty average.
Result: Books - Neutral

Review: Amazing tablet, super fast!
Result: Tech - Positive

Review: These pants ripped in a day.
Result: Clothing - Negative

Review: Cool headphones, decent sound.
Result: Tech - Neutral

Review: This novel is so dull.
Result: Books - Negative

Review: Best jacket ever, so warm!
Result: Clothing - Positive

Review: Laptop battery dies too quick.
Result: Tech - Negative

Review: Interesting story, worth a read.
Result: Books - Positive

