#🎯 Assessment 3 NLP and Computer Vision (Multi-Modal Sentiment Analysis)

##Objective
The goal of this project is to develop a multi-modal sentiment analysis system that combines Natural Language Processing (NLP) and Computer Vision (CV) techniques. The system classifies sentiment as positive, negative, or neutral using both textual and image data. We will need to:

- Preprocess both types of data.
- Build models to analyze sentiment in text (using NLP techniques).
- Analyze sentiment in images (using computer vision techniques).
- Combine both models to produce a final sentiment classification (positive, negative, neutral).


##Step 1: Dataset Preparation

In [3]:
!pip install kaggle




In [4]:
from google.colab import files
files.upload()


Saving kaggle.json to kaggle.json


{'kaggle.json': b'{"username":"aniajaca","key":"03c0ffd804dcd2ee5b83802bcdcc6dd1"}'}

In [5]:
import os
os.environ['KAGGLE_CONFIG_DIR'] = "/content"

!kaggle datasets download -d dunyajasim/twitter-dataset-for-sentiment-analysis

!unzip twitter-dataset-for-sentiment-analysis.zip

Dataset URL: https://www.kaggle.com/datasets/dunyajasim/twitter-dataset-for-sentiment-analysis
License(s): GNU Lesser General Public License 3.0
Archive:  twitter-dataset-for-sentiment-analysis.zip
  inflating: Images/Images/Negative/1.jpg  
  inflating: Images/Images/Negative/10.jpg  
  inflating: Images/Images/Negative/1003.jpg  
  inflating: Images/Images/Negative/1004.jpg  
  inflating: Images/Images/Negative/1006.jpg  
  inflating: Images/Images/Negative/1007.jpg  
  inflating: Images/Images/Negative/1008.jpg  
  inflating: Images/Images/Negative/101.jpg  
  inflating: Images/Images/Negative/1010.jpg  
  inflating: Images/Images/Negative/1013.jpg  
  inflating: Images/Images/Negative/1014.jpg  
  inflating: Images/Images/Negative/1015.jpg  
  inflating: Images/Images/Negative/103.jpg  
  inflating: Images/Images/Negative/1041.jpg  
  inflating: Images/Images/Negative/1045.jpg  
  inflating: Images/Images/Negative/1050.jpg  
  inflating: Images/Images/Negative/1052.jpg  
  inflatin

In [6]:
os.listdir('/content/')

['.config',
 'twitter-dataset-for-sentiment-analysis.zip',
 'kaggle.json',
 'LabeledText.xlsx',
 'Read me.txt',
 'Images',
 'sample_data']

In [7]:
import pandas as pd

# Load the Excel file
data = pd.read_excel('/content/LabeledText.xlsx')

# Inspect the first few rows
data.head()

Unnamed: 0,File Name,Caption,LABEL
0,1.txt,How I feel today #legday #jelly #aching #gym,negative
1,10.txt,@ArrivaTW absolute disgrace two carriages from...,negative
2,100.txt,This is my Valentine's from 1 of my nephews. I...,positive
3,1000.txt,betterfeelingfilms: RT via Instagram: First da...,neutral
4,1001.txt,Zoe's first love #Rattled @JohnnyHarper15,positive


In [8]:
import re
import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize

# Download NLTK resources
nltk.download('punkt')
nltk.download('stopwords')
nltk.download('punkt_tab')

# Define a function to clean and preprocess the text
def preprocess_text(text):
    # Convert to lowercase
    text = text.lower()
    # Remove non-alphanumeric characters (except spaces)
    text = re.sub(r'[^a-z\s]', '', text)
    # Tokenize the text
    tokens = word_tokenize(text)
    # Remove stopwords
    stop_words = set(stopwords.words('english'))
    tokens = [word for word in tokens if word not in stop_words]
    return " ".join(tokens)

# Apply preprocessing to the 'Caption' column
data['cleaned_caption'] = data['Caption'].apply(preprocess_text)

# Check the cleaned text
data[['Caption', 'cleaned_caption']].head()

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Unzipping corpora/stopwords.zip.
[nltk_data] Downloading package punkt_tab to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt_tab.zip.


Unnamed: 0,Caption,cleaned_caption
0,How I feel today #legday #jelly #aching #gym,feel today legday jelly aching gym
1,@ArrivaTW absolute disgrace two carriages from...,arrivatw absolute disgrace two carriages bango...
2,This is my Valentine's from 1 of my nephews. I...,valentines nephews elated sometimes little thi...
3,betterfeelingfilms: RT via Instagram: First da...,betterfeelingfilms rt via instagram first day ...
4,Zoe's first love #Rattled @JohnnyHarper15,zoes first love rattled johnnyharper


## Step 2: NLP Component (Text Analysis)

In [9]:
!pip install transformers
!pip install torch

Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch)
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cublas-cu12==12.4.5.8 (from torch)
  Downloading nvidia_cublas_cu12-12.4.5.8-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cufft-cu12==11.2.1.3 (from torch)
  Downloading nvidia_cufft_cu12-11.2.1.3-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-curand-cu12==10.3.5.147 (from torch)
  Downloading nvidia_curand_cu12-10.3.5

In [10]:
from transformers import BertTokenizer, BertForSequenceClassification
from torch.utils.data import DataLoader, TensorDataset
import torch
from sklearn.model_selection import train_test_split

# Initialize the tokenizer and model
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertForSequenceClassification.from_pretrained('bert-base-uncased', num_labels=3)  # 3 classes (positive, negative, neutral)

# Tokenize the text data
def encode_text(text):
    # Ensure padding to max length of 512 tokens and truncation
    return tokenizer(text, padding='max_length', truncation=True, max_length=512, return_tensors='pt')

# Encode the 'cleaned_caption' column
encoded_texts = [encode_text(text) for text in data['cleaned_caption']]

# Extract input_ids and attention_mask
input_ids = torch.cat([e['input_ids'] for e in encoded_texts], dim=0)
attention_mask = torch.cat([e['attention_mask'] for e in encoded_texts], dim=0)

# Convert labels to numerical format
label_map = {'negative': 0, 'neutral': 1, 'positive': 2}
labels = torch.tensor([label_map[label] for label in data['LABEL']])

# Split the data into training and testing sets
train_inputs, test_inputs, train_labels, test_labels = train_test_split(input_ids, labels, test_size=0.2, random_state=42)

train_mask, test_mask = train_test_split(attention_mask, test_size=0.2, random_state=42)

# Create DataLoader for batching
train_data = TensorDataset(train_inputs, train_mask, train_labels)
train_dataloader = DataLoader(train_data, batch_size=16, shuffle=True)

test_data = TensorDataset(test_inputs, test_mask, test_labels)
test_dataloader = DataLoader(test_data, batch_size=16)



The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

config.json:   0%|          | 0.00/570 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/440M [00:00<?, ?B/s]

Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [11]:
from transformers import AdamW
from tqdm import tqdm

# Set device to CUDA else CPU
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

# Define optimizer and learning rate
optimizer = AdamW(model.parameters(), lr=2e-5)

# Define the training loop
def train_model(model, train_dataloader, epochs=3):
    model.train()
    for epoch in range(epochs):
        total_train_loss = 0
        for batch in tqdm(train_dataloader, desc=f"Training Epoch {epoch+1}"):
            # Unpack the batch
            b_input_ids, b_input_mask, b_labels = [t.to(device) for t in batch]

            # Zero the gradients
            optimizer.zero_grad()

            # Forward pass
            outputs = model(b_input_ids, attention_mask=b_input_mask, labels=b_labels)
            loss = outputs.loss
            total_train_loss += loss.item()

            # Backward pass
            loss.backward()
            optimizer.step()

        avg_train_loss = total_train_loss / len(train_dataloader)
        print(f"Average training loss for epoch {epoch+1}: {avg_train_loss:.4f}")

# Train the model
train_model(model, train_dataloader, epochs=3)

Training Epoch 1: 100%|██████████| 244/244 [01:14<00:00,  3.28it/s]


Average training loss for epoch 1: 0.7918


Training Epoch 2: 100%|██████████| 244/244 [01:13<00:00,  3.32it/s]


Average training loss for epoch 2: 0.4887


Training Epoch 3: 100%|██████████| 244/244 [01:13<00:00,  3.32it/s]

Average training loss for epoch 3: 0.3323





In [12]:
# Check if GPU is available and use it, otherwise fall back to CPU
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

# Move the model to the device (GPU/CPU)
model.to(device)

# After that, we need to ensure that the tensors in the dataloader are also moved to the correct device
# Evaluate the model on the test set
model.eval()  # Set the model to evaluation mode

# Initialize variables to track loss and accuracy
val_loss = 0
val_accuracy = 0

for batch in test_dataloader:
    # Move input tensors to the same device as the model
    b_input_ids, b_input_mask, b_labels = [tensor.to(device) for tensor in batch]

    with torch.no_grad():  # Disable gradient calculation during evaluation
        outputs = model(b_input_ids, attention_mask=b_input_mask, labels=b_labels)
        loss = outputs.loss
        logits = outputs.logits

    val_loss += loss.item()

    # Calculate accuracy
    preds = torch.argmax(logits, dim=-1)
    val_accuracy += (preds == b_labels).sum().item()

# Calculate the average validation loss and accuracy
val_loss /= len(test_dataloader)
val_accuracy /= len(test_dataloader.dataset)

print(f"Validation Loss: {val_loss}")
print(f"Validation Accuracy: {val_accuracy}")

Validation Loss: 0.6642112480323823
Validation Accuracy: 0.7607802874743327


In [13]:
# Save the model and tokenizer
model.save_pretrained('/content/sentiment_model')
tokenizer.save_pretrained('/content/sentiment_model')

from google.colab import drive
drive.mount('/content/drive')
model.save_pretrained('/content/drive/MyDrive/sentiment_model')
tokenizer.save_pretrained('/content/drive/MyDrive/sentiment_model')


Mounted at /content/drive


('/content/drive/MyDrive/sentiment_model/tokenizer_config.json',
 '/content/drive/MyDrive/sentiment_model/special_tokens_map.json',
 '/content/drive/MyDrive/sentiment_model/vocab.txt',
 '/content/drive/MyDrive/sentiment_model/added_tokens.json')

In [15]:
from transformers import BertForSequenceClassification, BertTokenizer

# Reload the model and tokenizer
model = BertForSequenceClassification.from_pretrained('/content/sentiment_model')
tokenizer = BertTokenizer.from_pretrained('/content/sentiment_model')

## Step 3: Computer Vision Component (Image Analysis)

In [35]:
import torch
from torchvision import models, transforms
from PIL import Image
import os
import glob

# Define the image preprocessing pipeline
transform = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406],
                         std=[0.229, 0.224, 0.225])
])

# Load pre-trained ResNet50 and remove the classification layer
resnet_model = models.resnet50(pretrained=True)
resnet_model = torch.nn.Sequential(*list(resnet_model.children())[:-1])  # remove the final FC layer
resnet_model.eval()

# Folder mapping based on your directory listing:
sentiment_folder = {
    'positive': 'positive',
    'neutral': 'Neutral',
    'negative': 'Negative'
}

# Base path for images:
base_path = '/content/Images/Images/'

def extract_image_features(image_name, sentiment_label):
    numeric_part = ''.join(filter(str.isdigit, image_name))
    folder = sentiment_folder.get(sentiment_label.lower())
    if folder is None:
        raise ValueError(f"Invalid sentiment label: {sentiment_label}")

    pattern = os.path.join(base_path, folder, f"{numeric_part}*.jpg")
    matching_files = glob.glob(pattern)

    if not matching_files:
        raise FileNotFoundError(f"No image found matching pattern: {pattern}")

    image_path = matching_files[0]
    print("Found image path:", image_path)

    img = Image.open(image_path).convert('RGB')
    img = transform(img).unsqueeze(0)  # Add batch dimension

    with torch.no_grad():
        features = resnet_model(img)              # shape: [1, 2048, 1, 1]
        features = features.view(features.size(0), -1)  # flatten to [1, 2048]
    return features

# Example
image_name = '1.txt'
sentiment_label = 'positive'

features = extract_image_features(image_name, sentiment_label)
print("Extracted features shape:", features.shape)

Found image path: /content/Images/Images/positive/1555.jpg
Extracted features shape: torch.Size([1, 2048])


In [36]:
import torch.nn as nn

class ImageSentimentClassifier(nn.Module):
    def __init__(self, input_size, num_classes=3):
        super(ImageSentimentClassifier, self).__init__()
        self.fc1 = nn.Linear(input_size, 256)  # First fully connected layer
        self.relu = nn.ReLU()                # Activation function
        self.fc2 = nn.Linear(256, num_classes)  # Output layer

    def forward(self, x):
        x = self.fc1(x)
        x = self.relu(x)
        x = self.fc2(x)
        return x

# Assuming ResNet features have 2048 dimensions
input_size = 2048
model = ImageSentimentClassifier(input_size)

In [38]:
# Extract image features for all sentiment labels
def extract_all_features():
    all_features = {'positive': [], 'neutral': [], 'negative': []}

    # Loop through each sentiment label
    for sentiment_label in sentiment_folder.keys():
        sentiment_images = os.listdir(os.path.join(base_path, sentiment_folder[sentiment_label]))
        for image_name in sentiment_images:
            features = extract_image_features(image_name, sentiment_label)
            all_features[sentiment_label].append(features)

    return all_features

# Call the function to extract features
image_features = extract_all_features()

Found image path: /content/Images/Images/positive/291.jpg
Found image path: /content/Images/Images/positive/3694.jpg
Found image path: /content/Images/Images/positive/2903.jpg
Found image path: /content/Images/Images/positive/1555.jpg
Found image path: /content/Images/Images/positive/3940.jpg
Found image path: /content/Images/Images/positive/1834.jpg
Found image path: /content/Images/Images/positive/801.jpg
Found image path: /content/Images/Images/positive/1976.jpg
Found image path: /content/Images/Images/positive/4321.jpg
Found image path: /content/Images/Images/positive/4644.jpg
Found image path: /content/Images/Images/positive/652.jpg
Found image path: /content/Images/Images/positive/1611.jpg
Found image path: /content/Images/Images/positive/2791.jpg
Found image path: /content/Images/Images/positive/33.jpg
Found image path: /content/Images/Images/positive/21.jpg
Found image path: /content/Images/Images/positive/2665.jpg
Found image path: /content/Images/Images/positive/3330.jpg
Foun

In [39]:
from sklearn.model_selection import train_test_split
from torch.utils.data import DataLoader, TensorDataset

# Convert extracted features into tensors
features = {
    'positive': torch.stack(image_features['positive']),
    'neutral': torch.stack(image_features['neutral']),
    'negative': torch.stack(image_features['negative'])
}

# Create labels: 0 = positive, 1 = neutral, 2 = negative
labels = {
    'positive': torch.zeros(len(features['positive'])),  # label 0
    'neutral': torch.ones(len(features['neutral'])),     # label 1
    'negative': torch.full((len(features['negative']),), 2)  # label 2
}

# Combine features and labels
all_features = torch.cat([features['positive'], features['neutral'], features['negative']], dim=0)
all_labels = torch.cat([labels['positive'], labels['neutral'], labels['negative']], dim=0)

# Split into training and testing sets (80/20)
X_train, X_test, y_train, y_test = train_test_split(all_features, all_labels, test_size=0.2, random_state=42)

# PyTorch DataLoader
train_data = TensorDataset(X_train, y_train)
test_data = TensorDataset(X_test, y_test)

train_loader = DataLoader(train_data, batch_size=32, shuffle=True)
test_loader = DataLoader(test_data, batch_size=32, shuffle=False)

In [40]:
print("Positive features shape:", features['positive'].shape)
print("Neutral features shape:", features['neutral'].shape)
print("Negative features shape:", features['negative'].shape)


Positive features shape: torch.Size([1646, 1, 2048])
Neutral features shape: torch.Size([1771, 1, 2048])
Negative features shape: torch.Size([1452, 1, 2048])


In [41]:
print("Training data size:", len(train_loader.dataset))
print("Testing data size:", len(test_loader.dataset))


Training data size: 3895
Testing data size: 974


In [71]:
# Flatten the input from [batch_size, 1, 2048] to [batch_size, 2048]
X_train = X_train.squeeze(1)
X_test = X_test.squeeze(1)

import torch.nn as nn

import torch.nn as nn

class ImageSentimentClassifier(nn.Module):
    def __init__(self, input_size=2048, num_classes=3):
        super(ImageSentimentClassifier, self).__init__()
        self.fc1 = nn.Linear(input_size, 512)
        self.bn1 = nn.BatchNorm1d(512)
        self.relu = nn.ReLU()
        self.dropout = nn.Dropout(0.5)
        self.fc2 = nn.Linear(512, num_classes)

    def forward(self, x):
        x = x.view(x.size(0), -1)  # Flatten if needed
        x = self.fc1(x)
        x = self.bn1(x)
        x = self.relu(x)
        x = self.dropout(x)
        x = self.fc2(x)
        return x

model = ImageSentimentClassifier()

In [72]:
import torch
import torch.nn.functional as F
from torch import optim
import torch.optim as optim

# Re-initialize the Classifier

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=3e-4)

In [73]:
# Compute class weights based on the training labels

from sklearn.utils.class_weight import compute_class_weight

class_counts = [len(image_features['positive']), len(image_features['neutral']), len(image_features['negative'])]
classes = np.array([0, 1, 2])
weights = compute_class_weight(class_weight='balanced', classes=classes, y=all_labels.numpy())
weights = torch.tensor(weights, dtype=torch.float32).to(device)

# Use in loss function
criterion = nn.CrossEntropyLoss(weight=weights)

In [74]:
# Set Up Training Loop

num_epochs = 20

for epoch in range(num_epochs):
    model.train()
    total_loss = 0

    for images, labels in train_loader:
        images = images.squeeze(1).to(device)  # Shape: [batch, 2048]
        labels = labels.long().to(device)

        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        total_loss += loss.item()

    avg_loss = total_loss / len(train_loader)
    print(f"Epoch {epoch+1} | Training Loss: {avg_loss:.4f}")

Epoch 1 | Training Loss: 1.1748
Epoch 2 | Training Loss: 0.9847
Epoch 3 | Training Loss: 0.9230
Epoch 4 | Training Loss: 0.8616
Epoch 5 | Training Loss: 0.8088
Epoch 6 | Training Loss: 0.7432
Epoch 7 | Training Loss: 0.6892
Epoch 8 | Training Loss: 0.6450
Epoch 9 | Training Loss: 0.6076
Epoch 10 | Training Loss: 0.5631
Epoch 11 | Training Loss: 0.5280
Epoch 12 | Training Loss: 0.4842
Epoch 13 | Training Loss: 0.4532
Epoch 14 | Training Loss: 0.4130
Epoch 15 | Training Loss: 0.3866
Epoch 16 | Training Loss: 0.3781
Epoch 17 | Training Loss: 0.3525
Epoch 18 | Training Loss: 0.3250
Epoch 19 | Training Loss: 0.2802
Epoch 20 | Training Loss: 0.2983


In [75]:
# Train the Model

model.eval()
correct = 0
total = 0

with torch.no_grad():
    for images, labels in test_loader:
        images = images.squeeze(1).to(device)
        labels = labels.long().to(device)
        outputs = model(images)
        _, predicted = torch.max(outputs, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

accuracy = correct / total
print(f"Test Accuracy: {accuracy:.4f}")

Test Accuracy: 0.4815


## Step 4:  Fusion and Final Classification

In [85]:
import pandas as pd

data = pd.read_excel('/content/LabeledText.xlsx')


In [86]:
data.head()


Unnamed: 0,File Name,Caption,LABEL
0,1.txt,How I feel today #legday #jelly #aching #gym,negative
1,10.txt,@ArrivaTW absolute disgrace two carriages from...,negative
2,100.txt,This is my Valentine's from 1 of my nephews. I...,positive
3,1000.txt,betterfeelingfilms: RT via Instagram: First da...,neutral
4,1001.txt,Zoe's first love #Rattled @JohnnyHarper15,positive


In [182]:
from transformers import BertTokenizer, BertModel
import torch

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Load tokenizer and model
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
bert_model = BertModel.from_pretrained('bert-base-uncased')
bert_model.to(device)
bert_model.eval()

# Tokenize and encode all captions
captions = list(data['Caption'])  # ← using the raw caption here
encoded = tokenizer(
    captions,
    padding='max_length',
    truncation=True,
    max_length=128,
    return_tensors='pt'
)

input_ids = encoded['input_ids'].to(device)
attention_mask = encoded['attention_mask'].to(device)

# Extract BERT features
with torch.no_grad():
    outputs = bert_model(input_ids, attention_mask=attention_mask)
    text_features = outputs.pooler_output  # shape: [num_samples, 768]

In [183]:
image_feature_list = []

for i in range(len(data)):
    fname = data.loc[i, 'File Name']
    label = data.loc[i, 'LABEL'].lower()

    feature = extract_image_features(fname, label)
    feature = feature.view(-1)  # Flatten to shape [2048]
    image_feature_list.append(feature)

image_tensor = torch.stack(image_feature_list)  # shape: [num_samples, 2048]

Found image path: /content/Images/Images/Negative/1604.jpg
Found image path: /content/Images/Images/Negative/1003.jpg
Found image path: /content/Images/Images/positive/1009.jpg
Found image path: /content/Images/Images/Neutral/1000.jpg
Found image path: /content/Images/Images/positive/1001.jpg
Found image path: /content/Images/Images/positive/1002.jpg
Found image path: /content/Images/Images/Negative/1003.jpg
Found image path: /content/Images/Images/Negative/1004.jpg
Found image path: /content/Images/Images/Neutral/1005.jpg
Found image path: /content/Images/Images/Negative/1006.jpg
Found image path: /content/Images/Images/Negative/1007.jpg
Found image path: /content/Images/Images/Negative/1008.jpg
Found image path: /content/Images/Images/positive/1009.jpg
Found image path: /content/Images/Images/Negative/101.jpg
Found image path: /content/Images/Images/Negative/1010.jpg
Found image path: /content/Images/Images/Neutral/1011.jpg
Found image path: /content/Images/Images/Neutral/1012.jpg
Fo

In [191]:
label_map = {'positive': 0, 'neutral': 1, 'negative': 2}
labels = torch.tensor([label_map[label.lower()] for label in data['LABEL']])

In [192]:
import torch.nn as nn

# Define dropout layers manually
text_dropout = nn.Dropout(0.3)
image_dropout = nn.Dropout(0.3)

# Apply to features before fusion
text_features = text_dropout(text_features)
image_tensor = image_dropout(image_tensor)


In [193]:
# Concatenate Text and Image Features
# Ensure same device
text_features = text_features.to(device)
image_tensor = image_tensor.to(device)
labels = labels.to(device)

# Combine features
fused_features = torch.cat((text_features, image_tensor), dim=1)  # [4869, 2816]

In [194]:
# Train-Test Split

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(
    fused_features, labels, test_size=0.2, random_state=42
)

from torch.utils.data import TensorDataset, DataLoader

train_dataset = TensorDataset(X_train, y_train)
test_dataset = TensorDataset(X_test, y_test)

train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=32)

In [195]:
# Define the Fusion Model

import torch.nn as nn

class FusionSentimentClassifier(nn.Module):
    def __init__(self, text_dim=768, image_dim=2048, hidden_dim=512, num_classes=3):
        super(FusionSentimentClassifier, self).__init__()
        self.fc1 = nn.Linear(text_dim + image_dim, hidden_dim)
        self.bn1 = nn.BatchNorm1d(hidden_dim)
        self.relu = nn.ReLU()
        self.dropout = nn.Dropout(0.5)
        self.fc2 = nn.Linear(hidden_dim, num_classes)

    def forward(self, text_feat, image_feat):
        x = torch.cat((text_feat, image_feat), dim=1)
        x = self.fc1(x)
        x = self.bn1(x)
        x = self.relu(x)
        x = self.dropout(x)
        x = self.fc2(x)
        return x

In [196]:
# Train the Fusion Model

import torch.optim as optim

criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(fusion_model.parameters(), lr=1e-4)

# Training loop
for epoch in range(10):
    fusion_model.train()
    total_loss = 0

    for batch in train_loader:
        inputs, targets = batch
        optimizer.zero_grad()
        outputs = fusion_model(inputs[:, :768], inputs[:, 768:])  # Split back into text and image parts
        loss = criterion(outputs, targets)
        loss.backward()
        optimizer.step()
        total_loss += loss.item()

    avg_loss = total_loss / len(train_loader)
    print(f"Epoch {epoch+1} | Training Loss: {avg_loss:.4f}")

Epoch 1 | Training Loss: 0.5579
Epoch 2 | Training Loss: 0.4344
Epoch 3 | Training Loss: 0.3412
Epoch 4 | Training Loss: 0.2699
Epoch 5 | Training Loss: 0.2127
Epoch 6 | Training Loss: 0.1662
Epoch 7 | Training Loss: 0.1349
Epoch 8 | Training Loss: 0.1094
Epoch 9 | Training Loss: 0.0929
Epoch 10 | Training Loss: 0.0732


In [197]:
# Accuracy Test

fusion_model.eval()
correct = 0
total = 0

with torch.no_grad():
    for batch in test_loader:
        inputs, targets = batch
        outputs = fusion_model(inputs[:, :768], inputs[:, 768:])
        predictions = torch.argmax(outputs, dim=1)
        correct += (predictions == targets).sum().item()
        total += targets.size(0)

accuracy = correct / total
print(f"Fusion Model Test Accuracy: {accuracy:.4f}")


Fusion Model Test Accuracy: 0.4466


In [198]:
from sklearn.metrics import precision_recall_fscore_support

# Collect predictions and true labels
all_preds = []
all_labels = []

with torch.no_grad():
    for batch in test_loader:
        inputs, targets = batch
        outputs = fusion_model(inputs[:, :768], inputs[:, 768:])
        predictions = torch.argmax(outputs, dim=1)

        all_preds.extend(predictions.cpu().numpy())
        all_labels.extend(targets.cpu().numpy())

# Compute additional metrics
precision, recall, f1, _ = precision_recall_fscore_support(
    all_labels, all_preds, average='weighted'
)

print("\nAdditional Evaluation Metrics:")
print(f"Precision: {precision:.4f}")
print(f"Recall:    {recall:.4f}")
print(f"F1 Score:  {f1:.4f}")


Additional Evaluation Metrics:
Precision: 0.4460
Recall:    0.4466
F1 Score:  0.4461


In [199]:
torch.save(fusion_model.state_dict(), 'fusion_model.pt')