# Project Supervised Learning - Self Supervised Part


---

This notebook contains our implementation of the self-supervised learning (SSL) part of the project, based on the SimCLR framework. We use a custom CNN encoder trained without labels, and evaluate the learned representations by training various traditional classifiers on top of extracted features.

The notebook covers the full SSL pipeline, from data augmentation and model setup to evaluation with logistic regression, SVM, random forest, and a neural network head.

---

## 1. Data Preparation
- Unzip and organize the dataset structure
- Define `SSLFoodDataset` for generating augmented view pairs
- Apply SimCLR-style data augmentations
- Compute and use dataset-specific normalization stats

## 2. Model Architecture
- Define a ResNet18-inspired CNN (`CustomCNN`) with CBAM
- Wrap it in a `SimCLR` model with a projection head
- Implement NT-Xent contrastive loss

##  3. SSL Pretraining (SimCLR)
- Train the SimCLR model on the training set (20 or 35 epochs)
- Use frozen encoder, no labels involved
- Periodically save model checkpoints

## 4. Feature Extraction
- Use frozen encoder to extract 256-dimensional vectors
- Generate train/val/test sets for downstream classifiers
- Save feature arrays for reuse

## 5. Training Traditional Classifiers
- Train and evaluate:
  - Logistic Regression
  - Linear SVM (full + val-only)
  - RBF SVM (subset only)
  - Random Forest
  - MLP (PyTorch)

## 6. Evaluation & Analysis
- Compare classifier performance on extracted SSL features
- Analyze effects of data volume, label quality, and model type
- Reflect on constraints and generalization



In [None]:
#pip install tqdm

In [1]:
# Import libraries
import os
# os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "expandable_segments:True"

import shutil
import zipfile
import random
import time
from pathlib import Path
from tqdm import tqdm

import pandas as pd
import numpy as np
from PIL import Image
from tqdm.notebook import tqdm
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, confusion_matrix

import torch
# torch.cuda.empty_cache()
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import Dataset, DataLoader, Subset
from torchvision import transforms
import multiprocessing





In [2]:
# --- for reproducibility ---
def set_seed(seed=42):
    random.seed(seed)
    np.random.seed(seed)
    torch.manual_seed(seed)
    torch.cuda.manual_seed_all(seed)
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = False

def seed_worker(worker_id):
    seed = torch.initial_seed() % 2**32
    np.random.seed(seed)
    random.seed(seed)
    
set_seed(42)



---


# General Data Loader

In [None]:
data_dir = Path('/kaggle/input/ifood-2019-fgvc6')
working_dir = Path('/kaggle/working')
TRAIN_CSV = data_dir / 'train_labels.csv'

# Extract and flatten function
def extract_and_flatten(zip_filename, folder_name, internal_folder_name):
    zip_path = data_dir / zip_filename
    extract_path = working_dir  / (folder_name + "_temp")
    final_path = working_dir / folder_name

    with zipfile.ZipFile(zip_path, "r") as zip_ref:
        zip_ref.extractall(extract_path)

    nested = extract_path / internal_folder_name
    final_path.mkdir(exist_ok=True)
    for fname in os.listdir(nested):
        shutil.move(str(nested / fname), str(final_path / fname))

    shutil.rmtree(extract_path)


# Unpack all datasets
extract_and_flatten("train_set.zip", "train", "train_set")
extract_and_flatten("val_set.zip", "val", "val_set")
extract_and_flatten("test_set.zip", "test", "test_set")


In [None]:
# --- Custom Dataset Class for our Purposes ---
class FoodDataset(Dataset):
    def __init__(self, image_dir, labels_df, transform, class_to_idx):
        self.image_dir = image_dir
        self.labels_df = labels_df
        self.transform = transform
        self.class_to_idx = class_to_idx

    def __len__(self):
        return len(self.labels_df)

    def __getitem__(self, idx):
        row = self.labels_df.iloc[idx]
        img_path = os.path.join(self.image_dir, row['img_name'])
        image = Image.open(img_path).convert('RGB')
        label = self.class_to_idx[row['label']]

        image = self.transform(image)
        return image, label

In [None]:
# --- Compute dataset-specific mean and std ---
def compute_mean_std(image_dir):
    """
    Calculates mean and standard deviation across all RGB images.
    """
    transform = transforms.Compose([
        transforms.Resize((224, 224)),
        transforms.ToTensor()
    ])

    image_paths = list(Path(image_dir).glob("*.jpg"))

    mean = torch.zeros(3)
    std = torch.zeros(3)
    total_images = 0

    for img_path in tqdm(image_paths, desc="Computing mean/std"):
        image = Image.open(img_path).convert("RGB")
        tensor = transform(image)
        mean += tensor.mean(dim=(1, 2))
        std += tensor.std(dim=(1, 2))
        total_images += 1

    mean /= total_images
    std /= total_images

    print("Mean:", mean.tolist())
    print("Std:", std.tolist())
    return mean.tolist(), std.tolist()

# Already computed → no need to run again
# compute_mean_std(working_dir / 'train')


In [None]:
# --- Transforms for supervised and self-supervised training ---

# Augmented transform for supervised training
train_transform = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.RandomHorizontalFlip(),
    transforms.RandomRotation(15),
    transforms.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2, hue=0.02),
    transforms.RandomResizedCrop(224, scale=(0.8, 1.0)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.6388, 0.5444, 0.4448], std=[0.2229, 0.2414, 0.2638])
])

# Validation/test transform (no augmentation)
val_transform = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.6388, 0.5444, 0.4448], std=[0.2229, 0.2414, 0.2638])
])

# SimCLR-style augmentations for SSL
simclr_transform = transforms.Compose([
    transforms.RandomResizedCrop(224, scale=(0.5, 1.0)),
    transforms.RandomHorizontalFlip(),
    transforms.RandomApply([transforms.ColorJitter(0.4, 0.4, 0.4, 0.1)], p=0.8),
    transforms.RandomGrayscale(p=0.2),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.6388, 0.5444, 0.4448],
                         std=[0.2229, 0.2414, 0.2638])
])



In [None]:
# --- Label encoding ---
train_csv = data_dir / 'train_labels.csv'
labels_df = pd.read_csv(train_csv)

# Sort class names for consistent indexing
classes = sorted(labels_df['label'].unique())
class_to_idx = {cls_name: idx for idx, cls_name in enumerate(classes)}

# Map string labels to class indices
labels = labels_df['label'].map(class_to_idx)


In [None]:
# --- 1. Stratified Train/Validation Split ---
train_idx, val_idx = train_test_split(
    np.arange(len(labels_df)),
    test_size=0.2,
    stratify=labels,
    random_state=42
)

# Create new DataFrames
train_df = labels_df.iloc[train_idx].reset_index(drop=True)
val_df = labels_df.iloc[val_idx].reset_index(drop=True)


In [None]:
# --- 2. Dataset Instances with Transforms ---
train_dataset = FoodDataset(
    working_dir / 'train',
    train_df,
    transform=train_transform,
    class_to_idx=class_to_idx
)

val_dataset = FoodDataset(
    working_dir / 'train',
    val_df,
    transform=val_transform,
    class_to_idx=class_to_idx
)


In [None]:
# --- DataLoader Setup ---
# Adjust number of workers based on system
cpu_count = multiprocessing.cpu_count()
num_workers = 2  # Adjustable; 2 is safe default

# Training Loader
train_loader = DataLoader(
    train_dataset,
    batch_size=64,
    shuffle=True,
    num_workers=num_workers,
    pin_memory=True,
    persistent_workers=True,
    worker_init_fn=seed_worker
)

# Validation Loader
val_loader = DataLoader(
    val_dataset,
    batch_size=64,
    shuffle=False,
    num_workers=num_workers,
    pin_memory=True,
    persistent_workers=True,
    worker_init_fn=seed_worker
)


In [None]:
# --- Test Loader: from official validation set ---
test_csv = data_dir / 'val_labels.csv'
test_labels_df = pd.read_csv(test_csv).reset_index(drop=True)

test_dataset = FoodDataset(
    working_dir / 'val',
    test_labels_df,
    transform=val_transform,
    class_to_idx=class_to_idx
)

test_loader = DataLoader(
    test_dataset,
    batch_size=64,
    shuffle=False,
    num_workers=num_workers,
    pin_memory=True,
    persistent_workers=True,
    worker_init_fn=seed_worker
)


---

# Definition of the original Net


In [None]:
# --- Custom Residual Block with SiLU Activation ---
class BasicBlock(nn.Module):
    expansion = 1

    def __init__(self, in_channels, out_channels, stride=1, downsample=None):
        super().__init__()
        self.conv1 = nn.Conv2d(in_channels, out_channels, kernel_size=3,
                               stride=stride, padding=1, bias=False)
        self.bn1 = nn.BatchNorm2d(out_channels)
        self.act = nn.SiLU(inplace=True)
        self.conv2 = nn.Conv2d(out_channels, out_channels, kernel_size=3,
                               padding=1, bias=False)
        self.bn2 = nn.BatchNorm2d(out_channels)
        self.downsample = downsample

    def forward(self, x):
        identity = x if self.downsample is None else self.downsample(x)
        out = self.act(self.bn1(self.conv1(x)))
        out = self.bn2(self.conv2(out))
        out += identity
        return self.act(out)


# --- CBAM Attention Module (Channel + Spatial Attention) ---
class CBAMBlock(nn.Module):
    def __init__(self, channels, reduction=16):
        super().__init__()
        self.avg_pool = nn.AdaptiveAvgPool2d(1)
        self.max_pool = nn.AdaptiveMaxPool2d(1)

        self.channel_fc = nn.Sequential(
            nn.Conv2d(channels, channels // reduction, 1, bias=False),
            nn.ReLU(),
            nn.Conv2d(channels // reduction, channels, 1, bias=False)
        )

        self.spatial = nn.Sequential(
            nn.Conv2d(2, 1, kernel_size=7, padding=3, bias=False),
            nn.Sigmoid()
        )

        self.sigmoid = nn.Sigmoid()

    def forward(self, x):
        # Channel attention
        avg_out = self.channel_fc(self.avg_pool(x))
        max_out = self.channel_fc(self.max_pool(x))
        channel_att = self.sigmoid(avg_out + max_out)
        x = x * channel_att

        # Spatial attention
        avg_pool = torch.mean(x, dim=1, keepdim=True)
        max_pool, _ = torch.max(x, dim=1, keepdim=True)
        spatial_att = self.spatial(torch.cat([avg_pool, max_pool], dim=1))
        return x * spatial_att

# --- Custom CNN Definition (ResNet-like with CBAM and SiLU) ---
class CustomCNN(nn.Module):
    def __init__(self, block, layers, num_classes=251, base_width=32):
        super().__init__()
        self.in_channels = base_width

        self.conv1 = nn.Conv2d(3, base_width, kernel_size=3, stride=1, padding=1, bias=False)
        self.bn1 = nn.BatchNorm2d(base_width)
        self.act = nn.SiLU(inplace=True)

        self.layer1 = self._make_layer(block, base_width, layers[0])
        self.layer2 = self._make_layer(block, base_width*2, layers[1], stride=2)
        self.layer3 = self._make_layer(block, base_width*4, layers[2], stride=2)
        self.layer4 = self._make_layer(block, base_width*8, layers[3], stride=2)

        self.cbam = CBAMBlock(base_width * 8)
        self.avgpool = nn.AdaptiveAvgPool2d((1, 1))
        self.dropout = nn.Dropout(0.3)

        self.fc = nn.Sequential(
            nn.Linear(base_width * 8, 512),
            nn.ReLU(),
            nn.Dropout(0.3),
            nn.Linear(512, num_classes)
        )

        self._init_weights()


    def _make_layer(self, block, out_channels, blocks, stride=1):
        """
        Creates a sequence of residual blocks, optionally with downsampling.
        """
        downsample = None
        if stride != 1 or self.in_channels != out_channels * block.expansion:
            downsample = nn.Sequential(
                nn.Conv2d(self.in_channels, out_channels * block.expansion,
                          kernel_size=1, stride=stride, bias=False),
                nn.BatchNorm2d(out_channels * block.expansion)
            )

        layers = [block(self.in_channels, out_channels, stride, downsample)]
        self.in_channels = out_channels * block.expansion
        layers.extend([block(self.in_channels, out_channels) for _ in range(1, blocks)])
        return nn.Sequential(*layers)


    def _init_weights(self):
        """
        Initializes weights using He initialization (Kaiming) for conv layers.
        """
        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
            elif isinstance(m, nn.BatchNorm2d):
                nn.init.constant_(m.weight, 1)
                nn.init.constant_(m.bias, 0)
            elif isinstance(m, nn.Linear):
                nn.init.normal_(m.weight, 0, 0.01)
                nn.init.constant_(m.bias, 0)

    def forward(self, x):
        """
        Full forward pass with classification head.
        """
        x = self.act(self.bn1(self.conv1(x)))
        x = self.layer1(x)
        x = self.layer2(x)
        x = self.layer3(x)
        x = self.layer4(x)
        x = self.cbam(x)
        x = self.avgpool(x)
        x = torch.flatten(x, 1)
        x = self.dropout(x)
        x = self.fc(x)
        return x

        
    def extract_features(self, x):
        """
        Feature extractor for SSL: excludes final classification head.
        """
        x = self.act(self.bn1(self.conv1(x)))
        x = self.layer1(x)
        x = self.layer2(x)
        x = self.layer3(x)
        x = self.layer4(x)
        x = self.cbam(x)
        x = self.avgpool(x)
        x = torch.flatten(x, 1)
        return x



# --- Wrapper function ---
def build_custom_cnn(num_classes=251):
    """
    Builds ResNet18-like model with custom blocks and CBAM.
    """
    return CustomCNN(BasicBlock, [2, 2, 2, 2], num_classes=num_classes, base_width=32)


In [None]:
# --- Inspect Model Summary to check the Number of Parameters (<5M) ---
from torchsummary import summary

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = CustomCNN(BasicBlock, [2, 2, 2, 2], num_classes=251, base_width=32).to(device)

summary(model, input_size=(3, 224, 224), device=str(device))


---

# Self Supervised Learning - SImCLR Model 



In [None]:
# --- SimCLR Model: Encoder + Projection Head ---
class SimCLR(nn.Module):
    def __init__(self, base_encoder, projection_dim=128):
        """
        Wraps a base encoder (e.g., ResNet or custom CNN)
        and adds a projection head for contrastive learning.
        """
        super(SimCLR, self).__init__()
        self.encoder = base_encoder
        feature_dim = 256  # Output dim from extract_features()

        self.projector = nn.Sequential(
            nn.Linear(feature_dim, feature_dim),
            nn.ReLU(),
            nn.Linear(feature_dim, projection_dim)
        )

    def forward(self, x):
        features = self.encoder.extract_features(x)         # (B, 256)
        projections = self.projector(features)              # (B, 128)
        return nn.functional.normalize(projections, dim=1)  # L2 normalization for contrastive loss


In [None]:
# --- NT-Xent Loss (Contrastive) ---
class NTXentLoss(nn.Module):
    def __init__(self, batch_size, temperature=0.5):
        """
        Implements the Normalized Temperature-scaled Cross Entropy loss
        as used in SimCLR.
        """
        super(NTXentLoss, self).__init__()
        self.batch_size = batch_size
        self.temperature = temperature
        self.criterion = nn.CrossEntropyLoss(reduction="sum")
        self.mask = self._get_correlated_mask().type(torch.bool)

    def _get_correlated_mask(self):
        N = 2 * self.batch_size
        mask = torch.ones((N, N)) - torch.eye(N)
        return mask

    def forward(self, zis, zjs):
        """
        zis and zjs: (B, D) projections from two augmented views
        """
        device = zis.device
        N = 2 * self.batch_size

        # Concatenate all projections
        z = torch.cat([zis, zjs], dim=0)  # (2N, D)
        
        # Cosine similarity between all pairs
        sim_matrix = nn.functional.cosine_similarity(z.unsqueeze(1), z.unsqueeze(0), dim=2)  # (2N, 2N)

        # Positive pairs are at offsets ±batch_size
        positives = torch.cat([
            torch.diag(sim_matrix, self.batch_size),
            torch.diag(sim_matrix, -self.batch_size)
        ], dim=0)

        # All other pairs are negatives
        negatives = sim_matrix[self.mask.to(device)].view(N, -1)

        # Construct logits and labels
        logits = torch.cat([positives.unsqueeze(1), negatives], dim=1)
        labels = torch.zeros(N).long().to(device)  # positive pair = class 0

        # Scale by temperature and apply loss
        logits = logits / self.temperature
        return self.criterion(logits, labels) / N



## SSL Training - Setup


In [None]:
# --- SSL Dataset Class for SimCLR ---
class SSLFoodDataset(Dataset):
    """
    Dataset for SimCLR-style self-supervised training.
    Returns two differently augmented views of the same image.
    """
    def __init__(self, image_dir, transform, image_names=None):
        self.image_dir = image_dir
        self.transform = transform
        self.image_names = image_names or os.listdir(image_dir)

    def __len__(self):
        return len(self.image_names)

    def __getitem__(self, idx):
        img_name = self.image_names[idx]
        img_path = os.path.join(self.image_dir, img_name)
        image = Image.open(img_path).convert('RGB')
        
        xi = self.transform(image)
        xj = self.transform(image)
        return xi, xj


In [None]:
# --- SSL Training Parameters ---
batch_size_ssl = 48 # Works best with kaggle GPU 

ssl_dataset = SSLFoodDataset(
    image_dir=working_dir / 'train',
    transform=simclr_transform
)

ssl_loader = DataLoader(
    ssl_dataset,
    batch_size=batch_size_ssl,
    shuffle=True,
    num_workers=2,  # Adjust if needed
    pin_memory=True,
    persistent_workers=True,
    drop_last=True,  # Required for SimCLR (2N logic)
    worker_init_fn=seed_worker
)



In [None]:
# --- Model, Optimizer, Loss ---
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

base_encoder = build_custom_cnn(num_classes=251)  # num_classes irrelevant for SSL
model = SimCLR(base_encoder=base_encoder).to(device)

optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)
loss_fn = NTXentLoss(batch_size=batch_size_ssl, temperature=0.5)


## SSL Pretraining Loop

In [None]:
from tqdm import tqdm
import gc
torch.cuda.empty_cache()
gc.collect()



# Number epochs for SSL Pretraining
ssl_epochs = 20 # Adjust
model.train()

for epoch in range(ssl_epochs):
    total_loss = 0.0
    num_batches = 0

    progress_bar = tqdm(ssl_loader, desc=f"Epoch {epoch+1}/{ssl_epochs}")
    
    for xi, xj in progress_bar:
        xi, xj = xi.to(device), xj.to(device)

        # Embeddings 
        zis = model(xi)  # (B, D)
        zjs = model(xj)  # (B, D)

        # NT-Xent Loss
        loss = loss_fn(zis, zjs)

        # Backpropagation
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        total_loss += loss.item()
        num_batches += 1
        progress_bar.set_postfix(loss=loss.item())

    avg_loss = total_loss / num_batches
    print(f"[Epoch {epoch+1}] Average Loss: {avg_loss:.4f}")
    if (epoch + 1) % 4 == 0:  # Alle 4 Epochen 
        torch.save(model.state_dict(), f"/kaggle/working/simclr_checkpoint_ep{epoch+1}.pth")
        print(torch.cuda.memory_summary(device=None, abbreviated=True))



In [None]:
import shutil

torch.save(model.state_dict(), "/kaggle/working/simclr_final.pth")
# shutil.copy("/kaggle/working/simclr_final.pth", "/kaggle/outputs/")


###  Extracting Feature Vectors


In [None]:
# --- Feature Extraction from Frozen SSL Encoder ---
# Load pretrained encoder weights
model.load_state_dict(torch.load("/kaggle/input/ssl_final/pytorch/default/1/simclr_final.pth", map_location=device))
model = model.to(device).eval()

# Freeze encoder weights (no gradient updates)
for param in model.parameters():
    param.requires_grad = False


'''
We just have to execute this once with our best model and save the feature matrices. 
We can load those matrices again for training the traditional classifiers.
'''

# --- Extract Features & Labels from Dataloader ---
def extract_features_and_labels(dataloader, model, device):
    """
    Extracts feature vectors using the frozen encoder
    and returns numpy arrays for downstream classifier training.
    """
    features, labels = [], []
    with torch.no_grad():
        for images, lbls in tqdm(dataloader, desc="Extracting features"):
            images = images.to(device)
            feats = model.encoder.extract_features(images).cpu().numpy()
            features.append(feats)
            labels.append(lbls.numpy())
    return np.concatenate(features), np.concatenate(labels)

# --- Run once to cache feature matrices ---
X_train, y_train = extract_features_and_labels(train_loader, model, device)
X_val, y_val     = extract_features_and_labels(val_loader, model, device)
X_test, y_test   = extract_features_and_labels(test_loader, model, device)

# Save to disk for reuse (e.g. in classifier experiments)
np.save("X_train.npy", X_train)
np.save("y_train.npy", y_train)
np.save("X_val.npy", X_val)
np.save("y_val.npy", y_val)
np.save("X_test.npy", X_test)
np.save("y_test.npy", y_test)


## Training a traditional classifyer on feature vectors

Following the project instructions, we extracted features from a CNN encoder trained with SimCLR,
then trained a set of traditional classifiers on top of those features:

1. Logistic Regression
2. Linear SVM (via SGDClassifier)
3. Linear SVM trained only on validation set (limited label scenario)
4. RBF SVM (trained on a 3k validation subset due to runtime)
5. Random Forest
6. MLP (trained separately in PyTorch)

This allows us to evaluate the quality of the learned representations without relying on end-to-end backpropagation.

In [1]:
import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.svm import LinearSVC, SVC
from sklearn.ensemble import RandomForestClassifier
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score, f1_score, precision_score, recall_score
from sklearn.model_selection import train_test_split
import pandas as pd

# --- Load feature vectors ---
X_train = np.load('/kaggle/input/features/X_train.npy')
y_train = np.load('/kaggle/input/features/y_train.npy')

X_val = np.load('/kaggle/input/features2/X_val.npy')
y_val = np.load('/kaggle/input/features2/y_val.npy')

X_test = np.load('/kaggle/input/features2/X_test.npy')
y_test = np.load('/kaggle/input/features2/y_test.npy')

# --- Evaluation helper ---

def evaluate_model(y_true, y_pred):
    return {
        'Accuracy_test': accuracy_score(y_true, y_pred),
        'F1_weighted': f1_score(y_true, y_pred, average='weighted'),
        'F1_macro': f1_score(y_true, y_pred, average='macro'),
        'Precision_macro': precision_score(y_true, y_pred, average='macro'),
        'Recall_macro': recall_score(y_true, y_pred, average='macro'),
    }

# --- Model training and evaluation ---
results = {}


In [None]:
# 1. Logistic Regression (Full train set)
lr = LogisticRegression(max_iter=1000)
lr.fit(X_train, y_train)
results['LogReg (full)'] = evaluate_model(y_test, lr.predict(X_test))


In [2]:
# 2.SGDClassifier (Full train set) as faster alternative to linear SVM
from sklearn.linear_model import SGDClassifier

svm_sgd = SGDClassifier(loss='hinge', max_iter=1000, tol=1e-3)
svm_sgd.fit(X_train, y_train)
results['Linear SVM (full)'] = evaluate_model(y_test, svm_sgd.predict(X_test))

  _warn_prf(average, modifier, msg_start, len(result))


In [2]:
# 3. Random Forest (Full train set)
rf_full = RandomForestClassifier(n_estimators=100, max_depth=20, n_jobs=-1)
rf_full.fit(X_train, y_train)
results['Random Forest (full)'] = evaluate_model(y_test, rf_full.predict(X_test))

  _warn_prf(average, modifier, msg_start, len(result))


In [4]:
# 4. Linear SVM  (Val set only)
svm_linear_val = LinearSVC()
svm_linear_val.fit(X_val, y_val)
results['Linear SVM (val only)'] = evaluate_model(y_test, svm_linear_val.predict(X_test))



NameError: name 'svm_linear' is not defined

In [7]:
# 5. RBF SVM (Subset of val set because its so slow otherwise)
X_val_sub, _, y_val_sub, _ = train_test_split(X_val, y_val, train_size=3000, stratify=y_val)
svm_rbf = SVC(kernel='rbf')
svm_rbf.fit(X_val_sub, y_val_sub)
results['RBF SVM (val subset)'] = evaluate_model(y_test, svm_rbf.predict(X_test))


  _warn_prf(average, modifier, msg_start, len(result))


In [None]:
# --- Display as DataFrame ---
df_results = pd.DataFrame(results).T  # transpose for better readability
print(df_results)

df_results['Accuracy'] = pd.to_numeric(df_results['Accuracy'], errors='coerce')


df_results = df_results.sort_values(by='Accuracy', ascending=False)
print(df_results)

In [None]:
# --- Load extracted feature vectors from disk if not done already ---
X_train = np.load('/kaggle/input/features/X_train.npy')
y_train = np.load('/kaggle/input/features/y_train.npy')

X_val = np.load('/kaggle/input/features/X_val.npy')
y_val = np.load('/kaggle/input/features/y_val.npy')

X_test = np.load('/kaggle/input/features/X_test.npy')
y_test = np.load('/kaggle/input/features/y_test.npy')


In [5]:
# --- Convert to PyTorch tensors ---
import torch
import torch.nn as nn
from torch.utils.data import DataLoader, TensorDataset

X_train_tensor = torch.tensor(X_train, dtype=torch.float32)
y_train_tensor = torch.tensor(y_train, dtype=torch.long)
X_val_tensor   = torch.tensor(X_val, dtype=torch.float32)
y_val_tensor   = torch.tensor(y_val, dtype=torch.long)
X_test_tensor  = torch.tensor(X_test, dtype=torch.float32)
y_test_tensor  = torch.tensor(y_test, dtype=torch.long)

# --- Create DataLoaders ---
mlp_train_loader = DataLoader(TensorDataset(X_train_tensor, y_train_tensor), batch_size=256, shuffle=True)
mlp_val_loader   = DataLoader(TensorDataset(X_val_tensor, y_val_tensor), batch_size=256)
mlp_test_loader  = DataLoader(TensorDataset(X_test_tensor, y_test_tensor), batch_size=256)


# --- Define simple 2-layer MLP ---
class MLP(nn.Module):
    def __init__(self, input_dim=256, hidden_dim=512, output_dim=251):
        super(MLP, self).__init__()
        self.model = nn.Sequential(
            nn.Linear(input_dim, hidden_dim),
            nn.ReLU(),
            nn.Dropout(0.3),
            nn.Linear(hidden_dim, output_dim)
        )

    def forward(self, x):
        return self.model(x)

# --- MLP Training Setup ---
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
mlp_model = MLP(input_dim=X_train.shape[1]).to(device)

criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(mlp_model.parameters(), lr=1e-3)

# Early stopping
best_val_acc = 0
patience = 5
epochs_no_improve = 0
n_epochs = 50 # 50 is okay since we have early stopping

for epoch in range(n_epochs):
    # --- Training ---
    mlp_model.train()
    train_loss, correct, total = 0, 0, 0
    for x_batch, y_batch in mlp_train_loader:
        x_batch, y_batch = x_batch.to(device), y_batch.to(device)
        optimizer.zero_grad()
        out = mlp_model(x_batch)
        loss = criterion(out, y_batch)
        loss.backward()
        optimizer.step()

        train_loss += loss.item() * x_batch.size(0)
        preds = out.argmax(dim=1)
        correct += (preds == y_batch).sum().item()
        total += y_batch.size(0)

    train_acc = correct / total
    avg_train_loss = train_loss / total

    # --- Validation ---
    mlp_model.eval()
    val_loss, val_correct, val_total = 0, 0, 0
    with torch.no_grad():
        for x_batch, y_batch in mlp_val_loader:
            x_batch, y_batch = x_batch.to(device), y_batch.to(device)
            out = mlp_model(x_batch)
            loss = criterion(out, y_batch)

            val_loss += loss.item() * x_batch.size(0)
            preds = out.argmax(dim=1)
            val_correct += (preds == y_batch).sum().item()
            val_total += y_batch.size(0)

    val_acc = val_correct / val_total
    avg_val_loss = val_loss / val_total

    print(f"Epoch {epoch+1}: "
          f"Train Loss = {avg_train_loss:.4f}, Acc = {train_acc:.4f} | "
          f"Val Loss = {avg_val_loss:.4f}, Acc = {val_acc:.4f}")

    # --- Early Stopping ---
    if val_acc > best_val_acc:
        best_val_acc = val_acc
        best_model_state = mlp_model.state_dict()
        epochs_no_improve = 0
    else:
        epochs_no_improve += 1
        if epochs_no_improve >= patience:
            print(f"Early stopping triggered at epoch {epoch+1}")
            break



Epoch 1: Train Loss = 4.8205, Acc = 0.0626 | Val Loss = 4.4947, Acc = 0.0896
Epoch 2: Train Loss = 4.3761, Acc = 0.1079 | Val Loss = 4.3352, Acc = 0.1097
Epoch 3: Train Loss = 4.2426, Acc = 0.1247 | Val Loss = 4.2651, Acc = 0.1179
Epoch 4: Train Loss = 4.1620, Acc = 0.1342 | Val Loss = 4.2010, Acc = 0.1297
Epoch 5: Train Loss = 4.1031, Acc = 0.1430 | Val Loss = 4.1628, Acc = 0.1372
Epoch 6: Train Loss = 4.0577, Acc = 0.1510 | Val Loss = 4.1462, Acc = 0.1375
Epoch 7: Train Loss = 4.0132, Acc = 0.1558 | Val Loss = 4.1147, Acc = 0.1439
Epoch 8: Train Loss = 3.9787, Acc = 0.1616 | Val Loss = 4.0875, Acc = 0.1500
Epoch 9: Train Loss = 3.9450, Acc = 0.1666 | Val Loss = 4.0692, Acc = 0.1544
Epoch 10: Train Loss = 3.9181, Acc = 0.1715 | Val Loss = 4.0609, Acc = 0.1549
Epoch 11: Train Loss = 3.8917, Acc = 0.1757 | Val Loss = 4.0582, Acc = 0.1564
Epoch 12: Train Loss = 3.8652, Acc = 0.1784 | Val Loss = 4.0444, Acc = 0.1595
Epoch 13: Train Loss = 3.8442, Acc = 0.1821 | Val Loss = 4.0375, Acc = 0.

<All keys matched successfully>

In [8]:
# --- Load best model and test ---
mlp_model.load_state_dict(best_model_state)

from sklearn.metrics import classification_report

mlp_model.eval()
all_preds, all_labels = [], []
with torch.no_grad():
    for x_batch, y_batch in mlp_test_loader:
        x_batch = x_batch.to(device)
        preds = mlp_model(x_batch).argmax(dim=1).cpu()
        all_preds.extend(preds.numpy())
        all_labels.extend(y_batch.numpy())

# Final evaluation
print(classification_report(all_labels, all_preds, digits=4))


              precision    recall  f1-score   support

           0     0.2419    0.5455    0.3352        55
           1     0.4865    0.2951    0.3673        61
           2     0.5556    0.0943    0.1613        53
           3     0.1485    0.2941    0.1974        51
           4     0.4211    0.1951    0.2667        41
           5     0.2245    0.2115    0.2178        52
           6     0.2609    0.1053    0.1500        57
           7     0.0465    0.0370    0.0412        54
           8     0.2262    0.3878    0.2857        49
           9     0.2353    0.1739    0.2000        46
          10     0.1667    0.0208    0.0370        48
          11     0.2024    0.3617    0.2595        47
          12     0.1818    0.1778    0.1798        45
          13     0.1875    0.0984    0.1290        61
          14     0.0588    0.0217    0.0317        46
          15     0.3333    0.2807    0.3048        57
          16     0.3077    0.4898    0.3780        49
          17     0.0952    

  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
