# Explanation of the SimCLR Implementation with ResNet50

**Note**: Due to GPU memory limitations, I could not run the ResNet50 implementation. Below is an explanation of its structure and key features:

## 1. Data Preprocessing and Augmentation

### Normalization and Resizing:
- Input images are resized to `224x224` and normalized with mean `[0.5]` and standard deviation `[0.5]` to match the expected range for ResNet-50.

### SimCLR Augmentation:
- Transformations include:
  - `RandomResizedCrop`
  - `RandomHorizontalFlip`
  - `ColorJitter`
  - `GaussianBlur`
  - `RandomApply`
- These augmentations generate two views of the same image, critical for contrastive learning, ensuring the model learns representations invariant to these transformations.

---

## 2. Dataset and DataLoader

### Custom Dataset:
- Provides two augmented views of each image for SimCLR training.

### Batch Size:
- Large batch sizes (`1024` for contrastive learning, `64` for fine-tuning) align with SimCLR's reliance on a diverse batch for contrastive loss effectiveness.

---

## 3. SimCLR Model Architecture

### Base Encoder:
- A modified ResNet-50 backbone, pre-trained on ImageNet:
  - The final `fc` layer is replaced with an `Identity` layer to retain feature embeddings.
  - Input channels are modified to handle single-channel (grayscale) images.

### Projection Head:
- A two-layer neural network:
  - `Linear(2048, 1024)` → `ReLU` → `Linear(1024, 128)`.
- Projects features into a lower-dimensional space.
- Only encoder features are used for downstream tasks, while the projection head aids in learning representations during pre-training.

---

## 4. Loss Function

### NT-Xent Loss (Normalized Temperature-Scaled Cross-Entropy Loss):
- Measures similarity between positive pairs (two views of the same image) and contrasts them against negative pairs (different images in the batch).
- Features:
  - Temperature scaling improves gradient flow and learning.
  - Implements a softmax over the similarity matrix.
  - CrossEntropyLoss calculates the loss.

---

## 5. Training Procedure

### Contrastive Learning Stage:
- The SimCLR model is trained using:
  - Augmented image pairs.
  - NT-Xent loss.

### Optimization:
- **Optimizer**: Adam with a learning rate of `3e-4`.
- **Scheduler**: CosineAnnealingLR for gradual learning rate decay.

### Key Factors:
- Diverse data augmentation.
- Sufficiently large batch size.
- Effective contrastive loss with temperature scaling.

---

## 6. Downstream Task: Classification

### Linear Evaluation Protocol:
- Pre-trained encoder representations are fine-tuned using a linear classification head (`Linear(2048, num_classes)`).

### Optimization:
- **Loss**: CrossEntropyLoss.
- Encoder weights are fine-tuned at a smaller learning rate (`1e-5`), while the classification head adapts with a higher learning rate (`3e-4`).

---

## 7. Key Features Contributing to Accuracy

1. **Augmentation Diversity**:
   - Rich transformations help the model learn invariant features that generalize well to unseen data.
   
2. **Projection Head**:
   - Helps learn meaningful low-dimensional embeddings, improving representation quality.

3. **ResNet-50 Encoder**:
   - A powerful pre-trained backbone provides a strong initialization.

4. **NT-Xent Loss**:
   - Promotes robust and discriminative embeddings through contrastive learning.

5. **Large Batch Sizes**:
   - Increases the number of negative samples, enhancing contrastive loss effectiveness.

6. **Cosine Learning Rate Scheduler**:
   - Smoothly decays the learning rate, ensuring stable convergence.

7. **Separate Fine-Tuning**:
   - Retains learned representations in the encoder while allowing the classification head to adapt to the specific task.


In [None]:
from torchvision.models import ResNet50_Weights
import numpy as np
import torch
from torch.utils.data import Dataset, DataLoader
import torchvision.transforms as transforms
from torchvision.models import resnet50
import torch.nn as nn
import torch.optim as optim
from torch.optim.lr_scheduler import CosineAnnealingLR
from PIL import Image

# Load and preprocess the dataset
train_images_path = "brain_train_image_final.npy"
train_labels_path = "brain_train_label.npy"
test_images_path = "brain_test_image_final.npy"
test_labels_path = "brain_test_label.npy"

# Load the data
final_X_train_modified = np.load(train_images_path)[:, 1, :, :]
final_X_test_modified = np.load(test_images_path)[:, 1, :, :]
train_labels = np.load(train_labels_path)
test_labels = np.load(test_labels_path)

# Normalize and Resize Images using Pillow
def normalize_and_resize(images, target_size=(224, 224)):
    resized_images = []
    for img in images:
        img = Image.fromarray((img * 255).astype(np.uint8))
        img_resized = img.resize(target_size, Image.Resampling.LANCZOS)
        resized_images.append(np.array(img_resized) / 255.0)
    return np.array(resized_images)

final_X_train_resized = normalize_and_resize(final_X_train_modified)
final_X_test_resized = normalize_and_resize(final_X_test_modified)

# Define SimCLR Augmentation Transform
transform_simclr = transforms.Compose([
    transforms.ToPILImage(),
    transforms.RandomResizedCrop(size=224, scale=(0.08, 1.0), ratio=(3/4, 4/3)),
    transforms.RandomHorizontalFlip(p=0.5),
    transforms.ColorJitter(brightness=0.8, contrast=0.8, saturation=0.8, hue=0.2),
    transforms.RandomApply([transforms.GaussianBlur(kernel_size=23, sigma=(0.1, 2.0))], p=0.5),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.5], std=[0.5])
])

# Custom Dataset for SimCLR
class SimCLRDataset(Dataset):
    def __init__(self, images, transform):
        self.images = images
        self.transform = transform

    def __len__(self):
        return len(self.images)

    def __getitem__(self, idx):
        img = self.images[idx]
        img_1 = self.transform(img)
        img_2 = self.transform(img)
        return img_1, img_2

train_dataset = SimCLRDataset(final_X_train_resized, transform_simclr)
train_loader = DataLoader(train_dataset, batch_size=1024, shuffle=True)

# Define SimCLR Model
class SimCLR(nn.Module):
    def __init__(self, base_encoder, projection_dim):
        super(SimCLR, self).__init__()
        self.encoder = base_encoder
        self.projector = nn.Sequential(
            nn.Linear(2048, 1024),  # Reduce dimensions here
            nn.ReLU(),
            nn.Linear(1024, projection_dim)
        )
        

    def forward(self, x):
        h = self.encoder(x)
        z = self.projector(h)
        return z

# Define NT-Xent Loss
class NTXentLoss(nn.Module):
    def __init__(self, temperature):
        super(NTXentLoss, self).__init__()
        self.temperature = temperature
        self.criterion = nn.CrossEntropyLoss(reduction="mean")

    def forward(self, z_i, z_j):
        N = z_i.size(0) + z_j.size(0)
        z = torch.cat((z_i, z_j), dim=0)
        sim = torch.mm(z, z.T) / self.temperature
        sim = torch.nn.functional.softmax(sim, dim=1)

        labels = torch.cat([
            torch.arange(z_i.size(0), device=z.device),
            torch.arange(z_j.size(0), device=z.device)
        ])
        loss = self.criterion(sim, labels)
        return loss

# Initialize ResNet-50 Encoder with updated weights argument
base_encoder = resnet50(weights=ResNet50_Weights.IMAGENET1K_V1)
base_encoder.conv1 = nn.Conv2d(1, 64, kernel_size=7, stride=2, padding=3, bias=False)
base_encoder.fc = nn.Identity()

# Initialize SimCLR Model
model = SimCLR(base_encoder, projection_dim=128).to("cuda")
optimizer = optim.Adam(model.parameters(), lr=3e-4)
criterion = NTXentLoss(temperature=0.5)
scheduler = CosineAnnealingLR(optimizer, T_max=100)

# Train SimCLR Model
for epoch in range(100):
    total_loss = 0
    model.train()
    for img_1, img_2 in train_loader:
        img_1, img_2 = img_1.to("cuda"), img_2.to("cuda")
        z_i = model(img_1)
        z_j = model(img_2)

        loss = criterion(z_i, z_j)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        total_loss += loss.item()

    scheduler.step()
    print(f"Epoch [{epoch+1}/100], Loss: {total_loss/len(train_loader):.4f}")

# Define Dataset for Classification
class TestDataset(Dataset):
    def __init__(self, images, labels, transform=None):
        self.images = images
        self.labels = labels
        self.transform = transform

    def __len__(self):
        return len(self.images)

    def __getitem__(self, idx):
        img = self.images[idx]
        label = self.labels[idx]
        if self.transform:
            img = self.transform(img)
        return img, label

# Initialize Training and Test Dataset and DataLoader
train_transform = transforms.Compose([
    transforms.ToPILImage(),
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.5], std=[0.5])
])

test_transform = transforms.Compose([
    transforms.ToPILImage(),
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.5], std=[0.5])
])

train_dataset = TestDataset(final_X_train_resized, train_labels, transform=train_transform)
train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)

test_dataset = TestDataset(final_X_test_resized, test_labels, transform=test_transform)
test_loader = DataLoader(test_dataset, batch_size=64, shuffle=False)

# Add Classification Head
class ClassificationHead(nn.Module):
    def __init__(self, input_dim, num_classes):
        super(ClassificationHead, self).__init__()
        self.fc = nn.Linear(input_dim, num_classes)

    def forward(self, x):
        return self.fc(x)

classification_head = ClassificationHead(input_dim=2048, num_classes=len(np.unique(train_labels))).to("cuda")
optimizer_cls = optim.Adam([
    {"params": model.encoder.parameters(), "lr": 1e-5},
    {"params": classification_head.parameters(), "lr": 3e-4},
])
criterion_cls = nn.CrossEntropyLoss()

# Fine-tune Classification Head
for epoch in range(100):
    model.encoder.train()
    classification_head.train()
    total_loss = 0
    correct = 0
    for img, label in train_loader:
        img, label = img.to("cuda"), label.to("cuda")
        features = model.encoder(img)
        logits = classification_head(features)
        loss = criterion_cls(logits, label)

        optimizer_cls.zero_grad()
        loss.backward()
        optimizer_cls.step()

        total_loss += loss.item()
        correct += (logits.argmax(dim=1) == label).sum().item()

    accuracy = correct / len(train_labels)
    print(f"Epoch [{epoch+1}/100], Loss: {total_loss/len(train_loader):.4f}, Accuracy: {accuracy:.4f}")

# Evaluate on Test Dataset
classification_head.eval()
correct = 0
with torch.no_grad():
    for img, label in test_loader:
        img, label = img.to("cuda"), label.to("cuda")
        features = model.encoder(img)
        logits = classification_head(features)
        correct += (logits.argmax(dim=1) == label).sum().item()

test_accuracy = correct / len(test_labels)
print(f"Test Accuracy: {test_accuracy:.4f}")


# The Below one got 90% accuracy, with ResNet18

# SimCLR Implementation Explanation

## **1. Data Preprocessing and Augmentation**
- **Normalization and Resizing**:
  - Images are resized to `224x224` using Lanczos resampling and normalized with a mean of `0.5` and standard deviation of `0.5`.
- **Data Augmentation**:
  - Augmentation pipeline includes:
    - `RandomResizedCrop`: Crops images to random sizes and aspect ratios.
    - `RandomHorizontalFlip`: Randomly flips the image horizontally.
    - `ColorJitter`: Randomly changes brightness, contrast, saturation, and hue.
    - `GaussianBlur`: Blurs the image with a random kernel size and sigma.
  - These augmentations help generate two different views of the same image, which is crucial for contrastive learning.

---

## **2. Custom Dataset**
- **SimCLRDataset**:
  - Generates two augmented views of each input image to be used as positive pairs for contrastive learning.

---

## **3. SimCLR Model Architecture**
- **Base Encoder**:
  - A pre-trained `ResNet-18` is used as the backbone.
  - The first convolutional layer (`conv1`) is modified to handle single-channel (grayscale) images.
  - The fully connected layer (`fc`) is replaced with an `Identity` layer to output feature embeddings.
- **Projection Head**:
  - A two-layer MLP with a hidden size of `256` and output size of `128`.
  - Nonlinear activation (`ReLU`) is applied between the layers.
  - The projection head helps learn representations better suited for contrastive loss.

---

## **4. Contrastive Loss (NT-Xent Loss)**
- **Normalized Temperature-Scaled Cross-Entropy Loss**:
  - Encourages the model to maximize the similarity of positive pairs (different views of the same image) while minimizing similarity to negative pairs (different images in the batch).
  - Implements temperature scaling and softmax normalization for better gradient flow.
  - Cross-entropy loss is used to optimize the logits.

---

## **5. Training Procedure**
- **Contrastive Learning Stage**:
  - The SimCLR model is trained using the contrastive loss on augmented image pairs.
  - Key hyperparameters:
    - Batch size: `512`
    - Learning rate: `1e-4`
    - Temperature: `0.5`
  - The Adam optimizer is used with a `StepLR` scheduler for learning rate decay.
- **Training Outputs**:
  - Loss values are logged for each epoch to monitor the learning process.

---

## **6. Downstream Task: Classification**
- **Linear Evaluation Protocol**:
  - The pre-trained encoder is fine-tuned with a classification head for supervised learning on the labeled dataset.
  - Classification head:
    - A single linear layer mapping encoder outputs (`512`) to the number of classes.
  - Fine-tuning:
    - Encoder weights are updated at a slower learning rate (`1e-5`) than the classification head (`1e-3`).
    - Cross-entropy loss is used for optimization.
- **Performance Evaluation**:
  - Test accuracy is computed on the held-out test set to measure the model's performance.

---

## **7. Key Features Contributing to Accuracy**
1. **Data Augmentation**:
   - Rich transformations ensure the model learns robust and invariant features.
2. **Pre-trained ResNet-18 Encoder**:
   - Provides a strong starting point for feature extraction.
3. **Projection Head**:
   - Helps improve representation learning by mapping features to a contrastive space.
4. **Contrastive Loss**:
   - NT-Xent loss ensures the learned features are discriminative and generalizable.
5. **Fine-tuning Strategy**:
   - Gradual learning rate decay and careful tuning of encoder and head ensure effective adaptation to the downstream task.

---




In [1]:
import numpy as np
import torch
from torch.utils.data import Dataset, DataLoader
import torchvision.transforms as transforms
from torchvision.models import resnet18
import torch.nn as nn
import torch.optim as optim
from torch.optim.lr_scheduler import StepLR
from PIL import Image
import matplotlib.pyplot as plt

# Load and preprocess the dataset
train_images_path = "brain_train_image_final.npy"
train_labels_path = "brain_train_label.npy"
test_images_path = "brain_test_image_final.npy"
test_labels_path = "brain_test_label.npy"

# Load the data
final_X_train_modified = np.load(train_images_path)[:, 1, :, :]
final_X_test_modified = np.load(test_images_path)[:, 1, :, :]
train_labels = np.load(train_labels_path)
test_labels = np.load(test_labels_path)

# Normalize and Resize Images using Pillow
def normalize_and_resize(images, target_size=(224, 224)):
    resized_images = []
    for img in images:
        img = Image.fromarray((img * 255).astype(np.uint8))
        img_resized = img.resize(target_size, Image.Resampling.LANCZOS)
        resized_images.append(np.array(img_resized) / 255.0)
    return np.array(resized_images)

final_X_train_resized = normalize_and_resize(final_X_train_modified)
final_X_test_resized = normalize_and_resize(final_X_test_modified)

# Define SimCLR Augmentation Transform
transform_simclr = transforms.Compose([
    transforms.ToPILImage(),
    transforms.RandomResizedCrop(size=224),
    transforms.RandomHorizontalFlip(p=0.5),
    transforms.ColorJitter(brightness=0.5, contrast=0.5, saturation=0.5, hue=0.1),
    transforms.GaussianBlur(kernel_size=5, sigma=(0.1, 2.0)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.5], std=[0.5])
])

# Custom Dataset for SimCLR
class SimCLRDataset(Dataset):
    def __init__(self, images, transform):
        self.images = images
        self.transform = transform

    def __len__(self):
        return len(self.images)

    def __getitem__(self, idx):
        img = self.images[idx]
        img_1 = self.transform(img)
        img_2 = self.transform(img)
        return img_1, img_2

train_dataset = SimCLRDataset(final_X_train_resized, transform_simclr)
train_loader = DataLoader(train_dataset, batch_size=512, shuffle=True)

# Define SimCLR Model
class SimCLR(nn.Module):
    def __init__(self, base_encoder, projection_dim):
        super(SimCLR, self).__init__()
        self.encoder = base_encoder
        self.projector = nn.Sequential(
            nn.Linear(512, 256),
            nn.ReLU(),
            nn.Linear(256, projection_dim)
        )

    def forward(self, x):
        h = self.encoder(x)
        z = self.projector(h)
        return z

# Define NT-Xent Loss
class NTXentLoss(nn.Module):
    def __init__(self, batch_size, temperature):
        super(NTXentLoss, self).__init__()
        self.batch_size = batch_size
        self.temperature = temperature
        self.criterion = nn.CrossEntropyLoss(reduction="sum")

    def forward(self, z_i, z_j):
        N = z_i.size(0) + z_j.size(0)
        z = torch.cat((z_i, z_j), dim=0)
        sim = torch.matmul(z, z.T) / self.temperature
        mask = ~torch.eye(N, dtype=torch.bool, device=z.device)

        positives = torch.cat([
            torch.diag(sim, z_i.size(0)),
            torch.diag(sim, -z_i.size(0))
        ])

        negatives = sim[mask].view(N, -1)
        logits = torch.cat((positives.unsqueeze(1), negatives), dim=1)
        labels = torch.zeros(N, dtype=torch.long, device=z.device)
        loss = self.criterion(logits, labels) / N
        return loss

# Initialize ResNet-18 Encoder
base_encoder = resnet18(pretrained=True)
base_encoder.conv1 = nn.Conv2d(1, 64, kernel_size=7, stride=2, padding=3, bias=False)
base_encoder.fc = nn.Identity()

# Initialize SimCLR Model
model = SimCLR(base_encoder, projection_dim=128).to("cuda")
optimizer = optim.Adam(model.parameters(), lr=1e-4)
criterion = NTXentLoss(batch_size=512, temperature=0.5)

# Train SimCLR Model
for epoch in range(100):
    total_loss = 0
    model.train()
    for img_1, img_2 in train_loader:
        img_1, img_2 = img_1.to("cuda"), img_2.to("cuda")
        z_i = model(img_1)
        z_j = model(img_2)

        loss = criterion(z_i, z_j)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        total_loss += loss.item()

    print(f"Epoch [{epoch+1}/100], Loss: {total_loss/len(train_loader):.4f}")
    
    

# Define Dataset for Classification
class TestDataset(Dataset):
    def __init__(self, images, labels, transform=None):
        self.images = images
        self.labels = labels
        self.transform = transform

    def __len__(self):
        return len(self.images)

    def __getitem__(self, idx):
        img = self.images[idx]
        label = self.labels[idx]
        if self.transform:
            img = self.transform(img)
        return img, label

# Initialize Training and Test Dataset and DataLoader
train_transform = transforms.Compose([
    transforms.ToPILImage(),
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.5], std=[0.5])
])

test_transform = transforms.Compose([
    transforms.ToPILImage(),
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.5], std=[0.5])
])

train_dataset = TestDataset(final_X_train_resized, train_labels, transform=train_transform)
train_loader = DataLoader(train_dataset, batch_size=128, shuffle=True)

test_dataset = TestDataset(final_X_test_resized, test_labels, transform=test_transform)
test_loader = DataLoader(test_dataset, batch_size=128, shuffle=False)

# Add Classification Head
class ClassificationHead(nn.Module):
    def __init__(self, input_dim, num_classes):
        super(ClassificationHead, self).__init__()
        self.fc = nn.Linear(input_dim, num_classes)

    def forward(self, x):
        return self.fc(x)

classification_head = ClassificationHead(input_dim=512, num_classes=len(np.unique(train_labels))).to("cuda")
optimizer_cls = optim.Adam([
    {"params": model.encoder.parameters(), "lr": 1e-5},
    {"params": classification_head.parameters(), "lr": 1e-3},
])
scheduler_cls = StepLR(optimizer_cls, step_size=10, gamma=0.5)
criterion_cls = nn.CrossEntropyLoss()

# Fine-tune Classification Head
for epoch in range(100):
    model.encoder.train()
    classification_head.train()
    total_loss = 0
    correct = 0
    for img, label in DataLoader(train_dataset, batch_size=128, shuffle=True):
        img, label = img.to("cuda"), label.to("cuda")
        features = model.encoder(img)
        logits = classification_head(features)
        loss = criterion_cls(logits, label)

        optimizer_cls.zero_grad()
        loss.backward()
        optimizer_cls.step()

        total_loss += loss.item()
        correct += (logits.argmax(dim=1) == label).sum().item()

    accuracy = correct / len(train_labels)
    scheduler_cls.step()
    print(f"Epoch [{epoch+1}/100], Loss: {total_loss/len(train_loader):.4f}, Accuracy: {accuracy:.4f}")

# Evaluate on Test Dataset
classification_head.eval()
correct = 0
with torch.no_grad():
    for img, label in DataLoader(test_dataset, batch_size=128, shuffle=False):
        img, label = img.to("cuda"), label.to("cuda")
        features = model.encoder(img)
        logits = classification_head(features)
        correct += (logits.argmax(dim=1) == label).sum().item()

test_accuracy = correct / len(test_labels)
print(f"Test Accuracy: {test_accuracy:.4f}")




Epoch [1/100], Loss: 8.1232
Epoch [2/100], Loss: 6.6977
Epoch [3/100], Loss: 6.5628
Epoch [4/100], Loss: 6.4204
Epoch [5/100], Loss: 6.2939
Epoch [6/100], Loss: 6.2331
Epoch [7/100], Loss: 6.1769
Epoch [8/100], Loss: 6.0556
Epoch [9/100], Loss: 5.9819
Epoch [10/100], Loss: 5.8945
Epoch [11/100], Loss: 5.7674
Epoch [12/100], Loss: 5.6089
Epoch [13/100], Loss: 5.5685
Epoch [14/100], Loss: 5.4512
Epoch [15/100], Loss: 5.3110
Epoch [16/100], Loss: 5.1675
Epoch [17/100], Loss: 5.1007
Epoch [18/100], Loss: 4.9024
Epoch [19/100], Loss: 4.8179
Epoch [20/100], Loss: 4.7292
Epoch [21/100], Loss: 4.6493
Epoch [22/100], Loss: 4.5762
Epoch [23/100], Loss: 4.4276
Epoch [24/100], Loss: 4.3101
Epoch [25/100], Loss: 4.2424
Epoch [26/100], Loss: 4.1504
Epoch [27/100], Loss: 4.0430
Epoch [28/100], Loss: 3.9242
Epoch [29/100], Loss: 3.8458
Epoch [30/100], Loss: 3.7267
Epoch [31/100], Loss: 3.8131
Epoch [32/100], Loss: 3.7184
Epoch [33/100], Loss: 3.6482
Epoch [34/100], Loss: 3.5053
Epoch [35/100], Loss: 3

# The below one got 85% accuracy, with ResNet34

# SimCLR Implementation with ResNet-34

## **1. Data Preprocessing and Augmentation**
- **Normalization and Resizing**:
  - Input images are resized to `224x224` and normalized with a mean of `0.5` and standard deviation of `0.5` for compatibility with the ResNet-34 model.
- **Data Augmentation**:
  - Rich augmentation pipeline includes:
    - `RandomResizedCrop`: Crops images randomly with varying sizes and aspect ratios.
    - `RandomHorizontalFlip`: Flips the image horizontally with a probability of `0.5`.
    - `ColorJitter`: Modifies brightness, contrast, saturation, and hue to simulate various lighting conditions.
    - `GaussianBlur`: Adds random blur to images for robustness.
  - These augmentations generate two augmented views of the same image, ensuring robust and invariant feature learning.

---

## **2. Custom Dataset**
- **SimCLRDataset**:
  - Provides two augmented views of each image to act as positive pairs for contrastive learning.

---

## **3. SimCLR Model Architecture**
- **Base Encoder**:
  - A pre-trained `ResNet-34` backbone from ImageNet is used for feature extraction.
  - Input channels are modified to handle single-channel (grayscale) images.
  - The fully connected (`fc`) layer is replaced with an `Identity` layer to output raw feature embeddings.
- **Projection Head**:
  - A two-layer MLP maps feature embeddings to a lower-dimensional space:
    - First layer: `Linear(512, 512)` followed by `ReLU`.
    - Second layer: `Linear(512, 128)` for projection.
  - This helps learn more effective representations suited for contrastive learning.

---

## **4. Contrastive Loss (NT-Xent Loss)**
- **Normalized Temperature-Scaled Cross-Entropy Loss**:
  - Ensures that the similarity between positive pairs is maximized while minimizing similarity to negative pairs within the same batch.
  - Applies temperature scaling and uses softmax normalization for improved gradient flow.
  - Cross-entropy loss is used to optimize the similarity logits.

---

## **5. Training Procedure**
- **Contrastive Learning Stage**:
  - The SimCLR model is trained on the augmented pairs using contrastive loss.
  - Key hyperparameters:
    - Batch size: `512`
    - Learning rate: `3e-4`
    - Temperature: `0.5`
  - Optimizer: Adam
  - Scheduler: `CosineAnnealingLR` with `T_max=100` for smooth learning rate decay.

---

## **6. Downstream Task: Classification**
- **Linear Evaluation Protocol**:
  - The pre-trained encoder is fine-tuned with a linear classification head for the labeled dataset.
  - Classification head:
    - A single `Linear(512, num_classes)` layer maps encoder features to class logits.
  - Fine-tuning strategy:
    - Encoder weights updated at a slower learning rate (`1e-5`) compared to the classification head (`3e-4`).
    - Cross-entropy loss is used for optimization.
- **Performance Evaluation**:
  - Accuracy on the test set is computed to evaluate model performance.

---

## **7. Key Features Contributing to Accuracy**
1. **Data Augmentation**:
   - Extensive augmentations ensure the model learns invariant features that generalize well.
2. **ResNet-34 Encoder**:
   - A pre-trained encoder offers robust initial feature representations.
3. **Projection Head**:
   - A low-dimensional mapping improves representation quality for contrastive learning.
4. **Contrastive Loss**:
   - Effectively encourages discriminative feature learning.
5. **Large Batch Size**:
   - Provides a diverse set of negative samples, enhancing contrastive loss effectiveness.
6. **Cosine Learning Rate Scheduler**:
   - Smooth decay of the learning rate ensures stable training.
7. **Fine-tuning Protocol**:
   - Differential learning rates for the encoder and classification head improve adaptation for downstream tasks.

---



In [2]:
import numpy as np
import torch
from torch.utils.data import Dataset, DataLoader
import torchvision.transforms as transforms
from torchvision.models import resnet34, ResNet34_Weights
import torch.nn as nn
import torch.optim as optim
from torch.optim.lr_scheduler import CosineAnnealingLR
from PIL import Image

# Load and preprocess the dataset
train_images_path = "brain_train_image_final.npy"
train_labels_path = "brain_train_label.npy"
test_images_path = "brain_test_image_final.npy"
test_labels_path = "brain_test_label.npy"

# Load the data
final_X_train_modified = np.load(train_images_path)[:, 1, :, :]
final_X_test_modified = np.load(test_images_path)[:, 1, :, :]
train_labels = np.load(train_labels_path)
test_labels = np.load(test_labels_path)

# Normalize and Resize Images using Pillow
def normalize_and_resize(images, target_size=(224, 224)):
    resized_images = []
    for img in images:
        img = Image.fromarray((img * 255).astype(np.uint8))
        img_resized = img.resize(target_size, Image.Resampling.LANCZOS)
        resized_images.append(np.array(img_resized) / 255.0)
    return np.array(resized_images)

final_X_train_resized = normalize_and_resize(final_X_train_modified)
final_X_test_resized = normalize_and_resize(final_X_test_modified)

# Define SimCLR Augmentation Transform
transform_simclr = transforms.Compose([
    transforms.ToPILImage(),
    transforms.RandomResizedCrop(size=224, scale=(0.08, 1.0), ratio=(3/4, 4/3)),
    transforms.RandomHorizontalFlip(p=0.5),
    transforms.ColorJitter(brightness=0.8, contrast=0.8, saturation=0.8, hue=0.2),
    transforms.RandomApply([transforms.GaussianBlur(kernel_size=23, sigma=(0.1, 2.0))], p=0.5),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.5], std=[0.5])
])

# Custom Dataset for SimCLR
class SimCLRDataset(Dataset):
    def __init__(self, images, transform):
        self.images = images
        self.transform = transform

    def __len__(self):
        return len(self.images)

    def __getitem__(self, idx):
        img = self.images[idx]
        img_1 = self.transform(img)
        img_2 = self.transform(img)
        return img_1, img_2

train_dataset = SimCLRDataset(final_X_train_resized, transform_simclr)
train_loader = DataLoader(train_dataset, batch_size=512, shuffle=True)

# Define SimCLR Model
class SimCLR(nn.Module):
    def __init__(self, base_encoder, projection_dim):
        super(SimCLR, self).__init__()
        self.encoder = base_encoder
        self.projector = nn.Sequential(
            nn.Linear(512, 512),  # Adjusted for ResNet-34 output
            nn.ReLU(),
            nn.Linear(512, projection_dim)
        )

    def forward(self, x):
        h = self.encoder(x)
        z = self.projector(h)
        return z

# Define NT-Xent Loss
class NTXentLoss(nn.Module):
    def __init__(self, temperature):
        super(NTXentLoss, self).__init__()
        self.temperature = temperature
        self.criterion = nn.CrossEntropyLoss(reduction="mean")

    def forward(self, z_i, z_j):
        N = z_i.size(0) + z_j.size(0)
        z = torch.cat((z_i, z_j), dim=0)
        sim = torch.mm(z, z.T) / self.temperature
        sim = torch.nn.functional.softmax(sim, dim=1)

        labels = torch.cat([
            torch.arange(z_i.size(0), device=z.device),
            torch.arange(z_j.size(0), device=z.device)
        ])
        loss = self.criterion(sim, labels)
        return loss

# Initialize ResNet-34 Encoder with updated weights argument
base_encoder = resnet34(weights=ResNet34_Weights.IMAGENET1K_V1)
base_encoder.conv1 = nn.Conv2d(1, 64, kernel_size=7, stride=2, padding=3, bias=False)
base_encoder.fc = nn.Identity()

# Initialize SimCLR Model
model = SimCLR(base_encoder, projection_dim=128).to("cuda")
optimizer = optim.Adam(model.parameters(), lr=3e-4)
criterion = NTXentLoss(temperature=0.5)
scheduler = CosineAnnealingLR(optimizer, T_max=100)

# Train SimCLR Model
for epoch in range(100):
    total_loss = 0
    model.train()
    for img_1, img_2 in train_loader:
        img_1, img_2 = img_1.to("cuda"), img_2.to("cuda")
        z_i = model(img_1)
        z_j = model(img_2)

        loss = criterion(z_i, z_j)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        total_loss += loss.item()

    scheduler.step()
    print(f"Epoch [{epoch+1}/100], Loss: {total_loss/len(train_loader):.4f}")

# Define Dataset for Classification
class TestDataset(Dataset):
    def __init__(self, images, labels, transform=None):
        self.images = images
        self.labels = labels
        self.transform = transform

    def __len__(self):
        return len(self.images)

    def __getitem__(self, idx):
        img = self.images[idx]
        label = self.labels[idx]
        if self.transform:
            img = self.transform(img)
        return img, label

# Initialize Training and Test Dataset and DataLoader
train_transform = transforms.Compose([
    transforms.ToPILImage(),
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.5], std=[0.5])
])

test_transform = transforms.Compose([
    transforms.ToPILImage(),
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.5], std=[0.5])
])

train_dataset = TestDataset(final_X_train_resized, train_labels, transform=train_transform)
train_loader = DataLoader(train_dataset, batch_size=512, shuffle=True)

test_dataset = TestDataset(final_X_test_resized, test_labels, transform=test_transform)
test_loader = DataLoader(test_dataset, batch_size=512, shuffle=False)

# Add Classification Head
class ClassificationHead(nn.Module):
    def __init__(self, input_dim, num_classes):
        super(ClassificationHead, self).__init__()
        self.fc = nn.Linear(input_dim, num_classes)

    def forward(self, x):
        return self.fc(x)

classification_head = ClassificationHead(input_dim=512, num_classes=len(np.unique(train_labels))).to("cuda")
optimizer_cls = optim.Adam([
    {"params": model.encoder.parameters(), "lr": 1e-5},
    {"params": classification_head.parameters(), "lr": 3e-4},
])
criterion_cls = nn.CrossEntropyLoss()

# Fine-tune Classification Head
for epoch in range(100):
    model.encoder.train()
    classification_head.train()
    total_loss = 0
    correct = 0
    for img, label in train_loader:
        img, label = img.to("cuda"), label.to("cuda")
        features = model.encoder(img)
        logits = classification_head(features)
        loss = criterion_cls(logits, label)

        optimizer_cls.zero_grad()
        loss.backward()
        optimizer_cls.step()

        total_loss += loss.item()
        correct += (logits.argmax(dim=1) == label).sum().item()

    accuracy = correct / len(train_labels)
    print(f"Epoch [{epoch+1}/100], Loss: {total_loss/len(train_loader):.4f}, Accuracy: {accuracy:.4f}")

# Evaluate on Test Dataset
classification_head.eval()
correct = 0
with torch.no_grad():
    for img, label in test_loader:
        img, label = img.to("cuda"), label.to("cuda")
        features = model.encoder(img)
        logits = classification_head(features)
        correct += (logits.argmax(dim=1) == label).sum().item()

test_accuracy = correct / len(test_labels)
print(f"Test Accuracy: {test_accuracy*100:.4f}")


Epoch [1/100], Loss: 6.3819
Epoch [2/100], Loss: 6.1793
Epoch [3/100], Loss: 6.1024
Epoch [4/100], Loss: 6.0910
Epoch [5/100], Loss: 6.0812
Epoch [6/100], Loss: 6.0800
Epoch [7/100], Loss: 6.0776
Epoch [8/100], Loss: 6.0784
Epoch [9/100], Loss: 6.0793
Epoch [10/100], Loss: 6.0791
Epoch [11/100], Loss: 6.0776
Epoch [12/100], Loss: 6.0777
Epoch [13/100], Loss: 6.0779
Epoch [14/100], Loss: 6.0781
Epoch [15/100], Loss: 6.0788
Epoch [16/100], Loss: 6.0781
Epoch [17/100], Loss: 6.0792
Epoch [18/100], Loss: 6.0778
Epoch [19/100], Loss: 6.0781
Epoch [20/100], Loss: 6.0769
Epoch [21/100], Loss: 6.0781
Epoch [22/100], Loss: 6.0765
Epoch [23/100], Loss: 6.0763
Epoch [24/100], Loss: 6.0758
Epoch [25/100], Loss: 6.0762
Epoch [26/100], Loss: 6.0758
Epoch [27/100], Loss: 6.0760
Epoch [28/100], Loss: 6.0758
Epoch [29/100], Loss: 6.0762
Epoch [30/100], Loss: 6.0770
Epoch [31/100], Loss: 6.0752
Epoch [32/100], Loss: 6.0767
Epoch [33/100], Loss: 6.0752
Epoch [34/100], Loss: 6.0766
Epoch [35/100], Loss: 6

# The below one is used the previous SimCLR but has different epoches=200 for testing. the accuracy is 89%

# SimCLR Implementation with Alternative Testing Approach

## **1. Dataset for Classification**
- **TestDataset**:
  - A custom dataset class that accepts:
    - Images and their corresponding labels.
    - An optional transformation function to preprocess input images.
  - Returns a single image-label pair at a time.

---

## **2. Data Preprocessing for Testing**
- **Training and Testing Transformations**:
  - **Training**:
    - `Resize`: Resizes the input to `224x224`.
    - `ToTensor`: Converts the image into a PyTorch tensor.
    - `Normalize`: Normalizes pixel values with a mean of `0.5` and standard deviation of `0.5`.
  - **Testing**:
    - Applies the same resizing and normalization as training to maintain consistency.

---

## **3. DataLoader Configuration**
- **Batch Size**:
  - `512` for both training and testing.
- **Shuffling**:
  - Training set is shuffled to enhance generalization during training.
  - Testing set is not shuffled, ensuring deterministic evaluation.

---

## **4. Fine-Tuning with a Classification Head**
- **Classification Head**:
  - A single linear layer (`Linear(512, num_classes)`) maps encoder features to the output class logits.
- **Fine-Tuning Procedure**:
  - The SimCLR encoder is fine-tuned along with the classification head.
  - Hyperparameters:
    - Learning rate for encoder: `1e-5`
    - Learning rate for classification head: `3e-4`
    - Loss function: `CrossEntropyLoss`
    - Epochs: `200`
  - Optimizer: Adam optimizes both the encoder and classification head weights.
- **Training Outputs**:
  - The average training loss and accuracy are logged for each epoch.

---

## **5. Alternative Testing Procedure**
- **Evaluation on Test Dataset**:
  - The classification head's performance is evaluated on the held-out test set.
  - The encoder features are passed to the classification head for predictions.
- **Metrics**:
  - Accuracy is computed as the ratio of correctly predicted labels to the total number of samples in the test set.
- **Logging**:
  - The final test accuracy is printed as a percentage for easy interpretation.

---

## **6. Key Changes in Testing**
1. **Direct Evaluation with the Classification Head**:
   - Features extracted by the encoder are passed to the classification head for inference.
2. **Consistency in Data Transformations**:
   - The same resizing and normalization steps are applied to training and testing datasets.
3. **Logging Metrics**:
   - Both training and testing accuracy are explicitly calculated and printed for performance monitoring.

---





In [3]:
# Define Dataset for Classification
class TestDataset(Dataset):
    def __init__(self, images, labels, transform=None):
        self.images = images
        self.labels = labels
        self.transform = transform

    def __len__(self):
        return len(self.images)

    def __getitem__(self, idx):
        img = self.images[idx]
        label = self.labels[idx]
        if self.transform:
            img = self.transform(img)
        return img, label

# Initialize Training and Test Dataset and DataLoader
train_transform = transforms.Compose([
    transforms.ToPILImage(),
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.5], std=[0.5])
])

test_transform = transforms.Compose([
    transforms.ToPILImage(),
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.5], std=[0.5])
])

train_dataset = TestDataset(final_X_train_resized, train_labels, transform=train_transform)
train_loader = DataLoader(train_dataset, batch_size=512, shuffle=True)

test_dataset = TestDataset(final_X_test_resized, test_labels, transform=test_transform)
test_loader = DataLoader(test_dataset, batch_size=512, shuffle=False)

# Add Classification Head
class ClassificationHead(nn.Module):
    def __init__(self, input_dim, num_classes):
        super(ClassificationHead, self).__init__()
        self.fc = nn.Linear(input_dim, num_classes)

    def forward(self, x):
        return self.fc(x)

classification_head = ClassificationHead(input_dim=512, num_classes=len(np.unique(train_labels))).to("cuda")
optimizer_cls = optim.Adam([
    {"params": model.encoder.parameters(), "lr": 1e-5},
    {"params": classification_head.parameters(), "lr": 3e-4},
])
criterion_cls = nn.CrossEntropyLoss()

# Fine-tune Classification Head
for epoch in range(200):
    model.encoder.train()
    classification_head.train()
    total_loss = 0
    correct = 0
    for img, label in train_loader:
        img, label = img.to("cuda"), label.to("cuda")
        features = model.encoder(img)
        logits = classification_head(features)
        loss = criterion_cls(logits, label)

        optimizer_cls.zero_grad()
        loss.backward()
        optimizer_cls.step()

        total_loss += loss.item()
        correct += (logits.argmax(dim=1) == label).sum().item()

    accuracy = correct / len(train_labels)
    print(f"Epoch [{epoch+1}/100], Loss: {total_loss/len(train_loader):.4f}, Accuracy: {accuracy:.4f}")

# Evaluate on Test Dataset
classification_head.eval()
correct = 0
with torch.no_grad():
    for img, label in test_loader:
        img, label = img.to("cuda"), label.to("cuda")
        features = model.encoder(img)
        logits = classification_head(features)
        correct += (logits.argmax(dim=1) == label).sum().item()

test_accuracy = correct / len(test_labels)
print(f"Test Accuracy: {test_accuracy*100:.4f}")


Epoch [1/100], Loss: 1.1620, Accuracy: 0.3193
Epoch [2/100], Loss: 0.6423, Accuracy: 0.8497
Epoch [3/100], Loss: 0.3671, Accuracy: 0.9897
Epoch [4/100], Loss: 0.2183, Accuracy: 0.9976
Epoch [5/100], Loss: 0.1332, Accuracy: 0.9994
Epoch [6/100], Loss: 0.0877, Accuracy: 0.9994
Epoch [7/100], Loss: 0.0580, Accuracy: 0.9994
Epoch [8/100], Loss: 0.0429, Accuracy: 0.9994
Epoch [9/100], Loss: 0.0326, Accuracy: 1.0000
Epoch [10/100], Loss: 0.0248, Accuracy: 1.0000
Epoch [11/100], Loss: 0.0214, Accuracy: 1.0000
Epoch [12/100], Loss: 0.0174, Accuracy: 1.0000
Epoch [13/100], Loss: 0.0147, Accuracy: 1.0000
Epoch [14/100], Loss: 0.0130, Accuracy: 1.0000
Epoch [15/100], Loss: 0.0127, Accuracy: 1.0000
Epoch [16/100], Loss: 0.0102, Accuracy: 1.0000
Epoch [17/100], Loss: 0.0101, Accuracy: 1.0000
Epoch [18/100], Loss: 0.0088, Accuracy: 1.0000
Epoch [19/100], Loss: 0.0082, Accuracy: 1.0000
Epoch [20/100], Loss: 0.0072, Accuracy: 1.0000
Epoch [21/100], Loss: 0.0086, Accuracy: 1.0000
Epoch [22/100], Loss: 

# The below one is used the previous SimCLR but has different batch size=128 for testing. the accuracy is 88%

# SimCLR with Fine-Tuning and Classification Evaluation

## **1. Dataset for Classification**
- **TestDataset**:
  - Custom dataset class designed to handle:
    - Images and their corresponding labels.
    - Optional transformations for preprocessing.
  - Returns a processed image and its label.
- **Transformations**:
  - Applied to both training and testing datasets:
    - `Resize`: Resizes images to `224x224`.
    - `ToTensor`: Converts images into PyTorch tensors.
    - `Normalize`: Scales pixel values to have a mean of `0.5` and standard deviation of `0.5`.

---

## **2. DataLoader Initialization**
- **Batch Size**:
  - Training: `128`
  - Testing: `128`
- **Shuffling**:
  - Enabled for training (`shuffle=True`) to improve generalization.
  - Disabled for testing (`shuffle=False`) to maintain consistency in evaluation.

---

## **3. Classification Head**
- **Structure**:
  - Single `Linear` layer with input dimension `512` (encoder output) and output dimension equal to the number of classes (`num_classes`).
- **Purpose**:
  - Maps features extracted by the encoder to class logits for classification.

---

## **4. Fine-Tuning**
- **Objective**:
  - Fine-tune the encoder with a classification head on labeled training data.
- **Optimization**:
  - Encoder: Low learning rate (`1e-5`) to retain pre-trained features.
  - Classification Head: Higher learning rate (`3e-4`) to adapt to the task-specific dataset.
  - Loss Function: CrossEntropyLoss for classification.
  - Optimizer: Adam.
- **Epoch-wise Metrics**:
  - Logs the average training loss and accuracy per epoch.
  - Accuracy is calculated as the percentage of correct predictions on the training dataset.

---

## **5. Evaluation on Test Dataset**
- **Procedure**:
  - The trained encoder and classification head are evaluated on the test dataset.
  - For each test image:
    - Encoder extracts features.
    - Classification head predicts the class logits.
  - Correct predictions are summed to compute the total test accuracy.
- **Metric**:
  - Test accuracy is reported as the ratio of correct predictions to the total number of test samples.

---

## **6. Key Features**
1. **Separate Training and Testing Pipelines**:
   - Ensures robust and unbiased evaluation.
2. **Consistent Data Transformations**:
   - Uniform resizing and normalization enhance compatibility between training and testing datasets.
3. **Fine-Tuning Strategy**:
   - Differential learning rates for the encoder and classification head maximize performance.
4. **Batch-Wise Evaluation**:
   - Efficient and scalable evaluation with DataLoader.




In [2]:
# Define Dataset for Classification
class TestDataset(Dataset):
    def __init__(self, images, labels, transform=None):
        self.images = images
        self.labels = labels
        self.transform = transform

    def __len__(self):
        return len(self.images)

    def __getitem__(self, idx):
        img = self.images[idx]
        label = self.labels[idx]
        if self.transform:
            img = self.transform(img)
        return img, label

# Initialize Training and Test Dataset and DataLoader
train_transform = transforms.Compose([
    transforms.ToPILImage(),
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.5], std=[0.5])
])

test_transform = transforms.Compose([
    transforms.ToPILImage(),
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.5], std=[0.5])
])

train_dataset = TestDataset(final_X_train_resized, train_labels, transform=train_transform)
train_loader = DataLoader(train_dataset, batch_size=128, shuffle=True)

test_dataset = TestDataset(final_X_test_resized, test_labels, transform=test_transform)
test_loader = DataLoader(test_dataset, batch_size=128, shuffle=False)

# Add Classification Head
class ClassificationHead(nn.Module):
    def __init__(self, input_dim, num_classes):
        super(ClassificationHead, self).__init__()
        self.fc = nn.Linear(input_dim, num_classes)

    def forward(self, x):
        return self.fc(x)

classification_head = ClassificationHead(input_dim=512, num_classes=len(np.unique(train_labels))).to("cuda")
optimizer_cls = optim.Adam([
    {"params": model.encoder.parameters(), "lr": 1e-5},
    {"params": classification_head.parameters(), "lr": 3e-4},
])
criterion_cls = nn.CrossEntropyLoss()

# Fine-tune Classification Head
for epoch in range(100):
    model.encoder.train()
    classification_head.train()
    total_loss = 0
    correct = 0
    for img, label in train_loader:
        img, label = img.to("cuda"), label.to("cuda")
        features = model.encoder(img)
        logits = classification_head(features)
        loss = criterion_cls(logits, label)

        optimizer_cls.zero_grad()
        loss.backward()
        optimizer_cls.step()

        total_loss += loss.item()
        correct += (logits.argmax(dim=1) == label).sum().item()

    accuracy = (correct / len(train_labels))*100
    print(f"Epoch [{epoch+1}/100], Loss: {total_loss/len(train_loader):.4f}, Accuracy: {accuracy:.4f}")

# Evaluate on Test Dataset
classification_head.eval()
correct = 0
with torch.no_grad():
    for img, label in test_loader:
        img, label = img.to("cuda"), label.to("cuda")
        features = model.encoder(img)
        logits = classification_head(features)
        correct += (logits.argmax(dim=1) == label).sum().item()

test_accuracy = correct / len(test_labels)
print(f"Test Accuracy: {test_accuracy:.4f}")

Epoch [1/100], Loss: 0.6271, Accuracy: 80.8087
Epoch [2/100], Loss: 0.1275, Accuracy: 100.0000
Epoch [3/100], Loss: 0.0388, Accuracy: 100.0000
Epoch [4/100], Loss: 0.0207, Accuracy: 100.0000
Epoch [5/100], Loss: 0.0142, Accuracy: 100.0000
Epoch [6/100], Loss: 0.0103, Accuracy: 100.0000
Epoch [7/100], Loss: 0.0081, Accuracy: 100.0000
Epoch [8/100], Loss: 0.0070, Accuracy: 100.0000
Epoch [9/100], Loss: 0.0057, Accuracy: 100.0000
Epoch [10/100], Loss: 0.0052, Accuracy: 100.0000
Epoch [11/100], Loss: 0.0043, Accuracy: 100.0000
Epoch [12/100], Loss: 0.0039, Accuracy: 100.0000
Epoch [13/100], Loss: 0.0032, Accuracy: 100.0000
Epoch [14/100], Loss: 0.0034, Accuracy: 100.0000
Epoch [15/100], Loss: 0.0027, Accuracy: 100.0000
Epoch [16/100], Loss: 0.0023, Accuracy: 100.0000
Epoch [17/100], Loss: 0.0021, Accuracy: 100.0000
Epoch [18/100], Loss: 0.0020, Accuracy: 100.0000
Epoch [19/100], Loss: 0.0018, Accuracy: 100.0000
Epoch [20/100], Loss: 0.0018, Accuracy: 100.0000
Epoch [21/100], Loss: 0.0015, 

# The Below one is with ResNet18. accuracy is 85%


# SimCLR Implementation Explanation

## **1. Data Preprocessing and Augmentation**
- **Normalization and Resizing**:
  - Images are resized to `224x224` using Lanczos resampling and normalized with a mean of `0.5` and standard deviation of `0.5`.
- **Data Augmentation**:
  - Augmentation pipeline includes:
    - `RandomResizedCrop`: Crops images to random sizes and aspect ratios.
    - `RandomHorizontalFlip`: Randomly flips the image horizontally.
    - `ColorJitter`: Randomly changes brightness, contrast, saturation, and hue.
    - `GaussianBlur`: Blurs the image with a random kernel size and sigma.
  - These augmentations help generate two different views of the same image, which is crucial for contrastive learning.

---

## **2. Custom Dataset**
- **SimCLRDataset**:
  - Generates two augmented views of each input image to be used as positive pairs for contrastive learning.

---

## **3. SimCLR Model Architecture**
- **Base Encoder**:
  - A pre-trained `ResNet-18` is used as the backbone.
  - The first convolutional layer (`conv1`) is modified to handle single-channel (grayscale) images.
  - The fully connected layer (`fc`) is replaced with an `Identity` layer to output feature embeddings.
- **Projection Head**:
  - A two-layer MLP with a hidden size of `256` and output size of `128`.
  - Nonlinear activation (`ReLU`) is applied between the layers.
  - The projection head helps learn representations better suited for contrastive loss.

---

## **4. Contrastive Loss (NT-Xent Loss)**
- **Normalized Temperature-Scaled Cross-Entropy Loss**:
  - Encourages the model to maximize the similarity of positive pairs (different views of the same image) while minimizing similarity to negative pairs (different images in the batch).
  - Implements temperature scaling and softmax normalization for better gradient flow.
  - Cross-entropy loss is used to optimize the logits.

---

## **5. Training Procedure**
- **Contrastive Learning Stage**:
  - The SimCLR model is trained using the contrastive loss on augmented image pairs.
  - Key hyperparameters:
    - Batch size: `1024`
    - Learning rate: `3e-4`
    - Temperature: `0.5`
  - The Adam optimizer is used with a `CosineAnnealingLR` scheduler for gradual learning rate decay.
- **Training Outputs**:
  - Loss values are logged for each epoch to monitor the learning process.

---

## **6. Downstream Task: Classification**
- **Linear Evaluation Protocol**:
  - The pre-trained encoder is fine-tuned with a classification head for supervised learning on the labeled dataset.
  - Classification head:
    - A single linear layer mapping encoder outputs (`512`) to the number of classes.
  - Fine-tuning:
    - Encoder weights are updated at a slower learning rate (`1e-5`) than the classification head (`3e-4`).
    - Cross-entropy loss is used for optimization.
- **Performance Evaluation**:
  - Test accuracy is computed on the held-out test set to measure the model's performance.

---

## **7. Key Features Contributing to Accuracy**
1. **Data Augmentation**:
   - Rich transformations ensure the model learns robust and invariant features.
2. **Pre-trained ResNet-18 Encoder**:
   - Provides a strong starting point for feature extraction.
3. **Projection Head**:
   - Helps improve representation learning by mapping features to a contrastive space.
4. **Contrastive Loss**:
   - NT-Xent loss ensures the learned features are discriminative and generalizable.
5. **Fine-tuning Strategy**:
   - Gradual learning rate decay and careful tuning of encoder and head ensure effective adaptation to the downstream task.
---


In [4]:
from torchvision.models import resnet18, ResNet18_Weights
import numpy as np
import torch
from torch.utils.data import Dataset, DataLoader
import torchvision.transforms as transforms
from torchvision.models import resnet50
import torch.nn as nn
import torch.optim as optim
from torch.optim.lr_scheduler import CosineAnnealingLR
from PIL import Image

# Load and preprocess the dataset
train_images_path = "brain_train_image_final.npy"
train_labels_path = "brain_train_label.npy"
test_images_path = "brain_test_image_final.npy"
test_labels_path = "brain_test_label.npy"

# Load the data
final_X_train_modified = np.load(train_images_path)[:, 1, :, :]
final_X_test_modified = np.load(test_images_path)[:, 1, :, :]
train_labels = np.load(train_labels_path)
test_labels = np.load(test_labels_path)

# Normalize and Resize Images using Pillow
def normalize_and_resize(images, target_size=(224, 224)):
    resized_images = []
    for img in images:
        img = Image.fromarray((img * 255).astype(np.uint8))
        img_resized = img.resize(target_size, Image.Resampling.LANCZOS)
        resized_images.append(np.array(img_resized) / 255.0)
    return np.array(resized_images)

final_X_train_resized = normalize_and_resize(final_X_train_modified)
final_X_test_resized = normalize_and_resize(final_X_test_modified)

# Define SimCLR Augmentation Transform
transform_simclr = transforms.Compose([
    transforms.ToPILImage(),
    transforms.RandomResizedCrop(size=224, scale=(0.08, 1.0), ratio=(3/4, 4/3)),
    transforms.RandomHorizontalFlip(p=0.5),
    transforms.ColorJitter(brightness=0.8, contrast=0.8, saturation=0.8, hue=0.2),
    transforms.RandomApply([transforms.GaussianBlur(kernel_size=23, sigma=(0.1, 2.0))], p=0.5),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.5], std=[0.5])
])

# Custom Dataset for SimCLR
class SimCLRDataset(Dataset):
    def __init__(self, images, transform):
        self.images = images
        self.transform = transform

    def __len__(self):
        return len(self.images)

    def __getitem__(self, idx):
        img = self.images[idx]
        img_1 = self.transform(img)
        img_2 = self.transform(img)
        return img_1, img_2

train_dataset = SimCLRDataset(final_X_train_resized, transform_simclr)
train_loader = DataLoader(train_dataset, batch_size=1024, shuffle=True)

# Define SimCLR Model
class SimCLR(nn.Module):
    def __init__(self, base_encoder, projection_dim):
        super(SimCLR, self).__init__()
        self.encoder = base_encoder
        self.projector = nn.Sequential(
            nn.Linear(512, 256),  # Match encoder output size
            nn.ReLU(),
            nn.Linear(256, projection_dim)
        )

    def forward(self, x):
        h = self.encoder(x)
        z = self.projector(h)
        return z


# Define NT-Xent Loss
class NTXentLoss(nn.Module):
    def __init__(self, temperature):
        super(NTXentLoss, self).__init__()
        self.temperature = temperature
        self.criterion = nn.CrossEntropyLoss(reduction="mean")

    def forward(self, z_i, z_j):
        N = z_i.size(0) + z_j.size(0)
        z = torch.cat((z_i, z_j), dim=0)
        sim = torch.mm(z, z.T) / self.temperature
        sim = torch.nn.functional.softmax(sim, dim=1)

        labels = torch.cat([
            torch.arange(z_i.size(0), device=z.device),
            torch.arange(z_j.size(0), device=z.device)
        ])
        loss = self.criterion(sim, labels)
        return loss

# Initialize ResNet-18 Encoder
base_encoder = resnet18(weights=ResNet18_Weights.IMAGENET1K_V1)
base_encoder.conv1 = nn.Conv2d(1, 64, kernel_size=7, stride=2, padding=3, bias=False)
base_encoder.fc = nn.Identity()

# Initialize SimCLR Model
model = SimCLR(base_encoder, projection_dim=128).to("cuda")
optimizer = optim.Adam(model.parameters(), lr=3e-4)
criterion = NTXentLoss(temperature=0.5)
scheduler = CosineAnnealingLR(optimizer, T_max=100)

# Train SimCLR Model
for epoch in range(100):
    total_loss = 0
    model.train()
    for img_1, img_2 in train_loader:
        img_1, img_2 = img_1.to("cuda"), img_2.to("cuda")
        z_i = model(img_1)
        z_j = model(img_2)

        loss = criterion(z_i, z_j)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        total_loss += loss.item()

    scheduler.step()
    print(f"Epoch [{epoch+1}/100], Loss: {total_loss/len(train_loader):.4f}")

# Define Dataset for Classification
class TestDataset(Dataset):
    def __init__(self, images, labels, transform=None):
        self.images = images
        self.labels = labels
        self.transform = transform

    def __len__(self):
        return len(self.images)

    def __getitem__(self, idx):
        img = self.images[idx]
        label = self.labels[idx]
        if self.transform:
            img = self.transform(img)
        return img, label

# Initialize Training and Test Dataset and DataLoader
train_transform = transforms.Compose([
    transforms.ToPILImage(),
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.5], std=[0.5])
])

test_transform = transforms.Compose([
    transforms.ToPILImage(),
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.5], std=[0.5])
])

train_dataset = TestDataset(final_X_train_resized, train_labels, transform=train_transform)
train_loader = DataLoader(train_dataset, batch_size=512, shuffle=True)

test_dataset = TestDataset(final_X_test_resized, test_labels, transform=test_transform)
test_loader = DataLoader(test_dataset, batch_size=512, shuffle=False)

# Add Classification Head
class ClassificationHead(nn.Module):
    def __init__(self, input_dim, num_classes):
        super(ClassificationHead, self).__init__()
        self.fc = nn.Linear(input_dim, num_classes)

    def forward(self, x):
        return self.fc(x)

classification_head = ClassificationHead(input_dim=512, num_classes=len(np.unique(train_labels))).to("cuda")

optimizer_cls = optim.Adam([
    {"params": model.encoder.parameters(), "lr": 1e-5},
    {"params": classification_head.parameters(), "lr": 3e-4},
])
criterion_cls = nn.CrossEntropyLoss()

# Fine-tune Classification Head
for epoch in range(100):
    model.encoder.train()
    classification_head.train()
    total_loss = 0
    correct = 0
    for img, label in train_loader:
        img, label = img.to("cuda"), label.to("cuda")
        features = model.encoder(img)
        logits = classification_head(features)
        loss = criterion_cls(logits, label)

        optimizer_cls.zero_grad()
        loss.backward()
        optimizer_cls.step()

        total_loss += loss.item()
        correct += (logits.argmax(dim=1) == label).sum().item()

    accuracy = correct / len(train_labels)
    print(f"Epoch [{epoch+1}/100], Loss: {total_loss/len(train_loader):.4f}, Accuracy: {accuracy:.4f}")

# Evaluate on Test Dataset
classification_head.eval()
correct = 0
with torch.no_grad():
    for img, label in test_loader:
        img, label = img.to("cuda"), label.to("cuda")
        features = model.encoder(img)
        logits = classification_head(features)
        correct += (logits.argmax(dim=1) == label).sum().item()

test_accuracy = correct / len(test_labels)
print(f"Test Accuracy: {test_accuracy*100:.4f}")


Epoch [1/100], Loss: 7.3593
Epoch [2/100], Loss: 7.3082
Epoch [3/100], Loss: 7.1995
Epoch [4/100], Loss: 7.1010
Epoch [5/100], Loss: 7.0257
Epoch [6/100], Loss: 6.9785
Epoch [7/100], Loss: 6.9562
Epoch [8/100], Loss: 6.9369
Epoch [9/100], Loss: 6.9158
Epoch [10/100], Loss: 6.9056
Epoch [11/100], Loss: 6.9009
Epoch [12/100], Loss: 6.8963
Epoch [13/100], Loss: 6.8949
Epoch [14/100], Loss: 6.8927
Epoch [15/100], Loss: 6.8909
Epoch [16/100], Loss: 6.8916
Epoch [17/100], Loss: 6.8904
Epoch [18/100], Loss: 6.8895
Epoch [19/100], Loss: 6.8896
Epoch [20/100], Loss: 6.8895
Epoch [21/100], Loss: 6.8907
Epoch [22/100], Loss: 6.8900
Epoch [23/100], Loss: 6.8884
Epoch [24/100], Loss: 6.8883
Epoch [25/100], Loss: 6.8880
Epoch [26/100], Loss: 6.8878
Epoch [27/100], Loss: 6.8876
Epoch [28/100], Loss: 6.8881
Epoch [29/100], Loss: 6.8873
Epoch [30/100], Loss: 6.8866
Epoch [31/100], Loss: 6.8883
Epoch [32/100], Loss: 6.8870
Epoch [33/100], Loss: 6.8872
Epoch [34/100], Loss: 6.8867
Epoch [35/100], Loss: 6