<a href="https://colab.research.google.com/github/VicentePina7210/DataMiningCleaningExercise/blob/main/Copy_of_CNN_Practice.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [4]:
import torch
import numpy as np
import requests
from PIL import Image
import matplotlib.pyplot as plt
from io import BytesIO
from scipy import ndimage

RuntimeError: Only a single TORCH_LIBRARY can be used to register the namespace triton; please put all of your definitions in a single TORCH_LIBRARY block.  If you were trying to specify implementations, consider using TORCH_LIBRARY_IMPL (which can be duplicated).  If you really intended to define operators for a single namespace in a distributed way, you can use TORCH_LIBRARY_FRAGMENT to explicitly indicate this.  Previous registration of TORCH_LIBRARY was registered at /dev/null:2623; latest registration was registered at /dev/null:2623

## Convolution

In [None]:
# Load image from url
img_url = "https://img.jamesedition.com/listing_images/2024/12/25/12/13/27/23579aec-8f1b-4361-b43c-417a08b91137/je/1100xxs.jpg"
res = requests.get(img_url)
img_arr = np.array(Image.open(BytesIO(res.content)))
img_arr = img_arr[:,:,0] / 255.0
plt.imshow(img_arr)

In [None]:
# Downsample image
new_img = ndimage.zoom(img_arr, 0.25)
plt.imshow(new_img)

In [None]:
# Convolve image
kernel = np.array([
    [0, -2, -1],
    [2, 0, -2],
    [1, 2, 0]
])
plt.imshow(ndimage.convolve(new_img, kernel))

## Convolutional Neural Net

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms

In [None]:
# Load MNIST dataset with transformations
transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,))])
trainset = torchvision.datasets.MNIST(root='./data', train=True, download=True, transform=transform)
trainset = torch.utils.data.Subset(trainset, range(1000))
trainloader = torch.utils.data.DataLoader(trainset, batch_size=64, shuffle=True)

testset = torchvision.datasets.MNIST(root='./data', train=False, download=True, transform=transform)
testset = torch.utils.data.Subset(testset, range(1000))
testloader = torch.utils.data.DataLoader(testset, batch_size=64, shuffle=False)

In [None]:
# Define CNN model
class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(1, 16, 5, padding=2)  # Conv layer (input: 1 channel, output: 16, 3x3 kernel)
        self.conv2 = nn.Conv2d(16, 32, 5, padding=2) # Second conv layer (16 -> 32 filters)
        self.conv2 = nn.Conv2d(16, 32, 5, padding=2) #          Second conv layer (16 -> 32 filters)
        self.fc1 = nn.Linear(32 * 7 * 7, 128)  # Fully connected layer
        self.fc2 = nn.Linear(128, 10)  # Output layer (10 classes for digits)
        self.pool = nn.MaxPool2d(2, 2)  # 2x2 max pooling
        self.relu = nn.ReLU()

    def forward(self, x):
        x = self.pool(self.relu(self.conv1(x)))  # Conv -> ReLU -> Pool
        x = self.pool(self.relu(self.conv2(x)))  # Conv -> ReLU -> Pool
        x = x.view(-1, 32 * 7 * 7)  # Flatten for FC layers
        x = self.relu(self.fc1(x))  # Fully connected layer -> ReLU
        x = self.fc2(x)  # Output layer
        return x

In [None]:
# Initialize model, loss function, and optimizer
model = CNN()
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

In [None]:
# Training loop
for epoch in range(5):  # Train for 5 epochs
    for images, labels in trainloader:
        optimizer.zero_grad()  # Reset gradients
        outputs = model(images)  # Forward pass
        loss = criterion(outputs, labels)  # Compute loss
        loss.backward()  # Backpropagation
        optimizer.step()  # Update weights

    print(f"Epoch {epoch+1}, Loss: {loss.item()}")

# Testing loop
correct, total = 0, 0
with torch.no_grad():
    for images, labels in testloader:
        outputs = model(images)
        _, predicted = torch.max(outputs, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print(f"Test Accuracy: {100 * correct / total:.2f}%")

In [None]:
# Get the weights of the first convolutional layer
filters = model.conv1.weight.data  # Shape: (16, 1, 3, 3) -> 16 filters of size 3x3

# Convert to numpy for visualization
filters = filters.squeeze().numpy()  # Remove the singleton dimension

# Plot the filters
fig, axes = plt.subplots(4, 4, figsize=(6, 6))  # 4x4 grid for 16 filters
for i, ax in enumerate(axes.flat):
    ax.imshow(filters[i], cmap="gray")  # Show filter as grayscale
    ax.axis("off")  # Remove axis ticks

plt.suptitle("Learned Filters in First Conv Layer", fontsize=14)
plt.show()

In [2]:
import torch
import matplotlib.pyplot as plt

# Function to extract and visualize feature maps
def visualize_feature_maps(model, image):
    model.eval()  # Set model to evaluation mode
    layers = [model.conv1, model.conv2]  # Choose layers to visualize

    activations = []
    x = image.unsqueeze(0)  # Add batch dimension

    # Forward pass through selected layers
    for layer in layers:
        x = layer(x)
        x = torch.relu(x)  # Apply ReLU activation
        activations.append(x)

    # Plot original image
    plt.figure(figsize=(4, 4))
    plt.imshow(image.squeeze().numpy(), cmap="gray")
    plt.title("Original Input Image")
    plt.axis("off")
    plt.show()

    # Plot feature maps for each layer
    for i, activation in enumerate(activations):
        num_filters = activation.shape[1]  # Number of channels (filters)
        fig, axes = plt.subplots(1, num_filters, figsize=(num_filters * 2, 2))

        if num_filters == 1:
            axes = [axes]  # Ensure axes is iterable

        for j in range(num_filters):
            axes[j].imshow(activation[0, j].detach().numpy(), cmap="gray")
            axes[j].axis("off")

        plt.suptitle(f"Feature Maps After Layer {i+1}")
        plt.show()

# Get a sample image from test loader
dataiter = iter(testloader)
images, labels = next(dataiter)
sample_image = images[0]  # Take first image

# Visualize feature maps
visualize_feature_maps(model, sample_image)


RuntimeError: Only a single TORCH_LIBRARY can be used to register the namespace triton; please put all of your definitions in a single TORCH_LIBRARY block.  If you were trying to specify implementations, consider using TORCH_LIBRARY_IMPL (which can be duplicated).  If you really intended to define operators for a single namespace in a distributed way, you can use TORCH_LIBRARY_FRAGMENT to explicitly indicate this.  Previous registration of TORCH_LIBRARY was registered at /dev/null:2623; latest registration was registered at /dev/null:2623

## Questions
1. How does changing the kernel size in the convolutional layers affect accuracy?
By increasing the size of the kernel to 5 and padding to 2, I was able to get better test accuracy. On the other hand once I increased it to a higher number like 8 with padding 4, I saw a decrease in accuracy. The increase was likely letting the model become more complex, but increasing it too much could have led the model to underfit and not really pickup any patterns.

2. What happens if you increase or decrease the number of filters in each convolutional layer?
By increasing the number of filters in the layersit made the accuracy improve but the code took some more time to run.
By decreasing, the code ran faster but lost some performance.

3. What is the effect of adding an extra convolutional layer?
The model performs better with an extra layer. This is likely becuase the model is able to become more complex.

4. What happens if you remove the pooling layers?
Removing the pooling layer gave mixed results each time. It seems as if it makes the model more prone to overfitting. sometimes I got high accuracy and other times I got a relatively low accuracy.

5. How do you interpret the learned filter patterns? Are they understandable? Why or why not?
For the most part they are interpretable,  they mostly make sense and give an overall generalization of the main features of the number and they key points of the number. They all capture the defifining elements of the number. However a few done seem to capture enough

6. Do you think the learned filters could be more understandable for a different dataset?
I would say yes, depending on the number and the image used it could come close. The font could be different and the CNN would compensate for that with the filters.


7. Why does increasing the kernel size impact the receptive field of a convolutional layer?
Increasing the size of the kernel directly effects what the CNN sees. Increasing the size of the kernel would allow the CNN to see more and therefore the netowrk would be able to gain a better general understanding.