# Exercise 06 Convolutional Neural Networks

In this exercise, you need to follow the requirements of each question to generate the Python code, and the following example is for reference：

- Sample Question: Write a program that takes the user's name as input and prints "Hello, [name]!" where [name] is the user's input.

- Potential Answer:

```python
    name = input("Enter your name: ")
    print("Hello, " + name + "!")
```
- If you enter 'David', the code will output 'Hello, David!', and this will satisfy the requirements.

## Attention
- Generally, there will be multiple answers for one question and you don't have to strictly follow the instructions in the tutorial, as long as you can make the output of the code meet the requirements of the question.
- If possible, strive to make your code concise and avoid excessive reliance on less commonly used libraries.
- You may need to search for information on the Internet to complete the excercise.

Firstly, following codes can help you import necessary packages that will be used in this exercise and choose proper device to accelerate the training and testing process of neural networks.

In [None]:
import matplotlib.pyplot as plt

import torch
import torchvision
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import DataLoader, Subset
from torchvision import datasets, transforms

device = (
    "cuda" if torch.cuda.is_available()
    else "mps" if torch.backends.mps.is_available()
    else "cpu"
)
print(f"Using {device} device")

## Question 01 (Dataset and Dataloader):

We are using **CIFAR10** dataset for this exercise. Here are some basic information about this dataset

- CIFAR-10 dataset consists of 60,000 images divided into 10 classes.
- Each class contains 6000 images, 5000 for training and 1000 for testing.
- The images are colored and of size 32x32 pixels.
- The 10 different classes represent airplane, car, bird, cat, deer, dog, frog, horse, ship, and truck.

As one of the most commonly used baseline datasets in the field of computer vision, the `torchvision.dataset` module provides quick access to the CIFAR10 dataset

The `torchvision.datasets` module contains Dataset objects for many real-world vision data including MNIST, CIFAR, etc. In this tutorial, we use the MNIST dataset.

### Question 01.1 (Dataset and Dataloader Construction, Batch Size)
Please set an appropriate batch size, obtain the CIFAR10 dataset through the Pytorch API, and build the data loader. We have prepared code for downloading data and build dataloader for you. You only need to choose `BATCH_SIZE`. Write your answer in the following code frame:

**The code may take some time to download the first time it is executed.**

In [None]:
BATCH_SIZE = 

# Loading MNIST dataset
train_dataset = datasets.CIFAR10(root='./data/', train=True, transform=transforms.ToTensor(), download=True)
test_dataset = datasets.CIFAR10(root='./data/', train=False, transform=transforms.ToTensor(), download=True)

# Only use a subset of the dataset for training and testing
train_subset = Subset(train_dataset, list(range(500)))
test_subset = Subset(test_dataset, list(range(100)))

# Data Loader (Input Pipeline)
train_loader = DataLoader(dataset=train_subset, batch_size=BATCH_SIZE, shuffle=True, num_workers=0)
test_loader = DataLoader(dataset=test_subset, batch_size=BATCH_SIZE, shuffle=False, num_workers=0)

### Question 01.2 (Check)
Run the cell below. If your dataset is named as `train_dataset` and `test_dataset` and the build is complete it should output nothing.

In [3]:
assert len(train_dataset) == 50000, f"Expected 50000 train samples, got {len(train_dataset)}"
assert len(test_dataset) == 10000, f"Expected 10000 test samples, got {len(test_dataset)}"

### Question 01.3 (Date Visualization, Optional) 
Please visualize the first batch of data in the training data loader based on the code in the tutorial. Most of the required codes have benn written for you. You need to permute the data to adapt to the requirements of different APIs for the order of image channels at line 14.

You may refer to [https://pytorch.org/docs/stable/generated/torch.permute.html](https://pytorch.org/docs/stable/generated/torch.permute.html) to learn how to use `torch.permute()`.

In [None]:
# Get the first batch of the training data
dataiter = iter(train_loader)
images, labels = next(dataiter)

# Define the figure
grid_size = int(BATCH_SIZE**0.5)
rows, cols = grid_size, (BATCH_SIZE + grid_size - 1) // grid_size
fig, axes = plt.subplots(rows, cols, figsize=(12, 12))
axes = axes.flatten()

# Plot the images in the batch and their corresponding labels
for i, ax in enumerate(axes):
    if i < BATCH_SIZE:
        ax.imshow(images[i].squeeze().permute(__, __, __))  # permute to convert (Channel, Height, Width) to (Height, Width, Channel)
        ax.set_title(labels[i].item(), fontsize=8)
        ax.axis('off')
    else:
        ax.axis('off')  # Hide the empty subplot

plt.tight_layout()
plt.show()

## Question 02 (CNN)
### Question 02.1 (Model Defination): 
Fill in the appropriate code in the specified location in the cell below to build a convolutional neural network for image classification tasks. *You need to define the layers used in the network and the process of forward propagation of data between layers.* The model structure is:
- **conv1**: A 2d convolutional layer that receives 3 channels of input and 16 channels of output, with a convolutional kernel size of 3*3, a stride of 1, and a padding of 0;
- **maxpooling3**: A 2d maxpooling layer with a kernel size of 3*3;
- **conv2**: A 2d convolutional layer that receives 16 channels of input and 32 channels of output, with a convolutional kernel size of 2*2, a stride of 2, and a padding of 1;
- **conv3**: A 2d convolutional layer that receives 32 channels of input and 64 channels of output, with a convolutional kernel size of 3*3, a stride of 1, and padding to keep the feature map size constant;
- **maxpooling2**: A 2d maxpooling layer with a kernel size of 2*2;
- **flatten**: A layer that flat the feature map of the convolutional layer into a one-dimensional vector;
- **fc**: A fully connected layer that transforms a 64\*3\*3 dimensional vector into a 10 dimensional vector

**Note: We have already written some codes for you. You just need to fill in the appropriate parameters or add a few lines of code in the specified positions.**

##### Attention for this question:
- Please first complete this question based on the model structure given above. And use this model to complete the subsequent exercises in Question02.
- Once you have completed Question02 using the model structure given above, we encourage you to **try as many different model structures as possible to find the best classifier**.

In [None]:
# Create CNN Model
class CNN_Model(nn.Module):
    def __init__(self):
        super(CNN_Model, self).__init__()
        self.conv1 = nn.Conv2d(in_channels=3 , out_channels=16, kernel_size=3, stride=1, padding=0)
        # write your code here to define self.conv2
        self.conv2 = 
        # end of your code
        self.conv3 = nn.Conv2d(in_channels=__, out_channels=__, kernel_size=_, stride=_, padding="same")  # fill in the missing parameters, let padding="same" can keep the same feature map size of the input
        self.maxpooling2 = nn.MaxPool2d(kernel_size=2)
        # write your code here to define self.maxpooling3
        self.maxpooling3 = 
        # end of your code
        self.fc1 = nn.Linear(64 * 3 * 3, 10)
    
    def forward(self, x):
        x = F.relu(self.conv1(x))
        x = self.maxpooling3(x)
        # write your code here to let the data pass relu-activated conv2, relu-activated conv3 and maxpooling2 without activation sequentially



        # end of your code
        x = nn.Flatten()(x)
        x = self.fc1(x)
        return x


model = CNN_Model().to(device)
# (Optional) If you haven't install torchinfo, you can ignore the codes in this cell below. But TAs recommend you to use torchinfo here as this can help you check if your forward process has set aligned size beteween layers, especially between the last conv layer and the first fully connected layer.
from torchinfo import summary
summary(model, input_size=(BATCH_SIZE, 3, 32, 32), device=device)

### Question 02.2 (Optimizer, Loss function, Hyperparameters)

Please select the appropriate optimizer and loss function for the model. 

**Try to modify the parameters including learning rate, optimization algorithm, number of training rounds, etc. based on the tutorial to achieve better model performance.**

In [6]:
# define hyperparameters
LR = 
optimizer = 
loss = 

### Question 02.3 (Training Loop and Testing Loop)
Please complete the `train` function and the `test` function.

- The `train` function receives dataloader, model, loss function, and optimizer as parameters, completes the training of an epoch, including forward propagation, loss function calculation, and back propagation, and outputs training status information at appropriate times during the training process.
- The `test` function takes dataloader, model, and loss function as parameters, completes the test of the entire test dataloader, including forward propagation, loss function calculation, prediction accuracy calculation, and outputs the loss and accuracy on the entire test set.
- Please add additional code based on the tutorial to return the average loss on the batch during training and the average loss and accuracy on the test set after each epoch.

In [7]:
def train(dataloader, model, loss_fn, optimizer):
    # write your code here
    
    # end of your code

def test(dataloader, model, loss_fn):
    # write your code here
    
    # end of your code

### Question 02.4 (Training and Testing)

Use the function you defined earlier and select the appropriate number of training epochs to complete training and testing. Record the train loss and validation loss and validation accuracy for each epoch. Then, plot them on a graph together.

In [None]:
epochs = 20
train_loss, test_loss, test_accuracy = [], [], []
for t in range(epochs):
    print(f"Epoch {t+1}\n-------------------------------")
    # write your code here
    
    # end of your code
    

# Reuslt Visualization
# write your code here
    
# end of your code

### Question 02.5 (Alignment between Layers)

A convolutional neural network is defined below, but some of the parameters of some layers are missing. Please try to complete the network. 

Modify and run the following cell. If your results are correct, the following cell should not produce runtime errors.

In [None]:
# You should only modify the values ​​of the following constants.
INPUT_CHANNELS = 
CONV1_OUT_CHANNELS = 
CONV3_IN_CHANNELS = 
FC1_IN_FEATURES = 
FC2_IN_FEATURES = 
MODEL_OUTPUT_SIZE = 
# Your modification should end before this comment. 
# But you may still need to read the rest of the code in this cell to get the information you need to complete the exercise.

# Codes below this comment should not be modified.
class UncompletedCNN(nn.Module):
    def __init__(self):
        super(UncompletedCNN, self).__init__()
        # write your code here
        self.conv1 = nn.Conv2d(in_channels=INPUT_CHANNELS, out_channels=CONV1_OUT_CHANNELS, kernel_size=3, padding="same")
        self.conv2 = nn.Conv2d(in_channels=16, out_channels=64, kernel_size=3, padding="same")
        self.conv3 = nn.Conv2d(in_channels=CONV3_IN_CHANNELS, out_channels=64, kernel_size=3, padding="same")
        self.maxpooling = nn.MaxPool2d(kernel_size=2)
        self.fc1 = nn.Linear(FC1_IN_FEATURES, 256)
        self.fc2 = nn.Linear(FC2_IN_FEATURES, MODEL_OUTPUT_SIZE)
        # end of your code
    
    def forward(self, x):
        # write your code here
        x = F.relu(self.conv1(x))
        x = self.maxpooling(x)
        x = F.relu(self.conv2(x))
        x = self.maxpooling(x)
        x = F.relu(self.conv3(x))
        x = self.maxpooling(x)
        x = nn.Flatten()(x)
        x = self.fc1(x)
        x = self.fc2(x)
        return x


completed_model = UncompletedCNN()
try:
    model_output = completed_model(torch.rand(1, 3, 32, 32))
    loss_fn = nn.CrossEntropyLoss()
    loss = loss_fn(model_output, torch.rand(1, 10))
    print("Your answer is correct :-)")
except:
    raise AssertionError("Sorry your answer is wrong :( You can check the error messages above to help you locate your mistake(s).")

## Question 03 (ResNet)

Please build a ResNet-18 model based on the code in the tutorial or use `torchvision.models.resnet18` to classify the CIFAR-10 dataset. Modify the relevant parameters from the tutorial such as learning rate, batch size, optimizer, number of epochs, loss function, etc., to achieve a better model performance. Record the train loss and validation loss and accuracy for each epoch. Then, plot them on a graph together.

In [25]:
class ResNet18(nn.Module):
    def __init__(self):
        

    def forward(self, x):
        

resNet = ResNet18().to(device)

LR = 
loss = 
optimizer = 

In [None]:
epochs = 
for t in range(epochs):
    print(f"Epoch {t+1}\n-------------------------------")
    train(train_loader, resNet, loss, optimizer)
    test(test_loader, resNet, loss)
print("Done!")

## Question 04 (GNN)
If you haven't install torch_geometric yet, you can install it by `pip install torch_geometric`.

In [None]:
%pip install torch_geometric

The following code are extracted from the tutorial, helping you prepare the data. Copy the following code to run them first and continue finishing the question.

```python
from sklearn.metrics import precision_recall_fscore_support
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch_geometric.nn import GCNConv
from torch_geometric.datasets import Planetoid

dataset = Planetoid(root='./data/Cora', name='Cora')
```

Extract the data, the number of features on the node, and the number of categories of the node from the dataset, and output the number of features and the number of categories. Write your answer in the following code frame:

In [None]:
from sklearn.metrics import precision_recall_fscore_support
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch_geometric.nn import GCNConv
from torch_geometric.datasets import Planetoid

dataset = Planetoid(root='./data/Cora', name='Cora')
data, num_node_features, num_classes = dataset[0].to(device), dataset.num_node_features, dataset.num_classes
print(num_node_features, num_classes)

Please define a graph convolutional network with two graph convolutional layers. 

- The first graph convolutional layer takes the number of node features as input channels and outputs 128 channels. The second convolutional layer takes 128 channels as input and outputs channels equal to the number of node categories for classification.
- Both convolutional layers are activated using the sigmoid function.

Write your answer in the following code frame:

In [6]:
class GCN(nn.Module):
    def __init__(self, features, hidden_dimension, classes):
       

    def forward(self, data):
        

hidden_dimension = 
model = GCN(num_node_features, hidden_dimension, num_classes).to(device)

Select appropriate hyperparameters for training, record the metrics on the validation set during training and plot them on a figure. Write your answer in the following code frame:

In [None]:
learn_rate = 
weight_decay = 
optimizer = 
loss_fn = 

precisions, recalls, f1s, losses = [], [], [], []
for epoch in range(200):
    model.train()
    optimizer.zero_grad()
    out = model(data)
    loss = loss_fn(out[data.train_mask], data.y[data.train_mask])
    losses.append(loss.item())
    loss.backward()
    optimizer.step()

    model.eval()
    _, predicted_val = torch.max(out[data.val_mask], dim=1)
    predicted_val = predicted_val.cpu().detach().numpy()
    precision_val, recall_val, f1_val, _ = precision_recall_fscore_support(data.y[data.val_mask].cpu().detach().numpy(), 
                                                                           predicted_val, average='macro', zero_division=0)
    precisions.append(precision_val)
    recalls.append(recall_val)
    f1s.append(f1_val)
    print(f"Epoch {epoch:<3d} | precision_val: {precision_val:.4f}, recall_val: {recall_val:.4f}, f1_val: {f1_val:.4f}, loss: {loss.item():.4f}")

_, predicted_test = torch.max(out[data.test_mask], dim=1)
predicted_test = predicted_test.cpu().detach().numpy()
precision_test, recall_test, f1_test, _ = precision_recall_fscore_support(data.y[data.test_mask].cpu().detach().numpy(), 
                                                                        predicted_test, average='macro', zero_division=0)
print(f"Final Test precision: {precision_test:.4f}, recall: {recall_test:.4f}, f1: {f1_test:.4f}")

epochs = range(1, len(precisions)+1)
plt.figure(figsize=(10, 8))
plt.plot(epochs, precisions, 'g', label='Precision')
plt.plot(epochs, recalls, 'r', label='Recall')
plt.plot(epochs, f1s, 'm', label='F1')
plt.plot(epochs, losses, 'b', label='Loss')
plt.title('Training And Validation Metrics')
plt.grid()
plt.xlabel('Epochs')
plt.ylabel('Metrics')
plt.legend()
plt.show()