**Challenge: Implement a Multiclass Classification Neural Network using PyTorch**

Objective:
Build a neural network using PyTorch to predict handwritten digits of MNIST.

Steps:

1. **Data Preparation**: Load the MNIST dataset using ```torchvision.datasets.MNIST```. Standardize/normalize the features. Split the dataset into training and testing sets using, for example, ```sklearn.model_selection.train_test_split()```. **Bonus scores**: *use PyTorch's built-* ```DataLoader``` *to split the dataset*.

2. **Neural Network Architecture**: Define a simple feedforward neural network using PyTorch's ```nn.Module```. Design the input layer to match the number of features in the MNIST dataset and the output layer to have as many neurons as there are classes (10). You can experiment with the number of hidden layers and neurons to optimize the performance. **Bonus scores**: *Make your architecture flexibile to have as many hidden layers as the user wants, and use hyperparameter optimization to select the best number of hidden layeres.*

3. **Loss Function and Optimizer**: Choose an appropriate loss function for multiclass classification. Select an optimizer, like SGD (Stochastic Gradient Descent) or Adam.

4. **Training**: Write a training loop to iterate over the dataset.
Forward pass the input through the network, calculate the loss, and perform backpropagation. Update the weights of the network using the chosen optimizer.

5. **Testing**: Evaluate the trained model on the test set. Calculate the accuracy of the model.

6. **Optimization**: Experiment with hyperparameters (learning rate, number of epochs, etc.) to optimize the model's performance. Consider adjusting the neural network architecture for better results. **Notice that you can't use the optimization algorithms from scikit-learn that we saw in lab1: e.g.,** ```GridSearchCV```.


In [3]:
# insert code here
import numpy as np
import torch
import torchvision.datasets as datasets
import torchvision.transforms
import torch.nn as nn
import torch.optim as optim

transform = torchvision.transforms.Compose([
    torchvision.transforms.ToTensor(),  
    torchvision.transforms.Normalize((0.5,), (0.5,))  # Normalizza i valori dei pixel nell'intervallo [-1, 1]
])

trainset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
testset = datasets.MNIST(root='./data', train=False, download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=64, shuffle=True)
testloader = torch.utils.data.DataLoader(testset, batch_size=64, shuffle=True)

print(len(trainset))
print(len(testset))

class MNISTConvNet(nn.Module):
    def __init__(self):
        super().__init__()

        self.conv1= nn.Conv2d(in_channels=1, out_channels=3, kernel_size=3, 
                              stride=1, padding=1) 
        self.act1=nn.ReLU()
        
        self.conv2= nn.Conv2d(in_channels=3, out_channels=3, kernel_size=3, stride=1, padding=1)
        self.pool2=nn.MaxPool2d(kernel_size=2)
        self.act2= nn.ReLU()

        #(32, h, h)
        self.flat= nn.Flatten()
        self.fc3= nn.Linear(588, 512)
        self.act3= nn.ReLU()

        self.fc4= nn.Linear(512, 10)

    def forward(self, x):
        # input 1 x 28 x 28, output 3 x 28 x 28
        x=self.conv1(x)
        x=self.act1(x)
            # input 3 x 28 x 28, output 3 x 28 x 28
        x=self.conv2(x)
        x=self.act2(x)
        # input 32 x 32 x 32, output 32 x 16 x 16
        x=self.pool2(x)
        # 32 x 16 x 16, 8192
        x=self.flat(x)
        x=self.fc3(x)
        x=self.act3(x)
        x=self.fc4(x)
        return x

model = MNISTConvNet()
model

loss_fn = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

n_epochs = 100
batch_size = 10

for epoch in range(n_epochs):
    # Loop over the dataset in batches
     losses = []
     for inputs, labels in trainloader:
        # forward, backward, and then weight update
        y_pred = model(inputs)
        loss = loss_fn(y_pred, labels)
        losses.append(loss.item())
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

     print(f'Epoch {epoch + 1} --> loss = {np.mean(losses)}')

acc = 0
count = 0
for inputs, labels in testloader:
    y_pred = model(inputs)
    acc += (torch.argmax(y_pred, 1) == labels).float().sum()
    count += len(labels)
    acc /= count
print("Epoch %d: model accuracy %.2f%%" % (epoch, acc*100))



60000
10000
Epoch 1 --> loss = 0.3032480383555947


KeyboardInterrupt: 