# MLP

As a first step, we will implement a MultiLayer Perceptron (MLP) to test it on the classic iris dataset. 

1. Load the iris dataset and split it into train and test sets 
> Check the previous pratical sessions !

2. Built a Multi Layer Perceptron with default parameters and evaluate its accuracy on test set.

3. Plot the loss curve by using the `loss_curve_` attribute of `MLPClassifier`

4. Change the `max_iter` to 100 and check the curve and the output.


In [None]:
from sklearn.datasets import load_iris
...
X_train, X_test, y_train, y_test = ...


In [None]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
X,y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)


In [None]:
from sklearn.neural_network import MLPClassifier
clf = 
...
print("Accuracy:", clf.score(X_test, y_test))


In [None]:
from sklearn.neural_network import MLPClassifier
clf = MLPClassifier(random_state=42,verbose=True,max_iter=1000)
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)
print("Accuracy:", clf.score(X_test, y_test))


In [None]:
from matplotlib import pyplot as plt
...

In [None]:
from matplotlib import pyplot as plt
plt.plot(clf.loss_curve_)

Now we will implement our MLP on a more complex datasets corresponding to images of handwritten digits. A simplified version of this dataset is provided within the `load_digits` function of `sklearn.datasets` module.

1. Run a simple MLP classifier on the data. What is the default architecture used by sklearn ?
2. Modify your MLP to have 5 hidden layers with following dimensions : 
    * 32 neurons
    * 64 neurons
    * 128 neurons
    * 64 neurons
    * 32 neurons
3. What do you observe on learning process and results ?

In [None]:
from sklearn.datasets import load_digits
X,y = load_digits(return_X_y=True)
...

In [None]:
from sklearn.datasets import load_digits
X,y = load_digits(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)
clf = MLPClassifier(hidden_layer_sizes=[32,64,128,64,32],random_state=42,verbose=True,max_iter=1000)
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)
print("Accuracy:", clf.score(X_test, y_test))
plt.plot(clf.loss_curve_)

# GNN

In this second part, we will use a Graph Neural Network (GNN), implemented in its simplest form. 
All the code is provided since the implementation is a little bit more complex than using `sklearn`. Nonetheless, take time to understand the code and retrieve the components and steps of a Graph Neural Network.

To make it work, install `torch` (check command in Readme.md) and copy/paste the `greycdata` folder available on Universitice in the same folder as notebook.

In [None]:
from tqdm import tqdm 
import torch
from greycdata.datasets import GreycDataset

# Loading the Acyclic dataset
dataset = GreycDataset(name='Acyclic',root='data/Acyclic')

In [None]:
# Split data
dataset = dataset.shuffle()
ratio_train = .9
size_train = int(len(dataset)*ratio_train)
size_test = len(dataset)-size_train
train_dataset = dataset[:size_train]
test_dataset = dataset[size_train:]

print(f'Number of training graphs: {len(train_dataset)}')
print(f'Number of test graphs: {len(test_dataset)}')

In [None]:
# Convert data to torch data
from torch_geometric.loader import DataLoader
train_loader = DataLoader(train_dataset, batch_size=8, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False)


In [None]:
from torch.nn import Linear
import torch.nn.functional as F
from torch_geometric.nn import GCNConv
from torch_geometric.nn import global_add_pool

# Creation of a basic GCN model. 
class MyGCN(torch.nn.Module):
    def __init__(self, input_channels,hidden_channels):
        super(MyGCN, self).__init__()
        self.conv1 = GCNConv(input_channels, hidden_channels)
        self.conv2 = GCNConv(hidden_channels, hidden_channels)
        self.lin = Linear(hidden_channels, 1)

    def forward(self, x, edge_index, batch):
        # Convolution layers
        x = self.conv1(x, edge_index)
        x = x.relu()
        x = self.conv2(x, edge_index)
        
        # Read out (pooling) layer
        x = global_add_pool(x, batch) 
        x = self.lin(x)
        return x

model = MyGCN(input_channels=dataset.num_features,hidden_channels=64)
print(model)

In [None]:
model = MyGCN(input_channels=dataset.num_features,hidden_channels=128)
optimizer = torch.optim.Adam(model.parameters(), lr=0.01) # How we make the gradient descent
criterion = torch.nn.MSELoss(reduction="sum") # How we compute the loss, evaluate the performance of the model on train set

def my_mse(gt,pred):
    """
    Compute the sum of squared errors between gt and pred
    """
    return ((gt-pred)**2).sum()

def train():
    model.train()
    loss_epoch = 0.0
    for data in train_loader:  # Iterate in batches over the training dataset.
        out = model(data.x, data.edge_index, data.batch)  # Perform a single forward pass.
        loss = criterion(out, data.y.reshape(-1,1))  # Compute the loss.
        loss_epoch += loss.item()
        loss.backward()  # Derive gradients.
        optimizer.step()  # Update parameters based on gradients.
        optimizer.zero_grad()  # Clear gradients.
        loss_epoch += loss.item()
    return loss_epoch

def test(loader):
    model.eval()
    sse = 0.0
    nb = 0
    for data in loader:  # Iterate in batches over the training/test dataset.
        out = model(data.x, data.edge_index, data.batch)  
        sse += my_mse(out, data.y.reshape(-1,1))
    return sse

losses=[]

for epoch in tqdm(range(1, 100)):
    loss = train()
    losses.append(loss)

In [None]:
import matplotlib.pyplot as plt
plt.plot(losses)