# Problem definition

__Problem 1:__ Design model architecture from scratch using any technique and algorithm
covered in the course. Play with layers, and hyperparameters, and report the results of
experiments.
You will have to submit a written report (.pdf) of 2 pages containing the following and
the code:
- A paragraph including the architecture details of the best model you
managed to conceive, and the training process used to train it, as well as
its numerical performance;
- A global discussion including:
     - Your analysis of the trade-offs between the performance of the final
model, the time it takes to train it, and the time needed to tune all the
hyper-parameters;<br>
     - For each relevant parameter (the various architectural choices, the
optimizer configuration, the training process details, etc...), a paragraph
explaining what you found about how it impacts the training process and
the performance of the model

## Data preparation
### About data (from [KAGGLE](https://www.kaggle.com/datasets/uciml/pima-indians-diabetes-database))

This dataset is originally from the National Institute of Diabetes and Digestive and Kidney Diseases. The objective of the dataset is to diagnostically predict whether or not a patient has diabetes, based on certain diagnostic measurements included in the dataset. Several constraints were placed on the selection of these instances from a larger database. In particular, all patients here are females at least 21 years old of Pima Indian heritage.

In [1]:
import torch
from torch.utils.data import DataLoader, Dataset, random_split
from torch import nn

import pandas as pd
from sklearn.preprocessing import LabelEncoder

In [64]:
class NNDataset(Dataset):

    def __init__(self, path):
        df = pd.read_csv(path)

        self.X, self.y = df.values[:, :-1], df.values[:, -1]

        self.X = self.X.astype('float32')
        self.y = LabelEncoder().fit_transform(self.y)

    def __len__(self):
        return len(self.X)

    def __getitem__(self, ind):
        return [self.X[ind], self.y[ind]]

    def split(self, test_size=.2):
        test_size = int(test_size * len(self.X))
        train_size = len(self.X) - test_size

        return random_split(self, [train_size, test_size])

    def data_loader(self, batch_size=64, shuffle=True, test_size=0.2):
        train, test = self.split(test_size=test_size)

        train_dl = DataLoader(train, batch_size=batch_size, shuffle=shuffle)
        test_dl = DataLoader(test, batch_size=batch_size, shuffle=shuffle)

        return train_dl, test_dl





In [187]:
# data loaders
dataset = NNDataset("Data/diabetes.csv")
train_loader,test_loader = dataset.data_loader(batch_size=128,shuffle=True,test_size=0.2)

## Model architecture

In [188]:
from torch.nn import Module
from torch.nn import ReLU,LeakyReLU,Sigmoid,Linear,CrossEntropyLoss
from torch.nn.init import kaiming_uniform_,xavier_uniform_
from torch import optim
from sklearn.metrics import accuracy_score

In [189]:
class CustomNN_1(Module):
    
    def __init__(self,n_inputs):
        
        super(CustomNN_1,self).__init__()
        
        # layer 1
        self.linear1 = Linear(n_inputs,32)
        kaiming_uniform_(self.linear1.weight,nonlinearity='relu')
        self.act1 = ReLU()
        
        # layer 2
        self.linear2 = Linear(32,16)
        kaiming_uniform_(self.linear2.weight,nonlinearity='relu')
        self.act2 = ReLU()
        
        # output layer 
        self.linear3 = Linear(16,2)
        kaiming_uniform_(self.linear3.weight,nonlinearity='sigmoid')
        self.act3 = Sigmoid()

    def forward(self,X):
        
#         X = torch.flatten(X,1)
#         X = X.view(X.size(0), -1)

        
        # layer 1
        output = self.linear1(X)
        output = self.act1(output)
        
        # layer 2
        output = self.linear2(output)
        output = self.act2(output)
        
        # layer 3
        output = self.linear3(output)
        output = self.act3(output)
        
        return output 
    
    def binary(self,output):
        return round(output)
    
    
def train_nn(model,device,train_dl,optimizer,loss_function,epoch):
    
    for batch,(X,y) in enumerate(train_dl):
        
        X,y = X.to(device),y.to(device)
        # get output
        output = model(X)
        
        # calculate loss
        loss = loss_function(output,y)
        
        optimizer.zero_grad()
        # optimize
        loss.backward()
        optimizer.step()
        if batch % 4 == 0:

            
            print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
                epoch, batch * len(X), len(train_dl.dataset),
                100. * batch / len(train_dl), loss.item()))
            
            
def test_nn(device,model,test_dl,epoch,loss_function):
    
    correct = 0
    test_loss = 0
    
    with torch.no_grad():
        
        for batch,(X,y) in enumerate(test_dl):
            
            X,y = X.to(device),y.to(device)
            
            output = model(X)
            prediction = model.binary(output)
            
            # calculate loss
            loss = loss_function(output,y)
            
            
            corret

    accuracy = accuracy_score(y,prediction)
            
    print(f"\nEpoch: {epoch}, Test accuracy: {accuracy}\nTest loss\n")
            

In [190]:
model_1 = CustomNN_1(8)
print(model_1)

CustomNN_1(
  (linear1): Linear(in_features=8, out_features=32, bias=True)
  (act1): ReLU()
  (linear2): Linear(in_features=32, out_features=16, bias=True)
  (act2): ReLU()
  (linear3): Linear(in_features=16, out_features=2, bias=True)
  (act3): Sigmoid()
)


As we can see the model at first has two hidden layer with __ReLU__ activation function for both and output layer with __Sigmoid__ activation function. I am going to train this model using __Stochastic Gradient Gescent__ as optimizer with learning rate = 0.1 and __Cross Entropy__ as a loss function to train this model. As performence metric I am using accuracy score.

In [191]:
optimizer = optim.SGD(model_1.parameters(),lr=0.1)

In [192]:
for epoch in range(100):
    train_nn(model=model_1,device='cpu',train_dl=train_loader,optimizer=optimizer,loss_function=CrossEntropyLoss(),epoch=epoch)
    test_nn(model=model_1,device='cpu',test_dl=test_loader,loss_function=CrossEntropyLoss(),epoch=epoch)


Epoch: 0, Test accuracy: 0.48



Epoch: 1, Test accuracy: 0.68



Epoch: 2, Test accuracy: 0.64



Epoch: 3, Test accuracy: 0.76



Epoch: 4, Test accuracy: 0.8



Epoch: 5, Test accuracy: 0.68



Epoch: 6, Test accuracy: 0.68



Epoch: 7, Test accuracy: 0.76



Epoch: 8, Test accuracy: 0.72



Epoch: 9, Test accuracy: 0.84



Epoch: 10, Test accuracy: 0.68



Epoch: 11, Test accuracy: 0.64



Epoch: 12, Test accuracy: 0.64



Epoch: 13, Test accuracy: 0.6



Epoch: 14, Test accuracy: 0.68



Epoch: 15, Test accuracy: 0.6



Epoch: 16, Test accuracy: 0.8



Epoch: 17, Test accuracy: 0.48



Epoch: 18, Test accuracy: 0.8



Epoch: 19, Test accuracy: 0.72



Epoch: 20, Test accuracy: 0.72



Epoch: 21, Test accuracy: 0.56



Epoch: 22, Test accuracy: 0.56



Epoch: 23, Test accuracy: 0.56



Epoch: 24, Test accuracy: 0.76



Epoch: 25, Test accuracy: 0.8



Epoch: 26, Test accuracy: 0.48



Epoch: 27, Test accuracy: 0.68



Epoch: 28, Test accuracy: 0.8



Epoch: 29, Test accuracy: 0.8



Epoch: 66, Test accuracy: 0.68



Epoch: 67, Test accuracy: 0.72



Epoch: 68, Test accuracy: 0.76



Epoch: 69, Test accuracy: 0.72



Epoch: 70, Test accuracy: 0.68



Epoch: 71, Test accuracy: 0.64



Epoch: 72, Test accuracy: 0.72



Epoch: 73, Test accuracy: 0.68



Epoch: 74, Test accuracy: 0.68



Epoch: 75, Test accuracy: 0.68



Epoch: 76, Test accuracy: 0.72



Epoch: 77, Test accuracy: 0.68



Epoch: 78, Test accuracy: 0.76



Epoch: 79, Test accuracy: 0.84



Epoch: 80, Test accuracy: 0.72



Epoch: 81, Test accuracy: 0.72



Epoch: 82, Test accuracy: 0.72



Epoch: 83, Test accuracy: 0.72



Epoch: 84, Test accuracy: 0.68



Epoch: 85, Test accuracy: 0.64



Epoch: 86, Test accuracy: 0.68



Epoch: 87, Test accuracy: 0.68



Epoch: 88, Test accuracy: 0.64



Epoch: 89, Test accuracy: 0.68



Epoch: 90, Test accuracy: 0.72



Epoch: 91, Test accuracy: 0.68



Epoch: 92, Test accuracy: 0.64



Epoch: 93, Test accuracy: 0.68



Epoch: 94, Test accuracy: 0.72



Epoch: 95, Te

In [15]:
import pandas as pd

In [16]:
df = pd.read_csv("Data/diabetes.csv")

In [17]:
df.head()

Unnamed: 0,Pregnancies,Glucose,BloodPressure,SkinThickness,Insulin,BMI,DiabetesPedigreeFunction,Age,Outcome
0,6,148,72,35,0,33.6,0.627,50,1
1,1,85,66,29,0,26.6,0.351,31,0
2,8,183,64,0,0,23.3,0.672,32,1
3,1,89,66,23,94,28.1,0.167,21,0
4,0,137,40,35,168,43.1,2.288,33,1
