# SIB - Portfolio of Machine Learning Algorithms

## Exercise 16: Build, train and evaluate a neural network

• Build, train and evaluate a NN based on the following instructions:
- The training dataset has 32 features
- The task is binary classification
- Use the SGD optimizer
- Use the BinaryCrossEntropy loss
- Use the accuracy metric
- The model should contain:
• Dense layer 1
• ReLU activation layer 1
• Dense layer 2
• ReLU activation layer 2
• Output Dense layer
• Sigmoid activation layer  

- The dense layers should reduce the number of units to half (except the last one)  

- Train the NN for 100 epochs, with batch size of 16 with a learning rate of 0.01.  

- Test the model with k fold cross validation (hint: use the functions implemented in
class 8)

In [None]:
#Dataset and NN
import numpy as np
from si.data.dataset import Dataset
from si.neural_networks.neural_network import NeuralNetwork

#Optimizer
from si.neural_networks.optimizers import SGD

#Loss function
from si.neural_networks.losses import BinaryCrossEntropy

#Model
from si.neural_networks.layers import DenseLayer
from si.metrics.accuracy import accuracy
from si.neural_networks.activation import ReLUActivation, SigmoidActivation

#Test
from si.model_selection.cross_validate import k_fold_cross_validation

In [None]:
# Generate the dataset 
np.random.seed(42)
X = np.random.rand(1000, 32)  # 1000 samples, 32 features
y = np.random.randint(0, 2, size=(1000,))  # Binary labels (0,1)

dataset = Dataset(X, y)

In [None]:
#Inspecting the dataset
data = dataset.to_dataframe()
print(f"The dataset have the following dimensions: {data.shape}")
print(f"Target unique values: {data['y'].unique()}")
data.head()

The dataset have the following dimensions: (1000, 33)
Target unique values: [0 1]


Unnamed: 0,feat_0,feat_1,feat_2,feat_3,feat_4,feat_5,feat_6,feat_7,feat_8,feat_9,...,feat_23,feat_24,feat_25,feat_26,feat_27,feat_28,feat_29,feat_30,feat_31,y
0,0.37454,0.950714,0.731994,0.598658,0.156019,0.155995,0.058084,0.866176,0.601115,0.708073,...,0.366362,0.45607,0.785176,0.199674,0.514234,0.592415,0.04645,0.607545,0.170524,0
1,0.065052,0.948886,0.965632,0.808397,0.304614,0.097672,0.684233,0.440152,0.122038,0.495177,...,0.921874,0.088493,0.195983,0.045227,0.32533,0.388677,0.271349,0.828738,0.356753,1
2,0.280935,0.542696,0.140924,0.802197,0.074551,0.986887,0.772245,0.198716,0.005522,0.815461,...,0.637557,0.887213,0.472215,0.119594,0.713245,0.760785,0.561277,0.770967,0.493796,0
3,0.522733,0.427541,0.025419,0.107891,0.031429,0.63641,0.314356,0.508571,0.907566,0.249292,...,0.539342,0.80744,0.896091,0.318003,0.110052,0.227935,0.427108,0.818015,0.860731,1
4,0.006952,0.510747,0.417411,0.222108,0.119865,0.337615,0.94291,0.323203,0.518791,0.703019,...,0.239562,0.144895,0.489453,0.98565,0.242055,0.672136,0.76162,0.237638,0.728216,1


In [None]:
net = NeuralNetwork(
    epochs=100, batch_size=16, optimizer=SGD, learning_rate=0.01, verbose=True, loss=BinaryCrossEntropy, metric=accuracy)

# Adding nn layers 
n_features = dataset.X.shape[1]
# dense1
net.add(DenseLayer(n_units=16, input_shape=(n_features,)))  
#activation function 1
net.add(ReLUActivation())  
#dense2
net.add(DenseLayer(n_units=8))  
#activation function 1
net.add(ReLUActivation())  
#output dense layer
net.add(DenseLayer(n_units=1))  
#output activation function 
net.add(SigmoidActivation())  

<si.neural_networks.neural_network.NeuralNetwork at 0x2598760bdd0>

In [None]:
#k fold cross validation
k = 5 
scores = k_fold_cross_validation(net, dataset, scoring=accuracy, cv=k, seed=42)

Epoch 1/100 - loss: 638.7653 - accuracy: 0.8488
Epoch 2/100 - loss: 391.2232 - accuracy: 0.8550
Epoch 3/100 - loss: 128.9248 - accuracy: 0.9300
Epoch 4/100 - loss: 170.7639 - accuracy: 0.9163
Epoch 5/100 - loss: 260.9634 - accuracy: 0.8912
Epoch 6/100 - loss: 140.4476 - accuracy: 0.9187
Epoch 7/100 - loss: 115.5343 - accuracy: 0.9425
Epoch 8/100 - loss: 80.9696 - accuracy: 0.9537
Epoch 9/100 - loss: 104.6442 - accuracy: 0.9463
Epoch 10/100 - loss: 95.2593 - accuracy: 0.9475
Epoch 11/100 - loss: 80.4367 - accuracy: 0.9600
Epoch 12/100 - loss: 58.1357 - accuracy: 0.9675
Epoch 13/100 - loss: 56.2788 - accuracy: 0.9663
Epoch 14/100 - loss: 91.3565 - accuracy: 0.9550
Epoch 15/100 - loss: 60.9863 - accuracy: 0.9688
Epoch 16/100 - loss: 57.4622 - accuracy: 0.9663
Epoch 17/100 - loss: 54.0557 - accuracy: 0.9712
Epoch 18/100 - loss: 53.8685 - accuracy: 0.9700
Epoch 19/100 - loss: 54.8896 - accuracy: 0.9712
Epoch 20/100 - loss: 53.7017 - accuracy: 0.9712
Epoch 21/100 - loss: 52.5525 - accuracy: 

In [None]:
print(f"Scores per fold: {scores}")
print(f"Mean Accuracy: {np.mean(scores):.2f}")

Scores per fold: [np.float64(0.91), np.float64(0.95), np.float64(0.98), np.float64(0.965), np.float64(0.97)]
Mean Accuracy: 0.95


Average accuracy of 95% in k-fold cross-validation, indicating that the model performs well on the given dataset, with a high capability to generalize to unseen data.