##  Exercise 16: Build, train and evaluate a neural network
• Build, train and evaluate a NN based on the following instructions:
- The training dataset has 32 features
- The task is binary classification
- Use the SGD optimizer
- Use the BinaryCrossEntropy loss
- Use the accuracy metric
- The model should contain:
    - Dense layer 1
    - ReLU activation layer 1
    - Dense layer 2
    - ReLU activation layer 2
    - Output Dense layer
    - Sigmoid activation layer
- The dense layers should reduce the number of units to half (except the last one)
- Train the NN for 100 epochs, with batch size of 16 with a learning rate of 0.01.
- Test the model with k fold cross validation (hint: use the functions implemented in
class 8)

In [3]:
#Dataset and NN
import numpy as np
from si.data.dataset import Dataset
from si.neural_networks.neural_network import NeuralNetwork

#Test
from si.model_selection.cross_validate import k_fold_cross_validation

#Model
from si.neural_networks.activation import ReLUActivation, SigmoidActivation
from si.neural_networks.layers import DenseLayer
from si.metrics.accuracy import accuracy

#Loss fucntion
from si.neural_networks.losses import BinaryCrossEntropy

#Optimizer
from si.neural_networks.optimizers import SGD

In [4]:
# Generate the dataset 
data = Dataset.from_random(n_samples=100000, n_features=32, n_classes=2)
data.to_dataframe()

Unnamed: 0,feat_0,feat_1,feat_2,feat_3,feat_4,feat_5,feat_6,feat_7,feat_8,feat_9,...,feat_23,feat_24,feat_25,feat_26,feat_27,feat_28,feat_29,feat_30,feat_31,y
0,0.976547,0.241427,0.059132,0.151807,0.014832,0.452038,0.762461,0.743860,0.096513,0.213577,...,0.006081,0.775857,0.202235,0.759902,0.853912,0.204662,0.300406,0.159902,0.910484,1
1,0.068959,0.016315,0.190674,0.189777,0.001352,0.082869,0.954913,0.547682,0.082416,0.446555,...,0.188047,0.121498,0.326999,0.207931,0.725105,0.408893,0.934744,0.415978,0.477192,0
2,0.304785,0.751216,0.913070,0.822778,0.348765,0.580308,0.785550,0.080980,0.560385,0.467549,...,0.509677,0.579208,0.162089,0.616228,0.196515,0.967045,0.381648,0.431506,0.839992,1
3,0.328912,0.886526,0.735393,0.255947,0.690713,0.100076,0.357575,0.607811,0.101959,0.099431,...,0.297868,0.266689,0.192673,0.702141,0.547405,0.120180,0.269602,0.725568,0.425203,0
4,0.644986,0.182089,0.347033,0.663750,0.553099,0.639165,0.493641,0.099599,0.200863,0.672146,...,0.356915,0.607363,0.122069,0.166304,0.095662,0.136576,0.545939,0.342823,0.264331,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
99995,0.835092,0.140036,0.988933,0.752060,0.491575,0.084071,0.353827,0.039228,0.053449,0.593679,...,0.169298,0.274122,0.265072,0.858295,0.214300,0.059033,0.201833,0.578288,0.921414,1
99996,0.021276,0.261638,0.100656,0.509651,0.452341,0.118495,0.192924,0.556896,0.454364,0.097051,...,0.533293,0.023582,0.636987,0.359366,0.599528,0.212663,0.875134,0.011058,0.761540,1
99997,0.827696,0.751958,0.663910,0.873120,0.440279,0.134576,0.229134,0.168398,0.672684,0.999843,...,0.215184,0.939253,0.739134,0.856051,0.659241,0.970376,0.437966,0.681470,0.995774,1
99998,0.682615,0.601032,0.032177,0.305272,0.096027,0.395401,0.638515,0.427179,0.631607,0.157155,...,0.059058,0.199096,0.500800,0.964146,0.683166,0.150976,0.420088,0.017963,0.457499,1


In [9]:
# Initialize the neural network
# Define the model structure
model = NeuralNetwork(
    epochs=100,
    batch_size=16,
    optimizer=SGD,
    learning_rate=0.01,
    loss=BinaryCrossEntropy,
    metric=accuracy,
    verbose=True
)

In [11]:
# Add layers to the model
model.add(DenseLayer(16, (32,)))  # First dense layer with 16 neurons
model.add(ReLUActivation())                        # ReLU activation for non-linearity
model.add(DenseLayer(8))                     # Second dense layer with 8 neurons
model.add(ReLUActivation())                        # ReLU activation for non-linearity
model.add(DenseLayer(1))                     # Final dense layer for binary classification
model.add(SigmoidActivation())                     # Sigmoid activation for probability output

<si.neural_networks.neural_network.NeuralNetwork at 0x2a3eccdc560>

In [12]:
#Performing k fold cross validation
# Perform 5-fold cross-validation to evaluate the model
cv_results = k_fold_cross_validation(
    model=model,
    dataset=data,
    cv=5
)

Epoch 1/100 - loss: 55523.7718 - accuracy: 0.5026
Epoch 2/100 - loss: 55499.0861 - accuracy: 0.5020
Epoch 3/100 - loss: 55498.9888 - accuracy: 0.4991
Epoch 4/100 - loss: 55494.6496 - accuracy: 0.5029
Epoch 5/100 - loss: 55498.1511 - accuracy: 0.5006
Epoch 6/100 - loss: 55490.7595 - accuracy: 0.5005
Epoch 7/100 - loss: 55494.8306 - accuracy: 0.5014
Epoch 8/100 - loss: 55492.1506 - accuracy: 0.5000
Epoch 9/100 - loss: 55481.2750 - accuracy: 0.5037
Epoch 10/100 - loss: 55488.7525 - accuracy: 0.5033
Epoch 11/100 - loss: 55483.5297 - accuracy: 0.5035
Epoch 12/100 - loss: 55476.4954 - accuracy: 0.5055
Epoch 13/100 - loss: 55473.8080 - accuracy: 0.5047
Epoch 14/100 - loss: 55471.5820 - accuracy: 0.5064
Epoch 15/100 - loss: 55478.3271 - accuracy: 0.5034
Epoch 16/100 - loss: 55470.2573 - accuracy: 0.5067
Epoch 17/100 - loss: 55467.2764 - accuracy: 0.5053
Epoch 18/100 - loss: 55454.0212 - accuracy: 0.5072
Epoch 19/100 - loss: 55444.0641 - accuracy: 0.5094
Epoch 20/100 - loss: 55466.3908 - accura

In [13]:
print(f"Accuracy: {np.mean(cv_results):.2f} (+/- {np.std(cv_results):.2f})")

Accuracy: 0.51 (+/- 0.00)


In [14]:
print("Cross-validation scores:", cv_results)

Cross-validation scores: [0.50425, 0.50525, 0.5124, 0.50795, 0.50865]


In [15]:
print("Average cross-validation score:", np.mean(cv_results))

Average cross-validation score: 0.5077


The results obtained show that the model has an average accuracy of approximately 0.51, with minimal variation between the different iterations of the cross-validation (standard deviation close to 0). This indicates that the model failed to identify significant patterns in the data, providing predictions that are close to a random estimate.

This behavior is to be expected, given that the data set was generated randomly, without any inherent relationship between the features and the target classes. In situations like this, even the most sophisticated learning algorithms cannot overcome chance, as there are no real patterns in the data to learn from.

Furthermore, the fact that the standard deviation is practically zero reinforces the consistency of the results between the different cross-validation folds. This suggests that the model is reacting uniformly to the samples in the data set, but, as expected, this uniformity does not reflect the ability to generalize or identify patterns.

This experiment serves as a good practical demonstration of how machine learning models depend on the existence of structural patterns in the data to offer meaningful predictions. When the data is purely random, the models have no basis for differentiating classes, resulting in performance equivalent to random choices.