### Experiment 1 - Build an Artificial Neural Network to implement Binary Classification task using the Back-propagation algorithm and test the same using appropriate data sets.

Using the Bank Note Authentication UCI data, download it from - https://www.kaggle.com/datasets/ritesaluja/bank-note-authentication-uci-data

#### Installing Necessary Libraries globally

In [None]:
!pip install numpy scikit-learn tensorflow pandas

#### Importing Necessary Libraries

In [75]:
import pandas as pd
import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Input
from tensorflow.keras.optimizers import Adam
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score
from sklearn.metrics import classification_report, confusion_matrix, precision_score, recall_score, f1_score

#### Loading the dataset

In [48]:
df = pd.read_csv('BankNoteAuthentication.csv')

In [49]:
df.head(10)

Unnamed: 0,variance,skewness,curtosis,entropy,class
0,3.6216,8.6661,-2.8073,-0.44699,0
1,4.5459,8.1674,-2.4586,-1.4621,0
2,3.866,-2.6383,1.9242,0.10645,0
3,3.4566,9.5228,-4.0112,-3.5944,0
4,0.32924,-4.4552,4.5718,-0.9888,0
5,4.3684,9.6718,-3.9606,-3.1625,0
6,3.5912,3.0129,0.72888,0.56421,0
7,2.0922,-6.81,8.4636,-0.60216,0
8,3.2032,5.7588,-0.75345,-0.61251,0
9,1.5356,9.1772,-2.2718,-0.73535,0


In [50]:
df.tail(10)

Unnamed: 0,variance,skewness,curtosis,entropy,class
1362,-2.1668,1.5933,0.045122,-1.678,1
1363,-1.1667,-1.4237,2.9241,0.66119,1
1364,-2.8391,-6.63,10.4849,-0.42113,1
1365,-4.5046,-5.8126,10.8867,-0.52846,1
1366,-2.41,3.7433,-0.40215,-1.2953,1
1367,0.40614,1.3492,-1.4501,-0.55949,1
1368,-1.3887,-4.8773,6.4774,0.34179,1
1369,-3.7503,-13.4586,17.5932,-2.7771,1
1370,-3.5637,-8.3827,12.393,-1.2823,1
1371,-2.5419,-0.65804,2.6842,1.1952,1


In [51]:
df.shape

(1372, 5)

#### Key noting !! Data is not shuffled, class with label '0' is kept first and class with label '1' later..shuffle the data first

In [58]:
df = df.sample(frac=1, random_state=42).reset_index(drop=True)

In [59]:
df.head()

Unnamed: 0,variance,skewness,curtosis,entropy,class
0,0.46901,-0.63321,7.3848,0.36507,0
1,-1.9116,-6.1603,5.606,0.48533,1
2,-2.343,12.9516,3.3285,-5.9426,0
3,-2.5373,-6.959,8.8054,1.5289,1
4,0.3292,-4.4552,4.5718,-0.9888,0


In [60]:
df.tail()

Unnamed: 0,variance,skewness,curtosis,entropy,class
1367,-1.9551,-6.9756,5.5383,-0.12889,1
1368,-1.5228,-6.4789,5.7568,0.87325,1
1369,3.7635,2.7811,0.66119,0.34179,0
1370,-2.564,-1.7051,1.5026,0.32757,1
1371,-1.6988,-7.1163,5.7902,0.16723,1


In [61]:
class_counts = data['class'].value_counts()
print(class_counts)

class
0    762
1    610
Name: count, dtype: int64


In [62]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1372 entries, 0 to 1371
Data columns (total 5 columns):
 #   Column    Non-Null Count  Dtype  
---  ------    --------------  -----  
 0   variance  1372 non-null   float64
 1   skewness  1372 non-null   float64
 2   curtosis  1372 non-null   float64
 3   entropy   1372 non-null   float64
 4   class     1372 non-null   int64  
dtypes: float64(4), int64(1)
memory usage: 53.7 KB


In [63]:
df.isnull().sum()

variance    0
skewness    0
curtosis    0
entropy     0
class       0
dtype: int64

In [64]:
df.describe()

Unnamed: 0,variance,skewness,curtosis,entropy,class
count,1372.0,1372.0,1372.0,1372.0,1372.0
mean,0.433735,1.922353,1.397627,-1.191657,0.444606
std,2.842763,5.869047,4.31003,2.101013,0.497103
min,-7.0421,-13.7731,-5.2861,-8.5482,0.0
25%,-1.773,-1.7082,-1.574975,-2.41345,0.0
50%,0.49618,2.31965,0.61663,-0.58665,0.0
75%,2.821475,6.814625,3.17925,0.39481,1.0
max,6.8248,12.9516,17.9274,2.4495,1.0


#### Splitting features and labels

In [65]:
X = df[['variance', 'skewness', 'curtosis', 'entropy']].values # or use X = df.iloc[:, :-1].values 
y = df['class'].values # or use y = df.iloc[:, -1].values 

#### Train-test split

In [66]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

#### Standardize the data

In [67]:
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

#### Build the model

In [68]:
model = Sequential([
    Input(shape=(4,)),       
    Dense(10, activation='relu'),  
    Dense(5, activation='relu'),
    Dense(1, activation='sigmoid') 
])

#### Compile the model

In [69]:
model.compile(
    optimizer=Adam(learning_rate=0.001),
    loss='binary_crossentropy',
    metrics=['accuracy']
)

#### Train the model

In [70]:
history = model.fit(X_train, y_train, epochs=50, validation_split=0.2, verbose=1)

Epoch 1/50
[1m28/28[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 18ms/step - accuracy: 0.5531 - loss: 0.7845 - val_accuracy: 0.5682 - val_loss: 0.7389
Epoch 2/50
[1m28/28[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7ms/step - accuracy: 0.5464 - loss: 0.7019 - val_accuracy: 0.6000 - val_loss: 0.6565
Epoch 3/50
[1m28/28[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7ms/step - accuracy: 0.5836 - loss: 0.6373 - val_accuracy: 0.6909 - val_loss: 0.5925
Epoch 4/50
[1m28/28[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 8ms/step - accuracy: 0.7199 - loss: 0.5506 - val_accuracy: 0.8273 - val_loss: 0.5362
Epoch 5/50
[1m28/28[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 8ms/step - accuracy: 0.7966 - loss: 0.5055 - val_accuracy: 0.8182 - val_loss: 0.4904
Epoch 6/50
[1m28/28[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 9ms/step - accuracy: 0.8358 - loss: 0.4507 - val_accuracy: 0.8409 - val_loss: 0.4474
Epoch 7/50
[1m28/28[0m [32m━━━━━━━━━

#### Evaluate the model

In [71]:
loss, accuracy = model.evaluate(X_test, y_test)
print(f"Test Loss: {loss:.4f}, Test Accuracy: {accuracy:.4f}")

[1m9/9[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 8ms/step - accuracy: 0.9979 - loss: 0.0189 
Test Loss: 0.0219, Test Accuracy: 0.9964


#### Make Predictions

In [72]:
y_pred = model.predict(X_test)
y_pred = (y_pred > 0.5).astype(int) 

[1m9/9[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 9ms/step


#### Calculate Accuracy

In [73]:
final_accuracy = accuracy_score(y_test, y_pred)
print(f"Final Test Accuracy: {final_accuracy:.4f}")

Final Test Accuracy: 0.9964


In [76]:
print(classification_report(y_test, y_pred))

              precision    recall  f1-score   support

           0       1.00      0.99      1.00       163
           1       0.99      1.00      1.00       112

    accuracy                           1.00       275
   macro avg       1.00      1.00      1.00       275
weighted avg       1.00      1.00      1.00       275



In [78]:
print(confusion_matrix(y_test, y_pred))

[[162   1]
 [  0 112]]


My observations - 
-  Does Tuning Hyperparameters (Neurons, Layers, Learning Rate, Epochs) improve Model accuracy?

According to my observations on different datasets, it depends. These hyperparameters need to be adjusted systematically.
1) Too many neurons can lead to overfitting (excellent on training data, poor on test data), too few Neurons can lead to underfitting (poor performance on both training and test data). Solution --> start with a moderate number of neurons.
2) Too many layers can lead to risk of vanishing gradients (early layers stop learning), too few layers can lead to the model unable to capture complex relationships. Solution --> for simpler datasets, 2-3 layers might be sufficient; increase the number according to the complexity.
3) More Epochs --> The model gets more opportunities to learn.
Too many Epochs can lead to overfitting, too few Epochs can lead to underfitting.