# Description of the ANN Architecture with Keras Tuner
#### The function build_model defines a dynamic Artificial Neural Network (ANN) architecture designed for binary classification tasks. It leverages Keras Tuner to fine-tune various hyperparameters across multiple layers to optimize the model's performance. Below is a detailed description of each component

In [36]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.optimizers import Adam
!pip install keras-tuner
import keras_tuner as kt




In [37]:
# Load the dataset
url = "https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-diabetes.data.csv"
columns = [
    "Pregnancies", "Glucose", "BloodPressure", "SkinThickness", 
    "Insulin", "BMI", "DiabetesPedigreeFunction", "Age", "Outcome"
]
data = pd.read_csv(url, names=columns)

In [38]:
# Prepare the data
X = data.iloc[:, :-1].values  # Features
y = data.iloc[:, -1].values   # Target

In [39]:
# Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


In [40]:
# Standardize the features
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

## 1. Input Layer
1. The input layer acts as the entry point of the model, where features from the dataset are fed into the network.
2. The number of neurons (units) in this layer is a tunable parameter, ranging from 8 to 128 in steps of 8. This flexibility allows the tuner to explore configurations that balance complexity and performance.
3. The activation function is another tunable parameter, which can be relu, tanh, or sigmoid. These functions introduce non-linearity and help the model learn complex patterns.
4. The input shape is fixed and depends on the number of features in the training dataset (X_train.shape[1]).

## 2. Hidden Layers
1. Hidden layers are responsible for feature extraction and pattern recognition.
2. The number of hidden layers is tunable between 1 and 3, allowing for customization based on the complexity of the data.
3. Each hidden layer has:
    - Number of Neurons: Tunable between 8 to 128 in steps of 8, enabling exploration of model depth.
    - Activation Function: Tunable between relu, tanh, or sigmoid to adjust how the neurons process the data.
    - Dropout: A regularization technique to prevent overfitting. The dropout rate is tunable between 0.0 and 0.5, where a higher dropout rate  means more neurons are randomly disabled during training.

## 3. Output Layer
1. The output layer has a single neuron since the task is binary classification.
2. A sigmoid activation function is used to convert the output into a probability value between 0 and 1, representing the likelihood of the positive class.

## 4. Model Compilation
1. Optimizer: The model uses the Adam optimizer, a popular choice for its ability to handle sparse gradients and adapt learning rates during training. The learning rate is tunable, allowing values from 0.001 to 0.5.
2. Loss Function: Binary crossentropy is used as it is the standard for binary classification tasks. It measures the difference between predicted probabilities and actual labels.
3. Metrics: The model tracks accuracy, providing a straightforward measure of performance during training and validation.

In [41]:
# Define a function for building the model (used by Keras Tuner)
def build_model(hp):
    model = Sequential()
    
    # Input layer
    model.add(Dense(
        units=hp.Int('units_input', min_value=8, max_value=128, step=8),
        activation=hp.Choice('activation_input', ['relu', 'tanh', 'sigmoid']),
        input_shape=(X_train.shape[1],)
    ))
    
    # Hidden layers
    for i in range(hp.Int('num_hidden_layers', 1, 3)):
        model.add(Dense(
            units=hp.Int(f'units_{i}', min_value=8, max_value=128, step=8),
            activation=hp.Choice(f'activation_{i}', ['relu', 'tanh', 'sigmoid'])
        ))
        model.add(Dropout(hp.Float(f'dropout_{i}', min_value=0.0, max_value=0.5, step=0.1)))
    
    # Output layer
    model.add(Dense(1, activation='sigmoid'))
    
    # Compile the model
    model.compile(
        optimizer=Adam(hp.Choice('learning_rate', [0.001,0.005,0.05,0.1,0.2,0.5])),
        loss='binary_crossentropy',
        metrics=['accuracy']
    )
    return model

In [42]:
# Initialize Keras Tuner
tuner = kt.RandomSearch(
    build_model,
    objective='val_accuracy',
    max_trials=5,  # Number of different hyperparameter combinations to try
    executions_per_trial=1,  # Number of models to train per combination
    directory='tuner_results',
    project_name='diabetes_ann_tuning'
)

Reloading Tuner from tuner_results\diabetes_ann_tuning\tuner0.json


In [43]:
# Search for the best hyperparameters
tuner.search(X_train, y_train, validation_split=0.2, epochs=50, batch_size=32, verbose=2)

In [44]:
# Get the best hyperparameters and model
best_hps = tuner.get_best_hyperparameters(num_trials=1)[0]
print(f"""
The optimal number of units in the first layer is {best_hps.get('units_input')}.
The optimal activation function for the input layer is {best_hps.get('activation_input')}.
The optimal learning rate is {best_hps.get('learning_rate')}.
""")


The optimal number of units in the first layer is 32.
The optimal activation function for the input layer is sigmoid.
The optimal learning rate is 0.01.



In [45]:
# Train the best model
best_model = tuner.hypermodel.build(best_hps)
history = best_model.fit(X_train, y_train, validation_split=0.2, epochs=50, batch_size=32, verbose=2)

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/50
16/16 - 5s - 321ms/step - accuracy: 0.6110 - loss: 0.7217 - val_accuracy: 0.7398 - val_loss: 0.5830
Epoch 2/50
16/16 - 0s - 13ms/step - accuracy: 0.7088 - loss: 0.5798 - val_accuracy: 0.7398 - val_loss: 0.5215
Epoch 3/50
16/16 - 0s - 11ms/step - accuracy: 0.7556 - loss: 0.5152 - val_accuracy: 0.7886 - val_loss: 0.4827
Epoch 4/50
16/16 - 0s - 10ms/step - accuracy: 0.7658 - loss: 0.4857 - val_accuracy: 0.7805 - val_loss: 0.4755
Epoch 5/50
16/16 - 0s - 13ms/step - accuracy: 0.7699 - loss: 0.4862 - val_accuracy: 0.7724 - val_loss: 0.4927
Epoch 6/50
16/16 - 0s - 14ms/step - accuracy: 0.7637 - loss: 0.4736 - val_accuracy: 0.7724 - val_loss: 0.4792
Epoch 7/50
16/16 - 0s - 12ms/step - accuracy: 0.7576 - loss: 0.4797 - val_accuracy: 0.7480 - val_loss: 0.5176
Epoch 8/50
16/16 - 0s - 18ms/step - accuracy: 0.7780 - loss: 0.4894 - val_accuracy: 0.7642 - val_loss: 0.4824
Epoch 9/50
16/16 - 0s - 8ms/step - accuracy: 0.7597 - loss: 0.4748 - val_accuracy: 0.7398 - val_loss: 0.5747
Epoch 10/5

In [46]:
# Evaluate the model on the test set
test_loss, test_accuracy = best_model.evaluate(X_test, y_test, verbose=0)
print(f"Test Accuracy: {test_accuracy * 100:.2f}%")

Test Accuracy: 77.27%
