# Neural Networks

In [None]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Load the dataset (assuming the data is available as a CSV file named 'synthetic_data.csv')
# Replace 'synthetic_data.csv' with the actual path to your dataset if it's different.
try:
    df = pd.read_csv('synthetic_data.csv')
except FileNotFoundError:
    # If the file is not found, generate a synthetic dataset
    from sklearn.datasets import make_classification
    X, y = make_classification(n_samples=10000, n_features=50, n_informative=15, n_redundant=5, random_state=42)
    df = pd.DataFrame(X)
    df['target'] = y

# Separate features and target variable
X = df.drop('target', axis=1)
y = df['target']

# Scale the features
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
X_scaled = pd.DataFrame(X_scaled, columns=X.columns) # Convert back to DataFrame to keep column names

# Split the data into training, validation, and test sets
X_train, X_temp, y_train, y_temp = train_test_split(X_scaled, y, test_size=0.3, random_state=42)
X_val, X_test, y_val, y_test = train_test_split(X_temp, y_temp, test_size=0.5, random_state=42)

print("Data loaded and prepared successfully.")
print("Training data shape:", X_train.shape)
print("Validation data shape:", X_val.shape)
print("Test data shape:", X_test.shape)

### Neural Network Model Definition

We will define a simple sequential neural network model using TensorFlow's Keras API.

The model will consist of:
- An input layer that matches the number of features in our scaled data.
- One or more hidden layers with a specified number of neurons and an activation function (e.g., ReLU).
- An output layer with a single neuron and a sigmoid activation function, suitable for binary classification.

In [None]:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

# Define the model
model = Sequential([
    Dense(128, activation='relu', input_shape=(X_train.shape[1],)),
    Dense(64, activation='relu'),
    Dense(1, activation='sigmoid')
])

# Compile the model
model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

model.summary()

### Model Training

Now, we will train the neural network model using the training data (`X_train`, `y_train`) and evaluate its performance on the validation data (`X_val`, `y_val`) during the training process.
We will use the `fit` method of the Keras model, specifying the number of epochs and the batch size.

In [None]:
# Define training parameters
epochs = 20
batch_size = 32

# Train the model
history = model.fit(X_train, y_train,
                    epochs=epochs,
                    batch_size=batch_size,
                    validation_data=(X_val, y_val),
                    verbose=1)

### Model Evaluation

After training, we evaluate the model's performance on the test dataset (`X_test`, `y_test`) to get an unbiased estimate of its performance on new, unseen data. We will use the following metrics:

- **Accuracy**: The proportion of correctly classified instances.
- **Precision**: The proportion of true positive predictions among all positive predictions.
- **Recall**: The proportion of true positive predictions among all actual positive instances.
- **F1-score**: The harmonic mean of precision and recall, providing a balanced measure.

In [None]:
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score

# Evaluate the model on the test data
loss, accuracy = model.evaluate(X_test, y_test, verbose=0)

# Predict the classes for the test data
y_pred = (model.predict(X_test) > 0.5).astype("int32")

# Calculate other metrics
precision = precision_score(y_test, y_pred)
recall = recall_score(y_test, y_pred)
f1 = f1_score(y_test, y_pred)

print(f"Test Accuracy: {accuracy:.4f}")
print(f"Test Precision: {precision:.4f}")
print(f"Test Recall: {recall:.4f}")
print(f"Test F1-score: {f1:.4f}")

### Hyperparameter Tuning

Hyperparameter tuning is the process of finding the optimal set of hyperparameters for a machine learning model. Hyperparameters are parameters that are not learned from the data but are set prior to the training process. Examples include the number of layers, the number of neurons in each layer, the learning rate, and the batch size.

Tuning these hyperparameters can significantly impact the model's performance. We will use Keras Tuner to automate this process and find the best hyperparameters for our neural network.

In [None]:
!pip install keras_tuner -q

### Define the Hypermodel

We will define a function that creates a Keras model with hyperparameters to be tuned. In this example, we will tune the number of neurons in the hidden layers and the learning rate of the optimizer.

In [None]:
from tensorflow.keras.layers import Input
from tensorflow.keras.optimizers import Adam
import keras_tuner as kt

def build_hypermodel(hp):
    model = Sequential()
    model.add(Input(shape=(X_train.shape[1],)))
    # Tune the number of units in the first Dense layer
    model.add(Dense(units=hp.Int('num_units_1', min_value=32, max_value=512, step=32), activation='relu'))
    # Tune the number of units in the second Dense layer
    model.add(Dense(units=hp.Int('num_units_2', min_value=32, max_value=512, step=32), activation='relu'))
    model.add(Dense(1, activation='sigmoid'))

    # Tune the learning rate for the optimizer
    hp_learning_rate = hp.Choice('learning_rate', values=[1e-2, 1e-3, 1e-4])

    model.compile(optimizer=Adam(learning_rate=hp_learning_rate),
                  loss='binary_crossentropy',
                  metrics=['accuracy'])
    return model

# Instantiate the tuner
tuner = kt.RandomSearch(
    build_hypermodel,
    objective='val_accuracy',
    max_trials=10,  # Number of different hyperparameter combinations to try
    executions_per_trial=2, # Number of times to train the model per trial
    directory='my_dir', # Directory to store the results
    project_name='intro_to_kt')

### Run the Hyperparameter Search

We will now run the random search to find the best hyperparameters for our model. The tuner will train the model multiple times with different hyperparameter combinations and keep track of the best-performing ones based on the validation accuracy.

In [None]:
tuner.search(X_train, y_train, epochs=10, validation_data=(X_val, y_val))

### Get the Best Hyperparameters and Model

After the search is complete, we can retrieve the best hyperparameters found by the tuner and the corresponding best model.

In [None]:
# Get the best hyperparameters
best_hps = tuner.get_best_hyperparameters(num_trials=1)[0]

print(f"""
The optimal number of units in the first hidden layer is {best_hps.get('num_units_1')}.
The optimal number of units in the second hidden layer is {best_hps.get('num_units_2')}.
The optimal learning rate for the optimizer is {best_hps.get('learning_rate')}.
""")

# Get the best model
best_model = tuner.get_best_models(num_models=1)[0]

### Feature Importance

Understanding the importance of different features in a machine learning model can provide valuable insights into the data and the model's decision-making process. For neural networks, which are often considered black boxes, techniques like SHAP (SHapley Additive exPlanations) can help shed light on how each feature contributes to the output.

SHAP values represent the average marginal contribution of each feature across all possible permutations of features. By calculating SHAP values for our model, we can identify the features that have the biggest impact on the prediction outcomes.

In [None]:
!pip install shap -q

### Calculate and Visualize SHAP Values

We will use the `shap` library to calculate the SHAP values for our best model. Since our model is a Keras Sequential model, we can use `shap.DeepExplainer` or `shap.KernelExplainer`. `DeepExplainer` is faster for deep learning models, but requires a specific type of model architecture. `KernelExplainer` is more general but can be slower. We'll use `KernelExplainer` for broader applicability.

We'll then visualize the SHAP values to understand which features have the most impact on the model's output. A summary plot is a good way to see the overall feature importance.

In [None]:
import shap
import numpy as np

# Select a background dataset for SHAP - a small sample of the training data is usually sufficient
# Using X_train is fine as it is already scaled
background = X_train.sample(100, random_state=42)

# Create a SHAP explainer object
# Use KernelExplainer for broader compatibility with Keras models
explainer = shap.KernelExplainer(best_model.predict, background)

# Calculate SHAP values for a sample of the test data
# We'll use a small sample for demonstration purposes
X_test_sample = X_test.sample(100, random_state=42)
shap_values = explainer.shap_values(X_test_sample)

# Handle potential list output from shap_values for multi-output models
if isinstance(shap_values, list):
  shap_values = shap_values[0]

# Visualize the feature importance
# The summary plot shows the impact of each feature on the model output
# Pass feature names explicitly
shap.summary_plot(shap_values, X_test_sample, feature_names=X_test_sample.columns.tolist())

### Comprehensive Discussion

In this notebook, we have built and evaluated a neural network model for a binary classification task. We started by loading and preparing the data, including scaling the features. We then defined a sequential neural network model with two dense hidden layers and an output layer with a sigmoid activation function.

We trained the model using the Adam optimizer and binary cross-entropy loss, monitoring the accuracy on a validation set during training. After training, we evaluated the model on a separate test set and calculated key classification metrics: accuracy, precision, recall, and F1-score.

To improve the model's performance, we performed hyperparameter tuning using Keras Tuner's Random Search. We tuned the number of neurons in the hidden layers and the learning rate of the optimizer. The tuner identified the best combination of these hyperparameters based on the validation accuracy.

Furthermore, we explored feature importance using SHAP values to understand which features had the most significant impact on the model's predictions. This provided valuable insights into the data and the model's decision-making process.

Finally, we visualized the model's performance on the test set using a confusion matrix, which clearly shows the number of true positives, true negatives, false positives, and false negatives.

In [None]:
from sklearn.metrics import confusion_matrix
import matplotlib.pyplot as plt
import seaborn as sns

# Get predictions for the test set
y_pred = (best_model.predict(X_test) > 0.5).astype("int32")

# Calculate the confusion matrix
cm = confusion_matrix(y_test, y_pred)

# Visualize the confusion matrix
plt.figure(figsize=(8, 6))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', cbar=False)
plt.xlabel('Predicted Label')
plt.ylabel('True Label')
plt.title('Confusion Matrix')
plt.show()