<a href="https://colab.research.google.com/github/Rushil-K/Deep-Learning/blob/main/ANN/nmrk2627_ANN_DLM_project.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Deep Learning Project 1 : Artificial Neural Networks**

Contributors:
- Rushil Kohli
- Navneet Mittal


# **Executive Summary: ANN Model for Conversion Rate Prediction**

## **Project Overview**

The goal of this project is to predict the conversion rate of customers based on a dataset of 1 million records. The primary objective is to build a predictive model using Artificial Neural Networks (ANN) to determine whether a customer will convert (i.e., make a purchase or take a desired action) based on various customer features. The model was then deployed using **Streamlit**, allowing for real-time predictions and insights through an interactive dashboard.

This report provides a comprehensive overview of the entire process, from data curation to preprocessing, model training, and deployment.

---

## **Dataset Description**

The dataset contains 1 million entries and 8 columns, with the following features:

1. **CustomerID**: A unique identifier for each customer.

2. **Age**: The age of the customer.

3. **Gender**: The gender of the customer (categorical, encoded as 0 for Male and 1 for Female).

4. **Income**: The income of the customer.

5. **Purchases**: The number of purchases made by the customer.

6. **Clicks**: The number of clicks made by the customer on advertisements or products.

7. **Spent**: The amount of money spent by the customer.

8. **Converted**: The binary target variable indicating if the customer converted (1) or not (0).

The target variable, **Converted**, represents the conversion rate, and the other features represent various customer demographics and interactions. The dataset exhibits some class imbalance, with a higher proportion of non-converted customers.

---

## **Steps Taken**

### **1\. Data Curation**

The dataset, containing 1 million records, was curated to ensure data quality and completeness. This involved ensuring that there were no missing values and that the data was clean for model training. The dataset contained both categorical (e.g., **Gender**) and continuous (e.g., **Income**, **Purchases**) variables.

Given the large size of the dataset, careful consideration was given to how the data could be processed efficiently for model training. Additionally, class imbalance in the **Converted** target variable was acknowledged and addressed during preprocessing.

### **2\. Data Preprocessing**

Data preprocessing is a critical step in preparing the dataset for machine learning, particularly when dealing with large datasets. The following preprocessing steps were performed:

* **Encoding Categorical Features**:

  * The **Gender** feature, which is categorical, was encoded into numeric values using **OrdinalEncoder**. Gender was represented as 0 for Male and 1 for Female.

* **Handling Class Imbalance**:

  * Since the target variable **Converted** was imbalanced (more non-converted customers than converted), **SMOTE (Synthetic Minority Over-sampling Technique)** was applied to oversample the minority class and balance the dataset.

* **Standardization**:

  * To ensure that all continuous variables had similar ranges and to improve the efficiency of training, **StandardScaler** was used to standardize features such as **Age**, **Income**, **Purchases**, **Clicks**, and **Spent**.

* **Train-Test Split**:

  * The data was split into a training set (80%) and a test set (20%) using **train\_test\_split** to ensure that the model could be tested on unseen data.

* **Class Weights**:

  * Class weights were computed using **compute\_class\_weight** to account for the class imbalance and ensure the model did not favor the majority class (non-converted customers).

---

### **3\. Model Training**

Once the data was preprocessed, the model training began. The following steps were taken during the training process:

* **Model Architecture**:

  * A **Sequential** ANN model was built using **TensorFlow** and **Keras**. The architecture consisted of an input layer followed by several dense layers with **ReLU** activation functions, and dropout layers to prevent overfitting.

  * The final output layer was a single neuron with a **sigmoid** activation function to classify the data into two classes: converted (1) and not converted (0).

* **Hyperparameter Selection**:

  * Various hyperparameters, such as **epochs**, **learning rate**, **optimizer**, **activation function**, **dropout rate**, and **number of neurons per layer**, were selected through the Streamlit UI.

  * The model used optimizers like **Adam**, **SGD**, and **RMSProp**, with the learning rate being set from options: `0.01`, `0.001`, or `0.0001`.

  * A range of values for **dropout rates** (0.1 to 0.5) and **dense layers** (2 to 5 layers) were provided for experimentation.

* **Training**:

  * The model was trained using the preprocessed training data, with a **batch size of 128** and **epochs** selected based on user input through the Streamlit interface. The model also incorporated **class weights** to address class imbalance.

  * **Early stopping** was not used, but the model was trained for a specified number of epochs with **validation split** set to 0.2 to monitor performance during training.

---

### **4\. Model Evaluation**

After training, the model was evaluated using the following metrics:

* **Test Accuracy**: The accuracy of the model on the test dataset was calculated to evaluate how well the model generalized to unseen data.

* **Test Loss**: The binary cross-entropy loss on the test dataset was measured to assess the model's error rate.

In addition to these metrics, the following evaluation tools were used:

* **Training Performance Plots**:

  * **Accuracy** and **Loss** curves were plotted over epochs to visualize the model’s learning performance. This allowed the identification of any overfitting or underfitting trends.

* **Confusion Matrix**:

  * A confusion matrix was created to visualize how well the model predicted each class (converted or not converted). This provided insight into false positives and false negatives.

* **Classification Report**:

  * A detailed classification report was generated, showing **precision**, **recall**, and **F1-score** for both the converted and non-converted classes.

---

### **5\. Model Interpretation**

To interpret the model's predictions and understand feature importance:

* **SHAP (Shapley Additive Explanations)**:

  * **SHAP values** were calculated for a subset of the test data to explain the model’s predictions. The **summary plot** displayed which features had the most significant impact on conversion predictions.

  * The **mean absolute SHAP values** were calculated to rank features by their importance.

---

### **6\. Dashboard Creation**

After training and evaluation, the model was deployed through a **Streamlit** dashboard. The dashboard provides an interactive interface for users to:

* **Train the Model**:

  * The user can adjust hyperparameters (e.g., learning rate, number of layers, dropout rate) and retrain the model on the data.

* **Visualize Model Performance**:

  * The dashboard displays various performance metrics like **test accuracy**, **test loss**, and **confusion matrix**.

  * **Training performance plots** for accuracy and loss over epochs are shown to visualize the model’s learning.

* **View Evaluation Metrics**:

  * The **classification report** and **feature importance** are available for deeper insights into the model’s behavior.

* **Feature Importance**:

  * The **SHAP values** are displayed, indicating the most influential features in the conversion prediction.

The Streamlit dashboard enables users to easily interact with the trained model and explore the results visually.

---

## **Tech Stack**

* **Google Colab**: Used for data preprocessing, model training, and experimentation.

* **TensorFlow** and **Keras**: Used to build and train the Artificial Neural Network (ANN).

* **Streamlit**: Deployed the model and created an interactive dashboard for visualization and prediction.

* **Scikit-learn**: Utilized for data preprocessing, model evaluation, and performance metrics such as confusion matrix and classification report.

* **SHAP**: Used to interpret the model’s predictions and identify feature importance.

* **SMOTE**: Applied to address the class imbalance by oversampling the minority class.

* **gdown**: Used to download the dataset from Google Drive.

---

## **Conclusion**

This project demonstrates the ability of an Artificial Neural Network (ANN) to predict customer conversion rates based on various customer features. Through the preprocessing steps, class imbalance was addressed, and the model was trained with a carefully selected set of hyperparameters. The model achieved an accuracy of around **50-65%**, providing a reliable starting point for predicting customer conversion.

The interactive **Streamlit dashboard** allows for easy exploration of the model’s performance, evaluation metrics, and feature importance. This project provides valuable insights into customer behavior, and further optimization could improve model performance, making it a practical tool for marketing teams looking to predict customer conversion rates.

### Analysis

In [1]:
# Import necessary libraries
import os
import requests
import io
from io import StringIO
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix, classification_report, accuracy_score
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, BatchNormalization
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint

In [2]:
# Replace with your actual file ID
file_id = '1OPmMFUQmeZuaiYb0FQhwOMZfEbVrWKEK'

# Construct the URL for direct download (using export)
url = f'https://drive.google.com/uc?export=download&id={file_id}'

# Fetch the data using requests

response = requests.get(url)
response.raise_for_status()  # Raise an exception for bad responses

# Read the data into a pandas DataFrame using StringIO
# Specify encoding if needed, e.g., encoding='latin1' or encoding='utf-8'
nmrk2627_df = pd.read_csv(StringIO(response.text), encoding='utf-8')

# Display the head of the dataframe to verify data loading.
display(nmrk2627_df.head())

Unnamed: 0,CustomerID,Age,Gender,Income,Purchases,Clicks,Spent,Converted
0,1,41,Female,52618.0,26,67,2434.0,0
1,2,43,Male,53114.0,3,14,2937.0,0
2,3,43,Female,96145.0,4,78,2076.0,0
3,4,35,Female,92590.0,10,13,1437.0,1
4,5,23,Female,69262.0,14,62,1675.0,1


In [3]:
nmrk2627_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000000 entries, 0 to 999999
Data columns (total 8 columns):
 #   Column      Non-Null Count    Dtype  
---  ------      --------------    -----  
 0   CustomerID  1000000 non-null  int64  
 1   Age         1000000 non-null  int64  
 2   Gender      1000000 non-null  object 
 3   Income      1000000 non-null  float64
 4   Purchases   1000000 non-null  int64  
 5   Clicks      1000000 non-null  int64  
 6   Spent       1000000 non-null  float64
 7   Converted   1000000 non-null  int64  
dtypes: float64(2), int64(5), object(1)
memory usage: 61.0+ MB


## **ANALYSIS**

Adam Optimizer

In [5]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, PolynomialFeatures
from imblearn.over_sampling import SMOTE
from sklearn.utils.class_weight import compute_class_weight
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, BatchNormalization
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint
from tensorflow.keras.optimizers import Adam, RMSprop
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix

# 1. Data Preprocessing and Feature Engineering:

# Assuming 'nmrk2627_df' is your DataFrame
X = nmrk2627_df.drop('Converted', axis=1)
y = nmrk2627_df['Converted']

# One-hot encode 'Gender'
X = pd.get_dummies(X, columns=['Gender'], drop_first=True)

# Split data into train, validation, and test sets
X_train, X_temp, y_train, y_temp = train_test_split(X, y, test_size=0.4, random_state=552627)
X_val, X_test, y_val, y_test = train_test_split(X_temp, y_temp, test_size=0.5, random_state=552627)

# Standardize numerical features for training set
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)

# Standardize validation and test sets using training set's statistics
X_val = scaler.transform(X_val)
X_test = scaler.transform(X_test)

# Handle class imbalance using SMOTE only on training set
smote = SMOTE(random_state=552627)  # Using consistent random state
X_train, y_train = smote.fit_resample(X_train, y_train)

# Add polynomial features (degree=2)
poly = PolynomialFeatures(degree=2)
X_train = poly.fit_transform(X_train)
X_val = poly.transform(X_val)
X_test = poly.transform(X_test)


# Calculate class weights
class_weights = compute_class_weight(class_weight='balanced', classes=np.unique(y_train), y=y_train)
class_weights_dict = {i: weight for i, weight in enumerate(class_weights)}

# 2. Model Building and Training:

model = Sequential()
model.add(Dense(256, activation='relu', input_shape=(X_train.shape[1],)))
model.add(BatchNormalization())  # Added Batch Normalization
model.add(Dropout(0.5))
model.add(Dense(128, activation='relu'))
model.add(BatchNormalization())  # Added Batch Normalization
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))

# Experiment with optimizers and learning rates
optimizer = Adam(learning_rate=0.0001)
# optimizer = RMSprop(learning_rate=0.001)

model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])

# Use Early Stopping to prevent overfitting
early_stopping = EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True)  # Increased patience

# Save the best model during training
model_checkpoint = ModelCheckpoint('best_model.h5', monitor='val_accuracy', save_best_only=True)

# Train the model with a reasonable number of epochs and class weights
model.fit(X_train, y_train, epochs=10, batch_size=256,
          validation_data=(X_val, y_val),
          callbacks=[early_stopping, model_checkpoint],
          class_weight=class_weights_dict)  # Add class_weight


# 3. Prediction and Evaluation:

y_pred = model.predict(X_test)
y_pred_classes = (y_pred > 0.5).astype(int)

# Calculate accuracy
accuracy = accuracy_score(y_test, y_pred_classes)
print(f"Accuracy: {accuracy}")

# Generate classification report
print(classification_report(y_test, y_pred_classes))

# Generate confusion matrix
print(confusion_matrix(y_test, y_pred_classes))

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/10
[1m3276/3276[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.5011 - loss: 0.8541



[1m3276/3276[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m17s[0m 4ms/step - accuracy: 0.5011 - loss: 0.8540 - val_accuracy: 0.3642 - val_loss: 0.7098
Epoch 2/10
[1m3275/3276[0m [32m━━━━━━━━━━━━━━━━━━━[0m[37m━[0m [1m0s[0m 3ms/step - accuracy: 0.5040 - loss: 0.6996



[1m3276/3276[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m17s[0m 3ms/step - accuracy: 0.5040 - loss: 0.6996 - val_accuracy: 0.3770 - val_loss: 0.6977
Epoch 3/10
[1m3273/3276[0m [32m━━━━━━━━━━━━━━━━━━━[0m[37m━[0m [1m0s[0m 3ms/step - accuracy: 0.5030 - loss: 0.6933



[1m3276/3276[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m11s[0m 3ms/step - accuracy: 0.5030 - loss: 0.6933 - val_accuracy: 0.3901 - val_loss: 0.6977
Epoch 4/10
[1m3270/3276[0m [32m━━━━━━━━━━━━━━━━━━━[0m[37m━[0m [1m0s[0m 3ms/step - accuracy: 0.5045 - loss: 0.6931



[1m3276/3276[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 3ms/step - accuracy: 0.5045 - loss: 0.6931 - val_accuracy: 0.4037 - val_loss: 0.6967
Epoch 5/10
[1m3276/3276[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m11s[0m 3ms/step - accuracy: 0.5059 - loss: 0.6931 - val_accuracy: 0.3923 - val_loss: 0.6968
Epoch 6/10
[1m3276/3276[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m17s[0m 5ms/step - accuracy: 0.5066 - loss: 0.6930 - val_accuracy: 0.3827 - val_loss: 0.6971
Epoch 7/10
[1m3263/3276[0m [32m━━━━━━━━━━━━━━━━━━━[0m[37m━[0m [1m0s[0m 2ms/step - accuracy: 0.5057 - loss: 0.6931



[1m3276/3276[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m13s[0m 3ms/step - accuracy: 0.5057 - loss: 0.6931 - val_accuracy: 0.4060 - val_loss: 0.6962
Epoch 8/10
[1m3276/3276[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 3ms/step - accuracy: 0.5073 - loss: 0.6929 - val_accuracy: 0.3800 - val_loss: 0.6981
Epoch 9/10
[1m3276/3276[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m13s[0m 4ms/step - accuracy: 0.5085 - loss: 0.6929 - val_accuracy: 0.3858 - val_loss: 0.6977
Epoch 10/10
[1m3276/3276[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m11s[0m 3ms/step - accuracy: 0.5086 - loss: 0.6929 - val_accuracy: 0.3831 - val_loss: 0.6987
[1m6250/6250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 1ms/step
Accuracy: 0.40476
              precision    recall  f1-score   support

           0       0.70      0.26      0.38    139826
           1       0.30      0.73      0.43     60174

    accuracy  

In [12]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from imblearn.over_sampling import SMOTE
from sklearn.utils.class_weight import compute_class_weight
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix

# 1. Data Preprocessing and Feature Engineering:

# Assuming 'nmrk2627_df' is your DataFrame
X = nmrk2627_df.drop('Converted', axis=1)
y = nmrk2627_df['Converted']

# One-hot encode 'Gender'
encoder = OneHotEncoder(sparse_output=False, handle_unknown='ignore')  # Create OneHotEncoder
encoded_gender = encoder.fit_transform(X[['Gender']])  # Fit and transform Gender column
gender_df = pd.DataFrame(encoded_gender, columns=encoder.get_feature_names_out(['Gender']))  # Create DataFrame
X = X.drop('Gender', axis=1)  # Drop original Gender column
X = pd.concat([X, gender_df], axis=1)  # Concatenate encoded Gender

# Split data into train, validation, and test sets
X_train, X_temp, y_train, y_temp = train_test_split(X, y, test_size=0.4, random_state=552627)
X_val, X_test, y_val, y_test = train_test_split(X_temp, y_temp, test_size=0.5, random_state=552627)

# Standardize numerical features for training set
numerical_features = ['Age', 'Income', 'Purchases', 'Clicks', 'Spent']
scaler = StandardScaler()
X_train[numerical_features] = scaler.fit_transform(X_train[numerical_features])

# Standardize validation and test sets using training set's statistics
X_val[numerical_features] = scaler.transform(X_val[numerical_features])
X_test[numerical_features] = scaler.transform(X_test[numerical_features])

# Handle class imbalance using SMOTE only on training set
smote = SMOTE(random_state=552627)  # Using consistent random state
X_train, y_train = smote.fit_resample(X_train, y_train)

# Calculate class weights
class_weights = compute_class_weight(class_weight='balanced', classes=np.unique(y_train), y=y_train)
class_weights_dict = {i: weight for i, weight in enumerate(class_weights)}

# 2. Model Building and Training:

model = Sequential()
model.add(Dense(128, activation='relu', input_shape=(X_train.shape[1],)))
model.add(Dropout(0.3))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.3))
model.add(Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

# Use Early Stopping to prevent overfitting
early_stopping = EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)

# Save the best model during training
model_checkpoint = ModelCheckpoint('best_model.h5', monitor='val_accuracy', save_best_only=True)

# Train the model and store the history
history = model.fit(X_train, y_train, epochs=10, batch_size=256,
                    validation_data=(X_val, y_val),
                    callbacks=[early_stopping, model_checkpoint],
                    class_weight=class_weights_dict)  # Add class_weight


# 3. Prediction and Evaluation:

y_pred = model.predict(X_test)
y_pred_classes = (y_pred > 0.5).astype(int)

# Calculate accuracy
accuracy = accuracy_score(y_test, y_pred_classes)
print(f"Accuracy: {accuracy}")

# Generate classification report
print(classification_report(y_test, y_pred_classes))

# Generate confusion matrix
print(confusion_matrix(y_test, y_pred_classes))

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/10
[1m3276/3276[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.4996 - loss: 961.8517



[1m3276/3276[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m17s[0m 5ms/step - accuracy: 0.4996 - loss: 961.6132 - val_accuracy: 0.7001 - val_loss: 0.6915
Epoch 2/10
[1m3276/3276[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m15s[0m 3ms/step - accuracy: 0.4986 - loss: 0.8123 - val_accuracy: 0.2999 - val_loss: 0.6956
Epoch 3/10
[1m3276/3276[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 3ms/step - accuracy: 0.5000 - loss: 0.7450 - val_accuracy: 0.7001 - val_loss: 0.6926
Epoch 4/10
[1m3276/3276[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 3ms/step - accuracy: 0.4999 - loss: 0.7053 - val_accuracy: 0.2999 - val_loss: 0.6934
Epoch 5/10
[1m3276/3276[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m11s[0m 3ms/step - accuracy: 0.5000 - loss: 0.6989 - val_accuracy: 0.7001 - val_loss: 0.6923
Epoch 6/10
[1m3276/3276[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 3ms/step - accuracy: 0.498

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


In [15]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from imblearn.over_sampling import ADASYN
from sklearn.utils.class_weight import compute_class_weight
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix

# 1. Data Preprocessing and Feature Engineering:

# Assuming 'nmrk2627_df' is your DataFrame
X = nmrk2627_df.drop('Converted', axis=1)
y = nmrk2627_df['Converted']

# One-hot encode 'Gender'
encoder = OneHotEncoder(sparse_output=False, handle_unknown='ignore')  # Create OneHotEncoder
encoded_gender = encoder.fit_transform(X[['Gender']])  # Fit and transform Gender column
gender_df = pd.DataFrame(encoded_gender, columns=encoder.get_feature_names_out(['Gender']))  # Create DataFrame
X = X.drop('Gender', axis=1)  # Drop original Gender column
X = pd.concat([X, gender_df], axis=1)  # Concatenate encoded Gender

# Split data into train, validation, and test sets
X_train, X_temp, y_train, y_temp = train_test_split(X, y, test_size=0.4, random_state=552627)
X_val, X_test, y_val, y_test = train_test_split(X_temp, y_temp, test_size=0.5, random_state=552627)

# Standardize numerical features for training set
numerical_features = ['Age', 'Income', 'Purchases', 'Clicks', 'Spent']
scaler = StandardScaler()
X_train[numerical_features] = scaler.fit_transform(X_train[numerical_features])

# Standardize validation and test sets using training set's statistics
X_val[numerical_features] = scaler.transform(X_val[numerical_features])
X_test[numerical_features] = scaler.transform(X_test[numerical_features])

# Replace SMOTE with ADASYN
adasyn = ADASYN(random_state=552627)
X_train, y_train = adasyn.fit_resample(X_train, y_train)

# Calculate class weights
class_weights = compute_class_weight(class_weight='balanced', classes=np.unique(y_train), y=y_train)
class_weights_dict = {i: weight for i, weight in enumerate(class_weights)}

# 2. Model Building and Training:

model = Sequential()
model.add(Dense(128, activation='relu', input_shape=(X_train.shape[1],)))
model.add(Dropout(0.3))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.3))
model.add(Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

# Use Early Stopping to prevent overfitting
early_stopping = EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)

# Save the best model during training
model_checkpoint = ModelCheckpoint('best_model.h5', monitor='val_accuracy', save_best_only=True)

# Train the model and store the history
history = model.fit(X_train, y_train, epochs=10, batch_size=256,
                    validation_data=(X_val, y_val),
                    callbacks=[early_stopping, model_checkpoint],
                    class_weight=class_weights_dict)  # Add class_weight


# 3. Prediction and Evaluation:

y_pred = model.predict(X_test)
y_pred_classes = (y_pred > 0.5).astype(int)

# Calculate accuracy
accuracy = accuracy_score(y_test, y_pred_classes)
print(f"Accuracy: {accuracy}")

# Generate classification report
print(classification_report(y_test, y_pred_classes))

# Generate confusion matrix
print(confusion_matrix(y_test, y_pred_classes))

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/10
[1m3397/3397[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.4992 - loss: 1340.9580



[1m3397/3397[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m14s[0m 3ms/step - accuracy: 0.4992 - loss: 1340.6409 - val_accuracy: 0.7001 - val_loss: 0.6931
Epoch 2/10
[1m3397/3397[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m18s[0m 3ms/step - accuracy: 0.4975 - loss: 0.9185 - val_accuracy: 0.2999 - val_loss: 0.6955
Epoch 3/10
[1m3397/3397[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m21s[0m 3ms/step - accuracy: 0.5071 - loss: 0.7390 - val_accuracy: 0.7001 - val_loss: 0.6929
Epoch 4/10
[1m3397/3397[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m20s[0m 3ms/step - accuracy: 0.4990 - loss: 0.7176 - val_accuracy: 0.2999 - val_loss: 0.6948
Epoch 5/10
[1m3397/3397[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m11s[0m 3ms/step - accuracy: 0.5072 - loss: 0.6948 - val_accuracy: 0.7001 - val_loss: 0.6926
Epoch 6/10
[1m3397/3397[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m20s[0m 3ms/step - accuracy: 0.

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
