# Live Demo: Predict Customer Purchase Behavior
This demo shall introduce the class to the programatic implementation of artificial neural networks. <br>
It uses the Predict Customer Purchase Behavior Dataset from Kaggle (https://www.kaggle.com/datasets/rabieelkharoua/predict-customer-purchase-behavior-dataset), which provides information about customers and their purchase behaviour on a fictional shopping website. <br>
The goal of this demo is to train a neural network on the provided data to enable it to predict whether a customer will purchase an item on the website or not, based on his previous interaction with the website.


## 1. Preparation 
Here, we want to import all the necessary modules that are used in the demo. <br>
We will build the neural network using scikit-learn. <br>
Pandas is used for handling and manipulating the data.


In [46]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score, classification_report

## 2. Investigating the Data
Next, we will load the data and explore its content using Pandas.

In [None]:
# Step 1: Load the Dataset
file_path = 'customer_purchase_data.csv' 
data = pd.read_csv(file_path)

# Step 2: Exploratory Data Analysis (EDA)
print(data.head())

The Dataset has 9 columns. Those will be called __features__ from now on. <br>

The features are:
- Age: Customer's age
- Gender: Customer's gender (0: Male, 1: Female)
- Annual Income: Annual income of the customer in dollars
- Number of Purchases: Total number of purchases made by the customer
- Product Category: Category of the purchased product (0: Electronics, 1: Clothing, 2: Home Goods, 3: Beauty, 4: Sports)
- Time Spent on Website: Time spent by the customer on the website in minutes
- Loyalty Program: Whether the customer is a member of the loyalty program (0: No, 1: Yes)
- Discounts Availed: Number of discounts availed by the customer (range: 0-5)
- PurchaseStatus (Target Variable): Likelihood of the customer making a purchase (0: No, 1: Yes)

PurchaseStatus is the column that we want to train our neural network on. 
The distribution of records within the dataset is as following:
- 0 (No Purchase): 48%
- 1 (Purchase): 52%

## 3. Preparing the Dataset
In this step, we will define our features and prepare them to be used for the neural network. This includes scaling the data and splitting it into training data and test data.

In [48]:
# Define Features (X) and Target (y)
X = data.drop('PurchaseStatus', axis=1)  # Use all features except our target variable PurchaseStatus
y = data['PurchaseStatus']

In [49]:
# Standardize features
scaler = StandardScaler()
X = scaler.fit_transform(X)

In [50]:
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

## 4. Training and Evaluating the Model
Next, we will train the model on our data using the __MLPClassifier__ class from scikit-learn and evaluate its performance on the test dataset.

In [None]:
# Step 4: Build and Train the Neural Network
mlp = MLPClassifier(hidden_layer_sizes=(128, 64), activation='relu', solver='adam', max_iter=200, random_state=42)
mlp.fit(X_train, y_train)

In [None]:
# Step 5: Evaluate the Model
y_pred = mlp.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"Test Accuracy: {accuracy}")
print("Classification Report:\n", classification_report(y_test, y_pred))

## 5. Testing the Neural Network with Custom Data
Finally, we will create our own set of features to test the performance of the neural network and investigate the impact of the features on the output of the neural network.

In [None]:
# Step 6: Create a Custom Row for Prediction
custom_row = {
    'Age': 1.5,  
    'Gender': 1,  
    'AnnualIncome': 42300.76,   
    'NumberOfPurchases': 20,  
    'ProductCategory': 4, 
    'TimeSpentOnWebsite': 32.45,  
    'LoyaltyProgram': 1, 
    'DiscountsAvailed': 4,  
}

# Convert custom row to DataFrame
custom_row_df = pd.DataFrame([custom_row])

# Scale custom row
custom_row_scaled = scaler.transform(custom_row_df)

# Predict with the model
custom_prediction = mlp.predict(custom_row_scaled)
print(f"Prediction for custom row: {custom_prediction}")