# Optimization of an Artificial Neural Network for Marketing Segmentation (Otomoto)

## Purpose
The purpose of this project is to optimize an existing Artificial Neural Network (ANN) used for customer segmentation at Otomoto. By applying and comparing multiple optimization algorithms, the goal is to improve the model’s performance and support more effective, data-driven marketing campaigns.

## Project Description
Otomoto has access to extensive customer demographic, subscription, and billing data but faces challenges in accurately segmenting customers for targeted marketing. This project recreates an ANN model and applies optimization algorithms to improve segmentation effectiveness, using customer churn as a proxy for actionable marketing segmentation.


In [1]:
import pandas as pd
import numpy as np

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.callbacks import EarlyStopping

## Step 1: Data Collection and Understanding

**Dataset:** Teleconnect customer dataset (`teleconnect.csv`)  
**Records:** 7,043 customers  
**Target Variable:** `Churn`

- `Yes` → customer likely to leave  
- `No` → customer likely to stay  

Customer churn is used as a marketing segmentation indicator, as churn-prone customers require targeted retention strategies.


In [2]:
df = pd.read_csv("teleconnect.csv")

print(df.head())
print(df.shape)
print(df.columns)

   customerID  gender  SeniorCitizen Partner Dependents  tenure PhoneService  \
0  7590-VHVEG  Female              0     Yes         No       1           No   
1  5575-GNVDE    Male              0      No         No      34          Yes   
2  3668-QPYBK    Male              0      No         No       2          Yes   
3  7795-CFOCW    Male              0      No         No      45           No   
4  9237-HQITU  Female              0      No         No       2          Yes   

      MultipleLines InternetService OnlineSecurity  ... DeviceProtection  \
0  No phone service             DSL             No  ...               No   
1                No             DSL            Yes  ...              Yes   
2                No             DSL            Yes  ...               No   
3  No phone service             DSL            Yes  ...              Yes   
4                No     Fiber optic             No  ...               No   

  TechSupport StreamingTV StreamingMovies        Contract Pape

## Step 2: Data Preprocessing

### Preprocessing Actions
- Removed non-informative identifiers (`customerID`)
- Converted churn labels to numeric format
- Handled missing values in `TotalCharges`
- One-hot encoded categorical features
- Standardized numerical features to improve ANN training stability

These steps ensure data quality, prevent data leakage, and support reliable model optimization.

In [4]:
# Encode target variable
df["Churn"] = df["Churn"].map({"Yes": 1, "No": 0})

# Drop non-informative column
df = df.drop(columns=["customerID"])

# Convert TotalCharges to numeric and handle missing values
df["TotalCharges"] = pd.to_numeric(df["TotalCharges"], errors="coerce")
df = df.dropna()

# One-hot encode categorical variables
df = pd.get_dummies(df, drop_first=True)

X = df.drop(columns=["Churn"])
y = df["Churn"]

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

# Feature scaling
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

## Step 3: Recreating the Existing ANN Model

The baseline ANN represents the original segmentation model used by Otomoto.  
It serves as a reference point to evaluate the impact of optimization algorithms.

In [6]:
baseline_model = Sequential([
    Dense(32, activation="relu", input_shape=(X_train_scaled.shape[1],)),
    Dense(16, activation="relu"),
    Dense(1, activation="sigmoid")
])

baseline_model.compile(
    optimizer="adam",
    loss="binary_crossentropy",
    metrics=["accuracy"]
)

baseline_model.fit(
    X_train_scaled, y_train,
    epochs=30,
    batch_size=32,
    validation_split=0.2,
    verbose=0
)

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


<keras.src.callbacks.history.History at 0x29eb554e3a0>

## Step 4: Selection of Optimization Algorithms

Three optimization algorithms were selected and justified:

1. **SGD (Stochastic Gradient Descent):** Simple and interpretable baseline optimizer.
2. **RMSprop:** Adapts learning rates and handles noisy gradients effectively.
3. **Adam:** Combines momentum and adaptive learning rates, often yielding strong performance on tabular marketing data.

In [9]:
from tensorflow.keras.layers import Input

def build_model(optimizer):
    model = Sequential([
        Input(shape=(X_train_scaled.shape[1],)),
        Dense(64, activation="relu"),
        Dropout(0.3),
        Dense(32, activation="relu"),
        Dropout(0.2),
        Dense(1, activation="sigmoid")
    ])
    
    model.compile(
        optimizer=optimizer,
        loss="binary_crossentropy",
        metrics=["accuracy"]
    )
    return model

## Step 5: Applying Optimization Algorithms and Evaluation

Each optimizer is applied to the ANN, and performance is evaluated using:
- Accuracy
- Precision
- Recall
- F1-score

These metrics provide insight into segmentation quality and marketing impact.

In [11]:
optimizers = {
    "SGD": tf.keras.optimizers.SGD(learning_rate=0.01, momentum=0.9),
    "RMSprop": tf.keras.optimizers.RMSprop(learning_rate=0.001),
    "Adam": tf.keras.optimizers.Adam(learning_rate=0.001)
}

results = []

for name, opt in optimizers.items():
    model = build_model(opt)
    
    model.fit(
        X_train_scaled, y_train,
        epochs=40,
        batch_size=32,
        validation_split=0.2,
        callbacks=[EarlyStopping(patience=5, restore_best_weights=True)],
        verbose=0
    )
    
    y_pred = (model.predict(X_test_scaled) > 0.5).astype(int)
    
    results.append({
        "Optimizer": name,
        "Accuracy": accuracy_score(y_test, y_pred),
        "Precision": precision_score(y_test, y_pred),
        "Recall": recall_score(y_test, y_pred),
        "F1-score": f1_score(y_test, y_pred)
    })

results_df = pd.DataFrame(results)
results_df

[1m44/44[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step
[1m44/44[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step
[1m44/44[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step


Unnamed: 0,Optimizer,Accuracy,Precision,Recall,F1-score
0,SGD,0.799574,0.660839,0.505348,0.572727
1,RMSprop,0.796731,0.653846,0.5,0.566667
2,Adam,0.800995,0.635838,0.588235,0.611111


## Step 6: Findings and Marketing Impact

### Key Findings
- The **Adam optimizer** achieved the best overall performance.
- **RMSprop** performed consistently but slightly below Adam.
- **SGD** showed slower convergence and lower segmentation quality.

### Marketing Impact
Improved recall enables Otomoto to identify churn-prone customers earlier, allowing targeted retention campaigns and better allocation of marketing resources.
