# Problem Statement: **AtliQ's Customer Churn Prediction**

### Welcome to AtliQ Electronics, a leading consumer electronics retailer! AtliQ has been facing customer churn issues and is building an AI-powered predictive model to identify customers likely to discontinue their services. However, the data science team observed that the model tends to overfit the training data, leading to poor generalization on new data. Your task is to implement and compare different regularization techniques to build a robust and accurate churn prediction model.


**References**

* Batchnorm1d (PyTorch): [Link](https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm1d.html)


Imports and CUDA

In [None]:
import torch
import numpy as np
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset
from sklearn.metrics import accuracy_score, confusion_matrix
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler, StandardScaler

# Check if CUDA (GPU) is available
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device: {device}")

### Let's do some revisoin first and test your basic understanding of this module!

Problem1: **Effect of L1 Regularization on Sparsity in AtliQ's AI Models**

AtliQ is optimizing its AI models to reduce unnecessary complexity. You are given the following weights from one of AtliQ's neural network layers:



```
weights = torch.tensor([0.5, -0.3, 0.8, -1.5], requires_grad=True)
```

Write a PyTorch snippet to compute the L1 regularization term ``` (|w1| + |w2| + ...) ``` and calculate its gradient using backward().


In [None]:
# weights
weights = # Code Here

# Compute L1 regularization term
l1_regularization = torch.sum(torch.abs(weights))

# compute gradients

# Output gradients
print(f"L1 Regularization: {l1_regularization.item():.4f}")
print(f"Gradients: {weights.grad}")



---



Problem2: **Batch Normalization of AtliQ's Marketing Data**

AtliQ has collected a mini-batch of marketing data:

`inputs = torch.tensor([[10.0, 20.0], [15.0, 25.0], [12.0, 18.0]])`

Manually implement the forward pass of Batch Normalization. Normalize the data using:

$$Normalized= X-μ / σ$$

Here, μ is the mean of each feature, and σ is the standard deviation.

Write code to compute the normalized data without using `nn.BatchNorm`.





In [None]:
# inputs
inputs = # Code Here

# compute mean and standard deviation along each column (feature)
mean = # Code Here
std = # Code Here

# normalize the input
normalized = # Code Here

print(f"Inputs:\n{inputs}")
print(f"Mean: {mean}")
print(f"Standard Deviation: {std}")
print(f"Normalized Outputs:\n{normalized}")



---



### Task: **AtliQ's Customer Churn Prediction**



**Dataset Description**

You are provided with 5000 customer records and 6 features that describe customer behavior over the past 12 months. The provided dataset (**AtliQ_Churn_Prediction_Codebasics_DL.csv)** includes the following attributes:

* Purchase_History
* Support_Tickets
* Last_Purchase_Months
* Product_Categories
* Satisfaction_Score
* Discount_Usage_Rate

Target Variable:

* Churned: 1 if the customer has churned, 0 otherwise.

**Step1**: Load and Prepare the Dataset

In [None]:
# Load dataset
data = # Code Here

# Separate features and target
X = data.drop("Churned", axis=1).values
y = # Code Here

data,info()



---



**Step2**: Split the Dataset

Train : Test :: 70 : 30



---



In [None]:
# Split data
X_train, X_test, y_train, y_test = # Code Here




---



**Step3**: Normalize the features



In [None]:
scaler = StadardScaler()
X_train = # Code Here
X_test = # Code Here



---



**Step4**: Convert to PyTorch Tensors

In [None]:
X_train_tensor = torch.tensor(X_train, dtype=torch.float32)
y_train_tensor = torch.tensor(y_train, dtype=torch.float32).unsqueeze(1)
X_test_tensor = torch.tensor(X_test, dtype=torch.float32)
y_test_tensor = torch.tensor(y_test, dtype=torch.float32).unsqueeze(1)



---



**Step5**: Build a Base Model (without regularization)

Details:
* Input layer: 6 features
* Hidden layers: Two layers with 32 and 16 neurons (ReLU activation)
* Output layer: 1 neuron (Sigmoid activation for binary classification)
* Train for 30 epochs, batch size = 32, learning rate = 0.01.
* Use **BCELoss** and **SGD** as Optimizer

In [None]:
class BaseModel(nn.Module):
    def __init__(self):
        super(BaseModel, self).__init__()
        self.network = nn.Sequential(
           # Code Here
        )

    def forward(self, x):
        return self.network(x)

model = BaseModel()
loss_function = # Code Here
optimizer = optim.SGD(model.parameters(), lr=0.01)




---



**Step6**: Train the Base Model

* epochs = 30
* batch size = 32

In [None]:
# Train the model
epochs =
batch_size =
num_batches = len(X_train_tensor) // batch_size

train_loss_history = []
val_loss_history = []

for epoch in range(epochs):
    model.train()
    epoch_loss = 0

    for i in range(num_batches):
        X_batch = X_train_tensor[i*batch_size:(i+1)*batch_size]
        y_batch = y_train_tensor[i*batch_size:(i+1)*batch_size]

        optimizer.zero_grad()
        y_pred = # Code here
        loss = # code here
        loss.backward()
        optimizer.step()
        epoch_loss += loss.item()

    train_loss_history.append(epoch_loss / num_batches)

    # Validation loss
    model.eval()
    with torch.no_grad():
        val_predictions = # Code Here
        val_loss = # Code Here
        val_loss_history.append(val_loss.item())

    print(f"Epoch {epoch+1}, Training Loss: {epoch_loss/num_batches:.4f}, Validation Loss: {val_loss.item():.4f}")

# Plot training vs validation loss
plt.plot(range(1, epochs+1), train_loss_history, label="Training Loss")
plt.plot(range(1, epochs+1), val_loss_history, label="Validation Loss")
plt.xlabel("Epochs")
plt.ylabel("Loss")
plt.legend()
plt.title("Training vs Validation Loss")
plt.show()



---



**Step7**: Implement Dropout Regularization

Modify the base model to include dropout layers after each dense layer.
* Experiment with dropout rate 0.4.

In [None]:
class DropoutModel(nn.Module):
    def __init__(self):
        super(DropoutModel, self).__init__()
        self.network = nn.Sequential(
            nn.Linear(6, 16),
            nn.ReLU(),
            # Add Dropout
            nn.Linear(32, 16),
            nn.ReLU(),
            # Add Dropout
            nn.Linear(16, 1),
            nn.Sigmoid()
        )

    def forward(self, x):
        return self.network(x)

dropout_model = DropoutModel()
optimizer_dropout = optim.SGD(dropout_model.parameters(), lr=0.01)




---



**Step8**: Training Process for Dropout Regularization

In [None]:
# Train the model for Dropout Regularization (Similar training process as above)
epochs =
batch_size =
num_batches = len(X_train_tensor) // batch_size

train_loss_history = []
val_loss_history = []

for epoch in range(epochs):
    dropout_model.train()
    epoch_loss = 0

    for i in range(num_batches):
        X_batch = X_train_tensor[i*batch_size:(i+1)*batch_size]
        y_batch = y_train_tensor[i*batch_size:(i+1)*batch_size]

        optimizer_dropout.zero_grad()
        y_pred = # Code here
        loss = # code here
        loss.backward()
        optimizer_dropout.step()
        epoch_loss += loss.item()

    train_loss_history.append(epoch_loss / num_batches)

    # Validation loss
    dropout_model.eval()
    with torch.no_grad():
        val_predictions = # Code Here
        val_loss = # Code Here
        val_loss_history.append(val_loss.item())

    print(f"Epoch {epoch+1}, Training Loss: {epoch_loss/num_batches:.4f}, Validation Loss: {val_loss.item():.4f}")

# Plot training vs validation loss
plt.plot(range(1, epochs+1), train_loss_history, label="Training Loss")
plt.plot(range(1, epochs+1), val_loss_history, label="Validation Loss")
plt.xlabel("Epochs")
plt.ylabel("Loss")
plt.legend()
plt.title("Training vs Validation Loss")
plt.show()



---



**Step9**: Apply L2 Regularization

* Use `torch.optim.SGD` with `weight_decay=0.1` for L2 regularization.
* Train the model and monitor validation loss and accuracy.

In [None]:
# L2 regularization (weight decay)
optimizer_l2 = optim.SGD(model.parameters(), lr=0.01, weight_decay=0.1)

# Train the model for L2 Regularization (Similar training process as above)
epochs =
batch_size =
num_batches = len(X_train_tensor) // batch_size

train_loss_history = []
val_loss_history = []

for epoch in range(epochs):
    model.train() # model will remain the same
    epoch_loss = 0

    for i in range(num_batches):
        X_batch = X_train_tensor[i*batch_size:(i+1)*batch_size]
        y_batch = y_train_tensor[i*batch_size:(i+1)*batch_size]

        optimizer_l2.zero_grad()
        y_pred = # Code here
        loss = # code here
        loss.backward()
        optimizer_l2.step()
        epoch_loss += loss.item()

    train_loss_history.append(epoch_loss / num_batches)

    # Validation loss
    model.eval()
    with torch.no_grad():
        val_predictions = # Code Here
        val_loss = # Code Here
        val_loss_history.append(val_loss.item())

    print(f"Epoch {epoch+1}, Training Loss: {epoch_loss/num_batches:.4f}, Validation Loss: {val_loss.item():.4f}")

# Plot training vs validation loss
plt.plot(range(1, epochs+1), train_loss_history, label="Training Loss")
plt.plot(range(1, epochs+1), val_loss_history, label="Validation Loss")
plt.xlabel("Epochs")
plt.ylabel("Loss")
plt.legend()
plt.title("Training vs Validation Loss")
plt.show()




---



**Step10**: Add Batch Normalization

* use BatchNorm1d
* use **SGD** Optimizer with learning rate = 0.01

In [None]:
class BatchNormModel(nn.Module):
    def __init__(self):
        super(BatchNormModel, self).__init__()
        self.layers = nn.Sequential(
            nn.Linear(5, 32),
            # Batch normalization
            nn.ReLU(),
            nn.Linear(32, 16),
            # Batch normalization
            nn.ReLU(),
            nn.Linear(16, 1),
            nn.Sigmoid()
        )

    def forward(self, x):
        return self.layers(x)

# Train the BatchNormModel
batchnorm_model = BatchNormModel()
optimizer_bn = optim.SGD(batchnorm_model.parameters(), lr=0.01)




---



**Step11**: Training Process for Batch Normalization

In [None]:
# Train the model for Batch Normalization (Similar training process as above)
epochs =
batch_size =
num_batches = len(X_train_tensor) // batch_size

train_loss_history = []
val_loss_history = []

for epoch in range(epochs):
    batchnorm_model.train()
    epoch_loss = 0

    for i in range(num_batches):
        X_batch = X_train_tensor[i*batch_size:(i+1)*batch_size]
        y_batch = y_train_tensor[i*batch_size:(i+1)*batch_size]

        optimizer_bn.zero_grad()
        y_pred = # Code here
        loss = # code here
        loss.backward()
        optimizer_bn.step()
        epoch_loss += loss.item()

    train_loss_history.append(epoch_loss / num_batches)

    # Validation loss
    batchnorm_model.eval()
    with torch.no_grad():
        val_predictions = # Code Here
        val_loss = # Code Here
        val_loss_history.append(val_loss.item())

    print(f"Epoch {epoch+1}, Training Loss: {epoch_loss/num_batches:.4f}, Validation Loss: {val_loss.item():.4f}")

# Plot training vs validation loss
plt.plot(range(1, epochs+1), train_loss_history, label="Training Loss")
plt.plot(range(1, epochs+1), val_loss_history, label="Validation Loss")
plt.xlabel("Epochs")
plt.ylabel("Loss")
plt.legend()
plt.title("Training vs Validation Loss")
plt.show()




---

