Business scenario
Project name: Customer Churn Prediction

Business overview: You have been hired by a telecommunications company to develop a machine-learning model that predicts customer churn. The company wants to identify customers who are likely to cancel their service so they can take proactive steps to retain them. The model you develop will be integrated into the company’s customer relationship management (CRM) system and used by the marketing team to target at-risk customers with retention offers.

Project requirements
Predictive accuracy: The model must accurately predict whether a customer is likely to churn based on historical data.

Scalability: The model should be able to handle a large volume of data, as the company has millions of customers.

Integration: The model needs to be easily integrated into the company’s existing CRM system, which is built on a Python-based backend.

Efficiency: The model should be optimized for real-time or near-real-time predictions to allow timely interventions by the marketing team.

In [1]:
import torch
import torch.nn as nn
import torch.optim as optim
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split

In [2]:
data = pd.read_csv('Churn-Dataset-Telecom.csv')

print(data.head())

print(data.info())



In [3]:
data = data.drop(columns=['CustomerID']) #Simplify the dataset
data = data.dropna()  # Simple example of dropping missing values

Encode categorical variables
Convert categorical variables into numerical format using techniques like one-hot encoding.

In [4]:
data = pd.get_dummies(data, drop_first=True)

In [13]:
X = data.drop('Churn', axis=1)
y = data['Churn']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
X_train = X_train.astype(np.float32)
y_train = y_train.astype(np.float32)
X_test = X_test.astype(np.float32)
y_test = y_test.astype(np.float32)

In [10]:
class ChurnModel(nn.Module):
    def __init__(self):
        super(ChurnModel, self).__init__()
        self.fc1 = nn.Linear(X_train.shape[1], 64)
        self.fc2 = nn.Linear(64, 32)
        self.fc3 = nn.Linear(32, 1)

    def forward(self, x):
        x = torch.relu(self.fc1(x))
        x = nn.functional.dropout(x, 0.5, training=self.training)
        x = torch.relu(self.fc2(x))
        x = torch.sigmoid(self.fc3(x))
        return x

model = ChurnModel()

Compilar e treinar o modelo

In [11]:
criterion = nn.BCELoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Ensure the model and data are on the same device
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model.to(device)
X_train_tensor = torch.tensor(X_train.values).float().to(device)
y_train_tensor = torch.tensor(y_train.values).float().to(device)

# Training loop (simplified example)
for epoch in range(10):
    model.train()
    optimizer.zero_grad()
    outputs = model(X_train_tensor)
    loss = criterion(outputs.squeeze(), y_train_tensor)
    loss.backward()
    optimizer.step()

Step 5: Evaluate and optimize the model
Evaluate the model’s performance
After training, evaluate the model using the test data. Consider metrics such as accuracy, precision, recall, and F1 score to assess the model’s performance.

In [14]:
model.eval()
outputs = model(torch.tensor(X_test.values).float())
predictions = (outputs.squeeze().detach().numpy() > 0.5).astype(int)
accuracy = np.mean(predictions == y_test.values)
print(f'Test accuracy: {accuracy}')

