# CUSTOMER CHURN PREDICTION

Develop a model to predict customer churn for a subscription- based
service or business. Use historical customer data, including features like
usage behavior and customer demographics, and try algorithms like
Logistic Regression, Random Forests, or Gradient Boosting to predict
churn.

In [2]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report

In [3]:
# Load the dataset
dataset = pd.read_csv('Churn_Modelling.csv')

In [4]:
# Drop unnecessary columns
columns_to_drop = ['RowNumber', 'CustomerId', 'Surname']
dataset.drop(columns=columns_to_drop, inplace=True)

In [5]:
# Encoding
label_encoder = LabelEncoder()
dataset['Gender'] = label_encoder.fit_transform(dataset['Gender'])
dataset = pd.get_dummies(dataset, columns=['Geography'], drop_first=True)

In [6]:
# Separating as X and Y
X = dataset.drop('Exited', axis=1)
y = dataset['Exited']

In [7]:
# Split the data into train and test
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [8]:
# Standardize numerical features
numerical_features = ['CreditScore', 'Age', 'Tenure', 'Balance', 'NumOfProducts', 'HasCrCard', 'IsActiveMember', 'EstimatedSalary']
scaler = StandardScaler()
X_train[numerical_features] = scaler.fit_transform(X_train[numerical_features])
X_test[numerical_features] = scaler.transform(X_test[numerical_features])


In [9]:
# Train Logistic Regression model
model_lr = LogisticRegression(random_state=42)
model_lr.fit(X_train, y_train)
predictions_lr = model_lr.predict(X_test)

In [13]:
# Display Churn Predictions as True or False
churn_predictions = pd.DataFrame({'Churn Prediction': predictions_lr == 1})
print(churn_predictions.tail(25))


      Churn Prediction
1975             False
1976             False
1977             False
1978             False
1979             False
1980             False
1981             False
1982             False
1983             False
1984             False
1985             False
1986             False
1987             False
1988             False
1989             False
1990             False
1991             False
1992             False
1993             False
1994             False
1995             False
1996             False
1997             False
1998             False
1999             False
