<img src="https://www.retently.com/wp-content/uploads/2015/11/leading-causes-of-churn-1.png"/>

<span style="color:#43c175;font-weight:700;font-size:30px">
Customer Churn Prediction </span>
</br>
<span style="color:#f16e64;font-weight:700;font-size:20px">
Project Overview: </span>
In this project, we're diving into the world of customer behavior to predict when customers might decide to leave a service, like switching phone carriers. We're using a cool technique called logistic regression to make these predictions. The dataset we're working with has all sorts of info about these customers, like how long they've been with the service, their monthly charges, and more. Our goal is to build a smart model that can tell us which customers might leave, so we can try to keep them around.

<span style="color:#f16e64;font-weight:700;font-size:20px">
Importing the Dependencies </span>

In [None]:
import pandas as pd
from matplotlib import pyplot as plt
import numpy as np
%matplotlib inline
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import roc_curve, auc
import seaborn as sns

# Suppress warnings
import warnings
warnings.filterwarnings('ignore')

<span style="color:#f16e64;font-weight:700;font-size:20px">
Data Preprocessing </span>

In [None]:
# Load and preprocess data
df = pd.read_csv('/kaggle/input/telco-customer-churn/WA_Fn-UseC_-Telco-Customer-Churn.csv')
df.head()

In [None]:
df.drop('customerID', axis=1, inplace=True)
object_columns = df.select_dtypes(include=['object']).columns.to_list()
df = pd.get_dummies(df, columns = object_columns, drop_first=True)

In [None]:
# Split data
X = df.drop('Churn_Yes', axis=1)
y = df['Churn_Yes']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

<span style="color:#f16e64;font-weight:700;font-size:20px">
Train and Evaluate Model </span>

In [None]:
log_model = LogisticRegression()
log_model.fit(X_train, y_train)
y_pred = log_model.predict(X_test)

In [None]:
# Calculate and display accuracy
accuracy = log_model.score(X_test, y_test)
print("Accuracy:", accuracy)

<span style="color:#f16e64;font-weight:700;font-size:20px">
ROC Curve (AUC-ROC)  </span>

In [None]:
# Get predicted probabilities for positive class
y_prob = log_model.predict_proba(X_test)[:, 1]

# Calculate ROC curve and AUC
fpr, tpr, _ = roc_curve(y_test, y_prob)
roc_auc = auc(fpr, tpr)

# Plot ROC curve
plt.figure(figsize=(8, 6))
plt.plot(fpr, tpr, color='darkorange', lw=2, label='ROC curve (area = %0.2f)' % roc_auc)
plt.plot([0, 1], [0, 1], color='navy', lw=2, linestyle='--')
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver Operating Characteristic')
plt.legend(loc="lower right")
plt.show()
