Business Understanding

The telecommunications industry is fiercely competitive, with customers having numerous service providers to choose from.  High customer churn, the rate at which subscribers discontinue service, is a significant challenge. The average annual churn rate in this industry ranges from 15% to 25%, making customer retention a top priority.  Acquiring new customers is roughly 5 to 10 times more expensive than retaining existing ones, highlighting the importance of keeping current subscribers satisfied.

To effectively reduce churn, telecom companies need to proactively identify customers at high risk of leaving. Common causes of churn include poor customer service, inadequate products, and unfavorable pricing plans. Losing customers can have serious consequences, including increased costs for customer acquisition and product development, a decline in referrals, and a reduced Customer Lifetime Value (CLV).

In today's challenging economic climate, with rising interest rates, inflation, and an uncertain job market, addressing customer churn is even more critical for telecommunications companies.

Problem statement

The Sales and Marketing department at Syria Tel is struggling to retain customers due to high churn rates. This directly impacts revenue and hinders company growth.  Inefficient resource allocation limits their ability to optimize marketing campaigns.  This misallocation wastes budget and reduces the overall effectiveness of marketing efforts.  Furthermore, targeting the wrong customer segments exacerbates churn and undermines profitability.  A customer churn prediction system can address these challenges.

Relation to Syriatel

Implementing a customer churn prediction system offers significant benefits to the Syriatel Sales and Marketing team:

Reduced Churn: Proactive identification and retention of at-risk customers will lead to higher customer lifetime value and increased revenue streams.
Optimized Campaigns: Using customer insights to tailor campaigns will maximize return on investment (ROI) and improve campaign effectiveness.
Precision Marketing: Identifying and engaging high-value prospects will result in higher conversion rates and optimized resource allocation.

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
from imblearn.over_sampling import SMOTE
warnings.filterwarnings('ignore')
from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import OneHotEncoder
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report
from sklearn.model_selection import GridSearchCV                
from sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score
from sklearn.model_selection import cross_val_score
from sklearn.metrics import roc_auc_score
from sklearn.tree import DecisionTreeClassifier
from sklearn import tree
from sklearn.preprocessing import StandardScaler
from sklearn import preprocessing
from sklearn.model_selection import RandomizedSearchCV
from sklearn.model_selection import KFold
from sklearn.metrics import roc_auc_score

In [2]:
#loading data
data_path = r"C:\Users\user\Downloads\archive (4)\bigml_59c28831336c6604c800002a.csv"
data = pd.read_csv(data_path)
data.head()

Unnamed: 0,state,account length,area code,phone number,international plan,voice mail plan,number vmail messages,total day minutes,total day calls,total day charge,...,total eve calls,total eve charge,total night minutes,total night calls,total night charge,total intl minutes,total intl calls,total intl charge,customer service calls,churn
0,KS,128,415,382-4657,no,yes,25,265.1,110,45.07,...,99,16.78,244.7,91,11.01,10.0,3,2.7,1,False
1,OH,107,415,371-7191,no,yes,26,161.6,123,27.47,...,103,16.62,254.4,103,11.45,13.7,3,3.7,1,False
2,NJ,137,415,358-1921,no,no,0,243.4,114,41.38,...,110,10.3,162.6,104,7.32,12.2,5,3.29,0,False
3,OH,84,408,375-9999,yes,no,0,299.4,71,50.9,...,88,5.26,196.9,89,8.86,6.6,7,1.78,2,False
4,OK,75,415,330-6626,yes,no,0,166.7,113,28.34,...,122,12.61,186.9,121,8.41,10.1,3,2.73,3,False
