# Telecom Customer Churn Case Study

You have been provided with a dataset related to telecom customer churn. Each row in the dataset represents a unique customer, and the columns contain various attributes and information about these customers.

The data set includes information about:
- Churn Column: Indicates customer churn within the last month.
- Services Info: Subscribed services like phone, internet, etc.
- Account Details: Tenure, contract, billing, charges.
- Demographics: Gender, age, and family status.


## Load the dataset in a dataframe

In [30]:
#import necessary libraries

import pandas as pn
import numpy as np

In [31]:
#1. import the provided dataset to dataframe (telecom_customer_churn.csv)
#2. change the settings to display all the columns
#3. check the number of rows and columns
#4. check the top 5 rows

data=pn.read_csv("telecom_customer_churn.csv")
print(data.shape)
data.head(5)


(7043, 21)


Unnamed: 0,customer_id,gender,senior_citizen,partner,dependents,tenure,phone_service,multiple_lines,internet_service,online_security,...,device_protection,tech_support,streaming_tv,streaming_movies,contract,paperless_billing,payment_method,monthly_charges,total_charges,churn
0,7590-VHVEG,Female,0,Yes,No,1,No,No phone service,DSL,No,...,No,No,No,No,Month-to-month,Yes,Electronic check,29.85,29.85,No
1,5575-GNVDE,Male,0,No,No,34,Yes,No,DSL,Yes,...,Yes,No,No,No,One year,No,Mailed check,56.95,1889.5,No
2,3668-QPYBK,Male,0,No,No,2,Yes,No,DSL,Yes,...,No,No,No,No,Month-to-month,Yes,Mailed check,53.85,108.15,Yes
3,7795-CFOCW,Male,0,No,No,45,No,No phone service,DSL,Yes,...,Yes,Yes,No,No,One year,No,Bank transfer (automatic),42.3,1840.75,No
4,9237-HQITU,Female,0,No,No,2,Yes,No,Fiber optic,No,...,No,No,No,No,Month-to-month,Yes,Electronic check,70.7,151.65,Yes


In [32]:
#display all the column names

data.columns

Index(['customer_id', 'gender', 'senior_citizen', 'partner', 'dependents',
       'tenure', 'phone_service', 'multiple_lines', 'internet_service',
       'online_security', 'online_backup', 'device_protection', 'tech_support',
       'streaming_tv', 'streaming_movies', 'contract', 'paperless_billing',
       'payment_method', 'monthly_charges', 'total_charges', 'churn'],
      dtype='object')

In [59]:
# Check if the dataset contains nulls
data.isna().sum()

customer_id           0
gender                0
senior_citizen        0
partner               0
dependents            0
tenure                0
phone_service         0
multiple_lines        0
internet_service      0
online_security       0
online_backup         0
device_protection     0
tech_support          0
streaming_tv          0
streaming_movies      0
contract              0
paperless_billing     0
payment_method        0
monthly_charges       0
total_charges        11
churn                 0
dtype: int64

In [39]:
#check the datatype of all columns

data.dtypes

customer_id          object
gender               object
senior_citizen        int64
partner              object
dependents           object
tenure                int64
phone_service        object
multiple_lines       object
internet_service     object
online_security      object
online_backup        object
device_protection    object
tech_support         object
streaming_tv         object
streaming_movies     object
contract             object
paperless_billing    object
payment_method       object
monthly_charges       int64
total_charges        object
churn                object
dtype: object

In [42]:
# Fix the datatype
#convert the datatype of 'monthly_charges', 'total_charges', 'tenure' to numeric datatype (pd.to_numeric)

# Convert columns to numeric datatype
data["monthly_charges"] = pn.to_numeric(data["monthly_charges"], errors="coerce")
data["total_charges"] = pn.to_numeric(data["total_charges"], errors="coerce")
data["tenure"] = pn.to_numeric(data["tenure"], errors="coerce")

# Check datatypes
print(data.dtypes)


customer_id           object
gender                object
senior_citizen         int64
partner               object
dependents            object
tenure                 int64
phone_service         object
multiple_lines        object
internet_service      object
online_security       object
online_backup         object
device_protection     object
tech_support          object
streaming_tv          object
streaming_movies      object
contract              object
paperless_billing     object
payment_method        object
monthly_charges        int64
total_charges        float64
churn                 object
dtype: object


Q1 - Calculate the mean, median, and mode of the monthly_charges column

In [50]:
print(data["monthly_charges"].mean())
print(data["monthly_charges"].mode())
print(data["monthly_charges"].median())

64.29589663495669
0    19
Name: monthly_charges, dtype: int64
70.0


Q2 - Calculate the 25th, 50th, and 75th percentiles of the total_charges column

In [64]:
data=data.dropna()

print(np.percentile(data["total_charges"],25))
print(np.percentile(data["total_charges"],50))
print(np.percentile(data["total_charges"],75))




401.45
1397.475
3794.7375


Q3 - Calculate the range of monthly_charges column?

Hint - Range is the difference between max and min of monthly_charges.

In [66]:
data["monthly_charges"].max() - data["monthly_charges"].min()

np.int64(100)

Q4 - What is the first quartile of the monthly_charges column for customers who have not churned?

In [78]:
# Filter customers who have not churned
noChurned = data[data["churn"] == "No" ]

# Compute the 25th percentile for numeric columns only
q1 = noChurned.quantile(0.25, numeric_only=True) 

print(q1["monthly_charges"])


25.0


Q5 - What is the third quartile of the total_charges column for customers who have churned?

In [88]:
d = data[data["churn"] == "Yes" ]

# Compute the 75th percentile for numeric columns only
d1 = d.quantile(0.75, numeric_only=True) 

print(d1["total_charges"])

2331.3


Q6-  What is the mode of the payment method column for customers who have churned?

In [91]:
d["payment_method"].mode()

0    Electronic check
Name: payment_method, dtype: object

Q7 - What is the mean of the total charges column for customers who have churned and have a month-to-month contract?

In [102]:
# Filter the rows based on the churn status and contract type

d = data[(data["churn"] == "Yes") & (data["contract"]=="Month-to-month")]
d["total_charges"].mean()

# Calculate the mean of the total charges column


# Print the result


np.float64(1164.4605740181269)

Q8 - What is the median of the tenure column for customers who have not churned and have a two-year contract?

In [106]:
# Filter the rows based on the churn status and contract type

dd=data[(data["churn"]=="No") & (data["contract"]=="Two year")]

# Calculate the median of the tenure column

dd["tenure"].median()
# Print the result


np.float64(64.0)