# Telecom Customer Churn Case Study

You have been provided with a dataset related to telecom customer churn. Each row in the dataset represents a unique customer, and the columns contain various attributes and information about these customers.

The data set includes information about:
- Churn Column: Indicates customer churn within the last month.
- Services Info: Subscribed services like phone, internet, etc.
- Account Details: Tenure, contract, billing, charges.
- Demographics: Gender, age, and family status.


## Load the dataset in a dataframe

In [None]:
#import necessary libraries

import pandas as pd
from matplotlib import pyplot as plt

In [None]:
#1. import the provided dataset to dataframe (telecom_customer_churn.csv)
df = pd.read_csv("telecom_customer_churn.csv")
df.contract.unique()

array(['Month-to-month', 'One year', 'Two year'], dtype=object)

In [None]:
#2. change the settings to display all the columns

pd.set_option('display.max_rows',None)

In [None]:
#3. check the number of rows and columns
df.shape

(7043, 21)

In [None]:
#4. check the top 5 rows

df.head()

In [None]:
#display all the column names
print(df.columns)

Index(['customer_id', 'gender', 'senior_citizen', 'partner', 'dependents',
       'tenure', 'phone_service', 'multiple_lines', 'internet_service',
       'online_security', 'online_backup', 'device_protection', 'tech_support',
       'streaming_tv', 'streaming_movies', 'contract', 'paperless_billing',
       'payment_method', 'monthly_charges', 'total_charges', 'churn'],
      dtype='object')


In [None]:
# Check if the dataset contains nulls

df.isnull().sum()

customer_id          0
gender               0
senior_citizen       0
partner              0
dependents           0
tenure               0
phone_service        0
multiple_lines       0
internet_service     0
online_security      0
online_backup        0
device_protection    0
tech_support         0
streaming_tv         0
streaming_movies     0
contract             0
paperless_billing    0
payment_method       0
monthly_charges      0
total_charges        0
churn                0
dtype: int64

In [None]:
#check the datatype of all columns

df.info()

In [None]:
# Fix the datatype
#convert the datatype of 'monthly_charges', 'total_charges', 'tenure' to numeric datatype (pd.to_numeric)


df.monthly_charges = pd.to_numeric(df['monthly_charges'])
df.total_charges = pd.to_numeric(df["total_charges"])
df.tenure = pd.to_numeric(df['tenure'])


In [None]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7043 entries, 0 to 7042
Data columns (total 21 columns):
 #   Column             Non-Null Count  Dtype  
---  ------             --------------  -----  
 0   customer_id        7043 non-null   object 
 1   gender             7043 non-null   object 
 2   senior_citizen     7043 non-null   int64  
 3   partner            7043 non-null   object 
 4   dependents         7043 non-null   object 
 5   tenure             7043 non-null   int64  
 6   phone_service      7043 non-null   object 
 7   multiple_lines     7043 non-null   object 
 8   internet_service   7043 non-null   object 
 9   online_security    7043 non-null   object 
 10  online_backup      7043 non-null   object 
 11  device_protection  7043 non-null   object 
 12  tech_support       7043 non-null   object 
 13  streaming_tv       7043 non-null   object 
 14  streaming_movies   7043 non-null   object 
 15  contract           7043 non-null   object 
 16  paperless_billing  7043 

Q1 - Calculate the mean, median, and mode of the monthly_charges column

In [None]:
df.monthly_charges.mean(),df.monthly_charges.median(),df.monthly_charges.mode()

Q2 - Calculate the 25th, 50th, and 75th percentiles of the total_charges column

In [None]:
df.total_charges.quantile([0.25]),df.total_charges.quantile([0.50]),df.total_charges.quantile([0.75])

Q3 - Calculate the range of monthly_charges column?

Hint - Range is the difference between max and min of monthly_charges.

In [None]:
rangs=df.monthly_charges.max() - df.monthly_charges.min()
rangs

100.5

Q4 - What is the first quartile of the monthly_charges column for customers who have not churned?

In [None]:
Non_churn = df[df["churn"]=='No']
qtl1= Non_churn["monthly_charges"].quantile([0.25])
qtl1

0.25    25.1
Name: monthly_charges, dtype: float64

Q5 - What is the third quartile of the total_charges column for customers who have churned?

In [None]:
churn = df[df["churn"]=='Yes']
qtl3= churn["total_charges"].quantile([0.75])
qtl3

Q6-  What is the mode of the payment method column for customers who have churned?

In [None]:
churn = df[df["churn"]=='Yes']

churn["payment_method"].unique()

array(['Mailed check', 'Electronic check', 'Bank transfer (automatic)',
       'Credit card (automatic)'], dtype=object)

Q7 - What is the mean of the total charges column for customers who have churned and have a month-to-month contract?

In [None]:
# Filter the rows based on the churn status and contract type
churn = df[(df["churn"]=='Yes') & (df["contract"]=="Month-to-month")]

# Calculate the mean of the total charges column

val = churn["total_charges"].mean()

# Print the result
val

Q8 - What is the median of the tenure column for customers who have not churned and have a two-year contract?

In [None]:
# Filter the rows based on the churn status and contract type
Non_churn = df[(df["churn"]=='No') & (df["contract"]=="Two year")]

# Calculate the median of the tenure column

M_N_tenure = Non_churn["tenure"].median()


# Print the result
M_N_tenure


64.0