# TITLE : CUSTOMER CHURN ANALYSIS ON VODAFONE TELCO

## BUSINESS UNDERSTANDNG 

##### In today's fiercely competitive business environment, maximizing profitability and revenue margins is the top priority for every company. Central to achieving this goal is the need to cultivate strong customer relationships and maintain high retention rates. Recognizing the critical role of customer retention in fueling growth and profitability, businesses allocate significant resources to understand and combat churn - the phenomenon of customers ending their association with a company. To tackle churn effectively, companies are increasingly turning to machine learning techniques, particularly classification models. These models, which fall under supervised learning algorithms, analyze historical customer data to forecast future behaviors. By identifying patterns in the data, classification models can predict which category new observations belong to, thus aiding churn analysis.In this project, the main goal is to build a classification model that will aid us to identify which class/category the new data will fall and also help manage customer churn. 

## FORMULATION OF HYPOTHESIS 

#### NULL HYPOTHESIS: There is no significant difference in the effectiveness of the classification model in identifying customer churn compared to random guessing.

#### ALTERNATE HYPOTHESIS: The classification model significantly outperforms random guessing in identifying customer churn, demonstrating its efficacy in managing and reducing churn rates.




##### IMPORTING PACKAGES AND LOADING ENVIRONMENT 

In [2]:
import pandas as pd
import pyodbc 
from dotenv import dotenv_values
import os
import warnings 
import seaborn as sns 
import matplotlib.pyplot as plt
warnings.filterwarnings('ignore')


In [4]:
# LOADING ENVIRONMENT VARIABLES FROM .env File INTO A DICTIONARY 

environment_variables = dotenv_values('.env')

#GETTING VALUES FROM THE CCREDENTILAS SET IN .env FILE 

server = os.getenv('server')
user = os.getenv('user')
password = os.getenv('password')
database = os.getenv('database')
name = os.getenv('name')

# Connection string
conn_str = f'DRIVER=ODBC Driver 17 for SQL Server;SERVER={server};DATABASE={database};UID={user};PWD={password}'

# Establish connection
conn = pyodbc.connect(conn_str)
cursor = conn.cursor()
print("Connection successful!")


Connection successful!


##### SQL query to retrieve data from the table ''dbo.LP2_Telco_churn_first_3000' and Viewing First 5 rows on the Table 


In [5]:
first_datasets = "SELECT * FROM dbo.LP2_Telco_churn_first_3000"


In [6]:
table_1 = pd.read_sql(first_datasets,conn)
table_1.head()

Unnamed: 0,customerID,gender,SeniorCitizen,Partner,Dependents,tenure,PhoneService,MultipleLines,InternetService,OnlineSecurity,...,DeviceProtection,TechSupport,StreamingTV,StreamingMovies,Contract,PaperlessBilling,PaymentMethod,MonthlyCharges,TotalCharges,Churn
0,7590-VHVEG,Female,False,True,False,1,False,,DSL,False,...,False,False,False,False,Month-to-month,True,Electronic check,29.85,29.85,False
1,5575-GNVDE,Male,False,False,False,34,True,False,DSL,True,...,True,False,False,False,One year,False,Mailed check,56.950001,1889.5,False
2,3668-QPYBK,Male,False,False,False,2,True,False,DSL,True,...,False,False,False,False,Month-to-month,True,Mailed check,53.849998,108.150002,True
3,7795-CFOCW,Male,False,False,False,45,False,,DSL,True,...,True,True,False,False,One year,False,Bank transfer (automatic),42.299999,1840.75,False
4,9237-HQITU,Female,False,False,False,2,True,False,Fiber optic,False,...,False,False,False,False,Month-to-month,True,Electronic check,70.699997,151.649994,True


In [8]:
table_1.describe().T

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
tenure,3000.0,32.527333,24.637768,0.0,9.0,29.0,56.0,72.0
MonthlyCharges,3000.0,65.3474,30.137053,18.4,35.787499,70.900002,90.262501,118.650002
TotalCharges,2995.0,2301.278315,2274.987884,18.799999,415.25,1404.650024,3868.725098,8564.75
