# Vodafone Customer Attrition Predictor

### `Business Understanding`


#### **Problem Statement:** Vodafone is facing a growing challenge with customer attrition. This rising churn rate poses a threat to the company's operational efficiency and future growth. 

#### **Project Goal:** The aim of the project is to to find the likelihood of a customer leaving the organization, the key indicators of churn as well as the retention strategies that can be implemented to avert this problem.

#### **Stakeholders:** 
 - Vodafone
 - Business Team
 - Marketing Team


#### **Key Metrics and Success Criteria**
 - The model should have an accuracy score of 85%(on balanced data)
 - Good models are expected to have an F1 score of > 0.80 or 80%
 - There should be atleast 4 Baseline models
 - All hyperparameter tuning should only be applied to baseline models if they exceed their F1 score
 - An ROC Curver of 80% is ideal for the model to generalize 



#### **Features**
- Gender -- Whether the customer is a male or a female

- SeniorCitizen -- Whether a customer is a senior citizen or not

- Partner -- Whether the customer has a partner or not (Yes, No)

- Dependents -- Whether the customer has dependents or not (Yes, No)

- Tenure -- Number of months the customer has stayed with the company

- Phone Service -- Whether the customer has a phone service or not (Yes, No)

- MultipleLines -- Whether the customer has multiple lines or not

- InternetService -- Customer's internet service provider (DSL, Fiber Optic, No)

- OnlineSecurity -- Whether the customer has online security or not (Yes, No, No Internet)

- OnlineBackup -- Whether the customer has online backup or not (Yes, No, No Internet)

- DeviceProtection -- Whether the customer has device protection or not (Yes, No, No internet service)

- TechSupport -- Whether the customer has tech support or not (Yes, No, No internet)

- StreamingTV -- Whether the customer has streaming TV or not (Yes, No, No internet service)

- StreamingMovies -- Whether the customer has streaming movies or not (Yes, No, No Internet service)

- Contract -- The contract term of the customer (Month-to-Month, One year, Two year)

- PaperlessBilling -- Whether the customer has paperless billing or not (Yes, No)

- Payment Method -- The customer's payment method (Electronic check, mailed check, Bank transfer(automatic), Credit card(automatic))

- MonthlyCharges -- The amount charged to the customer monthly

- TotalCharges -- The total amount charged to the customer

- Churn -- Whether the customer churned or not (Yes or No)




##### **Null Hypothesis:** There is no statistically significant relationship between customer demographics, service usage patterns, contract details, and the likelihood of customer churn at Vodafone.


##### **Alternate Hypothesis:** There is a statistically significant relationship between customer demographics, service usage patterns, contract details, and the likelihood of customer churn at Vodafone.



##### **Analytical Questions:**
1. Which customer demographics are most strongly associated with churn, and how do factors like age, gender, and senior citizenship status influence churn rates?

2. How does the duration of customer tenure influence the likelihood of churn, and is there a specific tenure period during which customers are more likely to leave?

3. What is the impact of different service types (e.g., DSL, Fiber Optic, no internet service) on customer churn, and which services are most closely associated with higher churn rates?

4. How do contract terms (e.g., month-to-month, one-year, two-year) and billing preferences (e.g., paperless billing) affect customer churn rates?

5. What role do additional services (e.g., online security, tech support, streaming TV, streaming movies) play in influencing customer churn, and which of these services are most effective in retaining customers?






### `DATA UNDERSTANDING`

### **Importations**

In [1]:
# Data Manipulation packages

import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

# Data Preparation packages
import pyodbc     
from dotenv import dotenv_values    #import the dotenv_values function from the dotenv package
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.preprocessing import OneHotEncoder, StandardScaler
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
import warnings 

warnings.filterwarnings('ignore')

### **Importation to Connect to Server**

In [2]:
import pyodbc     
from dotenv import dotenv_values    #import the dotenv_values function from the dotenv package
import warnings 

warnings.filterwarnings('ignore')

### **Load Dataset**

In [3]:
# Connecting to the Dataset
df = pd.read_csv('..\\data\\LP2_Telco-churn-second-2000.csv')
df.head()


#Connecting to Database
#environment_variables = dotenv_values('.env')

# Get the values for the credentials you set in the '.env' file
#server = environment_variables.get('server')
#database = environment_variables.get('database')
#username = environment_variables.get('username')
#password = environment_variables.get('password')



# Load environment variables from .env file into a dictionary
environment_variables = dotenv_values('.env')

# Get the values for the credentials you set in the '.env' file
server = environment_variables.get("server")
database = environment_variables.get("database")
username = environment_variables.get("username")
password = environment_variables.get("password")

In [4]:
# Create a connection string
#connection_string = (
    #f"DRIVER={{ODBC Driver 17 for SQL Server}};"
    #f"SERVER={server};"
    #f"DATABASE={database};"
    #f"UID={username};"
    #f"PWD={password};"
    #f"MARS_Connection=yes;"
    #f"MinProtocolVersion=TLSv1.2;"
#)


# Create a connection string
connection_string = f"DRIVER={{SQL Server}};SERVER={server};DATABASE={database};UID={username};PWD={password};MARS_Connection=yes;MinProtocolVersion=TLSv1.2;"

In [5]:
# Establish the connection
connection = pyodbc.connect(connection_string)
    

#### Key Insights
