# SyriaTel Customer Churn Project

Name:Bedan Kibunja Chege

## Business Understanding

### Project overview
The project aims to build a predictive model to identify customers who are likely to churn (stop doing business) from SyriaTel, a telecommunications company. Churn prediction is crucial for the company because retaining existing customers is often more cost-effective than acquiring new ones. By predicting churn, the company can proactively take steps to retain customers, thereby reducing revenue loss and improving customer satisfaction.

### Problem Statement
The core problem is to develop a binary classification model that accurately predicts whether a customer will churn. Given that customer churn can significantly impact the company's profitability, identifying patterns and factors that contribute to churn is essential. The challenge lies in analyzing the available customer data to uncover insights that can help in formulating strategies to minimize churn.

### Project Objectives
Analyze customer data to identify key features and patterns that contribute to churn.
Develop and validate a binary classification model that predicts customer churn with high accuracy.
Provide actionable insights and recommendations for SyriaTel to implement targeted retention strategies based on the model's findings.
Evaluate the business impact of the predictive model by estimating potential revenue saved through early identification and intervention for at-risk customers.

## Importing Necessary Libraries and Data

In [1]:
# Data Exploration
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Preprocessing and Metric Evaluation
from sklearn.preprocessing import LabelEncoder, OneHotEncoder, StandardScaler
from sklearn.metrics import recall_score, accuracy_score
from sklearn.model_selection import train_test_split, cross_val_score, GridSearchCV
from imblearn.over_sampling import SMOTE

# Models
from sklearn.linear_model import LogisticRegression
from sklearn.dummy import DummyClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.ensemble import AdaBoostClassifier, GradientBoostingClassifier, RandomForestClassifier
import xgboost as xb

In [6]:
#import data
df = pd.read_csv(r"C:\Users\bedan\moringa\SyriaTel-customer-project\Dataset\bigml_59c28831336c6604c800002a.csv")
df.head()

Unnamed: 0,state,account length,area code,phone number,international plan,voice mail plan,number vmail messages,total day minutes,total day calls,total day charge,...,total eve calls,total eve charge,total night minutes,total night calls,total night charge,total intl minutes,total intl calls,total intl charge,customer service calls,churn
0,KS,128,415,382-4657,no,yes,25,265.1,110,45.07,...,99,16.78,244.7,91,11.01,10.0,3,2.7,1,False
1,OH,107,415,371-7191,no,yes,26,161.6,123,27.47,...,103,16.62,254.4,103,11.45,13.7,3,3.7,1,False
2,NJ,137,415,358-1921,no,no,0,243.4,114,41.38,...,110,10.3,162.6,104,7.32,12.2,5,3.29,0,False
3,OH,84,408,375-9999,yes,no,0,299.4,71,50.9,...,88,5.26,196.9,89,8.86,6.6,7,1.78,2,False
4,OK,75,415,330-6626,yes,no,0,166.7,113,28.34,...,122,12.61,186.9,121,8.41,10.1,3,2.73,3,False


## Data Understanding 
The dataset contains 3333 records and 21 features, below is an overview of the columns;

1. State: The location of the customer.

2. Account Length: The number of days the account was held by the customer.

3. Area Code: The area code of the customer.

4. Phone Number: Phone number assigned to the user.

5. International Plan: Indicator of whether the customer has an international plan.

6. Voice Mail Plan: Indicator of whether the customer has a voicemail plan.

7. Number Vmail Messages: Number of voicemails sent.

8. Total Day Minutes: Number of minutes the customer has been in calls during the day.

9. Total Day Calls: Total calls made during the day.

10. Total Day Charge: Billed charge to the customer for all day calls.

11. Total Eve Minutes: Number of minutes the customer has been in calls during the evening.

12. Total Eve Calls: Total calls made during the evening.

13. Total Eve Charge: Billed charge to the customer for all evening calls.

14. Total Night Minutes: Number of minutes the customer has been in calls during the night.

15. Total Night Calls: Total calls made during the night.

16. Total Night Charge: Billed charge to the customer for all night calls.

17. Total Intl Minutes: Total number minutes on international calls.

18. Total Intl Calls: Total internation calls made.

19. Total Intl Charge: Billed charge to the customer for all international calls.

20. Customer Service Calls: Number of calls made to customer service.

21. Churn: Indication of whether the customer terminated their contract.