Using historical customer data, we predict whether a particular customer will switch to another telecom provider or not.
- General Info
- Steps
- Model Building
- Model Evaluation
- Conclusions
- Concepts Used
- Technologies Used
- Acknowledgements
- Contact
A telecom firm has collected data of all its customers. The main types of attributes are:
- Demographics (age, gender etc.)
- Services availed (internet packs purchased, special offers taken etc.)
- Expenses (amount of recharge done per month etc.)
Based on all this past information, we built a model which predicts whether a particular customer will churn or not, i.e. whether they will switch to a different service provider or not. So the variable of interest, i.e. the target variable here is ‘Churn’ which will tell us whether or not a particular customer has churned. It is a binary variable - 1 means that the customer has churned and 0 means the customer has not churned.
The datasets that are being used can be found here.
- Data Preparation
- Data Cleaning
- Exploratory Data Analysis
- Random Forest Model for getting important variables
- Logistic Regression model for interpretability
- Business Recommendations
- Decision Tree for interpretation
- Principal Component Analysis for dimensionality reduction
- Ensemble models for improved accuracy
- After the model was built which predicts the probability of churn, cutoff probability was decided using ROC Curve
- Sensitivity - Specificity trade off was kept in mind while evaluating the model on different cutoffs until ideal cutoff was found.
- The model was also evaluated on Precision and Recall for comparison purposes
- Exploratory Data Analysis
- Tree Based Models eg. Decision Trees
- Logistic Regression
- Cross Validation
- Prinipal Component Analysis
- Ensemble Techniques eg. Random Forest, XGBoost
- python
- numpy
- pandas
- matplotlib
- seaborn
- statsmodels
- sklearn
- xgboost
This project was based on case study at IIITB
Created by [nitishpandey04]
email id- nitishpandey2117@gmail.com