GitHub - yacine-ammi/Mobile-CCP: This repository is created to share the steps that were taken in making my Graduation Thesis for my Applied Statistics Diploma, the project is about creating a machine learning model to predict the churn in a telecom company, this repository includes: The dataset used in the project, the relevant code, and the theisis in pdf.

Project Overview

Objective

The objective of this project is to analyze a public dataset of customers from a telecom company and predict whether a customer will switch to another company, thereby increasing profitability.

Dataset Description

This project utilizes a public dataset of 66,469 customers from an anonymous telecommunications company.
The goal of the project is to predict customer churn and increase profitability for the company.
Data preprocessing and cleaning techniques were used on 66 features before moving to the modelling phase.

Methodology

This project utilized both univariate and multivariate analysis to extract critical insights based on visualizations and correlation matrices. A data preprocessing stage was then required before the modeling phase could begin, where various techniques were used to prepare for the modeling stage. Various classification models were then applied to determine which one performed the best in identifying customers who may churn. The XGBoost model was chosen, evaluated, and interpreted using the SHAP package.

Modelling

Univariate and multivariate analysis were used to extract insights from the dataset.
Several classification models were tested, including ensemble learning methods like XGBoost.
The XGBoost model outperformed the other models with an accuracy of 90%.
Hyperparameter optimization was used to improve recall and meet business needs.

Results

The XGBClassifier model is capable of effectively handling both churners and non-churners.
User spendings was found to be the most relevant feature in predicting customer churn.
SHAP values were used to interpret the model and examine feature influence.

Project Structure

data: contains the raw dataset.
data dictionaey: table in pdf.
notebook: contains Jupyter notebooks for data preprocessing, EDA, and modelling.

Conclusion

This study concluded that ensemble learning algorithms, such as XGBoost, performed the best in predicting customer churn with an accuracy of 90%. The XGBClassifier model is capable of handling both classes effectively in identifying churners and non-churners. The most relevant feature in predicting customer churn is user_spendings. SHAP values were used to interpret the model and examine feature influence, providing a better understanding of the model's behavior. Additionally, we were able to examine and determine the influence of each feature on churn prediction both globally on the whole dataset and individually on two random customers.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
Mobile_Churn-Data.xlsx		Mobile_Churn-Data.xlsx
Mobile_Churn_Data_Dictionary.pdf		Mobile_Churn_Data_Dictionary.pdf
Mobile_Churn_ML_Model.ipynb		Mobile_Churn_ML_Model.ipynb
README.md		README.md
mobile-churn-ml-model.ipynb		mobile-churn-ml-model.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project Overview

Objective

Dataset Description

Methodology

Modelling

Results

Project Structure

Conclusion

About

Releases

Packages

Languages

yacine-ammi/Mobile-CCP

Folders and files

Latest commit

History

Repository files navigation

Project Overview

Objective

Dataset Description

Methodology

Modelling

Results

Project Structure

Conclusion

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages