Portfolio Purpose
This project analyzes customer churn using machine learning techniques. The goal is to identify key factors contributing to customer churn and build a predictive model to help businesses retain customers.
- The dataset contains customer information, contract details, and payment methods.
- Key categorical and numerical features were analyzed.
- The target variable is
Churn
(0: No, 1: Yes).
- Data cleaning and preprocessing (handling missing values, encoding categorical features).
- Correlation analysis using heatmaps.
- Chi-square tests to check dependency of categorical variables on churn.
- Statistical analysis to identify patterns in numerical data.
β Higher churn among:
- Customers without partners or dependents.
- Month-to-month contract holders.
- Customers using electronic checks.
- Senior citizens.
- Initial model trained to understand feature importance.
- Key predictors: Contract type, payment method, senior citizen status, monthly charges.
- Improved model performance.
- Feature importance analysis showed
Contract Type
as the most significant predictor.
- Accuracy: 80%
- Precision & Recall:
- Churned customers (1): 68% precision, 48% recall.
- Non-churned customers (0): 83% precision, 92% recall.
- The model effectively predicts churn but struggles with recall for actual churned customers.
- Tune hyperparameters for better recall.
- Try advanced models like XGBoost.
- Deploy as a web app for real-time predictions.
π Churn Prediction
βββ π Churn Prediction.ipynb # Jupyter Notebook with full analysis
βββ π README.md # Project documentation
βββ π data/ # Dataset
βββ π models/ # Trained model files
βββ π plots/ # Visualizations
- Python (pandas, numpy, scikit-learn, seaborn, matplotlib)
- Machine Learning: Logistic Regression, Random Forest
- Sabheen Gull
Have suggestions? Feel free to open an issue or contribute!
π Star this repository if you found it helpful! β