This is an end-to-end portfolio project analyzing customer churn for a telecom company. The goal was not just to build a report, but to build a complete system that analyzes why customers leave and predicts who will leave next, turning a reactive problem into a proactive business process.
A telecom company was facing a high, undefined customer churn rate, leading to significant revenue loss. The business had customer data but no process to analyze it, creating a critical information gap:
- No Root Cause: They didn't know the primary drivers of churn (price, service, contract terms?).
- No Clear Profile: They couldn't identify who was leaving.
- Reactive vs. Proactive: All retention efforts were reactive, trying to win back customers after they had already left.
- Analyze historical data to find the root cause of churn.
- Monitor current churn KPIs on a live Power BI dashboard.
- Predict which new customers are at high risk of churning in the future.
- Database: SQL Server
- Data Transformation: SQL, Power BI (Power Query)
- Data Visualization: Power BI
- Predictive Modeling: Python (Pandas, Scikit-learn, Random Forest)
- IDE: Jupyter Notebook
[Raw Data (CSV)] ➔ [ETL in SQL Server] ➔ [Analysis in Power BI (Dashboard 1)] ➔ [Export Data] ➔ [Predictive Model in Python (Jupyter)] ➔ [Import Predictions] ➔ [Actionable Dashboard in Power BI (Dashboard 2)]
This dashboard serves as the main analysis tool for monitoring the health of the customer base.
- Key Insight: The "Month-to-Month" contract is the single biggest driver of churn. These customers have a 42.7% churn rate, while "Two Year" contract customers only have a 2.8% rate.
- Other Insights:
- Customers using "Fiber Optic" internet churn more (41.9%) than those on DSL (19.0%).
- Customers who don't have "Online Security" or "Tech Support" are significantly more likely to leave.
This dashboard is the final solution. It uses the Python model's predictions to show the exact list of new customers who are at high risk of churning, allowing the marketing team to target them with proactive retention offers (e.g., "Upgrade to a 1-Year contract for a 10% discount").
A Random Forest Classifier was trained in Python to predict Customer_Status (Churned vs. Stayed).
- Result: The model achieved 84% accuracy on the test set.
- Top 5 Predictors (Feature Importance):
Contract(Month-to-Month)Total RevenueTotal ChargesMonthly_ChargeTotal_Long_Distance_Charges