Predicting churn rates is a challenging and common problem that data scientists and analysts regularly encounter in any customer-facing business. It minimizes customer defection by detecting on time which customers are likely to cancel their subscription.
This project is about the prediction process using Spark as a tool to engineer the relevant features and then build machine learning models for the prediction. An end-to-end solution report of a real-world problem using Sparkify dataset provided by Udacity
- Pandas
- Pyspark
- Matplotlib
- Numpy
- Sparkify.ipynb - This is the notebook used for analysis
- README.md
- Running through the notebook gives the result
- Explanation to the result of this analysis can be found here