Udacity_Sparkify

Index

Libraries.
Project Description.
File Description.
Analysis.
Results.
Licensing, Authors, and Acknowledgements

Libraries

You will need the standard data science libraries found in the Anaconda distribution of Python. Especially, the following packages are required:

NumPy Pandas Matplotlib Additionally, PySpark needs to be installed what can be done with pip install pyspark.

Project Description

Predicting churn is a very common, important and challenging task for companies which business model relies on products with a subscription model. It minimizes customer defection by predicting which customers are likely to cancel a subscription to a service and take actions in order to prevent it. In this capstone project, Udacity provided a 12 GB dataset of fictitious user interactions with a music streaming company called Sparkify. Whether the user listens to a song, adds it to a playlist or pushes the thumbs up button, all the user activities are logged and can be used for churn prediction.

The development process for Sparkify churn prediction is divided into three steps:

Exploratory data analysis. Load, exploratory analysis, understanding data...
Feature engineering. Create and transform features. New features: Thumbs up/down, songs count, length...
Modelling and evaluation. 3 models tested through a pipeline and hipertuning grid parameters:

3.1 Random Forest Classifier.

3.2 Logistic Regression.

3.3 Gradient-boosted Tree Classifier.

Using the f1 metric the Random Forest is the best model for predicting Churn.

Files:

Sparkify(1).ipynb: Notebook where the code is. "mini_sparkify_event_data.json": The small data subset we used in this case. Too big for github.

Medium blog https://medium.com/@newcastilian/churn-prediction-with-pyspark-6085efce6f7d

Git: https://github.com/AMTORRES82/Udacity_Sparkify

Licensing, Authors, and Acknowledgements

Thanks to Udacity.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
README.md		README.md
Sparkify (1).ipynb		Sparkify (1).ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Udacity_Sparkify

Index

Libraries

Project Description

Files:

Licensing, Authors, and Acknowledgements

About

Releases

Packages

Languages

AMTORRES82/Udacity_Sparkify

Folders and files

Latest commit

History

Repository files navigation

Udacity_Sparkify

Index

Libraries

Project Description

Files:

Licensing, Authors, and Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages