Sparkify-App

Overview

Motivation

Honing skills of:

Loading large datasets into Spark and manipulating them using Spark SQL and Spark Dataframes
Using the machine learning APIs within Spark ML to build and tune models
Integrating the skills I've learned in the Spark course and the Data Scientist Nanodegree program

Task and Datasets

Our primary task is to predict churned users based on logs of a music app. The size of original datasets is 12GB. Due to the limited computation power of free version of IBM Cloud, a medium-sized sub-datasets is utilized.

Frameworks & Libraries

Pyspark SQL and Pyspark ML

Summary of Project

Data Preprocessing
Exploratory Data Analysis
Feature Engineering
Modeling
Evaluation

Methodology

LogisticRegression was implemented to predict the churn of a customer.

Prediction on test set - Area under ROC - 0.9333 , Accuracy - 83.87% (After Tuning Hyperparameters)

View a detailed analysis report on Medium

Medium Post

Files in the repo

sparkify.ipynb - Analysis in Jupyter Notebook

Acknowledgement

Dataset by Udacity
Jupyter Notebook instruction by Udacity

License

This project is licensed under the MIT License - see the LICENSE file for details

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
LICENSE		LICENSE
README.md		README.md
Sparkify.ipynb		Sparkify.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sparkify-App

Overview

Motivation

Task and Datasets

Frameworks & Libraries

Summary of Project

Methodology

View a detailed analysis report on Medium

Files in the repo

Acknowledgement

License

About

Releases

Packages

Languages

License

rowhitswami/Sparkify-App

Folders and files

Latest commit

History

Repository files navigation

Sparkify-App

Overview

Motivation

Task and Datasets

Frameworks & Libraries

Summary of Project

Methodology

View a detailed analysis report on Medium

Files in the repo

Acknowledgement

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages