Skip to content

gustavo-leandro/sparkify-project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 

Repository files navigation

Table of Contents

  1. Project Motivation
  2. Installation
  3. File Descriptions
  4. Results
  5. Licensing, Authors, and Acknowledgements

Project Motivation

This is a project made to predict users who churn from the Sparkify platform.

From a log file of user activities, I did some analysis, treatments and tested some ML models to make this prediction

Installation

This project uses Python 3 and the following libraries:

datetime pandas pyspark.sql pyspark.ml

File Descriptions

There is one files: Sparkify.ipynb - notebook with all the code and analysis made. mini_sparkify_event_data.json - json file with the logs of Sparkify usage.

Results

The Random Forest model obtained the best result in comparison with the other models tested, but you can find more details at the post available here.

Licensing, Authors, Acknowledgements

The data belongs to Udacity. Feel free to use the code here as you would like!

== End ==

About

A spark project from Udacity Data Science Nanodegree

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published