A churn prediction project for a music streaming service
- Anaconda Python 2.7
- Google Cloud SDK
$ make env
$ source activate churnr
$ make reqs
$ make data
$ make submit
$ make download
├── LICENSE
├── Makefile <- Makefile with commands like `make data` or `make train`
├── README.md <- The top-level README for developers using this project.
├── data
│ ├── README.md <- Data description and source paths
│
├── models <- Model predictions
│
├── notebooks <- Jupyter notebooks.
│
├── reports <- Generated analysis as HTML, PDF, LaTeX, etc.
│ └── figures <- Generated graphics and figures to be used in reporting
│
├── requirements.txt <- The requirements file for reproducing the analysis environment
│
├── churnr <- Source code for use in this project.
│ ├── __init__.py <- Makes src a Python module
│ │
│ ├── app.py <- Application entry point for the experiment dispatcher
│ │
│ ├── submitter.py <- Script for submitting training jobs to CloudML
│ │
│ ├── sample.pys <- Script for sampling user ids for the experimentss
│ │
│ ├── scala/parse.sh <- Dispatches a play context parser job on Dataflow on the sampled data in 'sample.py'
│ │
│ ├── extract.py <- Engineer features and aggregate into timesteps the data parse at 'parse.sh'
│ │
│ └── process.py <- Normalize data from 'extract.py' and export to files in GCS
└── tox.ini <- tox file with settings for running tox; see tox.testrun.org