Skip to content

pchrabka/PySpark-PyData

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pySpark-pyData

This is an example pySpark application created for pyData Warsaw 2019 talk.

This application uses MovieLens data set as a source data. This data can be downloaded here https://grouplens.org/datasets/movielens/ or using get_data.py script included in this repository.

App development steps:
v1.0 - Initial version
v2.0 - Added config file
v3.0 - Added main.py
v4.0 - Added Makefile
v5.0 - Added UDFs
v6.0 - Added third party dependency
v7.0 - Added tests

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published