Repository containing code to aggregate NBA location data using distributed computing technologies. Completed as part of a University Assignment for MS Analytics.
Carries out a relativley simple query over a large dataset containing player x, y and z coordinates and timestamps to calculate distance moved. Implemented in PostgreSQL, Hive & Spark. See Report.pdf for findings.