Collection of pyspark scripts
This repository is a place where I share my scripts created while learning Spark, within the Hortonworks Sandbox.
It explores a bit the data compiled by the website FiveThirtyEight, related to the members of US Congress, from 1947 to 2014. I used PySpark SQL. The source link.USCongress_age.py With the International football results dataset, compiled by Mart Jürisoo, I explored some of the metrics of the Scottish National Team. For the moment, I am not taking into consideration the shootouts data, however, the file is also included in the source files folder. The source link.
football_stats.py