Skip to content

Latest commit

 

History

History
14 lines (10 loc) · 1.3 KB

README.md

File metadata and controls

14 lines (10 loc) · 1.3 KB

Apache Spark Projects

This repository contains Apache Spark based projects in either Python or Scala. It is intended that each directory contain both implementations. A comprehensive explanation each project and it's specifications are within the project's directory.

training_project_2

This project uses Spark's Streaming API to gather and process Twitter data, analyzing both live stream and historic data to answer some analysis questions such as the most common hashtag being used currently, the most common users mentioned by a specified user, the most common hashtags used by a specific user.

training_project_3

This project uses data from the Our World In Data and the IMF World Economies datasets to probe some interesting questions about the pandemic, it's effects on global economies, and an assessment of how countries responded to the pandemic.

training_project_4

This project uses data from the MeetUp.com API to decypher and chart trends in the data such as which state has the most MeetUp venues, are longer events or shorter events more popular, and what is the most common payment method.

training_project_5

This project uses data from CryptoDataDownload.com's historical exchange data for cryptocurrencies to find trends in price fluctuations as well as symbiotic movements in coin prices.