Skip to content

SeanHorner/spark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

57 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Apache Spark Projects

This repository contains Apache Spark based projects in either Python or Scala. It is intended that each directory contain both implementations. A comprehensive explanation each project and it's specifications are within the project's directory.

training_project_2

This project uses Spark's Streaming API to gather and process Twitter data, analyzing both live stream and historic data to answer some analysis questions such as the most common hashtag being used currently, the most common users mentioned by a specified user, the most common hashtags used by a specific user.

training_project_3

This project uses data from the Our World In Data and the IMF World Economies datasets to probe some interesting questions about the pandemic, it's effects on global economies, and an assessment of how countries responded to the pandemic.

training_project_4

This project uses data from the MeetUp.com API to decypher and chart trends in the data such as which state has the most MeetUp venues, are longer events or shorter events more popular, and what is the most common payment method.

training_project_5

This project uses data from CryptoDataDownload.com's historical exchange data for cryptocurrencies to find trends in price fluctuations as well as symbiotic movements in coin prices.

About

A repository of Apache Spark projects, training projects, and tutorials, in both Scala and Python.

Topics

Resources

Stars

Watchers

Forks