For this project I would like to use Twitter's API to listen to tweets. I would then create a Spark Streaming applciation that can process the tweets, count the hashtags, and get real-time updates on the trends of user topics.
Library and Package Requirements
- Python 3.7
- Tweepy version 4.12
- Pyspark Version 3.3.1
Spark can be downloaded from: https://spark.apache.org/downloads.html
In order to stream tweet data run tweet_stream_producer.py
In another terminal run tweet_stream_consumer.py to run the spark streaming application