Skip to content

youheekil/Twitter-Streaming-with-Apache-Kafka-Docker-and-Python

Repository files navigation

Twitter Streaming with Apache Kafka, Docker, and Python

  • setting twitter developer account
  • decide the twitter topic to analyze - Starbucks
  • setting docker - spark streaming might need it soon
  • kafka - try producer and consumer
  • kafka - twitter - data ingestion

Instruction

1. Set a new virtual environment and download requirements.txt

# create a virtual environment
python -m <virtual environment name>

# activate this virtual environment
source venv/bin/activate

# Installing list of packages in requirements.txt
pip install -r requirements.txt

2. Running Kafka with Docker

Docker Setup for Kafka is explained in HERE with details.

3. Get Twitter API Credentials

3-1. Check the link for TwitterAPI for Developer

3-2. Create a credential.json file with Twitter API Credentials

Check the file src/credential.json

4. Create python files - producer.py, consumer.py under src folder for kafka

Check the files src/producer.py and src/consumer.py. Make sure to modify the kafka topic_name and what to track on twitter.

5. Run producer.py and consumer.py for data streaming

Prepare two separate terminals, and run python conumser.py and python producer.py

terminal 1.

python src/consumer.py

terminal 2.

python src/producer.py

Releases

No releases published

Packages

No packages published