A end-to-end real-time stock market data pipeline with Python, AWS EC2, Apache Kafka, and Cassandra Data is processed on AWS EC2 with Apache Kafka and stored in a local Cassandra database.
-
Updated
Jun 7, 2023 - Python
A end-to-end real-time stock market data pipeline with Python, AWS EC2, Apache Kafka, and Cassandra Data is processed on AWS EC2 with Apache Kafka and stored in a local Cassandra database.
This project implements a real-time data pipeline using Apache Kafka, Python's psutil library for metric collection, and SQL Server for data storage. The pipeline collects metrics data from the local computer, processes it through Kafka brokers, and loads it into a SQL Server database. Additionally, a real-time dashboard is created using Power BI.
This project demonstrates real-time data streaming and processing architecture using Kafka, Spark Streaming, and Debezium for capturing CDC (Change Data Capture) events. The pipeline collects transaction data, processes it in real time, and updates a dashboard to display real-time analytics for smartphone data.
Collecting highlights from the Quix community and social media in the form of interesting questions, comments, challenges, solutions and insights
🔥👨💻 Build Big data pipelines with Apache Beam in any language and run it via Spark, Flink, GCP (Google Cloud Dataflow).
Collecting highlights from the Quix community and social media in the form of interesting questions, comments, challenges, solutions and insights
Add a description, image, and links to the real-time-data-pipeline topic page so that developers can more easily learn about it.
To associate your repository with the real-time-data-pipeline topic, visit your repo's landing page and select "manage topics."