🚀 Streaming Data Dashboard: Project Guide

This project requires building a big data streaming dashboard using Kafka and Streamlit, featuring separate real-time and historical data views.

🎯 Architecture & Components

A dual-pipeline architecture separates live streaming from long-term storage and analysis.

Pipeline	Flow	Output
Real-time	Kafka $\rightarrow$ Streamlit (Live Consumer)	`📈 Real-time Streaming View`
Historical	Kafka $\rightarrow$ HDFS OR MongoDB $\rightarrow$ Streamlit (Query)	`📊 Historical Data View`

Mandatory Components

Kafka Producer/Consumer.
HDFS or MongoDB integration.
Two-page Streamlit dashboard with charts.
Robust error handling.

💻 Technical Implementation Tasks

1. Data Producer (`producer.py`)

Create a Kafka Producer that fetches real data from an existing Application Programming Interface (API) (e.g., a public weather API, stock market API, etc.).

Required Data Schema Fields:

timestamp (ISO format)
value (Numeric)
metric_type (String)
sensor_id (String)

2. Dashboard (`app.py`)

Implement the Streamlit logic:

consume_kafka_data(): Real-time processing.
query_historical_data(): Data retrieval from storage.
Create interactive widgets (filters, time-range selector) for the Historical View.

3. Storage Integration

Implement data writing and querying for ONE of the following: HDFS or MongoDB.

🏃‍♂️ Setup & Execution

Prerequisites

Python 3.8+, Apache Kafka, HDFS OR MongoDB.

Setup

Setup environment
- Download miniconda
- Create your python environment
```
conda create -n bigdata python=3.10.13
```

Clone Repo & Install:

git clone [REPO_URL]
conda activate bigdata
pip install -r requirements.txt

Configure: Set up Kafka and your chosen Storage System.
Optional Environment File (.env): Use for connection details.

Execution

Start Kafka Broker (and Controller).
Start Producer:
```
python producer.py
```
Launch Dashboard:
```
streamlit run app.py
```

📦 Deliverables

Submit the following files:

app.py
producer.py
requirements.txt
README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🚀 Streaming Data Dashboard: Project Guide

🎯 Architecture & Components

Mandatory Components

💻 Technical Implementation Tasks

1. Data Producer (`producer.py`)

2. Dashboard (`app.py`)

3. Storage Integration

🏃‍♂️ Setup & Execution

Prerequisites

Setup

Execution

📦 Deliverables

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.gitignore		.gitignore
README.md		README.md
app.py		app.py
producer.py		producer.py
requirements.txt		requirements.txt

qpgacer-coder/big-data-streamlit

Folders and files

Latest commit

History

Repository files navigation

🚀 Streaming Data Dashboard: Project Guide

🎯 Architecture & Components

Mandatory Components

💻 Technical Implementation Tasks

1. Data Producer (producer.py)

2. Dashboard (app.py)

3. Storage Integration

🏃‍♂️ Setup & Execution

Prerequisites

Setup

Execution

📦 Deliverables

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

1. Data Producer (`producer.py`)

2. Dashboard (`app.py`)

Packages