Skip to content

A simulation to automatically collect weather data and visualize it on maps. Tech stack: Kafka, Spark Streaming, Cassandra, Tableau

Notifications You must be signed in to change notification settings

HelloSongi/Spark-Structured-streaming-IoT-Weather-Sensors

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

95 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Project Description

1

Project Description

A CDC architecture in the context of weather IoT sensors collectING real-time data on temp, humidity, air pressure, wind speed, and precipitation. The data will be collected continuously and stored in Cassandra, a highly scalable NoSQL database.

To process the data in real-time, we will be using Apache Kafka and Spark Streaming. Kafka will act as a messaging system to stream the data from the IoT sensors to Spark Streaming. Spark Streaming will process the data in real-time, perform analytics on the data, and output the results to Cassandra.

To visualize and analyze the data, we will be using Tableau. Tableau will connect to Cassandra and generate interactive visualizations to help us gain insights into the weather patterns and trends.

Technology Stack:

  1. Weather IoT sensors
  2. Apache Kafka
  3. Spark Streaming
  4. Cassandra
  5. Tableau

Project Goals:

  1. Collect real-time weather data using IoT sensors
  2. Process the data in real-time using Kafka and Spark Streaming
  3. Store the processed data in Cassandra for easy retrieval and analysis
  4. Generate interactive visualizations using Power Bi to gain insights into the weather patterns and trends

Expected Outcomes:,

  1. Real-time weather data collection and processing
  2. Highly scalable and fault-tolerant data storage in Cassandra
  3. Interactive visualizations of weather patterns and trends in Tableau
  4. Improved decision-making for weather-related activities and operations

Quick Start

  1. Start a Kafka server

    • create a topic called weather
  2. Start a Cassandra database *create a keyspace called weatherSensors (SimpleStrategy, replication=1)

CREATE KEYSPACE weatherSensors
WITH replication = {'class': 'SimpleStrategy, 'replication_factor' : 1};
  1. create a table called weather with the following schema
CREATE TABLE weatherSensors (
 date text,
 locality_name text,
 temperature_max int,
 temperature_min int,
 humidity int,
 wind int,
 wind_direction text,
 PRIMARY KEY (date, locality_name)
);

  1. package up everything in a scala the file using Java/Scala the build tool:
sbt package
  1. Run you spark, Kafka and cassandra using:
spark-submit --class StreamHandler \
--master local[*] \
--packages org.apache.spark:spark-sql-kafka-0-10_2.11:2.4.5,\
	com.datastax.cassandra:cassandra-driver-core:4.0.0,\
	com.datastax.spark:spark-cassandra-connector_2.11:2.4.3 \
target/scala-2.11/stream-handler_2.11-1.0.jar

Then start running your "IoT devices" script:

./IoT_Weather_sens0rs.py
  1. To view data in cassandra DB, run CQL shell ./bin/cqlsh and select * from weather to see if the data is being processed saved correctly!

Dashboarding

maxresdefault

About

A simulation to automatically collect weather data and visualize it on maps. Tech stack: Kafka, Spark Streaming, Cassandra, Tableau

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published