A CDC architecture in the context of weather IoT sensors collectING real-time data on temp, humidity, air pressure, wind speed, and precipitation. The data will be collected continuously and stored in Cassandra, a highly scalable NoSQL database.
To process the data in real-time, we will be using Apache Kafka and Spark Streaming. Kafka will act as a messaging system to stream the data from the IoT sensors to Spark Streaming. Spark Streaming will process the data in real-time, perform analytics on the data, and output the results to Cassandra.
To visualize and analyze the data, we will be using Tableau. Tableau will connect to Cassandra and generate interactive visualizations to help us gain insights into the weather patterns and trends.
- Weather IoT sensors
- Apache Kafka
- Spark Streaming
- Cassandra
- Tableau
- Collect real-time weather data using IoT sensors
- Process the data in real-time using Kafka and Spark Streaming
- Store the processed data in Cassandra for easy retrieval and analysis
- Generate interactive visualizations using Power Bi to gain insights into the weather patterns and trends
- Real-time weather data collection and processing
- Highly scalable and fault-tolerant data storage in Cassandra
- Interactive visualizations of weather patterns and trends in Tableau
- Improved decision-making for weather-related activities and operations
-
Start a Kafka server
- create a topic called
weather
- create a topic called
-
Start a Cassandra database *create a keyspace called
weatherSensors
(SimpleStrategy, replication=1)
CREATE KEYSPACE weatherSensors
WITH replication = {'class': 'SimpleStrategy, 'replication_factor' : 1};
- create a table called weather with the following schema
CREATE TABLE weatherSensors (
date text,
locality_name text,
temperature_max int,
temperature_min int,
humidity int,
wind int,
wind_direction text,
PRIMARY KEY (date, locality_name)
);
- package up everything in a scala the file using Java/Scala the build tool:
sbt package
- Run you spark, Kafka and cassandra using:
spark-submit --class StreamHandler \
--master local[*] \
--packages org.apache.spark:spark-sql-kafka-0-10_2.11:2.4.5,\
com.datastax.cassandra:cassandra-driver-core:4.0.0,\
com.datastax.spark:spark-cassandra-connector_2.11:2.4.3 \
target/scala-2.11/stream-handler_2.11-1.0.jar
Then start running your "IoT devices" script:
./IoT_Weather_sens0rs.py
- To view data in cassandra DB, run CQL shell
./bin/cqlsh
andselect * from weather
to see if the data is being processed saved correctly!