Skip to content

The aim of this big data project is to design and implement a big data system that can provide real-time context-aware recommendations to drivers on the level of possible danger.

Notifications You must be signed in to change notification settings

khandakerrahin/BDT2022-Group12

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

55 Commits
 
 
 
 
 
 

Repository files navigation

BDT2022-Group12

  • Shaker Mahmud Khandaker
  • Ciro Beneduce

Big Data Technologies Project 2022 - Group 12

The project aims to implement a big data system that provides real-time context-aware recommendations to drivers based on the level of possible danger. A pipeline has been defined, starting from the data collection, passing through real-time data ingestion and ending with a working demo that, using the map of Trent, plots real-time fake accidents taking into account their severity.

Live Demo: Click here

Technologies used:

  • Kafka
  • Apache Spark
  • Cassandra
  • Flask
  • Folium
  • Leaflet
  • MySQL

Web Hosts:

Prerequisites:

  • Python >= 3.7
  • Spark >= 3.0.3
  • Cassandra >= 3.0.19
  • Python packages: confluent_kafka, datetime, json, random, pyspark, folium, flask, time, csv, mysql.connector

Dataset source: Open Data Trentino

How to run the Project

  • run the "kafka_accident_producer.py" (preferably in Crontab)
    • python3 kafka_accident_producer.py
  • run the StreamHandler
    • spark-submit --packages org.apache.spark:spark-sql-kafka-0-10_2.12:3.1.2,org.apache.spark:spark-token-provider-kafka-0-10_2.12:3.1.2,com.datastax.spark:spark-cassandra-connector_2.12:3.2.0 /opt/apps/spark_stream_handler.py
  • run the Flask webapp
    • python3 webapp.py

About

The aim of this big data project is to design and implement a big data system that can provide real-time context-aware recommendations to drivers on the level of possible danger.

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published