Skip to content

dnguyenngoc/real-time-analytic

Repository files navigation

Real Time Analytic With Druid - Airflow - Kafka - Superset

Kafka Airflow Druid Redis Posgresql Superset

This repo gives an introduction to setting up streaming analytics using open source technologies. We'll use Apache {Kafka, Superset, Druid, Airflow} to set up a system that allows you to get a deeper understanding of the behaviour of your customers. Apache Druid

Screenshots & Gifs

View System



Contents

Example

1. Install docker and docker-compose

https://www.docker.com/

2. Pull git repo

git clone https://github.com/apot-group/real-time-analytic.git

3. Start Server

cd real-time-analytic && docker-compose up

Service URL User/Password
Druid Unified Console http://localhost:8888/ None
Druid Legacy Console http://localhost:8081/ None
Superset http://localhost:8088/ docker exec -it superset bash superset-init
Airflow http://localhost:3000/ a-airflow/app/standalone_admin_password.txt

3. Create Dashboard sample from druid streaming

  • Airflow dags at a-airflow/app/dags/demo.py each one min sent message to kafka 'demo' topic with data of list coin ['BTC', 'ETH', 'BTT', 'DOT'] the structure of data message like below.
   {
        "data_id" : 454,
        "name": 'BTC',
        "timestamp": '2021-02-05T10:10:01'
    }
  • From druid load data from kafka kafka:9092, choice demo topic and config data result table

  • From superset add druid like database sqlalchemy uri: druid://broker:8082/druid/v2/sql/. more detail at Superset-Database-Connect
  • Create Chart and dashboard on superset from demo table.
  • Enjoy! 🔥 🔥 🔥

Contact Us