A demo for realtime dashboard, based on bigdata technology and popular realtime comunication web technology.
- scrawler.py ---> kafka
- kafka ---> wordCounter.py
- wordCounter.py ---> kafka
- kafka ---> app.py
- app.py ---> browser
Note: powered by asciiflow website asciiflow
+--------------+ +------------------+ +----------------+
| | 1 | | 4 | |
| Scrawler +------------------> | Kafka +-------------------> | Flask |
| | | | | |
| | | | | |
+--------------+ +----+---------+---+ +--------+-------+
| ^ |
| | |
2 | |3 |5
| | |
| | |
| | v
+---v---------+----+ +--------+---------+
| | | |
| Spark Stream| | Browser |
| | | |
| | | |
+------------------+ +------------------+
- kafka -- tranfer all data between components
- spark streaming -- data statistics
- scrawler -- get raw data from url.
- flask -- python web framework
- socket.io -- frontend/backend data exchange tunnel
- vue -- popular frontend JS framework
-
pyenv install reference
-
install all other dependencies
./bin/install_deps.sh
- install Kafka
- install Spark(2.1.0)
- frontend build
./bin/build_ui.sh
./bin/start.sh
- run kafka
./bin/start_kafka.sh
- run wordCounter
./bin/start_word_counter.sh
- run flask server
./bin/start_flask.sh
Then go to browser and access url http://127.0.0.1:5000/#/
To get new page content and feed to wordCounter
./bin/start_scrawler.sh
Then go to browser to see changing word cloud.
This project is inspired by chinese bigdata course: http://dblab.xmu.edu.cn/post/8274/
ReadMe is written by Markdown, here is syntax cheatsheet Markdown Syntax