-
Start the followind containers: gateway, zookeeper, kafka, jobmanager, taskmanager, visualise
make up
-
Ensure all containers are running
docker container ls
Note, if any code changes are made to the content of the gateway or visualise folder, rebuild the docker before creating the containers.
cd <folder_name>
where <folder_name> is either
gateway
orvisualise
docker build .
-
Ensure the gateway is running at http://localhost:8080. Read more about the gateway
-
Run the max pipeline using either a direct runner or a flink runner
- Direct runner
make direct-run
- Flink runner
make flink-compile
make fllink-run
The dashboard for the running Flink job is accessible via http://localhost:8081
To access logs in the flink container, get a shell to the container using
docker exec -it jobmanager /bin/bash
-
Stream live data from wiki api continously to the gateway via a python script
python wiki_stream.py
-
Check if the running max pipeline processed the data and wrote the result to output topic
make read_topic TOPIC=output
-
Visualise the data in the output topic via visualise docker running a jupyter notebook at http://localhost:8088. Read more about visualisation.
-
Stop all dockers
make down