Infinispan Flink demo
This folder contains a sample job and scripts to demonstrate usage of
in other scenarios different from Hadoop MapReduce, by running an Apache Flink job against data stored in the cache.
- Linux or MacOS X
- Docker should be installed and running.Check it with
- Samples built: run
mvn clean installin the
Note for MacOS users
Add a route so that containers can be reached via their IPs:
sudo route -n add 172.17.0.0/16 `docker-machine ip default`
Preparing the Infinispan cluster
Run the script
./run-clusters.sh to launch a two node Infinispan cluster and a two node Flink cluster.
The Flink admin console can be found at:
Populating the cache
A simple file with 1k random phrases can be generated using:
docker exec -it master /usr/local/sample/target/scripts/generate.sh 1000
Inspect it using:
docker exec -it master more /file.txt
and populate the cache using the command line:
docker exec -it master sh -c "java -cp /usr/local/sample/target/app.jar org.infinispan.hadoop.sample.util.ControllerCache --host ispn-1 --cachename phrases --populate --file /file.txt"
Executing the job
To execute the job
org.infinispan.hadoop.flink.sample.WordFrequency that reads data from the
phrases cache and prints a histogram of the number of words per phrase:
docker exec -it master sh -c "/usr/local/flink/bin/flink run /usr/local/sample/target/app.jar ispn-1"
Changing the job
master docker container automatically maps the current folder to
/usr/local/sample/ inside the container; should you want to change the job, it's enough to rebuild the uber jar and re-run the job to pick up changes
To remove all the docker containers created in this sample:
docker-compose stop docker network rm sample