Dockerized Hadoop HDFS with Yarn, Spark and Zeppelin to use with Docker swarm mode.
The images provide Hadoop 2.9.0, Spark 2.2.1 and Zeppelin 0.7.3.
Please have a look into the provided docker-compose.yml
The Hadoop home is located at /usr/local/hadoop
. Thus, the configuration files are located at /usr/local/hadoop/etc/hadoop
. The default filesystem name is configured to hdfs://namenode
, therefore it is important to set the hostname of the container running the hadoop-hdfs-namenode
image to namenode
or change the property in the core-site.xml
.
Located at /usr/local/spark/conf
.
Located at /usr/local/zeppelin/conf
.
To keep HDFS datastorage persistent, provide a volume for the container path /var/hadoop
.
To have persistent notebooks, provide a volume for the container path /usr/local/zeppelin/notebook
.