clusterdock + zeppelin
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.


script to link a dockerized zeppelin to a cluster running Cloudera's dockerized CDH (called clusterdock)

Since a standalone Zeppelin comes with an own Apache Spark interpreter, we need to download the Spark binaries (ver 1.6, the same used in CDH provided in clusterdock, i.e. CDH 5.8 provides Spark 1.6.0)


We assume that you already have a dockerized cluster running CDH (started via clusterdock's standard scripts).

  • clone this repository on the same host that is running the (two or more) clusterdock containers.
  • make executable (i.e. cd clusterdock-with-zeppelin; chmod +x
  • run ./

If everything goes fine there should be a newly started Zeppelin container (called zepl, if you didn't edit the script)

The only remaining thing to do is to open the Zeppelin interface (go to your-docker-host:8080), click the Top Right menu (where it says "anonymous"), click Interpreter and scroll (all the way down) to the paragraph that says

spark %spark , %spark.sql , %spark.dep , %spark.pyspark , %spark.r

Click Edit in this paragraph and then fill in yarn-client in the box next to 'master' and add in the Dependencies section the following org.apache.spark:spark-streaming-kafka_2.10:1.6.0