Skip to content

About Apache Zeppelin

Juan Rodriguez Hortala edited this page May 6, 2015 · 3 revisions

Install Zeppelin

Following https://zeppelin.incubator.apache.org/docs/install/install.html

git clone https://juanrh@github.com/apache/incubator-zeppelin.git
cd incubator-zeppelin/

Edit zeppelin-server/pom.xml, adding the following dependency to the project zeppelin-server:

<dependency>
    <groupId>asm</groupId>
    <artifactId>asm</artifactId>
    <version>3.3.1</version>
</dependency>

Compile

# replace by suitable versions 
# mvn clean install -DskipTests -Dspark.version=1.3.1 -Dhadoop.version=2.4.0
# For local mode
mvn clean install -DskipTests

The expected result is a maven build success, and the file zeppelin-server/target/lib/asm-3.3.1.jar present at that path.

Now we use a minimal configuration (more details on this in the official installation instructions), and we can start and stop zeppelin, which should be running at http://localhost:8080, using port 8081 for internal communication

cp conf/zeppelin-site.xml.template conf/zeppelin-site.xml
bin/zeppelin-daemon.sh start
bin/zeppelin-daemon.sh stop

See a first minimal notebook below

Running Zeppelin in EMR

TODO In the mailing list I was suggested to use mvn clean install -DskipTests -Pspark-1.3 -Dspark.version=1.3.1 -Phadoop-2.4 -Dhadoop.version=2.4.0. It seems recompilation it's required anyway

A first sample notebook

First paragraph

%spark 
val xs = 2 +3
xs

second paragraph

%md this is **text**

third paragraph

%spark 
val xss = sc.parallelize(1 to 1000)
xss map(x => (x%7, x)) reduceByKey(_+_) collect