Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ZEPPELIN-1332] Remove spark-dependencies & suggest new way #1339

Closed
wants to merge 43 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
d377cc6
Fix typo comment in interpreter.sh
AhyoungRyu Aug 16, 2016
4f3edfd
Remove spark-dependencies
AhyoungRyu Aug 17, 2016
99ef019
Add spark-2.*-bin-hadoop* to .gitignore
AhyoungRyu Aug 17, 2016
4e8d5ff
Add download-spark.sh file
AhyoungRyu Aug 17, 2016
6784015
Remove useless comment line in common.sh
AhyoungRyu Aug 17, 2016
c866f0b
Remove zeppelin-spark-dependencies from r/pom.xml
AhyoungRyu Aug 18, 2016
3fe19bf
Change SPARK_HOME with proper message
AhyoungRyu Aug 21, 2016
9954523
Check interpreter/spark/ instead of SPARK_HOME
AhyoungRyu Sep 6, 2016
e6973b3
Refactor download-spark.sh
AhyoungRyu Sep 6, 2016
552185a
Revert: remove spark-dependencies
AhyoungRyu Sep 7, 2016
ffe64d9
Remove useless ZEPPELIN_HOME
AhyoungRyu Sep 7, 2016
5ed3311
Change dir of Spark bin to 'local-spark'
AhyoungRyu Sep 8, 2016
1419f0b
Set timeout for travis test
AhyoungRyu Sep 8, 2016
a813d92
Add license header to download-spark.cmd
AhyoungRyu Sep 8, 2016
368c15a
Fix wrong check condition in common.sh
AhyoungRyu Sep 8, 2016
e58075d
Add travis condition to download-spark.sh
AhyoungRyu Sep 8, 2016
89be91b
Remove bin/download-spark.cmd again
AhyoungRyu Sep 12, 2016
b22364d
Remove spark-dependency profiles & reorganize some titles in README.md
AhyoungRyu Sep 12, 2016
24dc95f
Update spark.md to add a guide for local-spark mode
AhyoungRyu Sep 12, 2016
2537fa1
Remove '-Ppyspark' build options
AhyoungRyu Sep 12, 2016
ca534e5
Remove useless creating .bak file process
AhyoungRyu Sep 13, 2016
edd525d
Update install.md & spark.md
AhyoungRyu Sep 13, 2016
a9b110a
Resolve 'sed' command issue between OSX & Linux
AhyoungRyu Sep 14, 2016
f383d3a
Trap ctrl+c during downloading Spark
AhyoungRyu Sep 14, 2016
527ef5b
Remove useless condition
AhyoungRyu Sep 14, 2016
555372a
Make local spark mode with zero-configuration as @moon suggested
AhyoungRyu Sep 20, 2016
de87cb2
Modify SparkRInterpreter.java to enable SparkR without SPARK_HOME
AhyoungRyu Sep 22, 2016
1dd51d8
Remove duplicated variable declaration
AhyoungRyu Sep 22, 2016
f068bef
Update related docs again
AhyoungRyu Sep 22, 2016
437f206
Fix typo in SparkRInterpreter.java
AhyoungRyu Sep 23, 2016
6caef52
Fix rebasing mistake
AhyoungRyu Oct 17, 2016
d8e3aba
Add get-spark option instead of getting user's answer
AhyoungRyu Oct 24, 2016
e97d6bc
Check the existence of local-spark
AhyoungRyu Oct 24, 2016
4240e75
Update Spark version to 2.0.1
AhyoungRyu Oct 31, 2016
3f4bea8
Fix travis CI failure as @astroshim suggested
AhyoungRyu Nov 8, 2016
8df7e24
Update related docs pages
AhyoungRyu Nov 8, 2016
5680fed
Fix typos
AhyoungRyu Nov 8, 2016
3d96bf8
Address @tae-jun's feedback
AhyoungRyu Nov 18, 2016
80b8b99
Update newly added build.md page accordingly
AhyoungRyu Nov 18, 2016
2747d9e
Print notice msg when Zeppelin server start
AhyoungRyu Nov 18, 2016
eb6fa6a
Update vagrant/ related files accordinly
AhyoungRyu Nov 20, 2016
2c1fe15
Address @bzz feedback: update migration notice \w stronger msg
AhyoungRyu Nov 24, 2016
a651f48
Remove unused variables in download-spark.sh
AhyoungRyu Nov 24, 2016
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
spark/derby.log
spark/metastore_db
spark-1.*-bin-hadoop*
spark-2.*-bin-hadoop*
.spark-dist
zeppelin-server/derby.log

Expand Down
14 changes: 7 additions & 7 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,27 +40,27 @@ matrix:

# Test all modules with spark 2.0.0 and scala 2.11
- jdk: "oraclejdk7"
env: SCALA_VER="2.11" SPARK_VER="2.0.0" HADOOP_VER="2.3" PROFILE="-Pspark-2.0 -Phadoop-2.3 -Ppyspark -Psparkr -Pscalding -Pexamples -Pscala-2.11" BUILD_FLAG="package -Pbuild-distr -DskipRat" TEST_FLAG="verify -Pusing-packaged-distr -DskipRat" TEST_PROJECTS=""
env: SCALA_VER="2.11" SPARK_VER="2.0.0" HADOOP_VER="2.3" PROFILE="-Pspark-2.0 -Phadoop-2.3 -Psparkr -Pscalding -Pexamples -Pscala-2.11" BUILD_FLAG="package -Pbuild-distr -DskipRat" TEST_FLAG="verify -Pusing-packaged-distr -DskipRat" TEST_PROJECTS=""

# Test all modules with scala 2.10
- jdk: "oraclejdk7"
env: SCALA_VER="2.10" SPARK_VER="1.6.1" HADOOP_VER="2.3" PROFILE="-Pspark-1.6 -Pr -Phadoop-2.3 -Ppyspark -Psparkr -Pscalding -Pbeam -Pexamples -Pscala-2.10" BUILD_FLAG="package -Pbuild-distr -DskipRat" TEST_FLAG="verify -Pusing-packaged-distr -DskipRat" TEST_PROJECTS=""
env: SCALA_VER="2.10" SPARK_VER="1.6.1" HADOOP_VER="2.3" PROFILE="-Pspark-1.6 -Pr -Phadoop-2.3 -Psparkr -Pscalding -Pbeam -Pexamples -Pscala-2.10" BUILD_FLAG="package -Pbuild-distr -DskipRat" TEST_FLAG="verify -Pusing-packaged-distr -DskipRat" TEST_PROJECTS=""

# Test all modules with scala 2.11
- jdk: "oraclejdk7"
env: SCALA_VER="2.11" SPARK_VER="1.6.1" HADOOP_VER="2.3" PROFILE="-Pspark-1.6 -Pr -Phadoop-2.3 -Ppyspark -Psparkr -Pscalding -Pexamples -Pscala-2.11" BUILD_FLAG="package -Pbuild-distr -DskipRat" TEST_FLAG="verify -Pusing-packaged-distr -DskipRat" TEST_PROJECTS=""
env: SCALA_VER="2.11" SPARK_VER="1.6.1" HADOOP_VER="2.3" PROFILE="-Pspark-1.6 -Pr -Phadoop-2.3 -Psparkr -Pscalding -Pexamples -Pscala-2.11" BUILD_FLAG="package -Pbuild-distr -DskipRat" TEST_FLAG="verify -Pusing-packaged-distr -DskipRat" TEST_PROJECTS=""

# Test spark module for 1.5.2
- jdk: "oraclejdk7"
env: SCALA_VER="2.10" SPARK_VER="1.5.2" HADOOP_VER="2.3" PROFILE="-Pspark-1.5 -Pr -Phadoop-2.3 -Ppyspark -Psparkr" BUILD_FLAG="package -DskipTests -DskipRat" TEST_FLAG="verify -DskipRat" TEST_PROJECTS="-pl zeppelin-interpreter,zeppelin-zengine,zeppelin-server,zeppelin-display,spark-dependencies,spark,r -Dtest=org.apache.zeppelin.rest.*Test,org.apache.zeppelin.spark.* -DfailIfNoTests=false"
env: SCALA_VER="2.10" SPARK_VER="1.5.2" HADOOP_VER="2.3" PROFILE="-Pspark-1.5 -Pr -Phadoop-2.3 -Psparkr" BUILD_FLAG="package -DskipTests -DskipRat" TEST_FLAG="verify -DskipRat" TEST_PROJECTS="-pl zeppelin-interpreter,zeppelin-zengine,zeppelin-server,zeppelin-display,spark,r -Dtest=org.apache.zeppelin.rest.*Test,org.apache.zeppelin.spark.* -DfailIfNoTests=false"

# Test spark module for 1.4.1
- jdk: "oraclejdk7"
env: SCALA_VER="2.10" SPARK_VER="1.4.1" HADOOP_VER="2.3" PROFILE="-Pspark-1.4 -Pr -Phadoop-2.3 -Ppyspark -Psparkr" BUILD_FLAG="package -DskipTests -DskipRat" TEST_FLAG="verify -DskipRat" TEST_PROJECTS="-pl zeppelin-interpreter,zeppelin-zengine,zeppelin-server,zeppelin-display,spark-dependencies,spark,r -Dtest=org.apache.zeppelin.rest.*Test,org.apache.zeppelin.spark.* -DfailIfNoTests=false"
env: SCALA_VER="2.10" SPARK_VER="1.4.1" HADOOP_VER="2.3" PROFILE="-Pspark-1.4 -Pr -Phadoop-2.3 -Psparkr" BUILD_FLAG="package -DskipTests -DskipRat" TEST_FLAG="verify -DskipRat" TEST_PROJECTS="-pl zeppelin-interpreter,zeppelin-zengine,zeppelin-server,zeppelin-display,spark,r -Dtest=org.apache.zeppelin.rest.*Test,org.apache.zeppelin.spark.* -DfailIfNoTests=false"

# Test selenium with spark module for 1.6.1
- jdk: "oraclejdk7"
env: TEST_SELENIUM="true" SCALA_VER="2.10" SPARK_VER="1.6.1" HADOOP_VER="2.3" PROFILE="-Pspark-1.6 -Phadoop-2.3 -Ppyspark -Pexamples" BUILD_FLAG="package -DskipTests -DskipRat" TEST_FLAG="verify -DskipRat" TEST_PROJECTS="-pl zeppelin-interpreter,zeppelin-zengine,zeppelin-server,zeppelin-display,spark-dependencies,spark -Dtest=org.apache.zeppelin.AbstractFunctionalSuite -DfailIfNoTests=false"
env: TEST_SELENIUM="true" SCALA_VER="2.10" SPARK_VER="1.6.1" HADOOP_VER="2.3" PROFILE="-Pspark-1.6 -Phadoop-2.3 -Pexamples" BUILD_FLAG="package -DskipTests -DskipRat" TEST_FLAG="verify -DskipRat" TEST_PROJECTS="-pl zeppelin-interpreter,zeppelin-zengine,zeppelin-server,zeppelin-display,spark -Dtest=org.apache.zeppelin.AbstractFunctionalSuite -DfailIfNoTests=false"

before_install:
- "ls -la .spark-dist ${HOME}/.m2/repository/.cache/maven-download-plugin"
Expand All @@ -77,6 +77,7 @@ install:

before_script:
- travis_retry ./testing/downloadSpark.sh $SPARK_VER $HADOOP_VER
- export SPARK_HOME=`pwd`/spark-$SPARK_VER-bin-hadoop$HADOOP_VER
- echo "export SPARK_HOME=`pwd`/spark-$SPARK_VER-bin-hadoop$HADOOP_VER" > conf/zeppelin-env.sh
- tail conf/zeppelin-env.sh

Expand All @@ -95,4 +96,3 @@ after_failure:
- cat zeppelin-distribution/target/zeppelin-*-SNAPSHOT/zeppelin-*-SNAPSHOT/logs/zeppelin*.out
- cat zeppelin-web/npm-debug.log
- cat spark-*/logs/*

6 changes: 2 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,9 +21,7 @@ To know more about Zeppelin, visit our web site [http://zeppelin.apache.org](htt
## Getting Started

### Install binary package
Please go to [install](http://zeppelin.apache.org/docs/snapshot/install/install.html) to install Apache Zeppelin from binary package.
Please refer to [Zeppelin installation guide](http://zeppelin.apache.org/docs/snapshot/install/install.html) to install Apache Zeppelin from binary package.

### Build from source
Please check [Build from source](http://zeppelin.apache.org/docs/snapshot/install/build.html) to build Zeppelin from source.


Please check [How to build Zeppelin from source](http://zeppelin.apache.org/docs/snapshot/install/build.html) to build Zeppelin.
40 changes: 27 additions & 13 deletions bin/common.sh
Original file line number Diff line number Diff line change
Expand Up @@ -58,31 +58,31 @@ fi

ZEPPELIN_CLASSPATH+=":${ZEPPELIN_CONF_DIR}"

function addEachJarInDir(){
function addEachJarInDir() {
if [[ -d "${1}" ]]; then
for jar in $(find -L "${1}" -maxdepth 1 -name '*jar'); do
ZEPPELIN_CLASSPATH="$jar:$ZEPPELIN_CLASSPATH"
done
fi
}

function addEachJarInDirRecursive(){
function addEachJarInDirRecursive() {
if [[ -d "${1}" ]]; then
for jar in $(find -L "${1}" -type f -name '*jar'); do
ZEPPELIN_CLASSPATH="$jar:$ZEPPELIN_CLASSPATH"
done
fi
}

function addEachJarInDirRecursiveForIntp(){
function addEachJarInDirRecursiveForIntp() {
if [[ -d "${1}" ]]; then
for jar in $(find -L "${1}" -type f -name '*jar'); do
ZEPPELIN_INTP_CLASSPATH="$jar:$ZEPPELIN_INTP_CLASSPATH"
done
fi
}

function addJarInDir(){
function addJarInDir() {
if [[ -d "${1}" ]]; then
ZEPPELIN_CLASSPATH="${1}/*:${ZEPPELIN_CLASSPATH}"
fi
Expand All @@ -96,17 +96,31 @@ function addJarInDirForIntp() {

ZEPPELIN_COMMANDLINE_MAIN=org.apache.zeppelin.utils.CommandLineUtils

function getZeppelinVersion(){
if [[ -d "${ZEPPELIN_HOME}/zeppelin-server/target/classes" ]]; then
ZEPPELIN_CLASSPATH+=":${ZEPPELIN_HOME}/zeppelin-server/target/classes"
fi
addJarInDir "${ZEPPELIN_HOME}/zeppelin-server/target/lib"
CLASSPATH+=":${ZEPPELIN_CLASSPATH}"
$ZEPPELIN_RUNNER -cp $CLASSPATH $ZEPPELIN_COMMANDLINE_MAIN -v
exit 0
function getZeppelinVersion() {
if [[ -d "${ZEPPELIN_HOME}/zeppelin-server/target/classes" ]]; then
ZEPPELIN_CLASSPATH+=":${ZEPPELIN_HOME}/zeppelin-server/target/classes"
fi
addJarInDir "${ZEPPELIN_HOME}/zeppelin-server/target/lib"
CLASSPATH+=":${ZEPPELIN_CLASSPATH}"
$ZEPPELIN_RUNNER -cp $CLASSPATH $ZEPPELIN_COMMANDLINE_MAIN -v
exit 0
}


SPARK_VERSION="2.0.1"
HADOOP_VERSION="2.7"
SPARK_CACHE="local-spark"
SPARK_ARCHIVE="spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}"

function downloadSparkBinary() {
if [[ ! -d "${SPARK_CACHE}/${SPARK_ARCHIVE}" ]]; then
. "${ZEPPELIN_HOME}/bin/download-spark.sh"
else
echo -e "${SPARK_ARCHIVE} already exists under local-spark."
fi
}

# Text encoding for
# Text encoding for
# read/write job into files,
# receiving/displaying query/result.
if [[ -z "${ZEPPELIN_ENCODING}" ]]; then
Expand Down
76 changes: 76 additions & 0 deletions bin/download-spark.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
#!/bin/bash
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

bin=$(dirname "${BASH_SOURCE-$0}")
bin=$(cd "${bin}">/dev/null; pwd)

. "${bin}/common.sh"

# Download Spark binary package from the given URL.
# Ties 3 times with 1s delay
# Arguments: url - source URL
function download_with_retry() {
local url="$1"
curl -O --retry 3 --retry-delay 1 "${url}"

if [[ "$?" -ne 0 ]]; then
echo -e "\nStop downloading with unexpected error."
fi
}

function unzip_spark_bin() {
if ! tar zxf "${SPARK_ARCHIVE}.tgz" ; then
echo "Unable to extract ${SPARK_ARCHIVE}.tgz" >&2
rm -rf "${SPARK_ARCHIVE}"
fi

rm -f "${SPARK_ARCHIVE}.tgz"
echo -e "\n${SPARK_ARCHIVE} is successfully downloaded and saved under ${ZEPPELIN_HOME}/${SPARK_CACHE}\n"
}

function create_local_spark_dir() {
if [[ ! -d "${SPARK_CACHE}" ]]; then
mkdir -p "${SPARK_CACHE}"
fi
}

function check_local_spark_dir() {
if [[ -d "${ZEPPELIN_HOME}/${SPARK_CACHE}" ]]; then
rm -r "${ZEPPELIN_HOME}/${SPARK_CACHE}"
fi
}

function save_local_spark() {
# echo -e "For using Spark interpreter in local mode(without external Spark installation), Spark binary needs to be downloaded."
trap "echo -e '\n\nForced termination by user.'; check_local_spark_dir; exit 1" SIGTERM SIGINT SIGQUIT
create_local_spark_dir
cd "${SPARK_CACHE}"

printf "Download ${SPARK_ARCHIVE}.tgz from mirror ...\n\n"

MIRROR_INFO=$(curl -s "http://www.apache.org/dyn/closer.cgi/spark/spark-${SPARK_VERSION}/${SPARK_ARCHIVE}.tgz?asjson=1")
PREFFERED=$(echo "${MIRROR_INFO}" | grep preferred | sed 's/[^"]*.preferred.: .\([^"]*\).*/\1/g')
PATHINFO=$(echo "${MIRROR_INFO}" | grep path_info | sed 's/[^"]*.path_info.: .\([^"]*\).*/\1/g')

download_with_retry "${PREFFERED}${PATHINFO}"
unzip_spark_bin
}

save_local_spark

set +xe
32 changes: 16 additions & 16 deletions bin/interpreter.sh
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ bin=$(dirname "${BASH_SOURCE-$0}")
bin=$(cd "${bin}">/dev/null; pwd)

function usage() {
echo "usage) $0 -p <port> -d <interpreter dir to load> -l <local interpreter repo dir to load>"
echo "usage) $0 -p <port> -d <interpreter dir to load> -l <local interpreter repo dir to load>"
}

while getopts "hp:d:l:v:u:" o; do
Expand Down Expand Up @@ -55,8 +55,8 @@ done


if [ -z "${PORT}" ] || [ -z "${INTERPRETER_DIR}" ]; then
usage
exit 1
usage
exit 1
fi

. "${bin}/common.sh"
Expand Down Expand Up @@ -98,17 +98,18 @@ fi

# set spark related env variables
if [[ "${INTERPRETER_ID}" == "spark" ]]; then
SPARK_APP_JAR="$(ls ${ZEPPELIN_HOME}/interpreter/spark/zeppelin-spark*.jar)"
# This will eventually passes SPARK_APP_JAR to classpath of SparkIMain
ZEPPELIN_INTP_CLASSPATH+=":${SPARK_APP_JAR}"

if [[ -n "${SPARK_HOME}" ]]; then
export SPARK_SUBMIT="${SPARK_HOME}/bin/spark-submit"
SPARK_APP_JAR="$(ls ${ZEPPELIN_HOME}/interpreter/spark/zeppelin-spark*.jar)"
# This will evantually passes SPARK_APP_JAR to classpath of SparkIMain
ZEPPELIN_INTP_CLASSPATH+=":${SPARK_APP_JAR}"

pattern="$SPARK_HOME/python/lib/py4j-*-src.zip"
pattern="${SPARK_HOME}/python/lib/py4j-*-src.zip"
py4j=($pattern)
# pick the first match py4j zip - there should only be one
export PYTHONPATH="$SPARK_HOME/python/:$PYTHONPATH"
export PYTHONPATH="${py4j[0]}:$PYTHONPATH"
export PYTHONPATH="${SPARK_HOME}/python/:${PYTHONPATH}"
export PYTHONPATH="${py4j[0]}:${PYTHONPATH}"
else
# add Hadoop jars into classpath
if [[ -n "${HADOOP_HOME}" ]]; then
Expand All @@ -120,12 +121,13 @@ if [[ "${INTERPRETER_ID}" == "spark" ]]; then
addJarInDirForIntp "${HADOOP_HOME}/lib"
fi

addJarInDirForIntp "${INTERPRETER_DIR}/dep"
# If there is not SPARK_HOME in the system, Zeppelin will use local Spark binary for Spark interpreter
export SPARK_SUBMIT="${ZEPPELIN_HOME}/${SPARK_CACHE}/${SPARK_ARCHIVE}/bin/spark-submit"

pattern="${ZEPPELIN_HOME}/interpreter/spark/pyspark/py4j-*-src.zip"
pattern="${ZEPPELIN_HOME}/${SPARK_CACHE}/${SPARK_ARCHIVE}/python/lib/py4j-*-src.zip"
py4j=($pattern)
# pick the first match py4j zip - there should only be one
PYSPARKPATH="${ZEPPELIN_HOME}/interpreter/spark/pyspark/pyspark.zip:${py4j[0]}"
PYSPARKPATH="${ZEPPELIN_HOME}/${SPARK_CACHE}/${SPARK_ARCHIVE}/python/lib/pyspark.zip:${py4j[0]}"

if [[ -z "${PYTHONPATH}" ]]; then
export PYTHONPATH="${PYSPARKPATH}"
Expand All @@ -146,8 +148,6 @@ if [[ "${INTERPRETER_ID}" == "spark" ]]; then
if [[ -n "${HADOOP_CONF_DIR}" ]] && [[ -d "${HADOOP_CONF_DIR}" ]]; then
ZEPPELIN_INTP_CLASSPATH+=":${HADOOP_CONF_DIR}"
fi

export SPARK_CLASSPATH+=":${ZEPPELIN_INTP_CLASSPATH}"
fi
elif [[ "${INTERPRETER_ID}" == "hbase" ]]; then
if [[ -n "${HBASE_CONF_DIR}" ]]; then
Expand Down Expand Up @@ -186,9 +186,9 @@ addJarInDirForIntp "${LOCAL_INTERPRETER_REPO}"
CLASSPATH+=":${ZEPPELIN_INTP_CLASSPATH}"

if [[ -n "${SPARK_SUBMIT}" ]]; then
${ZEPPELIN_IMPERSONATE_RUN_CMD} `${SPARK_SUBMIT} --class ${ZEPPELIN_SERVER} --driver-class-path "${ZEPPELIN_INTP_CLASSPATH_OVERRIDES}:${CLASSPATH}" --driver-java-options "${JAVA_INTP_OPTS}" ${SPARK_SUBMIT_OPTIONS} ${SPARK_APP_JAR} ${PORT} &`
${ZEPPELIN_IMPERSONATE_RUN_CMD} `${SPARK_SUBMIT} --class ${ZEPPELIN_SERVER} --driver-class-path "${ZEPPELIN_INTP_CLASSPATH_OVERRIDES}:${CLASSPATH}" --driver-java-options "${JAVA_INTP_OPTS}" ${SPARK_SUBMIT_OPTIONS} ${SPARK_APP_JAR} ${PORT} &`
else
${ZEPPELIN_IMPERSONATE_RUN_CMD} ${ZEPPELIN_RUNNER} ${JAVA_INTP_OPTS} ${ZEPPELIN_INTP_MEM} -cp ${ZEPPELIN_INTP_CLASSPATH_OVERRIDES}:${CLASSPATH} ${ZEPPELIN_SERVER} ${PORT} &
${ZEPPELIN_IMPERSONATE_RUN_CMD} ${ZEPPELIN_RUNNER} ${JAVA_INTP_OPTS} ${ZEPPELIN_INTP_MEM} -cp ${ZEPPELIN_INTP_CLASSPATH_OVERRIDES}:${CLASSPATH} ${ZEPPELIN_SERVER} ${PORT} &
fi

pid=$!
Expand Down
9 changes: 8 additions & 1 deletion bin/zeppelin-daemon.sh
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
#

USAGE="-e Usage: zeppelin-daemon.sh\n\t
[--config <conf-dir>] {start|stop|upstart|restart|reload|status}\n\t
[--config <conf-dir>] {start|stop|upstart|restart|reload|status|get-spark}\n\t
[--version | -v]"

if [[ "$1" == "--config" ]]; then
Expand Down Expand Up @@ -179,6 +179,10 @@ function start() {

initialize_default_directories

if [[ ! -d "${SPARK_CACHE}/${SPARK_ARCHIVE}" && -z "${SPARK_HOME}" ]]; then
echo -e "\nYou do not have neither local-spark, nor external SPARK_HOME set up.\nIf you want to use Spark interpreter, you need to run get-spark at least one time or set SPARK_HOME.\n"
fi

echo "ZEPPELIN_CLASSPATH: ${ZEPPELIN_CLASSPATH_OVERRIDES}:${CLASSPATH}" >> "${ZEPPELIN_OUTFILE}"

nohup nice -n $ZEPPELIN_NICENESS $ZEPPELIN_RUNNER $JAVA_OPTS -cp $ZEPPELIN_CLASSPATH_OVERRIDES:$CLASSPATH $ZEPPELIN_MAIN >> "${ZEPPELIN_OUTFILE}" 2>&1 < /dev/null &
Expand Down Expand Up @@ -265,6 +269,9 @@ case "${1}" in
status)
find_zeppelin_process
;;
get-spark)
downloadSparkBinary
;;
-v | --version)
getZeppelinVersion
;;
Expand Down
8 changes: 8 additions & 0 deletions bin/zeppelin.sh
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,10 @@ if [ "$1" == "--version" ] || [ "$1" == "-v" ]; then
getZeppelinVersion
fi

if [ "$1" == "get-spark" ]; then
downloadSparkBinary
fi

HOSTNAME=$(hostname)
ZEPPELIN_LOGFILE="${ZEPPELIN_LOG_DIR}/zeppelin-${ZEPPELIN_IDENT_STRING}-${HOSTNAME}.log"
LOG="${ZEPPELIN_LOG_DIR}/zeppelin-cli-${ZEPPELIN_IDENT_STRING}-${HOSTNAME}.out"
Expand Down Expand Up @@ -87,4 +91,8 @@ if [[ ! -d "${ZEPPELIN_NOTEBOOK_DIR}" ]]; then
$(mkdir -p "${ZEPPELIN_NOTEBOOK_DIR}")
fi

if [[ ! -d "${SPARK_CACHE}/${SPARK_ARCHIVE}" && -z "${SPARK_HOME}" ]]; then
echo -e "\nYou do not have neither local-spark, nor external SPARK_HOME set up.\nIf you want to use Spark interpreter, you need to run get-spark at least one time or set SPARK_HOME.\n"
fi

exec $ZEPPELIN_RUNNER $JAVA_OPTS -cp $ZEPPELIN_CLASSPATH_OVERRIDES:$CLASSPATH $ZEPPELIN_SERVER "$@"
11 changes: 6 additions & 5 deletions conf/zeppelin-env.sh.template
Original file line number Diff line number Diff line change
Expand Up @@ -41,16 +41,17 @@

#### Spark interpreter configuration ####

## Use provided spark installation ##
## defining SPARK_HOME makes Zeppelin run spark interpreter process using spark-submit
## Use provided Spark installation ##
## defining SPARK_HOME makes Zeppelin run Spark interpreter process using spark-submit
##
# export SPARK_HOME # (required) When it is defined, load it instead of Zeppelin embedded Spark libraries
# export SPARK_SUBMIT_OPTIONS # (optional) extra options to pass to spark submit. eg) "--driver-memory 512M --executor-memory 1G".
# export SPARK_APP_NAME # (optional) The name of spark application.

## Use embedded spark binaries ##
## without SPARK_HOME defined, Zeppelin still able to run spark interpreter process using embedded spark binaries.
## however, it is not encouraged when you can define SPARK_HOME
## Use embedded Spark binaries ##
## You can simply get the embedded Spark binaries by running "ZEPPELIN_HOME/bin/zeppelin-daemon.sh get-spark" or "ZEPPELIN_HOME/bin/zeppelin.sh get-spark".
## Zeppelin can run Spark interpreter process using this embedded Spark without any configurations.
## If you can define SPARK_HOME, then you don't need to download it.
##
# Options read in YARN client mode
# export HADOOP_CONF_DIR # yarn-site.xml is located in configuration directory in HADOOP_CONF_DIR.
Expand Down
4 changes: 2 additions & 2 deletions dev/create_release.sh
Original file line number Diff line number Diff line change
Expand Up @@ -103,8 +103,8 @@ function make_binary_release() {

git_clone
make_source_package
make_binary_release all "-Pspark-2.0 -Phadoop-2.4 -Pyarn -Ppyspark -Psparkr -Pr -Pscala-2.11"
make_binary_release netinst "-Pspark-2.0 -Phadoop-2.4 -Pyarn -Ppyspark -Psparkr -Pr -Pscala-2.11 -pl !alluxio,!angular,!cassandra,!elasticsearch,!file,!flink,!hbase,!ignite,!jdbc,!kylin,!lens,!livy,!markdown,!postgresql,!python,!shell,!bigquery"
make_binary_release all "-Pspark-2.0 -Phadoop-2.4 -Pyarn -Psparkr -Pr -Pscala-2.11"
make_binary_release netinst "-Pspark-2.0 -Phadoop-2.4 -Pyarn -Psparkr -Pr -Pscala-2.11 -pl !alluxio,!angular,!cassandra,!elasticsearch,!file,!flink,!hbase,!ignite,!jdbc,!kylin,!lens,!livy,!markdown,!postgresql,!python,!shell,!bigquery"

# remove non release files and dirs
rm -rf "${WORKING_DIR}/zeppelin"
Expand Down
2 changes: 1 addition & 1 deletion dev/publish_release.sh
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ NC='\033[0m' # No Color
RELEASE_VERSION="$1"
GIT_TAG="$2"

PUBLISH_PROFILES="-Ppublish-distr -Pspark-2.0 -Phadoop-2.4 -Pyarn -Ppyspark -Psparkr -Pr"
PUBLISH_PROFILES="-Ppublish-distr -Pspark-2.0 -Phadoop-2.4 -Pyarn -Psparkr -Pr"
PROJECT_OPTIONS="-pl !zeppelin-distribution"
NEXUS_STAGING="https://repository.apache.org/service/local/staging"
NEXUS_PROFILE="153446d1ac37c4"
Expand Down