-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ZEPPELIN-4154] Build docker image for each interpreter #3769
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
# | ||
# Licensed to the Apache Software Foundation (ASF) under one or more | ||
# contributor license agreements. See the NOTICE file distributed with | ||
# this work for additional information regarding copyright ownership. | ||
# The ASF licenses this file to You under the Apache License, Version 2.0 | ||
# (the "License"); you may not use this file except in compliance with | ||
# the License. You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
# | ||
FROM maven:3.5-jdk-8 as builder | ||
ADD . /workspace/zeppelin | ||
WORKDIR /workspace/zeppelin | ||
# Allow npm and bower to run with root privileges | ||
RUN echo "unsafe-perm=true" > ~/.npmrc && \ | ||
echo '{ "allow_root": true }' > ~/.bowerrc && \ | ||
mvn -B package -DskipTests -Pbuild-distr -Pspark-3.0 -Pinclude-hadoop -Phadoop3 -Pspark-scala-2.12 -Pweb-angular && \ | ||
# Example with doesn't compile all interpreters | ||
# mvn -B package -DskipTests -Pweb-angular -Pscala-2.11 -Pbuild-distr -pl '!groovy,!submarine,!livy,!hbase,!pig,!file,!flink,!ignite,!kylin,!lens' && \ | ||
mv /workspace/zeppelin/zeppelin-distribution/target/zeppelin-*/zeppelin-* /opt/zeppelin/ && \ | ||
# Removing stuff saves time, because docker creates a temporary layer | ||
rm -rf ~/.m2 && \ | ||
rm -rf /workspace/zeppelin/* | ||
|
||
FROM ubuntu:18.04 | ||
COPY --from=builder /opt/zeppelin /opt/zeppelin |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -29,7 +29,7 @@ data: | |
# If you have your ingress controller configured to connect to `zeppelin-server` service and have a domain name for it (with wildcard subdomain point the same address), you can replace serviceDomain field with your own domain. | ||
SERVICE_DOMAIN: local.zeppelin-project.org:8080 | ||
ZEPPELIN_K8S_SPARK_CONTAINER_IMAGE: spark:2.4.5 | ||
ZEPPELIN_K8S_CONTAINER_IMAGE: apache/zeppelin:0.9.0-SNAPSHOT | ||
ZEPPELIN_K8S_CONTAINER_IMAGE: apache/zeppelin-interpreter:0.9.0-SNAPSHOT | ||
ZEPPELIN_HOME: /zeppelin | ||
ZEPPELIN_SERVER_RPC_PORTRANGE: 12320:12320 | ||
# default value of 'master' property for spark interpreter. | ||
|
@@ -115,7 +115,7 @@ spec: | |
path: nginx.conf | ||
containers: | ||
- name: zeppelin-server | ||
image: apache/zeppelin:0.9.0-SNAPSHOT | ||
image: apache/zeppelin-server:0.9.0-SNAPSHOT | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Currently, the official docker images are built from Do you think we can release images based on new Dockerfiles in this PullRequest and remove While There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we can delete It makes sense to push these new images. How should we handle different compilation versions?
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. At least for Spark interpreter, it's got binary level compatibility to different spark (and hadoop) versions. Once built, It works with different versions without rebuilding it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How it all will work on non-K8S? Like, using Docker just for not installing anything to machine, and one image is more handy to work with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
You are right, one or at least a small set of images is more practical for work. In fact, I currently have only three images (distribution image, server, (one large) interpreter image) in my K8s setup. The title of ZEPPELIN-4154 and the first PR #3380 imply an image for each interpreter. This PR tries to solve the task. In my opinion we should at least provide different images for the Zeppelin server and the Zeppelin interpreter. A distribution image is useful to build Zeppelin only once and copy the same version to the Zeppelin server and the Zeppelin interpreter. My main goal for different images is to reduce the size and start time of images in a container cluster. If we want to create an image for each interpreter, the image size is reduced. All interpreter images should use the same base image to benefit from a potentially available layer.
Docker is also able to set up a local network, in most cases this is done via a bridge network. The Zeppelin server needs access to create/modify the network via the Docker daemon's tcp interface or at least the information when new containers are created via the tcp interface. |
||
command: ["sh", "-c", "$(ZEPPELIN_HOME)/bin/zeppelin.sh"] | ||
lifecycle: | ||
preStop: | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
# Licensed to the Apache Software Foundation (ASF) under one or more | ||
# contributor license agreements. See the NOTICE file distributed with | ||
# this work for additional information regarding copyright ownership. | ||
# The ASF licenses this file to You under the Apache License, Version 2.0 | ||
# (the "License"); you may not use this file except in compliance with | ||
# the License. You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
|
||
FROM zeppelin-distribution:latest AS zeppelin-distribution | ||
|
||
FROM zeppelin-interpreter-base:latest | ||
# Must declare it after FROM, because it would be reset if it is declared before FROM | ||
ARG interpreter_name | ||
# Copy interpreter | ||
COPY --from=zeppelin-distribution /opt/zeppelin/interpreter/${interpreter_name} ${Z_HOME}/interpreter/${interpreter_name} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
# Licensed to the Apache Software Foundation (ASF) under one or more | ||
# contributor license agreements. See the NOTICE file distributed with | ||
# this work for additional information regarding copyright ownership. | ||
# The ASF licenses this file to You under the Apache License, Version 2.0 | ||
# (the "License"); you may not use this file except in compliance with | ||
# the License. You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
FROM zeppelin-distribution:latest AS zeppelin-distribution | ||
|
||
FROM ubuntu:18.04 | ||
|
||
LABEL maintainer="Apache Software Foundation <dev@zeppelin.apache.org>" | ||
|
||
ARG version="0.9.0-SNAPSHOT" | ||
|
||
ENV VERSION="${version}" \ | ||
Z_HOME="/opt/zeppelin" | ||
|
||
RUN set -ex && \ | ||
apt-get -y update && \ | ||
DEBIAN_FRONTEND=noninteractive apt-get install -y openjdk-8-jre-headless wget && \ | ||
# Cleanup | ||
rm -rf /var/lib/apt/lists/* && \ | ||
apt-get autoclean && \ | ||
apt-get clean | ||
|
||
COPY --from=zeppelin-distribution /opt/zeppelin/bin ${Z_HOME}/bin | ||
COPY --from=zeppelin-distribution /opt/zeppelin/interpreter/zeppelin-interpreter-shaded-${VERSION}.jar ${Z_HOME}/interpreter/zeppelin-interpreter-shaded-${VERSION}.jar | ||
COPY log4j.properties ${Z_HOME}/conf/ | ||
COPY log4j_yarn_cluster.properties ${Z_HOME}/conf/ | ||
|
||
WORKDIR ${Z_HOME} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
# | ||
# Licensed to the Apache Software Foundation (ASF) under one or more | ||
# contributor license agreements. See the NOTICE file distributed with | ||
# this work for additional information regarding copyright ownership. | ||
# The ASF licenses this file to You under the Apache License, Version 2.0 | ||
# (the "License"); you may not use this file except in compliance with | ||
# the License. You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
# | ||
|
||
log4j.rootLogger = INFO, stdout | ||
|
||
log4j.appender.stdout = org.apache.log4j.ConsoleAppender | ||
log4j.appender.stdout.layout = org.apache.log4j.PatternLayout | ||
log4j.appender.stdout.layout.ConversionPattern=%5p [%d] ({%t} %F[%M]:%L) - %m%n |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
# | ||
# Licensed to the Apache Software Foundation (ASF) under one or more | ||
# contributor license agreements. See the NOTICE file distributed with | ||
# this work for additional information regarding copyright ownership. | ||
# The ASF licenses this file to You under the Apache License, Version 2.0 | ||
# (the "License"); you may not use this file except in compliance with | ||
# the License. You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
# | ||
|
||
log4j.rootLogger = INFO, stdout | ||
|
||
log4j.appender.stdout = org.apache.log4j.ConsoleAppender | ||
log4j.appender.stdout.layout = org.apache.log4j.PatternLayout | ||
log4j.appender.stdout.layout.ConversionPattern=%5p [%d] ({%t} %F[%M]:%L) - %m%n | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,50 @@ | ||
# Licensed to the Apache Software Foundation (ASF) under one or more | ||
# contributor license agreements. See the NOTICE file distributed with | ||
# this work for additional information regarding copyright ownership. | ||
# The ASF licenses this file to You under the Apache License, Version 2.0 | ||
# (the "License"); you may not use this file except in compliance with | ||
# the License. You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
FROM zeppelin-distribution:latest AS zeppelin-distribution | ||
|
||
FROM zeppelin-interpreter-base:latest | ||
|
||
ARG miniconda_version="py37_4.8.2" | ||
ARG miniconda_sha256="957d2f0f0701c3d1335e3b39f235d197837ad69a944fa6f5d8ad2c686b69df3b" | ||
|
||
ENV MINICONDA_VERSION=${miniconda_version} | ||
|
||
# Install additional_conda_packages | ||
COPY python_conda_packages.txt /python_conda_packages.txt | ||
# Some packages are not available via conda | ||
COPY pip_packages.txt /pip_packages.txt | ||
# Install Miniconda3 | ||
RUN set -ex && \ | ||
wget -nv https://repo.anaconda.com/miniconda/Miniconda3-${MINICONDA_VERSION}-Linux-x86_64.sh -O miniconda.sh && \ | ||
echo "${miniconda_sha256} miniconda.sh" > anaconda.sha256 && \ | ||
sha256sum --strict -c anaconda.sha256 && \ | ||
bash miniconda.sh -b -p /opt/conda && \ | ||
export PATH=/opt/conda/bin:$PATH && \ | ||
conda config --set always_yes yes --set changeps1 no && \ | ||
conda info -a && \ | ||
conda config --add channels conda-forge && \ | ||
conda install -y --quiet --file /python_conda_packages.txt && \ | ||
pip install -q -r /pip_packages.txt && \ | ||
# Cleanup | ||
rm -v miniconda.sh anaconda.sha256 && \ | ||
# Cleanup based on https://github.com/ContinuumIO/docker-images/commit/cac3352bf21a26fa0b97925b578fb24a0fe8c383 | ||
find /opt/conda/ -follow -type f -name '*.a' -delete && \ | ||
find /opt/conda/ -follow -type f -name '*.js.map' -delete && \ | ||
conda clean -ay | ||
|
||
ENV PATH /opt/conda/bin:$PATH | ||
|
||
COPY --from=zeppelin-distribution /opt/zeppelin/interpreter/python ${Z_HOME}/interpreter/python |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
bkzep==0.6.1 |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
pycodestyle | ||
numpy | ||
pandas | ||
scipy | ||
grpcio | ||
hvplot | ||
protobuf | ||
pandasql | ||
ipython | ||
matplotlib | ||
ipykernel | ||
jupyter_client | ||
bokeh | ||
apache-beam |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
# Licensed to the Apache Software Foundation (ASF) under one or more | ||
# contributor license agreements. See the NOTICE file distributed with | ||
# this work for additional information regarding copyright ownership. | ||
# The ASF licenses this file to You under the Apache License, Version 2.0 | ||
# (the "License"); you may not use this file except in compliance with | ||
# the License. You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
FROM zeppelin-distribution:latest AS zeppelin-distribution | ||
|
||
FROM zeppelin-interpreter-python:latest | ||
|
||
# Install additional_conda_packages | ||
COPY r_conda_packages.txt /r_conda_packages.txt | ||
# Install necessary r packages | ||
RUN set -ex && \ | ||
conda install -y --quiet --file /r_conda_packages.txt && \ | ||
# Cleanup based on https://github.com/ContinuumIO/docker-images/commit/cac3352bf21a26fa0b97925b578fb24a0fe8c383 | ||
find /opt/conda/ -follow -type f -name '*.a' -delete && \ | ||
find /opt/conda/ -follow -type f -name '*.js.map' -delete && \ | ||
conda clean -ay | ||
|
||
COPY --from=zeppelin-distribution /opt/zeppelin/interpreter/r ${Z_HOME}/interpreter/r |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
r-evaluate | ||
r-base64enc | ||
r-knitr | ||
r-ggplot2 | ||
r-shiny | ||
r-googlevis |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
# Licensed to the Apache Software Foundation (ASF) under one or more | ||
# contributor license agreements. See the NOTICE file distributed with | ||
# this work for additional information regarding copyright ownership. | ||
# The ASF licenses this file to You under the Apache License, Version 2.0 | ||
# (the "License"); you may not use this file except in compliance with | ||
# the License. You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
FROM zeppelin-distribution:latest AS zeppelin-distribution | ||
|
||
FROM zeppelin-interpreter-r:latest | ||
|
||
COPY --from=zeppelin-distribution /opt/zeppelin/interpreter/spark ${Z_HOME}/interpreter/spark |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Kubernetes, would it be possible to use customized interpreter image (scripts/docker/zeppelin-interpreter/<interpreter_name>) for particular interpreters and fallback to default interpreter image (scripts/docker/zeppelin-interpreter/Dockerfile) for all other interpreters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least the interpreter image build process should use 'scripts/docker/Zeppelin interpreter/docker file'. For K8s you should have all interpreter images in the docker registry. Fallback logic in the Zeppelin server should be avoided.