diff --git a/tez-dist/pom.xml b/tez-dist/pom.xml index 5940a996ac..6c4e7e5d36 100644 --- a/tez-dist/pom.xml +++ b/tez-dist/pom.xml @@ -135,7 +135,7 @@ /bin/bash - ${project.basedir}/src/docker/tez-am/build-am-docker.sh + ${project.basedir}/src/docker/build.sh -tez ${project.version} -repo apache diff --git a/tez-dist/src/docker/tez-am/Dockerfile.am b/tez-dist/src/docker/Dockerfile similarity index 88% rename from tez-dist/src/docker/tez-am/Dockerfile.am rename to tez-dist/src/docker/Dockerfile index 01647f336c..9d16ff4332 100644 --- a/tez-dist/src/docker/tez-am/Dockerfile.am +++ b/tez-dist/src/docker/Dockerfile @@ -47,9 +47,8 @@ RUN set -ex; \ # Set necessary environment variables ENV TEZ_HOME=/opt/tez \ - TEZ_CONF_DIR=/opt/tez/conf - -ENV TEZ_CLIENT_VERSION=$TEZ_VERSION + TEZ_CONF_DIR=/opt/tez/conf \ + TEZ_CLIENT_VERSION=$TEZ_VERSION ENV PATH=$TEZ_HOME/bin:$PATH @@ -57,15 +56,15 @@ COPY --from=env --chown=tez /opt/tez $TEZ_HOME RUN mkdir -p $TEZ_CONF_DIR && chown tez:tez $TEZ_CONF_DIR -COPY --chown=tez am-entrypoint.sh / +COPY --chown=tez entrypoint.sh am-entrypoint.sh child-entrypoint.sh / COPY --chown=tez conf $TEZ_CONF_DIR # Create Extension Point Directory RUN mkdir -p /opt/tez/plugins && chown tez:tez /opt/tez/plugins && chmod 755 /opt/tez/plugins -RUN chmod +x /am-entrypoint.sh +RUN chmod +x /entrypoint.sh /am-entrypoint.sh /child-entrypoint.sh USER tez WORKDIR $TEZ_HOME -ENTRYPOINT ["/am-entrypoint.sh"] +ENTRYPOINT ["/entrypoint.sh"] diff --git a/tez-dist/src/docker/README.md b/tez-dist/src/docker/README.md new file mode 100644 index 0000000000..f815e225d8 --- /dev/null +++ b/tez-dist/src/docker/README.md @@ -0,0 +1,196 @@ + + +# Apache Tez Docker + +This directory contains a unified Docker implementation for running TezAM +and TezChild process from a single container image. Based on +`TEZ_COMPONENT` environment variable the entrypoint is dynamically selected + +1. Building the docker image: + + ```bash + mvn clean install -DskipTests -Pdocker + ``` + + Alternatively, you can build it explicitly via the provided script: + + ```bash + ./tez-dist/src/docker/build.sh -tez -repo apache + ``` + +2. Local Zookeeper Setup (Standalone): + + If you are running the AM container use the official Docker + image (Refer to docker-compose.yml): + + ```bash + docker pull zookeeper:3.8.4 + + docker run -d \ + --name zookeeper-server \ + -p 2181:2181 \ + -p 8080:8080 \ + -e ZOO_MY_ID=1 \ + zookeeper:3.8.4 + ``` + +3. Running the Tez containers explicitly: + + **Running the Tez AM:** + + ```bash + export TEZ_VERSION=1.0.0-SNAPSHOT + + docker run --rm \ + -p 10001:10001 \ + --env-file tez-dist/src/docker/am.env \ + --name tez-am \ + --hostname localhost \ + apache/tez:$TEZ_VERSION + ``` + + * `TEZ_VERSION` corresponds to the Maven `${project.version}`. + Set this environment variable in your shell before running the commands. + + * Expose ports using the `-p` flag based on the + `tez.am.client.am.port-range` property in `tez-site.xml`. + + * The `--hostname` flag configures the container's hostname, allowing + services on the host (e.g., macOS) to connect to it. + + * Ensure the `--env-file` flag is included, or at a minimum, pass + `-e TEZ_FRAMEWORK_MODE=STANDALONE_ZOOKEEPER` and `-e TEZ_COMPONENT=AM` + to the `docker run` command. + + **Running the Tez Child:** + + The child container requires specific arguments (` + `) to connect back to the + Application Master. + + Assuming your AM is running on `localhost` port `10001`, and the AM + assigned the container ID `container_1703023223000_0001_01_000001`: + + ```bash + docker run --rm \ + --network host \ + --env-file tez-dist/src/docker/child.env \ + --name tez-child \ + --hostname localhost \ + apache/tez:1.0.0-SNAPSHOT \ + localhost 10001 container_1703023223000_0001_01_000001 dummy_token_abc 1 + ``` + +4. Debugging the Tez containers: + Uncomment the `JAVA_TOOL_OPTIONS` in `am.env` (or `child.env` for + port 5006) and expose the debug port using `-p` flag: + + ```bash + docker run --rm \ + -p 10001:10001 -p 5005:5005 \ + --env-file tez-dist/src/docker/am.env \ + --name tez-am \ + --hostname localhost \ + apache/tez:$TEZ_VERSION + ``` + +5. To override the tez-site.xml in docker image use: + + * Set the `TEZ_CUSTOM_CONF_DIR` environment variable in `am.env` / + `child.env` or via the `docker run` command (e.g., + `/opt/tez/custom-conf`). + + ```bash + export TEZ_SITE_PATH=$(pwd)/tez-dist/src/docker/conf/tez-site.xml + + docker run --rm \ + -p 10001:10001 \ + --env-file tez-dist/src/docker/am.env \ + -v "$TEZ_SITE_PATH:/opt/tez/custom-conf/tez-site.xml" \ + --name tez-am \ + --hostname localhost \ + apache/tez:$TEZ_VERSION + ``` + +6. To add plugin jars in docker image use: + + * The plugin directory path inside the Docker container is fixed at + `/opt/tez/plugins`. + + ```bash + docker run --rm \ + -p 10001:10001 \ + --env-file tez-dist/src/docker/am.env \ + -v "/path/to/your/local/plugins:/opt/tez/plugins" \ + --name tez-am \ + --hostname localhost \ + apache/tez:$TEZ_VERSION + ``` + +7. Using Docker Compose (Local Testing Cluster): + + The provided `docker-compose.yml` offers a complete, minimal Hadoop + ecosystem to test Tez in a distributed manner locally without setting + up a real cluster. + + **Services Included:** + + * **namenode & datanode:** A minimal Apache Hadoop HDFS cluster + (lean image) + + * **zookeeper:** Required by the Tez AM for standalone session + discovery + + * **tez-am:** It automatically waits for Zookeeper and HDFS to + be healthy before starting up. + + * **tez-child:** TBD + + **To start the full cluster:** + + ```bash + docker-compose -f tez-dist/src/docker/docker-compose.yml up -d + ``` + + **To monitor the Application Master logs:** + + ```bash + docker-compose -f tez-dist/src/docker/docker-compose.yml logs -f tez-am + ``` + + **To shut down the cluster and clean up volumes (HDFS/Zookeeper data):** + + ```bash + docker-compose -f tez-dist/src/docker/docker-compose.yml down -v + ``` + +8. To mount custom plugins or JARs required by Tez AM (e.g., for split + generation — typically the hive-exec jar, but in general, any UDFs or + dependencies previously managed via YARN localization: + + * Create a directory `tez-plugins` and add all required jars. + + * Uncomment the following lines in docker compose under the `tez-am` + and `tez-child` services to mount this directory as a volume to + `/opt/tez/plugins` in the docker container. + + ```yaml + volumes: + - ./tez-plugins:/opt/tez/plugins + ``` diff --git a/tez-dist/src/docker/tez-am/am-entrypoint.sh b/tez-dist/src/docker/am-entrypoint.sh similarity index 91% rename from tez-dist/src/docker/tez-am/am-entrypoint.sh rename to tez-dist/src/docker/am-entrypoint.sh index a6128419ce..53f52821a7 100644 --- a/tez-dist/src/docker/tez-am/am-entrypoint.sh +++ b/tez-dist/src/docker/am-entrypoint.sh @@ -18,9 +18,9 @@ set -xeou pipefail -################################################ -# 1. Mocking DAGAppMaster#main() env variables # -################################################ +############################################# +# Mocking DAGAppMaster#main() env variables # +############################################# : "${USER:="tez"}" : "${LOCAL_DIRS:="/tmp"}" @@ -70,12 +70,6 @@ CLASSPATH="${CLASSPATH}:${TEZ_HOME}/*:${TEZ_HOME}/lib/*" ############# # Execution # ############# -TEZ_DAG_JAR=$(find "$TEZ_HOME" -maxdepth 1 -name "tez-dag-*.jar" ! -name "*-tests.jar" | head -n 1) - -if [ -z "$TEZ_DAG_JAR" ]; then - echo "Error: Could not find tez-dag-*.jar in $TEZ_HOME" - exit 1 -fi echo "--> Starting DAGAppMaster..." diff --git a/tez-dist/src/docker/tez-am/am.env b/tez-dist/src/docker/am.env similarity index 98% rename from tez-dist/src/docker/tez-am/am.env rename to tez-dist/src/docker/am.env index 93cabeea32..2feb3548d2 100644 --- a/tez-dist/src/docker/tez-am/am.env +++ b/tez-dist/src/docker/am.env @@ -19,6 +19,7 @@ USER=tez LOG_DIRS=/opt/tez/logs +TEZ_COMPONENT=AM TEZ_FRAMEWORK_MODE=STANDALONE_ZOOKEEPER TEZ_CUSTOM_CONF_DIR=/opt/tez/custom-conf # TEZ_AM_HEAP_OPTS configures the maximum heap size (Xmx) for the Tez AM. diff --git a/tez-dist/src/docker/tez-am/build-am-docker.sh b/tez-dist/src/docker/build.sh similarity index 84% rename from tez-dist/src/docker/tez-am/build-am-docker.sh rename to tez-dist/src/docker/build.sh index 66bf7fc738..17cc85e376 100755 --- a/tez-dist/src/docker/tez-am/build-am-docker.sh +++ b/tez-dist/src/docker/build.sh @@ -25,7 +25,7 @@ REPO= usage() { cat <&2 Usage: $0 [-h] [-tez ] [-repo ] -Build the Apache Tez AM Docker image +Build the Apache Tez (AM and CHILD) Docker image -help Display help -tez Build image with the specified Tez version -repo Docker repository @@ -59,8 +59,8 @@ SCRIPT_DIR=$( pwd ) -DIST_DIR=${DIST_DIR:-"$SCRIPT_DIR/../../.."} -PROJECT_ROOT=${PROJECT_ROOT:-"$SCRIPT_DIR/../../../.."} +DIST_DIR=${DIST_DIR:-"$SCRIPT_DIR/../../"} +PROJECT_ROOT=${PROJECT_ROOT:-"$SCRIPT_DIR/../../../"} REPO=${REPO:-apache} WORK_DIR="$(mktemp -d)" @@ -86,17 +86,19 @@ fi # ------------------------------------------------------------------------- # BUILD CONTEXT PREPARATION # ------------------------------------------------------------------------- -cp -R "$SCRIPT_DIR/conf" "$WORK_DIR/" 2>/dev/null || mkdir -p "$WORK_DIR/conf" +cp -R "$SCRIPT_DIR/conf" "$WORK_DIR/" +cp "$SCRIPT_DIR/entrypoint.sh" "$WORK_DIR/" cp "$SCRIPT_DIR/am-entrypoint.sh" "$WORK_DIR/" -cp "$SCRIPT_DIR/Dockerfile.am" "$WORK_DIR/" +cp "$SCRIPT_DIR/child-entrypoint.sh" "$WORK_DIR/" +cp "$SCRIPT_DIR/Dockerfile" "$WORK_DIR/" echo "Building Docker image..." docker build \ "$WORK_DIR" \ - -f "$WORK_DIR/Dockerfile.am" \ - -t "$REPO/tez-am:$TEZ_VERSION" \ + -f "$WORK_DIR/Dockerfile" \ + -t "$REPO/tez:$TEZ_VERSION" \ --build-arg "BUILD_ENV=unarchive" \ --build-arg "TEZ_VERSION=$TEZ_VERSION" rm -r "${WORK_DIR}" -echo "Docker image $REPO/tez-am:$TEZ_VERSION built successfully." +echo "Docker image $REPO/tez:$TEZ_VERSION built successfully." diff --git a/tez-dist/src/docker/child-entrypoint.sh b/tez-dist/src/docker/child-entrypoint.sh new file mode 100644 index 0000000000..b8fb0dd576 --- /dev/null +++ b/tez-dist/src/docker/child-entrypoint.sh @@ -0,0 +1,104 @@ +#!/usr/bin/env bash +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +set -xeou pipefail + +######################################### +# Mocking TezChild#main() env variables # +######################################### + +: "${USER:="tez"}" +: "${LOCAL_DIRS:="/tmp"}" +: "${LOG_DIRS:="/opt/tez/logs"}" + +export USER LOCAL_DIRS LOG_DIRS + +mkdir -p "$LOG_DIRS" + +########################### +# Custom Config directory # +########################### +if [[ -n "${TEZ_CUSTOM_CONF_DIR:-}" ]] && [[ -d "$TEZ_CUSTOM_CONF_DIR" ]]; then + echo "--> Using custom configuration directory: $TEZ_CUSTOM_CONF_DIR" + find "${TEZ_CUSTOM_CONF_DIR}" -type f -exec \ + ln -sf {} "${TEZ_CONF_DIR}"/ \; + + # Remove template keyword if it exists + if [[ -f "$TEZ_CONF_DIR/tez-site.xml.template" ]]; then + envsubst < "$TEZ_CONF_DIR/tez-site.xml.template" > "$TEZ_CONF_DIR/tez-site.xml" + fi +fi + +############# +# CLASSPATH # +############# + +# Order is: conf -> plugins -> tez jars +CLASSPATH="${TEZ_CONF_DIR}" + +# Custom Plugins +# This allows mounting a volume at /opt/tez/plugins containing aux jars +PLUGIN_DIR="/opt/tez/plugins" +if [[ -d "$PLUGIN_DIR" ]]; then + count=$(find "$PLUGIN_DIR" -maxdepth 1 -name "*.jar" 2>/dev/null | wc -l) + if [ "$count" != "0" ]; then + echo "--> Found $count plugin jars. Prepending to classpath." + CLASSPATH="${CLASSPATH}:${PLUGIN_DIR}/*" + fi +fi + +# Tez Jars +CLASSPATH="${CLASSPATH}:${TEZ_HOME}/*:${TEZ_HOME}/lib/*" + +############# +# Execution # +############# + +echo "--> Starting TezChild..." + +: "${TEZ_CHILD_HEAP_OPTS:="-Xmx1024m"}" + +JAVA_ADD_OPENS=( + "--add-opens=java.base/java.lang=ALL-UNNAMED" + "--add-opens=java.base/java.util=ALL-UNNAMED" + "--add-opens=java.base/java.io=ALL-UNNAMED" + "--add-opens=java.base/java.net=ALL-UNNAMED" + "--add-opens=java.base/java.nio=ALL-UNNAMED" + "--add-opens=java.base/java.util.concurrent=ALL-UNNAMED" + "--add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED" + "--add-opens=java.base/java.util.regex=ALL-UNNAMED" + "--add-opens=java.base/java.lang.reflect=ALL-UNNAMED" + "--add-opens=java.sql/java.sql=ALL-UNNAMED" + "--add-opens=java.base/java.text=ALL-UNNAMED" + "-Dnet.bytebuddy.experimental=true" +) + +read -r -a JAVA_OPTS_ARR <<< "${JAVA_OPTS:-}" +read -r -a HEAP_OPTS_ARR <<< "${TEZ_CHILD_HEAP_OPTS}" + +exec java "${HEAP_OPTS_ARR[@]}" "${JAVA_OPTS_ARR[@]}" "${JAVA_ADD_OPENS[@]}" \ + -Djava.net.preferIPv4Stack=true \ + -Djava.io.tmpdir="$PWD/tmp" \ + -Dtez.root.logger=INFO,CLA,console \ + -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator \ + -Dlog4j.configuration=tez-container-log4j.properties \ + -Dyarn.app.container.log.dir="$LOG_DIRS" \ + -Dtez.conf.dir="$TEZ_CONF_DIR" \ + -cp "$CLASSPATH" \ + org.apache.tez.runtime.task.TezChild \ + "$@" diff --git a/tez-dist/src/docker/child.env b/tez-dist/src/docker/child.env new file mode 100644 index 0000000000..6a499121e9 --- /dev/null +++ b/tez-dist/src/docker/child.env @@ -0,0 +1,26 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +# Tez Child Container Environment Configuration + +USER=tez +LOG_DIRS=/opt/tez/logs +TEZ_COMPONENT=CHILD +TEZ_FRAMEWORK_MODE=STANDALONE_ZOOKEEPER +TEZ_CUSTOM_CONF_DIR=/opt/tez/custom-conf +TEZ_CHILD_HEAP_OPTS=-Xmx1024m +# JAVA_TOOL_OPTIONS='-agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=*:5006' diff --git a/tez-dist/src/docker/tez-am/conf/core-site.xml b/tez-dist/src/docker/conf/core-site.xml similarity index 100% rename from tez-dist/src/docker/tez-am/conf/core-site.xml rename to tez-dist/src/docker/conf/core-site.xml diff --git a/tez-dist/src/docker/tez-am/conf/hdfs-site.xml b/tez-dist/src/docker/conf/hdfs-site.xml similarity index 100% rename from tez-dist/src/docker/tez-am/conf/hdfs-site.xml rename to tez-dist/src/docker/conf/hdfs-site.xml diff --git a/tez-dist/src/docker/tez-am/conf/tez-site.xml b/tez-dist/src/docker/conf/tez-site.xml similarity index 100% rename from tez-dist/src/docker/tez-am/conf/tez-site.xml rename to tez-dist/src/docker/conf/tez-site.xml diff --git a/tez-dist/src/docker/tez-am/docker-compose.yml b/tez-dist/src/docker/docker-compose.yml similarity index 89% rename from tez-dist/src/docker/tez-am/docker-compose.yml rename to tez-dist/src/docker/docker-compose.yml index 3740fabe8a..3adb60788f 100644 --- a/tez-dist/src/docker/tez-am/docker-compose.yml +++ b/tez-dist/src/docker/docker-compose.yml @@ -96,7 +96,7 @@ services: " tez-am: - image: apache/tez-am:${TEZ_VERSION:-1.0.0-SNAPSHOT} + image: apache/tez:${TEZ_VERSION:-1.0.0-SNAPSHOT} container_name: tez-am hostname: tez-am networks: @@ -126,6 +126,22 @@ services: datanode: condition: service_started + tez-child: + image: apache/tez:${TEZ_VERSION:-1.0.0-SNAPSHOT} + container_name: tez-child + hostname: tez-child + networks: + - hadoop-network + # ports: + # - "5006:5006" + env_file: + - ./child.env + depends_on: + tez-am: + condition: service_started + # Example command for a manual launch (requires valid AM connection info) + command: ["tez-am", "10001", "container_12345", "token_abc", "1"] + networks: hadoop-network: name: hadoop-network diff --git a/tez-dist/src/docker/entrypoint.sh b/tez-dist/src/docker/entrypoint.sh new file mode 100644 index 0000000000..b80dbef4f8 --- /dev/null +++ b/tez-dist/src/docker/entrypoint.sh @@ -0,0 +1,34 @@ +#!/usr/bin/env bash +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +set -xeou pipefail + +: "${TEZ_COMPONENT:="AM"}" + +echo "--> Starting Tez Component: $TEZ_COMPONENT" + +if [[ "$TEZ_COMPONENT" == "AM" ]]; then + echo "--> Routing to Tez AM Entrypoint" + exec /am-entrypoint.sh "$@" +elif [[ "$TEZ_COMPONENT" == "CHILD" ]]; then + echo "--> Routing to Tez Child Entrypoint" + exec /child-entrypoint.sh "$@" +else + echo "Error: Unknown TEZ_COMPONENT '$TEZ_COMPONENT'. Must be 'AM' or 'CHILD'." + exit 1 +fi diff --git a/tez-dist/src/docker/tez-am/README.md b/tez-dist/src/docker/tez-am/README.md deleted file mode 100644 index 987f381853..0000000000 --- a/tez-dist/src/docker/tez-am/README.md +++ /dev/null @@ -1,136 +0,0 @@ - - -# Tez AM Docker - -1. Building the docker image: - - ```bash - mvn clean install -DskipTests -Pdocker - ``` - -2. Install zookeeper in mac: - - a. Via brew: set the `tez.am.zookeeper.quorum` value as - `host.docker.internal:2181` in `tez-site.xml` - - ```bash - brew install zookeeper - zkServer start - ``` - - b. Use Zookeeper docker image (Refer to docker compose yml): - - ```bash - docker pull zookeeper:3.8.4 - - docker run -d \ - --name zookeeper-server \ - -p 2181:2181 \ - -p 8080:8080 \ - -e ZOO_MY_ID=1 \ - zookeeper:3.8.4 - ``` - -3. Running the Tez AM container explicitly: - - ```bash - export TEZ_VERSION=1.0.0-SNAPSHOT - - docker run --rm \ - -p 10001:10001 \ - --env-file tez-dist/src/docker/tez-am/am.env \ - --name tez-am \ - --hostname localhost \ - apache/tez-am:$TEZ_VERSION - ``` - - * `TEZ_VERSION` corresponds to the Maven `${project.version}`. - Set this environment variable in your shell before running the commands. - * Expose ports using the `-p` flag based on the - `tez.am.client.am.port-range` property in `tez-site.xml`. - * The `--hostname` flag configures the container's hostname, allowing - services on the host (e.g., macOS) to connect to it. - * Ensure the `--env-file` flag is included, or at a minimum, pass - `-e TEZ_FRAMEWORK_MODE=STANDALONE_ZOOKEEPER` to the `docker run` command. - -4. Debugging the Tez AM container: -Uncomment the `JAVA_TOOL_OPTIONS` in `am.env` and expose 5005 port -using `-p` flag - - ```bash - docker run --rm \ - -p 10001:10001 -p 5005:5005 \ - --env-file tez-dist/src/docker/tez-am/am.env \ - --name tez-am \ - --hostname localhost \ - apache/tez-am:$TEZ_VERSION - ``` - -5. To override the tez-site.xml in docker image use: - * Set the `TEZ_CUSTOM_CONF_DIR` environment variable in `am.env` - or via the `docker run` command (e.g., `/opt/tez/custom-conf`). - - ```bash - export TEZ_SITE_PATH=$(pwd)/tez-dist/src/docker/conf/tez-site.xml - - docker run --rm \ - -p 10001:10001 \ - --env-file tez-dist/src/docker/tez-am/am.env \ - -v "$TEZ_SITE_PATH:/opt/tez/custom-conf/tez-site.xml" \ - --name tez-am \ - --hostname localhost \ - apache/tez-am:$TEZ_VERSION - ``` - -6. To add plugin jars in docker image use: - * The plugin directory path inside the Docker container is fixed at `/opt/tez/plugins`. - - ```bash - docker run --rm \ - -p 10001:10001 \ - --env-file tez-dist/src/docker/tez-am/am.env \ - -v "/path/to/your/local/plugins:/opt/tez/plugins" \ - --name tez-am \ - --hostname localhost \ - apache/tez-am:$TEZ_VERSION - ``` - -7. Using Docker Compose: - * Refer to the `docker-compose.yml` file in this directory for - an example of how to run both the Tez AM and Zookeeper containers - together using Docker Compose. - - ```bash - docker-compose -f tez-dist/src/docker/tez-am/docker-compose.yml up -d --build - ``` - - * This command will start both the Tez AM, Zookeeper, Minimal - Hadoop containers as defined in the `docker-compose.yml` file. - -8. To mount custom plugins or JARs required by Tez AM (e.g., for split generation - — typically the hive-exec jar, but in general, any UDFs or dependencies - previously managed via YARN localization: - * Create a directory tez-plugins and add all required jars. - * Uncomment the following lines in docker compose under the tez-am service - to mount this directory as a volume to `/opt/tez/plugins` in the docker container. - - ```yml - volumes: - - ./tez-plugins:/opt/tez/plugins - ```