[feat] update build-env-ldb-toolchain docker images after prebuilt release by hello-stephen · Pull Request #389 · apache/doris-thirdparty

hello-stephen · 2026-04-13T16:12:56Z

Summary

After each successful thirdparty prebuilt release, automatically rebuild and push apache/doris:build-env-ldb-toolchain-latest and apache/doris:build-env-ldb-toolchain-no-avx2-latest Docker images
The upstream docker/compilation/Dockerfile is fetched at CI time and patched in-place, so any toolchain/dependency updates from apache/doris are automatically inherited without maintaining a local Dockerfile copy
Two patches are applied to the upstream Dockerfile:
1. Fix epel metalink reliability issue (switch to Aliyun mirror)
2. Replace the "clone & build thirdparty" block with downloading the prebuilt artifact from GitHub Release
3. For no-avx2 image: additionally set USE_AVX2=0 in the builder ENV block

Test plan

Verified sed/python3 patch output locally: COPY doris and build-thirdparty removed, ARG GITHUB_REPOSITORY and prebuilt download block correctly injected
Full docker buildx build (no push) passed on tc-server-selectdb for both normal and no-avx2 Dockerfiles

Notes

Requires two secrets to be configured in the repository:

DOCKERHUB_USERNAME
DOCKERHUB_TOKEN (with push access to apache/doris on Docker Hub)

🤖 Generated with Claude Code

…lease After each successful thirdparty prebuilt release, automatically rebuild and push the apache/doris:build-env-ldb-toolchain-latest and apache/doris:build-env-ldb-toolchain-no-avx2-latest Docker images. The upstream docker/compilation/Dockerfile is fetched at CI time and patched in-place, so any toolchain/dependency updates from apache/doris are automatically inherited without maintaining a local Dockerfile copy. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Copilot

Pull request overview

This PR extends the existing thirdparty prebuilt release workflow to also rebuild and push the build-env-ldb-toolchain Docker images after a successful prebuilt release, by fetching and patching the upstream Doris compilation Dockerfile at CI time.

Changes:

Add workflow_dispatch support to allow manual runs.
Introduce an update-docker job that fetches and patches the upstream Doris Dockerfile, then builds/pushes two Docker images (normal + no-avx2) to Docker Hub.
Wire update-docker into the workflow’s success/failure jobs via updated needs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-17T03:58:39Z

  success:
    name: Success
-    needs: [prerelease, build]
+    needs: [prerelease, build, update-docker]
    if: needs.prerelease.outputs.should_release == 'true'


Adding update-docker to the success job’s needs means the release note will no longer be updated to SUCCESS if the Docker build/push fails (or is skipped), even when the prebuilt artifacts were successfully built and uploaded. If that coupling isn’t intended, consider decoupling the release-status update from the Docker publish (e.g., keep success depending only on prerelease/build, and have a separate status line/section for the image publish).

Copilot · 2026-04-17T03:58:40Z

+    if: |
+      always() && (
+        (needs.prerelease.outputs.should_release == 'true' && needs.build.result == 'success') ||
+        (github.event_name == 'workflow_dispatch' && needs.build.result != 'failure')


The update-docker job runs on workflow_dispatch when needs.build.result != 'failure', which also includes cancelled/skipped. That can trigger a Docker push even when the build was cancelled or never produced a fresh prebuilt artifact, leading to confusing failures (missing release asset) or publishing an unintended image. Consider tightening the condition to only allow success (or explicitly success || skipped if intentionally reusing an existing release asset), and/or add an explicit check that the expected release asset exists before building/pushing.

Suggested change

(github.event_name == 'workflow_dispatch' && needs.build.result != 'failure')

(github.event_name == 'workflow_dispatch' && needs.build.result == 'success')

Copilot · 2026-04-17T03:58:40Z

+        run: |
+          # Pull the upstream Dockerfile from apache/doris to stay in sync with
+          # any toolchain / dependency changes they make.
+          curl -fsSL https://raw.githubusercontent.com/apache/doris/master/docker/compilation/Dockerfile \
+            -o Dockerfile.upstream


curl pulls the upstream Dockerfile from the moving master branch. That makes the produced images non-reproducible and introduces a supply-chain risk (the build content can change without any change in this repo). Consider pinning to a specific commit SHA (e.g., derived from the Doris version you already compute in prerelease) and/or verifying the fetched content before building.

Suggested change

run: |

# Pull the upstream Dockerfile from apache/doris to stay in sync with

# any toolchain / dependency changes they make.

curl -fsSL https://raw.githubusercontent.com/apache/doris/master/docker/compilation/Dockerfile \

-o Dockerfile.upstream

env:

DORIS_DOCKERFILE_COMMIT: 0123456789abcdef0123456789abcdef01234567

DORIS_DOCKERFILE_SHA256: REPLACE_WITH_VERIFIED_SHA256

run: |

# Pull the upstream Dockerfile from a pinned apache/doris commit so the

# build input is reproducible and cannot drift with changes on master.

curl -fsSL "https://raw.githubusercontent.com/apache/doris/${DORIS_DOCKERFILE_COMMIT}/docker/compilation/Dockerfile" \

-o Dockerfile.upstream

printf '%s %s\n' "${DORIS_DOCKERFILE_SHA256}" Dockerfile.upstream | sha256sum -c -

Copilot · 2026-04-17T03:58:40Z

+          # Patch 1: fix epel metalink.
+          # The upstream RUN line installs epel-release then immediately runs yum
+          # install/clean/makecache, but epel.repo's metalink may be unreliable.
+          # Insert a sed fix between epel-release install and the next yum install.
+          old_epel = 'yum install epel-release -y && yum install https://packages.endpointdev.com'
+          new_epel = ('yum install epel-release -y \\\n'
+                      '    && sed -i \\\n'
+                      '      -e \'s/^metalink=/#metalink=/\' \\\n'
+                      '      -e \'s|^#baseurl=http://download.fedoraproject.org/pub/epel/7|baseurl=https://mirrors.aliyun.com/epel/7|\' \\\n'
+                      '      /etc/yum.repos.d/epel*.repo \\\n'
+                      '    && yum install https://packages.endpointdev.com')
+          assert old_epel in content, f"Patch 1 failed: target string not found in upstream Dockerfile"
+          patched = content.replace(old_epel, new_epel, 1)
+          assert patched != content, "Patch 1 was a no-op: upstream Dockerfile may have changed"
+
+          # Patch 2: replace "clone & build thirdparty" block with downloading
+          # our prebuilt artifact. The block starts with "# clone lastest source
+          # code" comment and ends with "rm -rf ${DEFAULT_DIR}/doris".
+          prebuilt_block = (
+              '# Download prebuilt thirdparty from GitHub Release (built by doris-thirdparty automation)\n'
+              'ARG GITHUB_REPOSITORY\n'
+              'RUN mkdir -p /var/local/thirdparty \\\n'
+              '    && wget -q "https://github.com/${GITHUB_REPOSITORY}/releases/download/automation/doris-thirdparty-prebuilt-linux-x86_64.tar.xz" \\\n'
+              '        -O /tmp/prebuilt.tar.xz \\\n'
+              '    && tar -xf /tmp/prebuilt.tar.xz -C /var/local/thirdparty \\\n'
+              '    && rm /tmp/prebuilt.tar.xz\n'
+          )
+          patched_2 = re.sub(
+              r'# clone lastest source code.*?rm -rf \$\{DEFAULT_DIR\}/doris\n',
+              prebuilt_block,
+              patched, flags=re.DOTALL
+          )
+          assert patched_2 != patched, "Patch 2 was a no-op: 'clone lastest source code' block not found in upstream Dockerfile"
+          patched = patched_2


The Python patching logic is very brittle: Patch 2 depends on the exact comment text # clone lastest source code and Patch 1 depends on an exact RUN substring. If upstream reformats or fixes typos, this workflow will start failing. Consider making the regex/string matches more resilient (e.g., tolerate latest/lastest, anchor around more stable commands, or use a small patch file applied with patch), so upstream drift causes fewer spurious failures.

Copilot · 2026-04-17T03:58:40Z

+          print("=== normal: check ===")
+          for line in open('Dockerfile.patched'):
+              if any(k in line for k in ['COPY doris', 'build-thirdparty', 'ARG GITHUB', 'prebuilt', 'USE_AVX2']):
+                  print(line, end='')
+          print("=== noavx2: check ===")
+          for line in open('Dockerfile.patched-noavx2'):
+              if any(k in line for k in ['COPY doris', 'build-thirdparty', 'ARG GITHUB', 'prebuilt', 'USE_AVX2']):
+                  print(line, end='')


The script only prints diagnostic lines if COPY doris / build-thirdparty are still present, but it doesn’t fail the job. Since this workflow never checks out a build context, leaving any COPY instructions in the patched Dockerfile will make the Docker build fail later with a less actionable error. Consider adding assertions that these patterns are absent in Dockerfile.patched/Dockerfile.patched-noavx2 after patching.

Suggested change

print("=== normal: check ===")

for line in open('Dockerfile.patched'):

if any(k in line for k in ['COPY doris', 'build-thirdparty', 'ARG GITHUB', 'prebuilt', 'USE_AVX2']):

print(line, end='')

print("=== noavx2: check ===")

for line in open('Dockerfile.patched-noavx2'):

if any(k in line for k in ['COPY doris', 'build-thirdparty', 'ARG GITHUB', 'prebuilt', 'USE_AVX2']):

print(line, end='')

diagnostic_patterns = ['COPY doris', 'build-thirdparty', 'ARG GITHUB', 'prebuilt', 'USE_AVX2']

forbidden_patterns = ['COPY doris', 'build-thirdparty']

def check_patched_dockerfile(path, label):

print(f"=== {label}: check ===")

with open(path) as f:

lines = f.readlines()

for line in lines:

if any(k in line for k in diagnostic_patterns):

print(line, end='')

for pattern in forbidden_patterns:

assert not any(pattern in line for line in lines), (

f"Patch validation failed for {path}: found forbidden pattern '{pattern}'; "

"patched Dockerfile still depends on unavailable build context"

)

check_patched_dockerfile('Dockerfile.patched', 'normal')

check_patched_dockerfile('Dockerfile.patched-noavx2', 'noavx2')

stephen lee and others added 2 commits April 14, 2026 00:10

3

af43f7c

CalvinKirs approved these changes Apr 17, 2026

View reviewed changes

hello-stephen requested a review from Copilot April 17, 2026 03:55

Copilot started reviewing on behalf of hello-stephen April 17, 2026 03:55 View session

Copilot AI reviewed Apr 17, 2026

View reviewed changes

morningman approved these changes Apr 17, 2026

View reviewed changes

morningman merged commit dcaea75 into main Apr 17, 2026
4 checks passed

hello-stephen deleted the feat/update-docker-image-in-build-workflow branch April 17, 2026 04:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feat] update build-env-ldb-toolchain docker images after prebuilt release#389

[feat] update build-env-ldb-toolchain docker images after prebuilt release#389
morningman merged 2 commits intomainfrom
feat/update-docker-image-in-build-workflow

hello-stephen commented Apr 13, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 17, 2026

Uh oh!

Copilot AI Apr 17, 2026

Uh oh!

Copilot AI Apr 17, 2026

Uh oh!

Copilot AI Apr 17, 2026

Uh oh!

Copilot AI Apr 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	(github.event_name == 'workflow_dispatch' && needs.build.result != 'failure')
	(github.event_name == 'workflow_dispatch' && needs.build.result == 'success')

-        run: |
-          # Pull the upstream Dockerfile from apache/doris to stay in sync with
-          # any toolchain / dependency changes they make.
-          curl -fsSL https://raw.githubusercontent.com/apache/doris/master/docker/compilation/Dockerfile \
-            -o Dockerfile.upstream
+        env:
+          DORIS_DOCKERFILE_COMMIT: 0123456789abcdef0123456789abcdef01234567
+          DORIS_DOCKERFILE_SHA256: REPLACE_WITH_VERIFIED_SHA256
+        run: |
+          # Pull the upstream Dockerfile from a pinned apache/doris commit so the
+          # build input is reproducible and cannot drift with changes on master.
+          curl -fsSL "https://raw.githubusercontent.com/apache/doris/${DORIS_DOCKERFILE_COMMIT}/docker/compilation/Dockerfile" \
+            -o Dockerfile.upstream
+          printf '%s  %s\n' "${DORIS_DOCKERFILE_SHA256}" Dockerfile.upstream | sha256sum -c -

-          print("=== normal: check ===")
-          for line in open('Dockerfile.patched'):
-              if any(k in line for k in ['COPY doris', 'build-thirdparty', 'ARG GITHUB', 'prebuilt', 'USE_AVX2']):
-                  print(line, end='')
-          print("=== noavx2: check ===")
-          for line in open('Dockerfile.patched-noavx2'):
-              if any(k in line for k in ['COPY doris', 'build-thirdparty', 'ARG GITHUB', 'prebuilt', 'USE_AVX2']):
-                  print(line, end='')
+          diagnostic_patterns = ['COPY doris', 'build-thirdparty', 'ARG GITHUB', 'prebuilt', 'USE_AVX2']
+          forbidden_patterns = ['COPY doris', 'build-thirdparty']
+          def check_patched_dockerfile(path, label):
+              print(f"=== {label}: check ===")
+              with open(path) as f:
+                  lines = f.readlines()
+              for line in lines:
+                  if any(k in line for k in diagnostic_patterns):
+                      print(line, end='')
+              for pattern in forbidden_patterns:
+                  assert not any(pattern in line for line in lines), (
+                      f"Patch validation failed for {path}: found forbidden pattern '{pattern}'; "
+                      "patched Dockerfile still depends on unavailable build context"
+                  )
+          check_patched_dockerfile('Dockerfile.patched', 'normal')
+          check_patched_dockerfile('Dockerfile.patched-noavx2', 'noavx2')

Conversation

hello-stephen commented Apr 13, 2026

Summary

Test plan

Notes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants