-
Notifications
You must be signed in to change notification settings - Fork 629
build: OPS-597: restructure sglang to follow container strategy structure #2803
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Tushar Sharma <tusharma@nvidia.com>
Signed-off-by: Tushar Sharma <tusharma@nvidia.com>
Signed-off-by: Tushar Sharma <tusharma@nvidia.com>
1a77f14
to
0bf7756
Compare
Signed-off-by: Tushar Sharma <tusharma@nvidia.com>
Signed-off-by: Tushar Sharma <tusharma@nvidia.com>
WalkthroughRewrites container/Dockerfile.sglang into a multi-stage build adding dynamo_base, framework, runtime, and dev stages with new build args and environment setup. Integrates SGLang source installation in framework and copies it into runtime. Updates container/build.sh gating so both VLLM and SGLANG trigger base and framework image builds. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Builder as Docker Build
participant Base as dynamo_base (FROM DYNAMO_BASE_IMAGE)
participant Framework as framework
participant Runtime as runtime
participant Dev as dev
Builder->>Base: FROM ${DYNAMO_BASE_IMAGE} AS dynamo_base
Builder->>Framework: FROM CUDA image\n+ install uv, create venv (PYTHON_VERSION)\n+ git clone SGLang (SGLANG_VERSION)\n+ uv pip install -e . (optional sccache)
Framework-->>Runtime: COPY /opt/sglang -> /opt/sglang
Base-->>Runtime: COPY nats-server, etcd
Framework-->>Runtime: COPY CUDA/UCX artifacts as needed
Builder->>Runtime: Install deps (incl. Prometheus PROM_VERSION)\n+ set ENV (DYNAMO_HOME, VIRTUAL_ENV, PATH)
Runtime-->>Dev: FROM runtime AS dev\n+ install dev tools (nvtop, tmux, vim)\n+ set PYTHONPATH
sequenceDiagram
autonumber
participant User as build.sh caller
participant Script as container/build.sh
participant Docker as Docker
User->>Script: ./build.sh --target <name>
alt target in {VLLM, SGLANG}
Script->>Docker: Build base image
Script->>Docker: Build framework image
else other targets
Script-->>User: Skip base/framework steps
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Pre-merge checks (2 passed, 1 warning)❌ Failed Checks (1 warning)
✅ Passed Checks (2 passed)
Poem
Tip 👮 Agentic pre-merge checks are now available in preview!Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (7)
container/Dockerfile.sglang (5)
59-61
: Comment contradicts FROM — clarify base usageThe comment says “Use dynamo base image” but this stage uses ${BASE_IMAGE}:${BASE_IMAGE_TAG}. The actual dynamo base is the earlier “dynamo_base” stage. Update the comment to avoid confusion and match vLLM Dockerfile phrasing.
-# Use dynamo base image (see /container/Dockerfile for more details) +# Framework stage builds on CUDA devel base; artifacts are copied from the 'dynamo_base' stage as needed
165-183
: Drop redundant g++ (build-essential already includes it)Saves a few MB and keeps parity with other Dockerfiles.
ninja-build \ - g++ \ # prometheus dependencies
Also applies to: 178-180
200-213
: Add checksum verification for Prometheus tarballRemote binary install without integrity check. Verify SHA256 to harden supply-chain.
ARG PROM_VERSION=3.4.1 RUN ARCH=$(dpkg --print-architecture) && \ case "$ARCH" in \ amd64) PLATFORM=linux-amd64 ;; \ arm64) PLATFORM=linux-arm64 ;; \ *) echo "Unsupported architecture: $ARCH" && exit 1 ;; \ esac && \ - curl -fsSL "https://github.com/prometheus/prometheus/releases/download/v${PROM_VERSION}/prometheus-${PROM_VERSION}.${PLATFORM}.tar.gz" \ - | tar -xz -C /tmp && \ + curl -fsSL -o /tmp/prom.tgz "https://github.com/prometheus/prometheus/releases/download/v${PROM_VERSION}/prometheus-${PROM_VERSION}.${PLATFORM}.tar.gz" && \ + curl -fsSL -o /tmp/sha256sums.txt "https://github.com/prometheus/prometheus/releases/download/v${PROM_VERSION}/sha256sums.txt" && \ + (cd /tmp && grep "prometheus-${PROM_VERSION}.${PLATFORM}.tar.gz" sha256sums.txt | sha256sum -c -) && \ + tar -xzf /tmp/prom.tgz -C /tmp && \ mv "/tmp/prometheus-${PROM_VERSION}.${PLATFORM}/prometheus" /usr/local/bin/ && \ chmod +x /usr/local/bin/prometheus && \ - rm -rf "/tmp/prometheus-${PROM_VERSION}.${PLATFORM}" + rm -rf "/tmp/prometheus-${PROM_VERSION}.${PLATFORM}" /tmp/prom.tgz /tmp/sha256sums.txt
214-219
: Comment mismatch: copying from dynamo_base, not frameworkThe comments say “Copy UCX from framework image” but the COPY uses dynamo_base. Adjust to avoid confusion.
-# Copy UCX from framework image as plugin for NIXL -# Copy NIXL source from framework image -# Copy dynamo wheels for gitlab artifacts +# Copy UCX and NIXL artifacts from 'dynamo_base'
321-321
: Redundant layer: copying /usr/local/bin from runtime into devThe dev stage is FROM runtime; /usr/local/bin is already present. This COPY adds an unnecessary layer.
-COPY --from=runtime /usr/local/bin /usr/local/bin
container/build.sh (2)
596-601
: Propagate sccache flags — consider surfacing ARCH-derived S3 key prefixYou validate bucket/region but don’t pass a prefix; Dockerfile derives SCCACHE_S3_KEY_PREFIX from ARCH. Optionally surface this in logs for parity with VLLM.
if [ "$USE_SCCACHE" = true ]; then BUILD_ARGS+=" --build-arg USE_SCCACHE=true" BUILD_ARGS+=" --build-arg SCCACHE_BUCKET=${SCCACHE_BUCKET}" BUILD_ARGS+=" --build-arg SCCACHE_REGION=${SCCACHE_REGION}" + echo " sccache S3 Key Prefix: '${ARCH}'" fi
406-433
: CLI UX: add an explicit --sglang-version passthroughOptional: add --sglang-version to avoid requiring manual --build-arg usage; improves discoverability and mirrors other knobs.
Happy to draft the flag plumbing if you want it in this PR.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
container/Dockerfile.sglang
(6 hunks)container/build.sh
(1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-08-30T20:43:49.632Z
Learnt from: keivenchang
PR: ai-dynamo/dynamo#2797
File: container/Dockerfile:437-449
Timestamp: 2025-08-30T20:43:49.632Z
Learning: In the dynamo project's devcontainer setup, the team prioritizes consistency across framework-specific Dockerfiles (like container/Dockerfile, container/Dockerfile.vllm, etc.) by mirroring their structure, even when individual optimizations might be possible, to maintain uniformity in the development environment setup.
Applied to files:
container/Dockerfile.sglang
📚 Learning: 2025-08-30T20:43:10.091Z
Learnt from: keivenchang
PR: ai-dynamo/dynamo#2797
File: .devcontainer/devcontainer.json:12-12
Timestamp: 2025-08-30T20:43:10.091Z
Learning: In the dynamo project, devcontainer.json files use templated container names (like "dynamo-vllm-devcontainer") that are automatically processed by the copy_devcontainer.sh script to generate framework-specific configurations with unique names, preventing container name collisions.
Applied to files:
container/Dockerfile.sglang
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Build and Test - dynamo
🔇 Additional comments (1)
container/build.sh (1)
614-629
: Two-step build extended to SGLANG — good alignment with VLLM flowClean split: build base once, pass as DYNAMO_BASE_IMAGE, then build framework. Matches the container strategy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that almost all of the runtime is almost identical between vllm and sglang (except for the later stages). And it's really great to see that dev is completely identical! This is great and will be really helpful for future maintenance. As long as you've tested this, LGTM +1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any idea how much time it will take to build container in github CI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (3)
container/Dockerfile.sglang (3)
109-115
: ENV uses unsupported Bash-style expansions — breaks Docker parsingDocker doesn’t support ${VAR:+...} in ENV. Export plain values and gate sccache in RUN steps.
Apply:
-ENV SCCACHE_BUCKET=${USE_SCCACHE:+${SCCACHE_BUCKET}} \ - SCCACHE_REGION=${USE_SCCACHE:+${SCCACHE_REGION}} \ - SCCACHE_S3_KEY_PREFIX=${USE_SCCACHE:+${ARCH}} \ - CMAKE_C_COMPILER_LAUNCHER=${USE_SCCACHE:+sccache} \ - CMAKE_CXX_COMPILER_LAUNCHER=${USE_SCCACHE:+sccache} \ - CMAKE_CUDA_COMPILER_LAUNCHER=${USE_SCCACHE:+sccache} +ENV SCCACHE_BUCKET=${SCCACHE_BUCKET} \ + SCCACHE_REGION=${SCCACHE_REGION} \ + SCCACHE_S3_KEY_PREFIX=${ARCH} +# Set CMake launchers only when USE_SCCACHE=true in build RUN steps.
121-129
: Gate sccache usage and stats; export CMake launchers only when enabledUnconditionally calling show-stats will fail if sccache isn’t installed; also ensure CMake launchers are exported when enabled.
Apply:
RUN --mount=type=cache,target=/root/.cache/uv \ cd /opt && \ git clone https://github.com/sgl-project/sglang.git && \ cd sglang && \ git checkout v${SGLANG_VERSION} && \ - # Install in editable mode for development - uv pip install -e "python[all]" && \ - /tmp/use-sccache.sh show-stats "SGLang"; + if [ "${USE_SCCACHE}" = "true" ]; then \ + export CMAKE_C_COMPILER_LAUNCHER=sccache CMAKE_CXX_COMPILER_LAUNCHER=sccache CMAKE_CUDA_COMPILER_LAUNCHER=sccache; \ + fi && \ + uv pip install -e "python[all]" && \ + if [ "${USE_SCCACHE}" = "true" ]; then /tmp/use-sccache.sh show-stats "SGLang"; fi
239-244
: [cp312 hardcoded] Make wheel selection Python-version agnosticHardcoding cp312 breaks builds when PYTHON_VERSION ≠ 3.12. Compute ABI at build time.
Apply:
-RUN uv pip install \ - /opt/dynamo/wheelhouse/ai_dynamo_runtime*cp312*.whl \ +RUN PY_ABI="$(python - <<'PY'\nimport sys;print(f'cp{sys.version_info.major}{sys.version_info.minor}')\nPY\n)" && \ + uv pip install \ + /opt/dynamo/wheelhouse/ai_dynamo_runtime*${PY_ABI}*.whl \ /opt/dynamo/wheelhouse/ai_dynamo*any.whl \ /opt/dynamo/wheelhouse/nixl/nixl*.whl \ /opt/dynamo/benchmarks && \ rm -rf /opt/dynamo/benchmarks
🧹 Nitpick comments (7)
container/build.sh (1)
621-637
: Two-step build gating for SGLANG looks good; consider tagging base as latest for cache reuseGating mirrors VLLM and wires DYNAMO_BASE_IMAGE correctly. As a small improvement, also tag the base image with a stable alias (e.g., dynamo-base:latest) to improve cache hits across commits and CI runs.
Apply:
- $RUN_PREFIX docker build -f "${SOURCE_DIR}/Dockerfile" --target dev $PLATFORM $BUILD_ARGS $CACHE_FROM $CACHE_TO --tag $DYNAMO_BASE_IMAGE $BUILD_CONTEXT_ARG $BUILD_CONTEXT $NO_CACHE + $RUN_PREFIX docker build -f "${SOURCE_DIR}/Dockerfile" --target dev $PLATFORM $BUILD_ARGS $CACHE_FROM $CACHE_TO --tag $DYNAMO_BASE_IMAGE --tag dynamo-base:latest $BUILD_CONTEXT_ARG $BUILD_CONTEXT $NO_CACHEcontainer/Dockerfile.sglang (6)
83-88
: Pin uv image to a version or digestUsing ghcr.io/astral-sh/uv:latest makes builds non-reproducible. Pin to a tag or digest.
Apply:
-COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/ +ARG UV_TAG=0.4.20 +COPY --from=ghcr.io/astral-sh/uv:${UV_TAG} /uv /uvx /bin/
155-158
: Preserve venv precedence in PATHThe later PATH override inserts directories before the venv, which can shadow venv tools. Keep VIRTUAL_ENV first.
Apply:
-ENV PATH="${VIRTUAL_ENV}/bin:${PATH}" +ENV PATH="${VIRTUAL_ENV}/bin:${PATH}" @@ -ENV PATH=/usr/local/bin/etcd/:/usr/local/cuda/nvvm/bin:$PATH +ENV PATH=/usr/local/bin/etcd/:/usr/local/cuda/nvvm/bin:${VIRTUAL_ENV}/bin:$PATHAlso applies to: 198-199
200-213
: Verify Prometheus download integrityAdd a checksum verification to guard against tampering.
Example:
ARG PROM_VERSION=3.4.1 RUN ARCH=$(dpkg --print-architecture) && \ @@ - curl -fsSL "https://github.com/prometheus/prometheus/releases/download/v${PROM_VERSION}/prometheus-${PROM_VERSION}.${PLATFORM}.tar.gz" \ - | tar -xz -C /tmp && \ + curl -fsSLO "https://github.com/prometheus/prometheus/releases/download/v${PROM_VERSION}/prometheus-${PROM_VERSION}.${PLATFORM}.tar.gz" && \ + curl -fsSLO "https://github.com/prometheus/prometheus/releases/download/v${PROM_VERSION}/sha256sums.txt" && \ + grep "prometheus-${PROM_VERSION}.${PLATFORM}.tar.gz" sha256sums.txt | sha256sum -c - && \ + tar -xz -C /tmp -f "prometheus-${PROM_VERSION}.${PLATFORM}.tar.gz" && \ mv "/tmp/prometheus-${PROM_VERSION}.${PLATFORM}/prometheus" /usr/local/bin/ && \ chmod +x /usr/local/bin/prometheus && \ - rm -rf "/tmp/prometheus-${PROM_VERSION}.${PLATFORM}" + rm -rf "/tmp/prometheus-${PROM_VERSION}.${PLATFORM}" prometheus-${PROM_VERSION}.${PLATFORM}.tar.gz sha256sums.txt
233-235
: Avoid env var in COPY source to prevent cross-stage mismatchUse the literal path to reduce ambiguity between stages.
Apply:
-COPY --from=framework ${VIRTUAL_ENV} ${VIRTUAL_ENV} +COPY --from=framework /opt/dynamo/venv /opt/dynamo/venv
14-16
: SGLANG tag robustnessgit checkout v${SGLANG_VERSION} assumes a “v” prefix. Consider trying both or accepting raw tags.
Apply:
- git checkout v${SGLANG_VERSION} && \ + (git checkout "v${SGLANG_VERSION}" || git checkout "${SGLANG_VERSION}") && \Please confirm SGLang’s tagging scheme for rc tags you intend to use.
Also applies to: 125-126
321-336
: Redundant copy from runtime in the same pathCOPY --from=runtime /usr/local/bin /usr/local/bin is a no-op since dev derives FROM runtime.
Apply:
-COPY --from=runtime /usr/local/bin /usr/local/bin +# Inherits /usr/local/bin from runtime; no copy needed.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
container/Dockerfile.sglang
(6 hunks)container/build.sh
(1 hunks)
🧰 Additional context used
🧠 Learnings (4)
📓 Common learnings
Learnt from: keivenchang
PR: ai-dynamo/dynamo#2797
File: container/Dockerfile:437-449
Timestamp: 2025-08-30T20:43:49.632Z
Learning: In the dynamo project's devcontainer setup, the team prioritizes consistency across framework-specific Dockerfiles (like container/Dockerfile, container/Dockerfile.vllm, etc.) by mirroring their structure, even when individual optimizations might be possible, to maintain uniformity in the development environment setup.
📚 Learning: 2025-05-28T22:54:46.875Z
Learnt from: grahamking
PR: ai-dynamo/dynamo#1177
File: container/Dockerfile.vllm:102-105
Timestamp: 2025-05-28T22:54:46.875Z
Learning: In Dockerfiles, when appending to environment variables that may not exist in the base image, Docker validation will fail if you reference undefined variables with ${VARIABLE} syntax. In such cases, setting the environment variable directly (e.g., ENV CPATH=/usr/include) rather than appending is the appropriate approach.
Applied to files:
container/Dockerfile.sglang
📚 Learning: 2025-08-30T20:43:49.632Z
Learnt from: keivenchang
PR: ai-dynamo/dynamo#2797
File: container/Dockerfile:437-449
Timestamp: 2025-08-30T20:43:49.632Z
Learning: In the dynamo project's devcontainer setup, the team prioritizes consistency across framework-specific Dockerfiles (like container/Dockerfile, container/Dockerfile.vllm, etc.) by mirroring their structure, even when individual optimizations might be possible, to maintain uniformity in the development environment setup.
Applied to files:
container/Dockerfile.sglang
📚 Learning: 2025-08-30T20:43:10.091Z
Learnt from: keivenchang
PR: ai-dynamo/dynamo#2797
File: .devcontainer/devcontainer.json:12-12
Timestamp: 2025-08-30T20:43:10.091Z
Learning: In the dynamo project, devcontainer.json files use templated container names (like "dynamo-vllm-devcontainer") that are automatically processed by the copy_devcontainer.sh script to generate framework-specific configurations with unique names, preventing container name collisions.
Applied to files:
container/Dockerfile.sglang
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Mirror Repository to GitLab
- GitHub Check: Build and Test - dynamo
🔇 Additional comments (1)
container/Dockerfile.sglang (1)
42-58
: Overall: structure aligns with project’s container strategyMulti-stage split (dynamo_base, framework, runtime, dev) and consistency with other framework Dockerfiles look good and follow the team’s uniformity preference.
Also applies to: 133-151, 274-286
…build from source Signed-off-by: Tushar Sharma <tusharma@nvidia.com>
@nv-anants It should actually take less than the vllm container (since we don't need to build from source for sglang). I would estimate around ~15 minutes. |
…r parallelization Signed-off-by: Tushar Sharma <tusharma@nvidia.com>
…ture (#2803) Signed-off-by: Tushar Sharma <tusharma@nvidia.com> Signed-off-by: Indrajit Bhosale <iamindrajitb@gmail.com>
…ture (#2803) Signed-off-by: Tushar Sharma <tusharma@nvidia.com>
…ture (#2803) Signed-off-by: Tushar Sharma <tusharma@nvidia.com> Signed-off-by: hongkuanz <hongkuanz@nvidia.com>
Overview:
This PR applies the recent refactor covered in this PR: 82bae24
to the sglang so that all containers get the benefit of leveraging Dynamo base container as the base Dockerfile, sccache improvements, and other optimizations to reduce build times. This also enables us to add these build jobs to Github CI for public validation as well.
Details:
Where should the reviewer start?
Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
Summary by CodeRabbit