Skip to content

Fix release workflow: runner labels, Docker username, setuptools_scm#2702

Merged
gyohuangxin merged 1 commit intoROCm:mainfrom
sunway513:fix/release-workflow-bugs
Apr 13, 2026
Merged

Fix release workflow: runner labels, Docker username, setuptools_scm#2702
gyohuangxin merged 1 commit intoROCm:mainfrom
sunway513:fix/release-workflow-bugs

Conversation

@sunway513
Copy link
Copy Markdown
Collaborator

Summary

Three independent bugs preventing release builds from completing:

  • Runner labels don't exist: aiter-mi300-1gpu and aiter-mi325-1gpu were never registered as GitHub Actions runners. Builds queued indefinitely (7+ hours). Changed default to aiter-1gpu-runner (confirmed working in aiter-test.yaml). Fixed linux-aiter-mi355-1 typo to linux-aiter-mi35x-1.
  • Docker username typo: rocmshard -> rocmshared (missing 'e'), causing Docker login failure on build-only-aiter runner.
  • setuptools_scm 10.x breaks build: setuptools_scm 10.0.5 moved core to vcs_versioning package, causing ModuleNotFoundError: No module named 'vcs_versioning'. Pin to <10 until pyproject.toml is updated.

Also protects tag-based builds from cancel-in-progress.

Test plan

  • Confirmed aiter-1gpu-runner label exists and picks up jobs (run 24287110305 was picked up immediately)
  • Confirmed rocmshared is the correct Docker username (used in aiter-test.yaml, triton-test.yaml)
  • Confirmed setuptools_scm<10 (9.2.2) fixes the build (tested in 6 local containers)
  • Trigger release build after merge to verify end-to-end

Three independent bugs preventing release builds:

1. Runner labels don't exist: aiter-mi300-1gpu, aiter-mi325-1gpu were
   never registered. Use aiter-1gpu-runner (MI325 1-GPU, confirmed in
   aiter-test.yaml). Fix linux-aiter-mi355-1 typo to linux-aiter-mi35x-1.

2. Docker username typo: rocmshard -> rocmshared (missing 'e'),
   causing Docker login failure on build-only-aiter runner.

3. setuptools_scm 10.x breaks build: moved core to vcs_versioning
   package, causing ModuleNotFoundError. Pin to <10 until pyproject.toml
   is updated.

Also protect tag-based builds from cancel-in-progress.
@sunway513 sunway513 requested review from a team and Copilot April 11, 2026 18:37
@github-actions
Copy link
Copy Markdown
Contributor

🏷️ CI Guide

Runs automatically on every PR:

  • ✅ Pre-checks (submodule verification, code formatting)
  • ✅ Aiter op tests (gfx942 + gfx950)
  • ✅ Triton tests (only when aiter/ops/triton/** or related paths are changed)

Extended tests (opt-in via labels):

Label Tests
ci:triton-355 Run Triton tests on MI355 in addition to MI325
ci:sglang SGLang integration tests
ci:atom ATOM benchmark (DeepSeek-R1 + GPT-OSS)
ci:vllm vLLM benchmark
ci:all All of the above

Add labels via the sidebar or gh pr edit 2702 --add-label <label>

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes multiple GitHub Actions release-workflow blockers so manual release builds can start promptly, authenticate to Docker, and complete Python wheel builds reliably.

Changes:

  • Update workflow_dispatch defaults/options for runner selection and Docker username.
  • Prevent tag-based release runs from being cancelled by concurrency settings.
  • Add an explicit setuptools_scm pin step inside the build container to avoid breaking versions.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- linux-aiter-mi355-1
- aiter-1gpu-runner
- build-only-aiter
- linux-aiter-mi35x-1
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}
cancel-in-progress: ${{ github.ref != 'refs/heads/main' && !startsWith(github.ref, 'refs/tags/v') }}
docker exec \
-w /workspace \
aiter_build_${{ matrix.python_version }} \
pip install --timeout=60 --retries=10 "setuptools_scm<10"
@gyohuangxin gyohuangxin merged commit b522c4b into ROCm:main Apr 13, 2026
27 of 29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants