ci: fix macOS release integration-test 20min timeout#1575
Merged
Conversation
Two unit tests asserted st_mode & 0o111 == 0o111, which fails on Windows because NTFS does not honor POSIX execute bits. Guard both with pytest.mark.skipif(sys.platform == 'win32'), matching the existing convention used elsewhere in the suite. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
scripts/test-integration.sh runs under `set -euo pipefail`. On macOS
the default /bin/bash is 3.2, where expanding an empty array with a
bare "${arr[@]}" raises an unbound-variable error. Local integration
runs (PYTEST_EXTRA_ARGS unset) aborted before pytest with
'extra_args[@]: unbound variable'. Use the ${arr[@]+"${arr[@]}"}
guard so the empty-array expansion is safe; CI behaviour (with
PYTEST_EXTRA_ARGS set) is unchanged.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Roll the [Unreleased] changelog into a dated 0.16.1 block and bump pyproject.toml + uv.lock. Adds the previously-missing user-facing entries for #1539 (apm doctor), #1566, #1569, #1567, #1553, #1552, and #1538 surfaced by enumerating merged PRs since v0.16.0. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The v0.16.1 release pipeline failed on the macOS x86_64 (Intel) build: the consolidated job's "Run integration tests" step hit its 20-minute timeout at ~61% progress. The tests were passing -- the slower Intel runner simply could not finish the full suite serially in time, and the arm64 runner was also near the edge. Unlike ci-integration.yml, which shards the suite across four runners, the release workflow runs the whole integration suite on a single scarce macOS runner. Parallelise it in-process with xdist (-n 2, matching ci-integration's per-shard width to bound shared-PAT API concurrency) using --dist loadgroup so the home_env xdist_group markers keep HOME-mutating tests serialized on one worker. Also raise the step timeout to 30 minutes for headroom on the slow Intel runner. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Updates the release pipeline to avoid macOS integration tests timing out during tagged/scheduled/dispatch release runs, by enabling limited pytest-xdist parallelism and increasing the step timeout. It also rolls forward the project version/changelog for 0.16.1 and adjusts unit tests that assert POSIX executable bits to be skipped on Windows.
Changes:
- Release workflow (macOS Intel + ARM) now sets
PYTEST_EXTRA_ARGS="-n 2 --dist loadgroup"and increases the integration-test step timeout to 30 minutes. - Integration test runner script now uses a bash-safe empty-array expansion guard when passing through
PYTEST_EXTRA_ARGS. - Version/changelog updates for
0.16.1, plus Windows skips for unit tests that rely on POSIX execute-bit preservation.
Show a summary per file
| File | Description |
|---|---|
uv.lock |
Bumps apm-cli version to 0.16.1. |
pyproject.toml |
Bumps project version to 0.16.1. |
CHANGELOG.md |
Creates 0.16.1 section and moves entries out of Unreleased. |
.github/workflows/build-release.yml |
Adds xdist args + increases macOS integration-test timeout to avoid release failures. |
scripts/test-integration.sh |
Makes PYTEST_EXTRA_ARGS pass-through safe under bash 3.2 + set -u. |
tests/unit/test_file_ops.py |
Skips execute-bit preservation assertion on Windows. |
tests/unit/test_download_strategies.py |
Skips execute-bit preservation assertion on Windows. |
Copilot's findings
- Files reviewed: 6/7 changed files
- Comments generated: 2
Comment on lines
+208
to
+217
| # macOS runners are scarce, so this consolidated job runs the | ||
| # whole integration suite on one runner instead of sharding it | ||
| # across four like ci-integration.yml. Run it serially and the | ||
| # slower Intel runner overruns the step timeout, so parallelise | ||
| # in-process with xdist. --dist loadgroup is required: it is the | ||
| # only scheduler that honors pytest.mark.xdist_group, which keeps | ||
| # the HOME-mutating tests serialized on a single worker. Kept at | ||
| # -n 2 (matching ci-integration's per-shard width) to bound the | ||
| # shared-PAT API concurrency these E2E tests generate. | ||
| PYTEST_EXTRA_ARGS: "-n 2 --dist loadgroup" |
Comment on lines
+361
to
+363
| # The ${arr[@]+"${arr[@]}"} guard keeps an empty array expansion safe | ||
| # under `set -u` on bash 3.2 (the default /bin/bash on macOS), where a | ||
| # bare "${arr[@]}" on an empty array raises an unbound-variable error. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The
v0.16.1release pipeline (run 26764269738) failed on Build & Validate (macOS x86_64). TheRun integration testsstep hit itstimeout-minutes: 20ceiling at ~61% progress. The tests were passing -- the slowermacos-15-intelrunner simply could not finish the full suite serially within 20 minutes. The arm64 runner passed but was also near the edge (~22min job). Because the macOS build failed, all downstream publish jobs (GitHub Release, PyPI, Homebrew, Scoop) were skipped -- so v0.16.1 never actually shipped.Root cause
Unlike
ci-integration.yml(which shards the integration suite across 4 Linux runners with-n 2 --dist loadgroup), the release workflow runs the entire integration suite on a single scarce macOS runner, serially (noPYTEST_EXTRA_ARGS). Serial wall-time on the Intel runner extrapolates to ~33min -- well over the 20min step budget.Fix
For both consolidated macOS integration steps (Intel + arm64):
PYTEST_EXTRA_ARGS: "-n 2 --dist loadgroup"to parallelise in-process.-n 2matchesci-integration.yml's proven per-shard width, bounding the shared-PAT API concurrency these E2E tests generate.--dist loadgroupis required to honor thepytest.mark.xdist_group(name="home_env")markers, which keep HOME-mutating E2E tests serialized on a single worker (race-safe).timeout-minutesfrom 20 to 30 for headroom on the slow Intel runner.-n 2gives ~2x speedup (~33min -> ~17-20min), comfortably under the new 30min ceiling. The Linux/Windows integration job is left untouched -- it passed on the failed run.Validation
yaml.safe_load).Run integration testssteps; release-validation steps and the Linux/Windows job are unchanged.cicd.instructions.md: this preserves the consolidated-macOS-job architecture (scarce runners, no extra sharding).Recovery plan after merge
The
v0.16.1tag points at a commit without this fix, sogh run rerunwould not include it. After this merges, thev0.16.1tag will be re-created on the newmainHEAD and pushed to re-trigger the full release pipeline (nothing shipped yet, so reusing the version is safe).