Add matrix operations example#6
Merged
Merged
Conversation
deanq
approved these changes
Apr 15, 2025
deanq
approved these changes
Apr 16, 2025
deanq
added a commit
that referenced
this pull request
Jun 1, 2026
…release-please bot Addresses @runpod-Henrik (PASS WITH NITS) and Copilot review on PR #345. Henrik #3 / Copilot inline ci.yml:50 — skip cascade - Previously `pre-check.if` skipped on release-please bot PRs, and `quality-gates needs: [pre-check]` cascaded to skip too, leaving any release-please PR that touched non-metadata files with zero CI signal. - Drop the workflow-level `paths-ignore` (it interacted badly with branch protection's required-check semantics anyway) and instead put explicit `if: !startsWith(github.head_ref, 'release-please--')` on pre-check, quality-gates, packaging-guard, and build. Symmetric, auditable. Henrik #1 + #6 — branch-protection compatibility (release-please bot PRs) - Add `release-please-sentinel` job with matching matrix + name "Quality Gates" so it produces the same "Quality Gates (3.10)"…"(3.13)" check names that branch protection requires, but auto-passes. Branch protection on main can now require those checks without admins having to either loosen the rule or hand-craft an external sentinel workflow. Henrik #2 — PR-time wheel build is dropped - Reintroduce build on PRs but gated by `packaging-guard` job that inspects `git diff` against the PR base. Build runs when: - pyproject.toml / Makefile / MANIFEST.in / scripts/validate-wheel.sh changed, OR - any new non-Python file was added under src/ (data files need explicit package-data inclusion or they're silently dropped). - Push to main always builds. Henrik #4 — `make dev` entry-point sanity check - Verified locally: `rm -rf .venv && make dev && .venv/bin/flash --version` prints "Runpod Flash CLI v1.16.0". `uv sync` registers the entry point for the workspace project; the dropped `uv pip install -e .` was redundant. No code change needed. Henrik #5 / Copilot Makefile:108 — misleading "plain output" comment - Rewrite the comment block above `quality-check`. The aliases do NOT produce plain output -- local invocations see GitHub Actions annotation markers (::group::, --output-format=github) which are cosmetically noisy but functionally inert. Comment now reflects this. Henrik misc nit — e2e.yml `unit-tests` job removal - Add a leading comment on the `e2e` job explaining why the duplicate unit-tests job was dropped, so a future maintainer doesn't re-add it thinking it was lost by accident. https://linear.app/runpod/issue/AE-3161
9 tasks
deanq
added a commit
that referenced
this pull request
Jun 2, 2026
…release-please bot Addresses @runpod-Henrik (PASS WITH NITS) and Copilot review on PR #345. Henrik #3 / Copilot inline ci.yml:50 — skip cascade - Previously `pre-check.if` skipped on release-please bot PRs, and `quality-gates needs: [pre-check]` cascaded to skip too, leaving any release-please PR that touched non-metadata files with zero CI signal. - Drop the workflow-level `paths-ignore` (it interacted badly with branch protection's required-check semantics anyway) and instead put explicit `if: !startsWith(github.head_ref, 'release-please--')` on pre-check, quality-gates, packaging-guard, and build. Symmetric, auditable. Henrik #1 + #6 — branch-protection compatibility (release-please bot PRs) - Add `release-please-sentinel` job with matching matrix + name "Quality Gates" so it produces the same "Quality Gates (3.10)"…"(3.13)" check names that branch protection requires, but auto-passes. Branch protection on main can now require those checks without admins having to either loosen the rule or hand-craft an external sentinel workflow. Henrik #2 — PR-time wheel build is dropped - Reintroduce build on PRs but gated by `packaging-guard` job that inspects `git diff` against the PR base. Build runs when: - pyproject.toml / Makefile / MANIFEST.in / scripts/validate-wheel.sh changed, OR - any new non-Python file was added under src/ (data files need explicit package-data inclusion or they're silently dropped). - Push to main always builds. Henrik #4 — `make dev` entry-point sanity check - Verified locally: `rm -rf .venv && make dev && .venv/bin/flash --version` prints "Runpod Flash CLI v1.16.0". `uv sync` registers the entry point for the workspace project; the dropped `uv pip install -e .` was redundant. No code change needed. Henrik #5 / Copilot Makefile:108 — misleading "plain output" comment - Rewrite the comment block above `quality-check`. The aliases do NOT produce plain output -- local invocations see GitHub Actions annotation markers (::group::, --output-format=github) which are cosmetically noisy but functionally inert. Comment now reflects this. Henrik misc nit — e2e.yml `unit-tests` job removal - Add a leading comment on the `e2e` job explaining why the duplicate unit-tests job was dropped, so a future maintainer doesn't re-add it thinking it was lost by accident. https://linear.app/runpod/issue/AE-3161
deanq
added a commit
that referenced
this pull request
Jun 4, 2026
…sync; catch modified data files Henrik's delta review on PR #345. Henrik #6 — ALLOWED was hand-synced with release-please-config.json - Previous regex hardcoded the four files release-please touches today. Asymmetric failure mode is safe (out-of-sync -> bypass doesn't fire -> full CI runs) but the next person to add an `extra-files` entry to release-please-config.json would silently break the bypass for bot PRs. - Derive the file list dynamically via jq from release-please-config.json's `.packages["."]["extra-files"]`, append the three implicit files (CHANGELOG.md, manifest, pyproject.toml from release-type=python). Use `grep -Fxvf <(allowed_files)` for fixed-string exact-line matching -- removes the regex-escaping concern entirely. Henrik #7 — `uv run` undid the dev-only sync - `uv run` defaults to syncing the environment before each invocation, so the steps after `uv sync --only-group dev --frozen` would pull in the full project on each ruff call -- defeating the "no project install" premise of pre-check. - Add `--no-sync` to both ruff invocations. ruff runs from the existing dev-only venv directly; pre-check stays fast. Henrik #8 — `--diff-filter=A` missed modified non-Python files - Edits to existing packaged data files (e.g. future tweaks to runpod_flash/rules/AGENTS.md) wouldn't trigger should_build, so validate-wheel.sh wouldn't smoke-test the modified content. Packaging metadata wouldn't change, but the wheel content would. - Switch to `--diff-filter=AM` (Added or Modified). Cheap and correct. Henrik #9 (fetch-depth: 0 over-broad) intentionally not addressed -- Henrik flagged it as "not a real cost issue." Henrik #4 (make dev entry-point) re-verified locally: `rm -rf .venv && make dev && .venv/bin/flash --version` -> Runpod Flash CLI v1.16.0. https://linear.app/runpod/issue/AE-3161
deanq
added a commit
that referenced
this pull request
Jun 4, 2026
* ci: dedupe quality gates and add fast-fail pre-check (AE-3161) Eliminates redundant CI/CD work that was costing ~24 min of compute per merge and ~28 min per release-please bot PR push. Workflow changes: - ci.yml: add concurrency group (cancel in-flight PR runs), paths-ignore for release-please bot PRs, new pre-check job (uvx ruff format+check, ~10s no-install fast-fail), bump setup-uv v2 -> v5, build job only on push:main since release-please.yml builds + publishes the same SHA. - release-please.yml: remove duplicate quality-gates job. Branch protection on main already enforces ci.yml green before code lands, so the second matrix validated nothing. Saves a full 4-leg matrix per merge. - e2e.yml: drop unit-tests job (duplicated ci.yml quality-gates), fold the summary step into the e2e job to remove the third workflow definition of the same install setup. Makefile changes: - dev: drop redundant 'uv pip install -e .' (uv sync handles editable install in workspace mode). - quality-check: alias ci-quality-github so there is one canonical CI quality gate, no drift between local and CI. - ci-quality-github / test-coverage: fix the latent junit overwrite bug where both pytest invocations wrote pytest-results.xml -- now writes pytest-results-parallel.xml and pytest-results-serial.xml. The double pytest pass is retained because it provides process isolation between parallel and serial tests; collapsing to a single loadgroup run surfaced state pollution in test_load_balancer_sls_stub. Note: removing quality-gates from release-please.yml + paths-ignore for the bot's files means the release-please PR will have no required CI check. Branch protection on main will need to allow missing checks for the release-please--* branch pattern, or add a no-op sentinel job named to match the protection rule. https://linear.app/runpod/issue/AE-3161 * ci: address QA — fix skip cascade, add packaging guard, sentinel for release-please bot Addresses @runpod-Henrik (PASS WITH NITS) and Copilot review on PR #345. Henrik #3 / Copilot inline ci.yml:50 — skip cascade - Previously `pre-check.if` skipped on release-please bot PRs, and `quality-gates needs: [pre-check]` cascaded to skip too, leaving any release-please PR that touched non-metadata files with zero CI signal. - Drop the workflow-level `paths-ignore` (it interacted badly with branch protection's required-check semantics anyway) and instead put explicit `if: !startsWith(github.head_ref, 'release-please--')` on pre-check, quality-gates, packaging-guard, and build. Symmetric, auditable. Henrik #1 + #6 — branch-protection compatibility (release-please bot PRs) - Add `release-please-sentinel` job with matching matrix + name "Quality Gates" so it produces the same "Quality Gates (3.10)"…"(3.13)" check names that branch protection requires, but auto-passes. Branch protection on main can now require those checks without admins having to either loosen the rule or hand-craft an external sentinel workflow. Henrik #2 — PR-time wheel build is dropped - Reintroduce build on PRs but gated by `packaging-guard` job that inspects `git diff` against the PR base. Build runs when: - pyproject.toml / Makefile / MANIFEST.in / scripts/validate-wheel.sh changed, OR - any new non-Python file was added under src/ (data files need explicit package-data inclusion or they're silently dropped). - Push to main always builds. Henrik #4 — `make dev` entry-point sanity check - Verified locally: `rm -rf .venv && make dev && .venv/bin/flash --version` prints "Runpod Flash CLI v1.16.0". `uv sync` registers the entry point for the workspace project; the dropped `uv pip install -e .` was redundant. No code change needed. Henrik #5 / Copilot Makefile:108 — misleading "plain output" comment - Rewrite the comment block above `quality-check`. The aliases do NOT produce plain output -- local invocations see GitHub Actions annotation markers (::group::, --output-format=github) which are cosmetically noisy but functionally inert. Comment now reflects this. Henrik misc nit — e2e.yml `unit-tests` job removal - Add a leading comment on the `e2e` job explaining why the duplicate unit-tests job was dropped, so a future maintainer doesn't re-add it thinking it was lost by accident. https://linear.app/runpod/issue/AE-3161 * ci: close sentinel-bypass hole and pin pre-check ruff to lockfile Addresses Copilot's second review on PR #345. Copilot ci.yml:42 — sentinel-bypass vulnerability - Previous sentinel gated only on `startsWith(head_ref, 'release-please--')`. A contributor could open a PR from any branch named with that prefix and inherit the sentinel's auto-passes for "Quality Gates (3.10..3.13)" while pre-check / quality-gates / build all skipped via the same `if:` guard. Net: CI fully bypassed, branch protection satisfied, code unreviewed. - Replace the three independent `if:` gates with a single `guard` job that inspects the actual PR diff and classifies the run as either `release-please-bypass` (branch prefix AND diff is exactly CHANGELOG.md + .release-please-manifest.json) or `full`. `release-please-sentinel` only fires when classification is bypass; pre-check / quality-gates / build fire only when classification is full. A contributor naming a branch release-please-- but changing code falls through to full CI. - Also folds the previous packaging-guard logic into the same job; build reads `should_build` from the unified guard output. Copilot ci.yml:59 — ruff version drift in pre-check - `uvx ruff …` installed the latest ruff at runtime, which can drift from the locked version (uv.lock pins ruff 0.14.14) used by `make ci-quality-github`. That produces flaky "passes locally / fails CI" (or vice versa) verdicts when ruff cuts a release between runs. - Pre-check now runs `uv sync --only-group dev --frozen` then `uv run ruff …`. Pulls the exact pinned version; warm-cache install is ~1s, cold is a few seconds. Fast-fail premise preserved; determinism gained. https://linear.app/runpod/issue/AE-3161 * ci: replace release-please sentinel with single Validation aggregator Borrowed from runpod-python (ci.yml:105) -- one required check that inspects upstream `needs.*.result` and treats "success" or "skipped" as green. Much cleaner than the previous sentinel-with-matching-matrix-names hack: - Drops `release-please-sentinel` job entirely. No more duplicate `Quality Gates` display name. No more "skipped Quality Gates" cosmetic row alongside the real four matrix legs on every regular PR. - For release-please bot bypass runs: guard classifies bypass -> pre-check / quality-gates / build all skip -> `validation` aggregates ("skipped" treated as ok) -> green. - For regular PRs: everything runs -> `validation` aggregates -> green. - For any genuinely failed or cancelled job: `validation` exits 1 -> branch protection blocks merge. Future matrix changes (add 3.14, drop 3.10) or new jobs (security scans, etc.) no longer require touching branch protection -- only the `needs:` list and the results array in this aggregator. Requires branch protection on `main` to be reconfigured before merge: - Remove required checks: `CI / Quality Gates (3.10)`, `(3.11)`, `(3.12)`, `(3.13)` - Add required check: `CI / Validation` https://linear.app/runpod/issue/AE-3161 * ci: classify metadata-only push:main commits as release-please-bypass Closes the last remaining redundancy in the lifecycle. When release-please's release PR is merged, the resulting commit on main only changes CHANGELOG.md and .release-please-manifest.json -- the same code was already validated when the underlying feature PRs merged. Previously ci.yml ran the full 4-version matrix + build on those merge commits (~7 min compute per release). Now `guard` classifies push:main by diff: - Metadata-only diff -> release-please-bypass -> pre-check / quality-gates / build all skip, validation aggregates "skipped" as success. - Any non-metadata change (feature merges, admin direct pushes) -> full. release-please.yml's pypi-publish builds its own wheel before publishing, so skipping ci.yml's build on release commits doesn't blind us to wheel regressions. https://linear.app/runpod/issue/AE-3161 * ci: align bypass classifier with release-please-config; use uv.lock for cache Addresses Copilot's third review on PR #345. Copilot ci.yml:66 + ci.yml:88 — bypass missing release-please-managed files - release-please-config.json declares release-type=python (auto-bumps pyproject.toml) and extra-files=src/runpod_flash/__init__.py. Real bot PRs touch 4 files, not 2. Previous ALLOWED regex missed pyproject.toml and __init__.py, so every real release-please bot PR was misclassified as `full`. Entire bypass optimization was broken for the actual case it was designed for. - Extend ALLOWED to all four files. Add a comment pointing back to release-please-config.json so the regex stays in sync. - Add defense-in-depth: PR classifier now also requires github.event.pull_request.user.login == 'runpod-release-please-bot[bot]' (the GitHub App account). Branch prefix + bot author + file content all required. A contributor spoofing the branch name fails the author check; a contributor with write access to the bot would already be trusted. - Push:main classifier keeps content-only check (admin can squash-merge the bot's PR by hand, so actor isn't a reliable signal on push). Copilot ci.yml:164 + ci.yml:193 + release-please.yml:61 + e2e.yml:39 — setup-uv cache keyed on pyproject.toml - uv sync is lockfile-driven; pyproject.toml-only cache key means uv.lock changes (most common dependency update) don't invalidate the cache, causing stale-deps cache hits. - Switch cache-dependency-glob to uv.lock everywhere. pre-check already used uv.lock; this aligns the rest. Copilot Makefile:107 — misleading "aliases" comment - quality-check-strict is NOT an alias for ci-quality-github -- it runs a different set (adds typecheck, plain output, no junit). Comment block rewritten to enumerate each target individually so maintainers don't conflate them. https://linear.app/runpod/issue/AE-3161 * ci: derive bypass file list from release-please-config; pre-check no-sync; catch modified data files Henrik's delta review on PR #345. Henrik #6 — ALLOWED was hand-synced with release-please-config.json - Previous regex hardcoded the four files release-please touches today. Asymmetric failure mode is safe (out-of-sync -> bypass doesn't fire -> full CI runs) but the next person to add an `extra-files` entry to release-please-config.json would silently break the bypass for bot PRs. - Derive the file list dynamically via jq from release-please-config.json's `.packages["."]["extra-files"]`, append the three implicit files (CHANGELOG.md, manifest, pyproject.toml from release-type=python). Use `grep -Fxvf <(allowed_files)` for fixed-string exact-line matching -- removes the regex-escaping concern entirely. Henrik #7 — `uv run` undid the dev-only sync - `uv run` defaults to syncing the environment before each invocation, so the steps after `uv sync --only-group dev --frozen` would pull in the full project on each ruff call -- defeating the "no project install" premise of pre-check. - Add `--no-sync` to both ruff invocations. ruff runs from the existing dev-only venv directly; pre-check stays fast. Henrik #8 — `--diff-filter=A` missed modified non-Python files - Edits to existing packaged data files (e.g. future tweaks to runpod_flash/rules/AGENTS.md) wouldn't trigger should_build, so validate-wheel.sh wouldn't smoke-test the modified content. Packaging metadata wouldn't change, but the wheel content would. - Switch to `--diff-filter=AM` (Added or Modified). Cheap and correct. Henrik #9 (fetch-depth: 0 over-broad) intentionally not addressed -- Henrik flagged it as "not a real cost issue." Henrik #4 (make dev entry-point) re-verified locally: `rm -rf .venv && make dev && .venv/bin/flash --version` -> Runpod Flash CLI v1.16.0. https://linear.app/runpod/issue/AE-3161 * ci: content-shape bypass check, conditional build on push, e2e summary table fix Copilot fourth review on PR #345. Copilot ci.yml:98 — pyproject.toml in ALLOWED could let non-version edits slip through the bypass - A push:main touching only the metadata filenames could smuggle non-version edits to pyproject.toml (deps, ruff config, scripts) since the previous check was filename-only. Push:main can't rely on actor identity (admins can squash-merge by hand), so the file-list check was the only gate. - Add `verify_version_shape` helper: enforces that any pyproject.toml change is a single `version = "..."` line, and any src/runpod_flash/__init__.py change is a single `__version__ = "..."` line. Applies to both push:main and PR bypass paths. Tested locally against simulated bot and malicious diffs. Copilot ci.yml:103 — `should_build=true` unconditional on push:main - Previous code only computed should_build content-based on PR full-mode; push:main non-bypass always built. Contradicted the documented intent and cost ~1 min per feature merge. - Factor the content heuristic into `compute_should_build`; use it for both push:main full and PR full paths. Same allow-list (pyproject.toml, Makefile, MANIFEST.in, validate-wheel.sh) plus AM-filter on non-Python files under src/. Copilot e2e.yml:129 — malformed markdown table when JUnit file missing - Python ternary `... if duration is not None else "- |"` had the wrong scope: when duration was None, the *entire* print expression became the string `"- |"`, breaking the table. - Extract `_num()` and `_dur()` helpers that placehold "-" per field; build the row in one f-string. Table stays valid when the suite didn't run. Copilot Makefile:116 (NOT changed) — claim was that aliasing quality-check to ci-quality-github drops `--cov-fail-under=65` enforcement. Verified locally: the serial pass in ci-quality-github inherits --cov-fail-under=65 from pyproject.toml's addopts (only the parallel pass overrides to 0). Local `make quality-check` still hits the 65% gate at the end of the serial pass; observed "Required test coverage of 65% reached. Total coverage: 85.96%" on this branch. Will reply on the thread, not change code. https://linear.app/runpod/issue/AE-3161
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is the code example that will be built by users in the Tetra quickstart.