Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
69 changes: 37 additions & 32 deletions .github/instructions/cicd.instructions.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,33 +5,37 @@ description: "CI/CD Pipeline configuration for PyInstaller binary packaging and

# CI/CD Pipeline Instructions

## Workflow Architecture (Fork-safe)
Four workflows split by trigger and secret requirements:
## Workflow Architecture (Tiered + Merge Queue)
Four workflows split by trigger and tier. PRs get fast feedback; the heavy
integration suite runs only at merge time via GitHub Merge Queue
(microsoft/apm#770).

1. **`ci.yml`** — `pull_request` trigger (all PRs, including forks)
1. **`ci.yml`** - Tier 1, runs on `pull_request` AND `merge_group`
- **Linux-only** (ubuntu-24.04). Combined `build-and-test` job: unit tests + binary build in a single runner. No secrets needed.
- Uploads Linux x86_64 binary artifact for downstream integration testing.
2. **`ci-integration.yml`** — `workflow_run` trigger (after CI completes, environment-gated)
- **Linux-only**. Smoke tests, integration tests, release validation. Requires `integration-tests` environment approval.
- Security: uses `workflow_run` (not `pull_request_target`) — PR code is NEVER checked out.
- Downloads Linux binary artifact from ci.yml, runs test scripts from default branch (main).
- Reports results back to PR via commit status API.
- Detects CI circular dependency (upstream failure → reports `pending` instead of blocking).
- Annotates originating PR URL for traceability.
3. **`build-release.yml`** — `push` to main, tags, schedule, `workflow_dispatch`
- Runs in both PR context (fast feedback for contributors) and merge_group
context (against the tentative merge commit before queue auto-merges).
2. **`ci-integration.yml`** - Tier 2, `merge_group` trigger only
- **Linux-only**. Builds binary inline, then runs smoke + integration +
release-validation against the tentative merge commit.
- Trust boundary is the write-access grant (only users with write can
enqueue a PR). No environment approval gate.
- Inlines the binary build instead of fetching from `ci.yml` to avoid
cross-workflow artifact plumbing across triggers.
3. **`build-release.yml`** - `push` to main, tags, schedule, `workflow_dispatch`
- **Linux + Windows** run combined `build-and-test` (unit tests + binary build in one job).
- **macOS Intel** uses `build-and-validate-macos-intel` (root node, runs own unit tests no dependency on `build-and-test`). Builds the binary on every push for early regression feedback; integration + release-validation phases conditional on tag/schedule/dispatch.
- **macOS ARM** uses `build-and-validate-macos-arm` (root node, tag/schedule/dispatch only ARM runners are extremely scarce with 2-4h+ queue waits). Only requested when the binary is actually needed for a release.
- **macOS Intel** uses `build-and-validate-macos-intel` (root node, runs own unit tests - no dependency on `build-and-test`). Builds the binary on every push for early regression feedback; integration + release-validation phases conditional on tag/schedule/dispatch.
- **macOS ARM** uses `build-and-validate-macos-arm` (root node, tag/schedule/dispatch only - ARM runners are extremely scarce with 2-4h+ queue waits). Only requested when the binary is actually needed for a release.
- Secrets always available. Full 5-platform binary output (linux x86_64/arm64, darwin x86_64/arm64, windows x86_64).
4. **`ci-runtime.yml`** nightly schedule, manual dispatch, path-filtered push
4. **`ci-runtime.yml`** - nightly schedule, manual dispatch, path-filtered push
- **Linux x86_64 only**. Live inference smoke tests (`apm run`) isolated from release pipeline.
- Uses `GH_MODELS_PAT` for GitHub Models API access.
- Failures do not block releases annotated as warnings.
- Failures do not block releases - annotated as warnings.

## Platform Testing Strategy
- **PR time**: Linux-only combined build-and-test in `ci.yml`. Catches logic bugs and dependency issues before merge. Windows + macOS are tested post-merge (platform-specific issues are rare and the full matrix runs on every push to main).
- **Post-merge**: Full 5-platform matrix (linux x86_64/arm64, darwin x86_64/arm64, windows x86_64) catches remaining platform-specific issues on main.
- **Rationale**: ci.yml has always been Linux-only Windows and macOS are covered by `build-release.yml` on every push to main. This keeps PR feedback fast while still catching platform issues before release.
- **Rationale**: ci.yml has always been Linux-only - Windows and macOS are covered by `build-release.yml` on every push to main. This keeps PR feedback fast while still catching platform issues before release.

## PyInstaller Binary Packaging
- **CRITICAL**: Uses `--onedir` mode (NOT `--onefile`) for faster CLI startup performance
Expand All @@ -50,26 +54,26 @@ Four workflows split by trigger and secret requirements:
3. **Path Resolution**: Use symlinks and PATH manipulation for isolated binary testing

## Inference Testing (Decoupled)
- Live inference tests (`apm run`) are **isolated** in `ci-runtime.yml` they do NOT gate releases
- Live inference tests (`apm run`) are **isolated** in `ci-runtime.yml` - they do NOT gate releases
- `APM_RUN_INFERENCE_TESTS=1` env var enables inference in test scripts; absent = skipped
- `GH_MODELS_PAT` is only used in `ci-runtime.yml` and smoke-test jobs — NOT in integration-tests or release-validation
- Rationale: 8 inference executions × 2% failure rate = 14.9% false-negative per release; APM core UVPs require zero live inference
- `GH_MODELS_PAT` is only used in `ci-runtime.yml` and Tier 2 smoke-test job - NOT in integration-tests or release-validation
- Rationale: 8 inference executions x 2% failure rate = 14.9% false-negative per release; APM core UVPs require zero live inference

## Release Flow Dependencies
- **PR workflow**: ci.yml (build-and-test, Linux-only) then ci-integration.yml via workflow_run (approve → smoke-test → integration-tests → release-validation → report-status, all Linux-only)
- **Push/Release workflow (Linux + Windows)**: build-and-test → integration-tests → release-validation → create-release → publish-pypi → update-homebrew (gh-aw-compat runs in parallel, informational)
- **Push/Release workflow (macOS Intel)**: build-and-validate-macos-intel (root node: unit tests + build always + conditional integration/release-validation) → create-release
- **Push/Release workflow (macOS ARM)**: build-and-validate-macos-arm (root node, tag/schedule/dispatch only; all phases run) → create-release
- **PR workflow**: Tier 1 only - ci.yml (build-and-test, Linux-only) provides fast feedback. Tier 2 does not run until enqueued.
- **Merge queue workflow**: ci.yml (Tier 1 against tentative merge ref) + ci-integration.yml (Tier 2: build -> smoke-test -> integration-tests -> release-validation). Queue auto-merges on success; ejects on failure.
- **Push/Release workflow (Linux + Windows)**: build-and-test -> integration-tests -> release-validation -> create-release -> publish-pypi -> update-homebrew (gh-aw-compat runs in parallel, informational)
- **Push/Release workflow (macOS Intel)**: build-and-validate-macos-intel (root node: unit tests + build always + conditional integration/release-validation) -> create-release
- **Push/Release workflow (macOS ARM)**: build-and-validate-macos-arm (root node, tag/schedule/dispatch only; all phases run) -> create-release
- **Tag Triggers**: Only `v*.*.*` tags trigger full release pipeline
- **Artifact Retention**: 30 days for debugging failed releases
- **Cross-workflow artifacts**: ci-integration.yml downloads artifacts from ci.yml using `run-id` and `github-token`
- **Cross-workflow artifacts**: ci-integration.yml builds the binary inline (no cross-workflow artifact transfer); build-release.yml jobs share artifacts within the same workflow run.

## Fork PR Security Model
- Fork PRs get unit tests + build via `ci.yml` (no secrets, runs PR code safely)
- `ci-integration.yml` triggers via `workflow_run` after CI completes — NEVER checks out PR code
- Binary artifacts from ci.yml are tested using test scripts from the default branch (main)
- Environment approval gate (`integration-tests`) ensures maintainer reviews PR before integration tests run
- Commit status is reported back to the PR SHA so results appear on the PR
## Trust Model
- **PR push (any contributor, including forks)**: Runs Tier 1 only. No CI secrets exposed. PR code is checked out and tested in an unprivileged context.
- **merge_group (write access required)**: Runs Tier 1 + Tier 2. Tier 2 sees secrets. The `gh-readonly-queue/main/*` ref is created by GitHub from the PR merged into main; only users with write access can trigger this by enqueueing a PR.
- **Trust boundary = write-access grant**, managed in repo Settings -> Collaborators. Write access is granted only to vetted contributors.
- **No environment approval gate** is required because the act of enqueueing IS the trust assertion. This replaces the previous `integration-tests` environment approval flow.

## Key Environment Variables
- `PYTHON_VERSION: '3.12'` - Standardized across all jobs
Expand All @@ -78,11 +82,12 @@ Four workflows split by trigger and secret requirements:

## Performance Considerations
- **Combined build-and-test**: Eliminates ~1.5m runner re-provisioning overhead by running unit tests and binary build in the same job.
- **macOS as root nodes**: macOS consolidated jobs run their own unit tests and start immediately no dependency on Linux/Windows test completion.
- **macOS as root nodes**: macOS consolidated jobs run their own unit tests and start immediately - no dependency on Linux/Windows test completion.
- **Native uv caching**: `setup-uv` action with `enable-cache: true` replaces manual `actions/cache@v3` blocks.
- **Targeted setup-node usage**: Node.js is only installed in `ci-runtime.yml`, macOS consolidated jobs, and integration-tests/release-validation phases (for `apm runtime setup copilot` npm install).
- **Targeted setup-node usage**: Node.js is only installed in `ci-runtime.yml`, macOS consolidated jobs, and integration-tests/release-validation phases (for `apm runtime setup copilot` -> npm install).
- **macOS runner consolidation**: Each macOS arch has a single consolidated job (build + integration + release-validation). Intel (`build-and-validate-macos-intel`) runs on every push since Intel runners are plentiful. ARM (`build-and-validate-macos-arm`) is gated to tag/schedule/dispatch only since ARM runners are extremely scarce (2-4h+ queue waits). This avoids serial re-queuing of runners across multiple jobs.
- **Unit tests skip macOS**: Python unit tests are platform-agnostic; Linux + Windows coverage is sufficient. macOS-specific validation (binary build, integration tests, release validation) still runs via the consolidated job.
- **Tier 2 runs once per merged PR**, not per WIP push, since it triggers on `merge_group` only. Saves the bulk of integration minutes that the previous per-push flow burned.
- UPX compression when available (reduces binary size ~50%)
- Python optimization level 2 in PyInstaller
- Aggressive module exclusions (tkinter, matplotlib, etc.)
Loading
Loading