refactor(tests): retire script enumeration; pytest discovers tests/integration/#1247
Conversation
…tegration/ PR2 of #1166. Per acceptance criteria: - scripts/test-integration.sh no longer enumerates individual pytest files; it invokes pytest tests/integration/ once and lets the marker registry (PR1, #1167) handle per-test gating. Script shrinks 778 -> 402 lines. - New test files dropped into tests/integration/ are picked up automatically; the only contract is to add the right requires_* marker (or hermetic = no marker). - Adds requires_apm_binary / requires_github_token / requires_runtime_copilot pytestmarks to 11 files that needed them now that pytest discovers the full directory (test-coverage-expert audit). - Adds 'live' to the addopts deselect list so test_skill_bundle_live.py preserves its current 'opt-in only' behaviour. - Adds APM_RUN_INTEGRATION_TESTS=1 to ci-integration.yml so network-integration tests (transport selection) actually run in CI. - CI integration-tests job timeout 20 -> 30 min to absorb the newly-discovered tests. - Docs (integration-testing.md) reframe the script as the CI orchestrator, not legacy. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Refactors the integration-test harness to rely on pytest discovery over tests/integration/ (instead of a hand-maintained per-file enumeration), with marker-based gating used to preserve conditional execution in CI/local runs.
Changes:
- Simplifies
scripts/test-integration.shto invokepytest tests/integration/once and rely on marker-driven skips. - Adds/adjusts
requires_*markers across integration tests to match required preconditions (binary, tokens, runtimes). - Updates CI + pytest config to keep
livetests opt-in and to ensure network-integration tests still run in CI.
Show a summary per file
| File | Description |
|---|---|
scripts/test-integration.sh |
Removes per-file pytest invocations; runs the full tests/integration/ directory once and prints an env probe. |
.github/workflows/ci-integration.yml |
Sets APM_RUN_INTEGRATION_TESTS=1 for the integration job and increases timeout. |
pyproject.toml |
Deselects live tests by default via addopts. |
tests/integration/test_uninstall_multi_e2e.py |
Adds requires_apm_binary marker alongside GitHub token gating. |
tests/integration/test_uninstall_dry_run_e2e.py |
Adds requires_apm_binary marker alongside GitHub token gating. |
tests/integration/test_skill_integration.py |
Adds requires_apm_binary marker alongside GitHub token gating. |
tests/integration/test_mcp_env_var_copilot_e2e.py |
Adds markers for binary + Copilot runtime preconditions. |
tests/integration/test_marketplace_e2e.py |
Adds requires_apm_binary marker. |
tests/integration/test_local_install.py |
Adds requires_apm_binary marker. |
tests/integration/test_local_content_audit.py |
Adds requires_apm_binary marker. |
tests/integration/test_link_rewrite_e2e.py |
Adds requires_apm_binary marker. |
tests/integration/test_intra_package_cleanup.py |
Adds requires_apm_binary marker. |
tests/integration/test_install_verbose_redaction_e2e.py |
Adds requires_apm_binary marker. |
tests/integration/test_install_subdir_dedup_e2e.py |
Adds requires_github_token marker for a network+token-dependent test. |
tests/integration/test_global_scope_e2e.py |
Adds requires_apm_binary marker. |
tests/integration/test_apm_dependencies.py |
Adds requires_github_token marker for real-repo integration coverage. |
docs/src/content/docs/contributing/integration-testing.md |
Updates contributor docs to describe the orchestrator script as CI’s entry point. |
CHANGELOG.md |
Adds an Unreleased entry describing the integration-test harness refactor. |
Copilot's findings
- Files reviewed: 18/18 changed files
- Comments generated: 14
| pytestmark = [ | ||
| pytest.mark.requires_github_token, | ||
| pytest.mark.requires_apm_binary, | ||
| ] | ||
|
|
| pytestmark = [ | ||
| pytest.mark.requires_github_token, | ||
| pytest.mark.requires_apm_binary, | ||
| ] | ||
|
|
| # Skip all tests if GITHUB_APM_PAT is not set or apm binary missing | ||
| pytestmark = [ | ||
| pytest.mark.requires_github_token, | ||
| pytest.mark.requires_apm_binary, | ||
| ] |
|
|
||
|
|
||
| @pytest.fixture | ||
| def apm_command(): |
| pytestmark = pytest.mark.requires_apm_binary | ||
|
|
||
| SAMPLE_MARKETPLACE_NAME = "test-mkt" | ||
|
|
||
|
|
| pytestmark = pytest.mark.requires_apm_binary | ||
|
|
||
|
|
||
| @pytest.fixture | ||
| def apm_command(): |
| pytestmark = pytest.mark.requires_apm_binary | ||
|
|
||
| # --------------------------------------------------------------------------- | ||
| # Fixtures | ||
| # --------------------------------------------------------------------------- |
| - Registry proxy now warns when `PROXY_REGISTRY_TOKEN` is set and `PROXY_REGISTRY_URL` uses `http://`, since the bearer token would be transmitted in plaintext; set `PROXY_REGISTRY_ALLOW_HTTP=1` to silence the warning for trusted internal proxies. (#1149) | ||
| - Integration tests now use marker-driven discovery: 21 `pytestmark = pytest.mark.skipif(...)` chains across `tests/integration/` are replaced with declarative `requires_*` markers, with precondition logic centralized in `tests/integration/conftest.py` and auto-skipping at collection time. PR1 of #1166. (#1167) | ||
| - Integration test apm-binary resolution now prefers the local build (`./dist/apm-<os>-<arch>/apm`) over a system-wide `apm` on `PATH`, so contributors validating the binary under test are not silently shadowed by a global install; the bearer-token marker (`requires_ado_bearer`) discards the captured JWT immediately and persists only the boolean outcome. (#1167) | ||
| - `scripts/test-integration.sh` is now a thin orchestrator: it builds/locates the apm binary, sets up runtimes and tokens, then invokes `pytest tests/integration/` exactly once. The 28 per-file pytest enumerations were removed; the marker registry handles per-test gating, and new test files dropped into `tests/integration/` are picked up automatically. PR2 of #1166. (#1247) |
| log_success "All integration tests completed successfully!" | ||
| echo "" | ||
| if [[ "$USE_EXISTING_BINARY" == "true" ]]; then | ||
| echo "✅ CI mode: Used pre-built artifacts and validated integration workflow" | ||
| echo "CI mode: Used pre-built artifacts and validated integration workflow" | ||
| else | ||
| echo "✅ Local mode: Built binary and validated full integration process" | ||
| echo "Local mode: Built binary and validated full integration process" |
| ### CI orchestrator: `scripts/test-integration.sh` | ||
|
|
||
| `scripts/test-integration.sh` is the thin orchestrator the CI | ||
| integration job invokes. Its sole responsibilities are: resolve | ||
| GitHub / ADO tokens, detect platform, locate or build the apm |
- test_registry_client_integration.py: add requires_network_integration marker (was relying on runtime self-skip; live HTTP to api.mcp.github.com would fire on any dev pytest invocation). Sec panel finding. - test_runtime_smoke.py: add requires_e2e_mode marker (was unmarked; downloads real codex/llm binaries). Test-coverage panel finding. - docs/integration-testing.md: drop pre-fix/post-fix phrasing per docs-current-behaviour-only convention; document the live marker in the registry table. Doc-writer + devx-ux findings. - scripts/test-integration.sh: export APM_RUN_INTEGRATION_TESTS for symmetry with APM_E2E_TESTS; update banner to reflect thin-orchestrator identity; drop vestigial 'Integration Testing Coverage!' comment. Py-arch + cli-log nits. - ci-integration.yml: add inline comment on the 20->30 timeout bump. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
APM Review Panel — advisory recommendation for PR #1247Note Eight panelists ran in parallel. Ship recommendation: ship-with-nits-addressed. All recommended findings have been addressed in commit Activation
Findings addressed in
|
| # | Severity | Persona | Summary | Fix |
|---|---|---|---|---|
| 1 | recommended | supply-chain-security | test_registry_client_integration.py makes live HTTP to api.mcp.github.com with no marker; widened discovery would fire it on every dev pytest tests/integration/ |
Added pytestmark = pytest.mark.requires_network_integration |
| 2 | recommended | test-coverage-expert | test_runtime_smoke.py not in old script, no marker; downloads real codex/llm binaries via setup-codex.sh/setup-llm.sh |
Added pytestmark = pytest.mark.requires_e2e_mode |
| 3 | recommended | doc-writer | integration-testing.md used pre-fix/post-fix phrasing ("the script no longer enumerates"); violates docs-current-behaviour-only convention |
Reframed positively; no history reference |
| 4 | recommended | devx-ux-expert | live marker silently deselected by addopts but absent from the documented marker registry table |
Added a row to the marker table with the opt-in command (pytest -m live tests/integration -v) |
| 5 | nit | cli-logging-expert | Stale banner line "Tests comprehensive runtime scenarios..." conflicts with thin-orchestrator identity | Replaced with one-line description matching the header comment |
| 6 | nit | cli-logging-expert | Vestigial inline comment # Integration Testing Coverage! on setup_runtimes call |
Removed |
| 7 | nit | python-architect | APM_RUN_INTEGRATION_TESTS echoed but not exported, asymmetric with APM_E2E_TESTS |
Added export for symmetry |
| 8 | nit | python-architect | Timeout bump from 20→30 min not explained in YAML | Added inline comment citing the discovery widening |
Findings carried forward as follow-up suggestions (not addressed in this PR)
| # | Severity | Persona | Summary | Reason for deferral |
|---|---|---|---|---|
| A | nit | devx-ux-expert | Pre-existing emoji in log_info/log_success/log_error (lines 48, 52, 56) violates encoding.instructions.md |
Pre-existing, out of scope for PR2; would be churn touching every script that sources these helpers. Right venue: a dedicated cleanup PR. |
| B | recommended | supply-chain-security | Broader audit suggested for ~30 unmarked test files | Audited and verified clean: ran grep -nE "requests\.|httpx\.|urlopen|setup-codex|setup-llm|setup-copilot" tests/integration/test_*.py filtered to files without pytestmark. Only 4 files match: the 2 fixed above (#1, #2), and 2 hermetic files that mock the network (test_credential_fill_disambiguation.py, test_install_local_bundle_e2e.py). No further gaps. |
| C | nit | oss-growth-hacker | Release-note beat: "adding integration test is now a single file drop" | Surfaced for the next release-note draft, not a PR change. |
Diagrams (from python-architect)
Legend: registry-pattern view of where each touched file sits in the marker-driven test discovery system.
classDiagram
direction LR
class conftest_py {
Module -- tests/integration/conftest.py
+_MARKER_CHECKS dict
+pytest_collection_modifyitems
+apm_binary_path fixture
}
class pyproject_toml {
Config -- pyproject.toml
+addopts str
+markers list
}
class test_integration_sh {
Script -- scripts/test-integration.sh
+run_e2e_tests
+setup_runtimes
}
class ci_integration_yml {
Workflow -- ci-integration.yml
+env APM_E2E_TESTS
+env APM_RUN_INTEGRATION_TESTS
}
class TestModule {
Pure
+pytestmark list
}
ci_integration_yml ..> test_integration_sh : invokes
test_integration_sh ..> conftest_py : pytest discovers
conftest_py ..> TestModule : skips via markers
pyproject_toml ..> conftest_py : marker declarations
Validation re-run after fixes
$ uv run --extra dev ruff check src/ tests/
All checks passed!
$ uv run --extra dev ruff format --check src/ tests/
733 files already formatted
$ uv run pytest tests/integration --collect-only -q | tail -1
664/680 tests collected (16 deselected) in 2.90s
Collection count unchanged — the new module-level markers replace runtime self-skips with declarative gating, no behavioural drift.
cc @sergio-sisternes-epam — the panel is advisory; merge gating remains with you.
Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com
…ions primitive (#1249) * refactor(tests): retire script enumeration; pytest discovers tests/integration/ PR2 of #1166. Per acceptance criteria: - scripts/test-integration.sh no longer enumerates individual pytest files; it invokes pytest tests/integration/ once and lets the marker registry (PR1, #1167) handle per-test gating. Script shrinks 778 -> 402 lines. - New test files dropped into tests/integration/ are picked up automatically; the only contract is to add the right requires_* marker (or hermetic = no marker). - Adds requires_apm_binary / requires_github_token / requires_runtime_copilot pytestmarks to 11 files that needed them now that pytest discovers the full directory (test-coverage-expert audit). - Adds 'live' to the addopts deselect list so test_skill_bundle_live.py preserves its current 'opt-in only' behaviour. - Adds APM_RUN_INTEGRATION_TESTS=1 to ci-integration.yml so network-integration tests (transport selection) actually run in CI. - CI integration-tests job timeout 20 -> 30 min to absorb the newly-discovered tests. - Docs (integration-testing.md) reframe the script as the CI orchestrator, not legacy. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore(changelog): wire real PR number for PR2 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(tests): address apm-review-panel findings on PR2 - test_registry_client_integration.py: add requires_network_integration marker (was relying on runtime self-skip; live HTTP to api.mcp.github.com would fire on any dev pytest invocation). Sec panel finding. - test_runtime_smoke.py: add requires_e2e_mode marker (was unmarked; downloads real codex/llm binaries). Test-coverage panel finding. - docs/integration-testing.md: drop pre-fix/post-fix phrasing per docs-current-behaviour-only convention; document the live marker in the registry table. Doc-writer + devx-ux findings. - scripts/test-integration.sh: export APM_RUN_INTEGRATION_TESTS for symmetry with APM_E2E_TESTS; update banner to reflect thin-orchestrator identity; drop vestigial 'Integration Testing Coverage!' comment. Py-arch + cli-log nits. - ci-integration.yml: add inline comment on the 20->30 timeout bump. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs(tests): codify integration-test marker procedure as APM instructions primitive Issue #1166 / follow-up to PR #1247. PR #1247 retired the manual per-file enumeration in scripts/test-integration.sh and shifted gating to declarative pytestmarkers, but the procedure ("where to drop a new integration test, which marker to use, what NOT to do") lived only in PR comments and the test-coverage-expert agent's tribal knowledge. This PR extracts that procedure into a versioned APM primitive that applies to both human contributors (via .github/instructions/) and the test-coverage-expert persona (via a cross-reference at step 5 of its review procedure), AND adds a hermetic regression-trap test that guarantees the rule's pointers do not silently drift. Changes: - .apm/instructions/tests.instructions.md: new H2 "Integration tests: placement and markers" with procedure, the canonical marker quick-map (7 rows), and anti-patterns. Mirrored to .github/. - .apm/agents/test-coverage-expert.agent.md: step 5 now links to the instructions file and explicitly instructs the persona to flag ungated live-network/runtime-binary calls. Mirrored to .github/. - tests/integration/test_marker_registry_sync.py: 7 hermetic assertions verifying that pyproject.toml markers, tests/integration/conftest.py::_MARKER_CHECKS, the docs registry table, and the instructions rule all stay in sync; plus a static lint forbidding new os.getenv-based runtime self-skips on gate env vars in tests/integration/. - CHANGELOG.md: one line under [Unreleased] Changed. - .github/copilot-instructions.md: auto-refreshed by apm compile. Lint (uv run --extra dev ruff check + format --check): silent. Tests (uv run pytest tests/integration/test_marker_registry_sync.py tests/integration/test_drift_check.py): 40 passed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: regenerate compiled mirror with correct relative paths apm install rewrites cross-primitive links from .apm/ source paths to the equivalent paths from the .github/ deployment root. The manual mirror in the previous commit shipped the unrewritten path, which caused APM Self-Check drift. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Daniel Meppiel <copilot-rework@github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix(install,tests): repair 63 integration tests in merge queue (#1247) The merge-queue Integration Tests job for #1238 surfaced 63 failures that had accumulated in the suite since several integration tests were finally wired into CI. Fixes group into seven clusters: 1. Copilot harness marker (post-#1154): bare `.github/` no longer counts; fixtures must drop `.github/copilot-instructions.md`. 2. `apm marketplace build` was removed in favor of `apm pack`; tests now assert the deprecation message + exit 2. 3. `tests/fixtures/azure-skills/.claude-plugin/marketplace.json` regenerated and SHA constant refreshed. 4. `apm.yml` invalid-deps assertion: tolerate Rich line-wrapping in `dependencies` literal echo. 5. Schema fix: stop using removed top-level `repo:` key; use `git:`. 6. ADO virtual collections (`collections/<name>`) now use the SUBDIRECTORY layout per #1094; orphan-detection and dependency- declaration-order tests updated to slash form. 7. `test_auto_bootstrap_creates_user_manifest` exposed a real regression introduced by #1149: `user_scope_rejection_reason()` rejected ALL local paths at user scope, contradicting the post-#937 contract that only relative local paths are ambiguous. Restored absolute-vs-relative gating. The two unit failures in `test_mcp_registry_module` and `test_registry_client` reproduce on `origin/main` without these changes and are pre-existing, out of scope for this PR. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test: add regression traps for the two highest-leverage gaps Adds two fast unit-level test files that pin the contracts whose absence let the corresponding regressions ship to release: 1. tests/unit/install/test_user_scope_rejection_reason.py Pins the post-#937 / post-#1149 contract for local-path policy at user scope: relative -> reject; absolute -> accept; remote -> accept; git: parent -> reject; project scope -> never reject. Verified to fail when the predicate is reverted to the broken #1149 form (5 of 15 tests fail). 2. tests/unit/test_install_path_declaration_invariant.py Asserts that DependencyReference.get_install_path() and primitives.discovery.get_dependency_declaration_order() agree on the on-disk path for every dependency flavor: regular GitHub, regular ADO, virtual subdirectory GitHub, and -- critically -- virtual subdirectory ADO (collections/<name>) which was the cluster-6 drift in the merge-queue run. Verified to fail when the declaration-order branch is mutated to emit the deleted-pre-#1094 flattened form (2 of 6 tests fail). Both files run in <1s combined, so the next refactor that touches either contract fails one assertion in CI instead of waiting for a slow E2E that may not even be wired in yet. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: address PR #1257 review comments 1. user_scope_rejection_reason now expanduser()s the local path before the absolute check. The rest of the install pipeline (sources.py, phases/resolve.py) already expanduser()s local paths, so without this alignment 'apm install --global ~/pkg' was incorrectly rejected as a relative path even though it expands to an absolute one. Trap added: tests/unit/install/test_user_scope_rejection_reason.py gains test_user_scope_accepts_tilde_local_path (parametrized x2). 2. tests/integration/test_install_dry_run_e2e.py: the temp_project fixture pre-created .github/copilot-instructions.md but _assert_no_install_artifacts asserted that same file is absent after the dry-run -- a structural contradiction that masked real dry-run regressions. Fix: drop the fixture file and set targets: [copilot] explicitly in the generated apm.yml so target detection no longer depends on the marker existing pre-install. 3. tests/integration/test_local_content_audit.py: refresh stale comment to reflect the post-#1154 detection contract (the marker is .github/copilot-instructions.md, not a bare .github/ directory). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Daniel Meppiel <copilot-rework@github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
refactor(tests): retire script enumeration; pytest discovers tests/integration/
TL;DR
PR2 of #1166 collapses
scripts/test-integration.shfrom 778 → 402 lines by deleting the 28 hand-maintainedpytest tests/integration/test_X.pyinvocations and replacing them with a singlepytest tests/integration/call. The marker registry shipped in PR1 (#1167) handles per-test gating, so new test files dropped into the directory are picked up automatically. 11 test files get therequires_*markers they were previously running on the honor system.Note
Builds on the marker registry merged in #1167 (PR1). Closes the original wiring bug from #1166 — the script previously enumerated only 28 of 73 integration files, leaving ~45 files dependent on contributors remembering to wire them in.
Problem (WHY)
tests/integration/and never ran in CI unless someone remembered to add apytest …test_X.pyblock. Acceptance criterion 2 of Refactor test-integration.sh to fixture-driven pytest discovery (eliminate manual test wiring) #1166: "A new test file dropped intotests/integration/with the right markers automatically runs without script edits."if/then/log_error/exit 1. PR1 (refactor(tests): marker-driven integration discovery (PR1 of #1166) #1167) moved precondition logic into_MARKER_CHECKSintests/integration/conftest.py; the script's per-file scaffolding is now duplicate state.test_skill_bundle_live.py(pytestmark = pytest.mark.live, 9 tests) hits real GitHub repos and was never gated by_MARKER_CHECKS. It was hidden from CI today only because the script never enumerated it; once pytest discovers the directory it would run by default.APM_RUN_INTEGRATION_TESTS=1was set per-invocation in the old script fortest_transport_selection_integration.py. Without lifting it to job env, the marker-gated test would silently skip in CI post-refactor.requires_apm_binarywas missing on 11 files that shell out to the built apm binary. Hermetic on a contributor laptop withapmon PATH; hard failure in CI without the binary or token.Why these matter: the script-as-source-of-truth regime contradicts the marker-driven design intent of #1166 ("script no longer enumerates individual
pytest test_*.pycalls; it invokespytest tests/integration/with marker selection"). Acceptance criterion 4 also requires: "Conditional gating (ADO PAT, GitHub PAT, runtime presence) preserved via markers + skipif fixtures, not bash branches."Approach (WHAT)
run_e2e_tests()body (28 enumerated blocks, ~415 lines) with a singlepytest tests/integration/invocation.requires_apm_binary/requires_github_token/requires_runtime_copilotmarkers to 11 files that previously ran on the honor system.'live'toaddoptsdeselect list inpyproject.tomlsotest_skill_bundle_live.pypreserves its current "opt-in only" behaviour without inventing a new marker name.APM_RUN_INTEGRATION_TESTS=1to job env inci-integration.yml; bump timeout 20 → 30 min to absorb the newly-discovered tests.docs/.../integration-testing.md: the script is the CI orchestrator, no longer "legacy".main(); pytest reports per-test results.Implementation (HOW)
scripts/test-integration.sh— removed lines 300-715 (run_e2e_tests()body) and lines 743-766 (decorative success echoes); replaced with a thin function that exportsAPM_E2E_TESTS=1, prints the env probe, sources.venvif present, and runspytest tests/integration/ -v --tb=short. Header banner reframed from "DEPRECATED" to "thin orchestrator". Token resolve / platform detect / binary build / runtime setup / dep install untouched.tests/integration/conftest.py— not modified. The_MARKER_CHECKSregistry from PR1 is the contract this PR depends on.pyproject.toml— one-char change:addopts = "-m 'not benchmark and not live'". Preserves opt-in viapytest -m live..github/workflows/ci-integration.yml— addedAPM_RUN_INTEGRATION_TESTS: "1"to theintegration-testsjob env;timeout-minutes: 20→30.pytestmark = pytest.mark.requires_apm_binary(or a list when stacking withrequires_github_token/requires_runtime_copilot). Multi-marker files usepytestmark = [pytest.mark.X, pytest.mark.Y](pytest's documented list form). See test_mcp_env_var_copilot_e2e.py for the canonical multi-marker shape.docs/src/content/docs/contributing/integration-testing.md— rewrote the "Legacy: scripts/test-integration.sh" section as "CI orchestrator: scripts/test-integration.sh". No removal of contributor-facing pytest invocations on the same page.CHANGELOG.md— one Unreleased line under Changed, ending in(#PR_NUMBER).Diagrams
Legend: pre-PR2 vs post-PR2 control flow inside
scripts/test-integration.sh. The boxed nodes are the new behaviour; everything else was already there.flowchart TD A[main] --> B[check_prerequisites] B --> C[detect_platform] C --> D[detect_environment] D --> E[build_binary or use CI artifact] E --> F[setup_binary_for_testing] F --> G[setup_runtimes codex copilot llm] G --> H[install_test_dependencies] H --> I[run_e2e_tests] I --> J["pytest tests/integration/ NEW: single invocation"] J -->|exit 0| K[log_success Ready for release validation] J -->|exit non-zero| L[log_error exit 1] style J fill:#dff,stroke:#06c,stroke-width:2pxLegend: how a single test reaches a skip vs run decision. The
_MARKER_CHECKSregistry is the gate; the script no longer participates.sequenceDiagram participant Script as scripts/test-integration.sh participant Pytest as pytest tests/integration/ participant Conftest as conftest.py _MARKER_CHECKS participant Test as test_marketplace_e2e.py Script->>Pytest: invoke once Pytest->>Conftest: pytest_collection_modifyitems items Conftest->>Test: get_closest_marker requires_apm_binary Test-->>Conftest: marker present Conftest->>Conftest: _has_apm_binary alt binary resolved Conftest-->>Pytest: leave item Pytest->>Test: run else binary missing Conftest-->>Pytest: add skip marker apm binary not found Pytest->>Test: skip endTrade-offs
addopts = "-m 'not benchmark and not live'"over a newrequires_live_testsmarker. Smaller diff; preserves the existingpytest.mark.livedeclaration inpyproject.toml. Rejected adding to_MARKER_CHECKSbecause that would force a marker rename acrosstest_skill_bundle_live.pyfor no functional gain.Benefits
_MARKER_CHECKSis now the single source of precondition truth — bash branches no longer race the registry.test_skill_bundle_live.pyno longer at risk of accidentally hitting live GitHub in CI when discovery widens.Validation
uv run --extra dev ruff check src/ tests/:uv run --extra dev ruff format --check src/ tests/:uv run pytest tests/unit -q(full unit suite):uv run pytest tests/integration --collect-only -q | tail -3:The
16 deselectedaretest_skill_bundle_live.py's 9 live tests + 7benchmark-marked items — exactly the set we wanted gated out by default.Script syntax check
Scenario Evidence
tests/integration/conftest.py::pytest_collection_modifyitems(the discovery contract) + the newpytest tests/integration/invocation inscripts/test-integration.shtests/integration/conftest.py::_has_apm_binary+ 11 newly-marked filestest_skill_bundle_live.pydoes NOT execute in default CIpyproject.toml addopts = "-m 'not benchmark and not live'"; verified by--collect-onlyshowing 16 deselectedtest_transport_selection_integration.py(network-integration) RUNS in CIci-integration.ymladdsAPM_RUN_INTEGRATION_TESTS: "1"to job envHow to test
uv run pytest tests/integration --collect-only -q | tail -3→ expect664/680 tests collected (16 deselected).APM_E2E_TESTS=1 uv run pytest tests/integration/test_compile_constitution_injection.py -v→ all pass without GitHub token or built binary.GITHUB_APM_PAT+ built binary):uv run ./scripts/test-integration.sh→ expect singlepytest tests/integration/invocation in the log; preconditions skip cleanly per marker.integration-testsjob in.github/workflows/ci-integration.ymlshould pass within the 30-min timeout.tests/integration/test_drift_check_pr2.pywith a singledef test_pass(): assert True; without editing the script, confirm--collect-onlypicks it up.Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com