docs(tests): codify integration-test marker procedure as APM instructions primitive#1249
Merged
danielmeppiel merged 6 commits intomainfrom May 10, 2026
Merged
Conversation
…tegration/ PR2 of #1166. Per acceptance criteria: - scripts/test-integration.sh no longer enumerates individual pytest files; it invokes pytest tests/integration/ once and lets the marker registry (PR1, #1167) handle per-test gating. Script shrinks 778 -> 402 lines. - New test files dropped into tests/integration/ are picked up automatically; the only contract is to add the right requires_* marker (or hermetic = no marker). - Adds requires_apm_binary / requires_github_token / requires_runtime_copilot pytestmarks to 11 files that needed them now that pytest discovers the full directory (test-coverage-expert audit). - Adds 'live' to the addopts deselect list so test_skill_bundle_live.py preserves its current 'opt-in only' behaviour. - Adds APM_RUN_INTEGRATION_TESTS=1 to ci-integration.yml so network-integration tests (transport selection) actually run in CI. - CI integration-tests job timeout 20 -> 30 min to absorb the newly-discovered tests. - Docs (integration-testing.md) reframe the script as the CI orchestrator, not legacy. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- test_registry_client_integration.py: add requires_network_integration marker (was relying on runtime self-skip; live HTTP to api.mcp.github.com would fire on any dev pytest invocation). Sec panel finding. - test_runtime_smoke.py: add requires_e2e_mode marker (was unmarked; downloads real codex/llm binaries). Test-coverage panel finding. - docs/integration-testing.md: drop pre-fix/post-fix phrasing per docs-current-behaviour-only convention; document the live marker in the registry table. Doc-writer + devx-ux findings. - scripts/test-integration.sh: export APM_RUN_INTEGRATION_TESTS for symmetry with APM_E2E_TESTS; update banner to reflect thin-orchestrator identity; drop vestigial 'Integration Testing Coverage!' comment. Py-arch + cli-log nits. - ci-integration.yml: add inline comment on the 20->30 timeout bump. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ions primitive Issue #1166 / follow-up to PR #1247. PR #1247 retired the manual per-file enumeration in scripts/test-integration.sh and shifted gating to declarative pytestmarkers, but the procedure ("where to drop a new integration test, which marker to use, what NOT to do") lived only in PR comments and the test-coverage-expert agent's tribal knowledge. This PR extracts that procedure into a versioned APM primitive that applies to both human contributors (via .github/instructions/) and the test-coverage-expert persona (via a cross-reference at step 5 of its review procedure), AND adds a hermetic regression-trap test that guarantees the rule's pointers do not silently drift. Changes: - .apm/instructions/tests.instructions.md: new H2 "Integration tests: placement and markers" with procedure, the canonical marker quick-map (7 rows), and anti-patterns. Mirrored to .github/. - .apm/agents/test-coverage-expert.agent.md: step 5 now links to the instructions file and explicitly instructs the persona to flag ungated live-network/runtime-binary calls. Mirrored to .github/. - tests/integration/test_marker_registry_sync.py: 7 hermetic assertions verifying that pyproject.toml markers, tests/integration/conftest.py::_MARKER_CHECKS, the docs registry table, and the instructions rule all stay in sync; plus a static lint forbidding new os.getenv-based runtime self-skips on gate env vars in tests/integration/. - CHANGELOG.md: one line under [Unreleased] Changed. - .github/copilot-instructions.md: auto-refreshed by apm compile. Lint (uv run --extra dev ruff check + format --check): silent. Tests (uv run pytest tests/integration/test_marker_registry_sync.py tests/integration/test_drift_check.py): 40 passed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Base automatically changed from
refactor/test-integration-script-retire-pr2
to
main
May 10, 2026 18:32
apm install rewrites cross-primitive links from .apm/ source paths to the equivalent paths from the .github/ deployment root. The manual mirror in the previous commit shipped the unrewritten path, which caused APM Self-Check drift. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Codifies the integration-test marker placement procedure as an APM instructions primitive (mirrored into .github/), wires the guidance into the test-coverage-expert agent persona, and adds a hermetic “registry sync” integration test to prevent drift between pyproject.toml markers, tests/integration/conftest.py::_MARKER_CHECKS, and the docs marker registry.
Changes:
- Add
tests/integration/test_marker_registry_sync.pyto assert marker registry invariants acrosspyproject.toml, docs,.apm/instructions/, and integration conftest wiring. - Extend
.apm/instructions/tests.instructions.mdand.github/instructions/tests.instructions.mdwith an explicit “Integration tests: placement and markers” procedure + quick-map + anti-patterns. - Update
test-coverage-expertpersona docs to point at the new instruction contract; add a changelog entry and refresh deployed-file hashes.
Show a summary per file
| File | Description |
|---|---|
tests/integration/test_marker_registry_sync.py |
New regression-trap tests to keep marker declarations, docs, and conftest predicates in sync. |
CHANGELOG.md |
Adds an Unreleased “Changed” entry describing the new instructions + regression-trap. |
apm.lock.yaml |
Updates hashes for modified deployed .github/ instruction/agent files. |
.github/instructions/tests.instructions.md |
Mirrors the new integration-test marker procedure for GitHub-integration consumers. |
.apm/instructions/tests.instructions.md |
Source primitive documenting the integration-test marker contract. |
.github/agents/test-coverage-expert.agent.md |
Updates persona step to reference the marker placement contract and flag ungated self-skips. |
.apm/agents/test-coverage-expert.agent.md |
Source primitive mirroring the persona update. |
.github/copilot-instructions.md |
Regenerated header/version metadata. |
Copilot's findings
Comments suppressed due to low confidence (1)
tests/integration/test_marker_registry_sync.py:173
- In
test_apm_rule_only_names_declared_markers, thetoken_re/firstcited = ...block and the follow-up comment about the regex capturing only the suffix are redundant/inaccurate (the pattern has no capturing group, sofindall()returns the full token). This is easy to misread and looks like leftover experimentation; consider removing the first regex + assignment and keeping a single pattern/citedderivation.
# Match identifiers OR the placeholder shape inside backticks / plain.
token_re = re.compile(r"\brequires_(?:runtime_<name>|[a-z][a-z0-9_]*)")
cited = set(token_re.findall(body))
# Re-prefix because the regex captured only the suffix; rebuild full names.
# (The regex above intentionally returns the full token thanks to \b
# anchoring and the group inside; re-derive with findall on full pattern.)
full_re = re.compile(r"\brequires_(?:runtime_<name>|[a-z][a-z0-9_]+)")
cited = set(full_re.findall(body))
- Files reviewed: 8/8 changed files
- Comments generated: 2
| import sys | ||
| from pathlib import Path | ||
|
|
||
| import tomllib |
| - Integration tests now use marker-driven discovery: 21 `pytestmark = pytest.mark.skipif(...)` chains across `tests/integration/` are replaced with declarative `requires_*` markers, with precondition logic centralized in `tests/integration/conftest.py` and auto-skipping at collection time. PR1 of #1166. (#1167) | ||
| - Integration test apm-binary resolution now prefers the local build (`./dist/apm-<os>-<arch>/apm`) over a system-wide `apm` on `PATH`, so contributors validating the binary under test are not silently shadowed by a global install; the bearer-token marker (`requires_ado_bearer`) discards the captured JWT immediately and persists only the boolean outcome. (#1167) | ||
| - `scripts/test-integration.sh` is now a thin orchestrator: it builds/locates the apm binary, sets up runtimes and tokens, then invokes `pytest tests/integration/` exactly once. The 28 per-file pytest enumerations were removed; the marker registry handles per-test gating, and new test files dropped into `tests/integration/` are picked up automatically. PR2 of #1166. (#1247) | ||
| - Integration-test marker procedure codified as `.apm/instructions/tests.instructions.md` (wired into `test-coverage-expert` persona) and guarded by a regression-trap test that asserts `pyproject.toml`, `tests/integration/conftest.py::_MARKER_CHECKS`, the docs registry table, and the instructions rule stay in sync. (#1166) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
TL;DR
Follow-up to #1247: codifies the integration-test marker procedure as a versioned APM primitive that applies to humans (
.github/instructions/) AND to thetest-coverage-expertpersona, then locks the rule in place with a hermetic regression-trap test that assertspyproject.toml,tests/integration/conftest.py::_MARKER_CHECKS, the docs registry table, and the instructions rule all stay in sync. Stacked on #1247.Note
Base branch:
refactor/test-integration-script-retire-pr2(PR #1247). Rebase tomainonce #1247 merges.Problem (WHY)
PR #1247 retired the manual per-file enumeration in
scripts/test-integration.shand moved gating to declarativepytestmark = pytest.mark.requires_*markers. The mechanics are real and live, but the procedure for using them ("where do I drop a new integration test? which marker do I pick? what must I NOT do?") survived only in:test-coverage-expertpersonaThat makes the rule invisible to future contributors and to fresh agent sessions. Three concrete failure modes follow:
tests/integration/with no marker → it runs unconditionally and breaks CI on machines that lack the runtime/token it needs.if not os.getenv("APM_E2E_TESTS"): pytest.skip(...)inside the test body — thetest-coverage-expertreviewer might not flag it because the rule is not declared anywhere it loads.requires_Xmarker topyproject.toml, registers the predicate inconftest.py, but forgets to update the docs registry table → the marker becomes invisible to anyone who reads the docs as the source of truth.The encoding rule that this repo enforces for source files is the right precedent: write the contract down once, in
.apm/instructions/, and let it apply byapplyTo:glob.Approach (WHAT)
.apm/instructions/tests.instructions.md: "Integration tests: placement and markers" — drop file intests/integration/, declarepytestmark, pick from registry; no script edit..apm/agents/test-coverage-expert.agent.mdstep 5 ("Probe the test tree") now links to the instructions file and explicitly instructs the persona to flag ungatedos.getenv("APM_E2E_TESTS")self-skips.tests/integration/test_marker_registry_sync.py: 7 hermetic invariants overpyproject.toml,conftest.py, docs, and the rule.## [Unreleased]→Changed.Important
The
.github/instructions/and.github/agents/mirror files are maintained manually, not regenerated byapm compile. Verified: deleting.github/instructions/tests.instructions.mdand re-runningapm compile --cleandoes not recreate it. Repository convention (commits095424fd,6c99c5b0) updates.apm/and.github/copies in lockstep within the same commit.Implementation (HOW)
.apm/instructions/tests.instructions.md.github/instructions/tests.instructions.md.apm/agents/test-coverage-expert.agent.md.github/agents/test-coverage-expert.agent.md.github/copilot-instructions.mdapm compile(Build ID + version bump).tests/integration/test_marker_registry_sync.pyCHANGELOG.md[Unreleased] → Changed.What the regression-trap test asserts
flowchart LR A[pyproject.toml<br/>markers list] -- "every requires_*" --> B[conftest.py<br/>_MARKER_CHECKS] A -- "every requires_* / live" --> C[docs registry table<br/>integration-testing.md] C -- "no phantom markers" --> A D[.apm/instructions/<br/>tests.instructions.md] -- "every cited marker" --> A E[tests/integration/<br/>test_*.py files] -- "no runtime os.getenv<br/>self-skips on gate env vars" --> F[lint guard] B -- "no orphan predicates" --> AThe 7 invariants (test names mirror the assertions verbatim):
test_every_pyproject_gating_marker_has_conftest_predicate— a declaredrequires_Xwithout a predicate is dead config.test_every_conftest_predicate_is_declared_in_pyproject— a predicate without a declaration emitsPytestUnknownMarkWarning.test_every_gating_marker_is_documented_in_registry—requires_*andliveMUST appear in the docs table.test_docs_registry_only_names_declared_markers— no phantom markers in the docs table.test_apm_rule_only_names_declared_markers— marker names cited in the rule body must really exist.test_integration_tests_use_pytestmark_not_runtime_self_skip— static lint overtests/integration/test_*.pyforbiddingos.getenv("APM_E2E_TESTS")+pytest.skippatterns.test_tomllib_available— sanity guard on the Python version assumption (stdlibtomllibrequires 3.11+).Trade-offs
.github/mirror sync, no automation. Apre-commithook could enforce parity, but adding one is out of scope; the regression-trap test catches drift in marker names, and reviewers can spot.apm/↔.github/divergence in the PR diff.pytest.skipsomewhere within 400 chars could in principle false-positive. Current count of such patterns undertests/integration/: zero. The test itself is exempt via path comparison.liveandbenchmarkcarved out of the docs-sync direction.integration,slow, andbenchmarkare taxonomy markers (not gates) and are intentionally NOT required to appear in the docs registry; the invariant only goes one direction for them (docs must not advertise phantom markers).Validation
How to test
uv run --extra dev ruff check src/ tests/ && uv run --extra dev ruff format --check src/ tests/(both silent).uv run pytest tests/integration/test_marker_registry_sync.py -v(all 7 pass)..github/agents/test-coverage-expert.agent.md, step 5 ("Probe the test tree") links to../instructions/tests.instructions.mdand names the anti-pattern explicitly.requires_*marker inpyproject.tomland re-run test Why do we need a GitHub token? #1 — it MUST fail with a clear message naming the missing marker.