Add failing tests for #1403#1432
Draft
prompt-driven-github[bot] wants to merge 3 commits into
Draft
Conversation
…est_extend coverage churn Fixes #1403
Collaborator
|
Superseded by #1440, which carries the complete fix for #1403 (and reuses/adapts the regression tests authored here). This PR's code missed the |
gltanaka
pushed a commit
that referenced
this pull request
Jun 6, 2026
#1403) (#1440) * fix(ci): scope-preserving PR auto-heal — never escalate to test_extend (#1403) PR auto-heal was re-bloating narrow fix PRs: for a Python module whose tests pass but coverage is below target, `sync_determine_operation` returns `test_extend`, and `heal_module` routes verify/generate/test/crash through `pdd sync`, which re-derives the same coverage gap internally and appends unrelated generated tests (rewriting `.pdd/meta` command to `test_extend`). This made narrow PRs non-mergeable (e.g. #1390). Add a single env-var signal, `PDD_DISABLE_TEST_EXTEND`, set only in PR auto-heal mode (`not skip_ci`) and enforced at two layers: - Detection (`sync_determine_operation.test_extend_disabled`): the coverage-gap branch returns the existing `all_synced` no-op for all languages when the flag is set. Because this function is called by both the in-process `detect_drift` and the nested `pdd sync`, one branch covers both the detection and execution paths the issue requires. - Execution backstop (`sync_orchestration`): mirror the existing non-Python `test_extend` skip — log `test_extend_skipped`, accept the current state, and write no test file. - `ci_drift_heal.main` sets the flag on `os.environ` only around the in-process `detect_drift` call (restored in `finally`, no leak) and passes it explicitly in the `pdd sync` subprocess env. Push-to-main (`--skip-ci`) is unaffected — whole-module coverage growth still runs. - `detect_drift` now treats `all_synced` as "no drift" (alongside nothing/synced) so the guarded no-op is a clean skip, not an "unknown operation" heal failure. Prompts updated to match (source of truth). Regression tests prove: PR mode suppresses + propagates the flag, push-to-main keeps test_extend, the orchestrator never appends tests / writes a `test_extend` fingerprint when suppressed, and an e2e run proves the parent→child env contract. The flag is default-off, so unset behavior is byte-for-byte unchanged. Supersedes #1416 (prompts-only, code never regenerated) and #1432 (tests + partial code; missed the all_synced filter and env restore). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(ci): address review — all_synced skip after reclassification; rename helper (#1403) Codex review round 1 found a real regression: detect_drift skipped 'all_synced' BEFORE the git-based reclassification, so a PR that changed code without its prompt (and had a low-coverage passing run_report) was silently dropped instead of being promoted to 'update'. This also regressed the pre-existing non-Python all_synced coverage-gap path under --diff-base. - detect_drift: move the 'all_synced' no-drift skip to AFTER git reclassification, so an all_synced module whose code changed without its prompt is still promoted to 'update'; only a still-terminal all_synced is dropped. - Rename helper test_extend_disabled() -> is_test_extend_disabled() so pytest does not collect it as a test when imported into a test module, and so the name reads as a predicate. New regression tests: - detect_drift: all_synced + code-only change -> update (not dropped); terminal all_synced -> clean skip (never an unknown-operation failure). - is_test_extend_disabled truthiness incl. falsey (0/false/off/'') and whitespace; unset -> False. - main(): os.environ flag restored even when detect_drift raises, and a pre-existing value is restored exactly (not clobbered). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Test User <test@test.com> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds failing tests that detect the bug reported in #1403.
Test Files
tests/test_ci_drift_heal.pytests/test_sync_determine_operation.pytests/test_sync_orchestration.pytests/test_ci_drift_heal_e2e.pyPrompt Files
pdd/prompts/ci_drift_heal_python.promptpdd/prompts/sync_determine_operation_python.promptpdd/prompts/sync_orchestration_python.promptWhat This PR Contains
ci_drift_healprocess and nestedpdd syncsubprocess boundaryRoot Cause
PR auto-heal reuses whole-module sync semantics without a PR-scope guard. Detection can classify Python low-coverage modules as
test_extend, andheal_module()can route broad operations through plainpdd sync, allowing a child sync process to append unrelated generated tests. The fix should set and honor aPDD_DISABLE_TEST_EXTENDguard in PR mode (skip_ci=False) at both detection and execution time while leaving push-to-main (skip_ci=True) coverage growth unaffected.Next Steps
Fixes #1403
Generated by PDD agentic bug workflow