Skip to content

Add failing tests for #1403#1432

Draft
prompt-driven-github[bot] wants to merge 3 commits into
mainfrom
fix/issue-1403
Draft

Add failing tests for #1403#1432
prompt-driven-github[bot] wants to merge 3 commits into
mainfrom
fix/issue-1403

Conversation

@prompt-driven-github
Copy link
Copy Markdown
Contributor

Summary

Adds failing tests that detect the bug reported in #1403.

Test Files

  • Unit tests: tests/test_ci_drift_heal.py
  • Unit tests: tests/test_sync_determine_operation.py
  • Unit tests: tests/test_sync_orchestration.py
  • E2E test: tests/test_ci_drift_heal_e2e.py

Prompt Files

  • Prompt file fixed in Step 7: pdd/prompts/ci_drift_heal_python.prompt
  • Prompt file fixed in Step 7: pdd/prompts/sync_determine_operation_python.prompt
  • Prompt file fixed in Step 7: pdd/prompts/sync_orchestration_python.prompt

What This PR Contains

  • Failing unit tests that reproduce the reported PR auto-heal scope bug
  • A failing E2E test that verifies the bug across the parent ci_drift_heal process and nested pdd sync subprocess boundary
  • Prompt file fixes from Step 7 that specify the intended PR-scope behavior
  • Tests verified to fail on current code and expected to pass once the bug is fixed

Root Cause

PR auto-heal reuses whole-module sync semantics without a PR-scope guard. Detection can classify Python low-coverage modules as test_extend, and heal_module() can route broad operations through plain pdd sync, allowing a child sync process to append unrelated generated tests. The fix should set and honor a PDD_DISABLE_TEST_EXTEND guard in PR mode (skip_ci=False) at both detection and execution time while leaving push-to-main (skip_ci=True) coverage growth unaffected.

Next Steps

  1. Implement the fix at the identified location
  2. Verify the unit tests pass
  3. Verify the E2E test passes
  4. Run full test suite
  5. Mark PR as ready for review

Fixes #1403


Generated by PDD agentic bug workflow

@Serhan-Asad
Copy link
Copy Markdown
Collaborator

Superseded by #1440, which carries the complete fix for #1403 (and reuses/adapts the regression tests authored here). This PR's code missed the detect_drift all_synced filter (a guarded coverage-gap module became an "unknown operation" heal failure) and left os.environ mutated; #1440 fixes both and adds the regression coverage. Recommend closing in favor of #1440.

gltanaka pushed a commit that referenced this pull request Jun 6, 2026
#1403) (#1440)

* fix(ci): scope-preserving PR auto-heal — never escalate to test_extend (#1403)

PR auto-heal was re-bloating narrow fix PRs: for a Python module whose
tests pass but coverage is below target, `sync_determine_operation`
returns `test_extend`, and `heal_module` routes verify/generate/test/crash
through `pdd sync`, which re-derives the same coverage gap internally and
appends unrelated generated tests (rewriting `.pdd/meta` command to
`test_extend`). This made narrow PRs non-mergeable (e.g. #1390).

Add a single env-var signal, `PDD_DISABLE_TEST_EXTEND`, set only in PR
auto-heal mode (`not skip_ci`) and enforced at two layers:

- Detection (`sync_determine_operation.test_extend_disabled`): the
  coverage-gap branch returns the existing `all_synced` no-op for all
  languages when the flag is set. Because this function is called by both
  the in-process `detect_drift` and the nested `pdd sync`, one branch
  covers both the detection and execution paths the issue requires.
- Execution backstop (`sync_orchestration`): mirror the existing
  non-Python `test_extend` skip — log `test_extend_skipped`, accept the
  current state, and write no test file.
- `ci_drift_heal.main` sets the flag on `os.environ` only around the
  in-process `detect_drift` call (restored in `finally`, no leak) and
  passes it explicitly in the `pdd sync` subprocess env. Push-to-main
  (`--skip-ci`) is unaffected — whole-module coverage growth still runs.
- `detect_drift` now treats `all_synced` as "no drift" (alongside
  nothing/synced) so the guarded no-op is a clean skip, not an
  "unknown operation" heal failure.

Prompts updated to match (source of truth). Regression tests prove: PR
mode suppresses + propagates the flag, push-to-main keeps test_extend,
the orchestrator never appends tests / writes a `test_extend` fingerprint
when suppressed, and an e2e run proves the parent→child env contract.
The flag is default-off, so unset behavior is byte-for-byte unchanged.

Supersedes #1416 (prompts-only, code never regenerated) and #1432
(tests + partial code; missed the all_synced filter and env restore).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(ci): address review — all_synced skip after reclassification; rename helper (#1403)

Codex review round 1 found a real regression: detect_drift skipped
'all_synced' BEFORE the git-based reclassification, so a PR that changed
code without its prompt (and had a low-coverage passing run_report) was
silently dropped instead of being promoted to 'update'. This also
regressed the pre-existing non-Python all_synced coverage-gap path under
--diff-base.

- detect_drift: move the 'all_synced' no-drift skip to AFTER git
  reclassification, so an all_synced module whose code changed without its
  prompt is still promoted to 'update'; only a still-terminal all_synced
  is dropped.
- Rename helper test_extend_disabled() -> is_test_extend_disabled() so
  pytest does not collect it as a test when imported into a test module,
  and so the name reads as a predicate.

New regression tests:
- detect_drift: all_synced + code-only change -> update (not dropped);
  terminal all_synced -> clean skip (never an unknown-operation failure).
- is_test_extend_disabled truthiness incl. falsey (0/false/off/'') and
  whitespace; unset -> False.
- main(): os.environ flag restored even when detect_drift raises, and a
  pre-existing value is restored exactly (not clobbered).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Test User <test@test.com>
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PR auto-heal should not escalate narrow fix PRs into unrelated test_extend coverage churn

2 participants