Richer core dumps: structured errors, sync_steps, LLM trace, ANSI strip by vishalramvelu · Pull Request #712 · promptdriven/pdd

vishalramvelu · 2026-03-20T02:44:41Z

Summary

Improves PDD debug snapshots (core dumps written to .pdd/core_dumps/pdd-core-*.json) so failures are easier to diagnose without log spelunking.

Structured errors: Non-exception failures can be recorded via record_core_dump_error and appear under errors in the dump.
Dump schema: Bumps to schema_version 2; aligns steps with invoked subcommands and defaults missing model to "unknown".
Context in the JSON: Auto-includes relevant .pdd/meta/*_sync.log / *_run.json content, derives sync_steps from sync logs (skips noise where appropriate), and keeps file_contents for tracked / meta paths.
Terminal output: Strips CSI + OSC (and related) sequences so terminal_output is readable plain text.
Sync failures: On relevant failure paths, attaches details.llm_trace (last prompt/response, redacted/truncated) and test_output_excerpt where applicable.
Budget exhaustion: Surfaces as a structured error suitable for the dump.

Test Results

Test_Core_Dump - Passed
Sync_Orchestration - Passed
Test_Core_Errors - Passed
Test_Cli - Passed

Manual testing

From a small fixture repo (or this repo), run a command that fails during sync (e.g. intentional test failure on fix) with core dumps enabled (default on unless --no-core-dump).
Open the newest .pdd/core_dumps/pdd-core-*.json and confirm:
schema_version is 2, errors includes expected structured entries when no Python traceback exists, steps length/order matches the run; missing models show "unknown", terminal_output has no obvious ANSI/OSC garbage, sync_steps present when meta sync logs exist; entries look like the failing operations, on LLM-using failure paths, details.llm_trace appears when designed
Run with --no-core-dump and confirm no new dump file (or no write, per existing behavior).

Checklist

Code changes limited to PDD CLI / sync / dump behavior (no PDD_CAP prompt files in this PR).
uv run pytest passes for the suites/commands above (or full CI-equivalent run).
No secrets or raw API keys in llm_trace / file snapshots (redaction/truncation behavior reviewed).
Manual spot-check of a real .pdd/core_dumps/*.json after a failed sync.
Prompts submitted to PDD_CAP Repo as PR

Fixes #710

Copilot

Pull request overview

This PR improves PDD “core dump” debug snapshots to make sync failures diagnosable from the JSON alone (structured non-exception errors, per-operation sync steps derived from meta logs, failure-only LLM/test traces, and ANSI/OSC stripping for captured terminal output).

Changes:

Add structured core-dump errors via record_core_dump_error and record logical/non-exception failure paths in sync.
Bump core dump schema to v2 and enrich dumps with auto-included meta artifacts plus derived sync_steps.
Capture failure-only debugging context (ANSI/OSC-clean terminal output, truncated test output excerpts, and last LLM prompt/response pair).

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
`pdd/core/errors.py`	Adds structured core-dump error recording API.
`pdd/core/dump.py`	Bumps schema to v2; auto-includes meta sync/run files; derives `sync_steps`; ensures `steps[*].model` defaults to `"unknown"`.
`pdd/core/cli.py`	Expands ANSI stripping to cover CSI + OSC sequences.
`pdd/core/llm_trace.py`	Introduces lightweight, redacted/truncated LLM prompt/response trace capture by operation.
`pdd/llm_invoke.py`	Records best-effort LLM traces for cloud + LiteLLM paths.
`pdd/sync_orchestration.py`	Records logical failures as structured core-dump errors; captures truncated test output excerpts; attaches failure-only LLM traces to operation log entries.
`pdd/sync_main.py`	Records budget exhaustion as a structured core-dump error.
`tests/test_core_dump.py`	Updates expectations for schema v2; adds tests for meta sync/run auto-inclusion, derived `sync_steps`, and model defaulting.
`tests/core/test_cli.py`	Adds tests covering OSC/cursor-sequence stripping.
`tests/test_core_errors.py`	Adds unit test for structured error recording.
`tests/test_sync_orchestration.py`	Adds tests validating logical-failure error recording, test output excerpt truncation, and LLM trace attachment.
`pdd/simple_math.py`	Adds a new module (appears unrelated to the PR’s stated scope).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

gltanaka · 2026-03-23T17:18:11Z

target 3/24

vishalramvelu · 2026-03-25T00:50:18Z

Units Tests failing because TestConvergencePromptRequirements expects literal substrings in agentic_e2e_fix_orchestrator_python.prompt, but updating that prompt caused a merge conflict with main on the same file. Leaving the prompt as-is for now. We can align prompt text with Issue #903 tests in a follow-up once the branch is rebased cleanly or the file is de-conflicted upstream.

gltanaka · 2026-03-25T17:46:14Z

Hey @vishalramvelu — CI is failing because the PR's version of pdd/prompts/agentic_e2e_fix_orchestrator_python.prompt is missing the Requirement 5a/5b convergence sections ("empty dev-units short-circuit" and "per-cycle file-hash comparison") that the updated tests assert on.

The two failing tests:

test_orchestrator_prompt_contains_empty_dev_units_requirement
test_orchestrator_prompt_contains_file_hash_comparison_requirement

These requirements exist in upstream main but weren't picked up by the fork. Could you rebase onto the latest upstream main? That should bring in the prompt content and resolve the merge conflicts as well.

…op unused json import

…rchestrator prompt

gltanaka · 2026-03-30T17:23:32Z

target 3/31

Made-with: Cursor # Conflicts: # pdd/prompts/agentic_bug_step11_e2e_test_LLM.prompt # pdd/prompts/agentic_e2e_fix_orchestrator_python.prompt # tests/test_issue_633_reproduction.py

…hers Made-with: Cursor

gltanaka

Hey @vishalramvelu — the core changes here look solid and well-tested. Just 3 files that need to be removed before we can merge:

pdd/core_dump_smoke.py — PDD-generated test file that from solution import add and from z3 import .... Neither exists in the package, so this will break imports. Same issue Copilot flagged for simple_math.py (which you removed) — this one slipped through.
context/simple_math_example.py — Another PDD-generated artifact, unrelated to this PR's scope.
uv.lock — The project doesn't use uv (no references in Makefile, CI, or pyproject.toml). This looks like it was committed from your local setup. 3,637 lines we don't need.

Once those are removed, this is good to merge.

…v.lock

vishalramvelu · 2026-04-04T20:09:04Z

Hey @vishalramvelu — the core changes here look solid and well-tested. Just 3 files that need to be removed before we can merge:

pdd/core_dump_smoke.py — PDD-generated test file that from solution import add and from z3 import .... Neither exists in the package, so this will break imports. Same issue Copilot flagged for simple_math.py (which you removed) — this one slipped through.

context/simple_math_example.py — Another PDD-generated artifact, unrelated to this PR's scope.

uv.lock — The project doesn't use uv (no references in Makefile, CI, or pyproject.toml). This looks like it was committed from your local setup. 3,637 lines we don't need.

Once those are removed, this is good to merge.

Ok just removed those files.

…ase conflict - Combine TestCommand (command + optional cwd) from main with _run_fix_operation_test_subprocess for fix-phase test capture. - Update test_sync_orchestration mocks: TestCommand instances, patch _run_fix_operation_test_subprocess where fix flow replaced subprocess.run. Made-with: Cursor

gltanaka · 2026-04-06T17:17:55Z

target 4/7

gltanaka

All prior review feedback addressed. Verified end-to-end on an isolated cherry-pick branch against gltanaka/pdd main:

359/359 unit tests pass (core dump, errors, CLI, sync orchestration, update)
74/75 cloud batch tasks pass (1 failure is a prompt include tag assertion unrelated to this PR)
All 8 Issue #710 acceptance criteria covered and verified

LGTM - ready to merge.

gltanaka requested a review from Copilot March 20, 2026 19:11

Copilot started reviewing on behalf of gltanaka March 20, 2026 19:11 View session

Copilot AI reviewed Mar 20, 2026

View reviewed changes

Comment thread pdd/sync_orchestration.py

Comment thread pdd/simple_math.py Outdated

Comment thread pdd/llm_invoke.py Outdated

vishalramvelu added 5 commits March 25, 2026 14:13

Richer core dumps: structured errors, sync_steps, LLM trace, ANSI strip

2480649

Fix LLM trace staleness on sync ops; remove stray pdd/simple_math; dr…

6f43a78

…op unused json import

Document Issue promptdriven#903 convergence requirements in e2e fix o…

1399adf

…rchestrator prompt

Revert orchestrator prompt convergence text to avoid merge conflict

b1daa54

Update modify flow and add core dump pro

413cd47

vishalramvelu force-pushed the feature/711 branch from 4bdbb6a to 413cd47 Compare March 25, 2026 19:15

vishalramvelu added 2 commits March 25, 2026 14:27

Align orchestrator and step11 prompts with CI

e206505

Update issue promptdriven#633 meta-test for fixed HTTP method guidance

6689f34

vishalramvelu added 2 commits April 3, 2026 17:07

Merge origin/main into feature/711

f13713b

Made-with: Cursor # Conflicts: # pdd/prompts/agentic_bug_step11_e2e_test_LLM.prompt # pdd/prompts/agentic_e2e_fix_orchestrator_python.prompt # tests/test_issue_633_reproduction.py

fix(prompt): Step 11 API mock example uses 'GET'/'POST' for test matc…

c7e0f95

…hers Made-with: Cursor

gltanaka requested changes Apr 4, 2026

View reviewed changes

chore: drop stray core_dump_smoke, context simple_math example, and u…

ebf4c52

…v.lock

vishalramvelu added 2 commits April 4, 2026 15:21

fix(sync): delegate fix-phase test subprocess for reliable test patching

162e573

vishalramvelu added 2 commits April 6, 2026 14:57

Merge origin/main into feature/711 and resolve prompt/csv conflicts

82f3ebc

fix(prompts): restore orchestrator prompt

5e7d533

gltanaka approved these changes Apr 9, 2026

View reviewed changes

gltanaka merged commit 21e5f0a into promptdriven:main Apr 9, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Richer core dumps: structured errors, sync_steps, LLM trace, ANSI strip#712

Richer core dumps: structured errors, sync_steps, LLM trace, ANSI strip#712
gltanaka merged 14 commits into
promptdriven:mainfrom
vishalramvelu:feature/711

vishalramvelu commented Mar 20, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gltanaka commented Mar 23, 2026

Uh oh!

vishalramvelu commented Mar 25, 2026

Uh oh!

gltanaka commented Mar 25, 2026

Uh oh!

gltanaka commented Mar 30, 2026

Uh oh!

gltanaka left a comment

Uh oh!

vishalramvelu commented Apr 4, 2026

Uh oh!

gltanaka commented Apr 6, 2026

Uh oh!

gltanaka left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

vishalramvelu commented Mar 20, 2026

Summary

Test Results

Manual testing

Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gltanaka commented Mar 23, 2026

Uh oh!

vishalramvelu commented Mar 25, 2026

Uh oh!

gltanaka commented Mar 25, 2026

Uh oh!

gltanaka commented Mar 30, 2026

Uh oh!

gltanaka left a comment

Choose a reason for hiding this comment

Uh oh!

vishalramvelu commented Apr 4, 2026

Uh oh!

gltanaka commented Apr 6, 2026

Uh oh!

gltanaka left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants