Debug snapshots (core dumps) lack actionable diagnostic information for debugging failures

## Problem

When a PDD command fails, the debug snapshot saved to `.pdd/core_dumps/` doesn't contain enough information to diagnose what went wrong. The current message says "attach when reporting bugs," but the snapshot often lacks the detail needed to actually debug the issue.

#### I think this is high priority, not only to help developers debug failures, but also to brainstorm improvements to the tool. 

#### Example: `pdd sync summarize_directory` fails with "5 consecutive fix operations" loop

The core dump shows:
- `"errors": []` — empty, even though the command failed
- `"steps": [{"step": 1, "command": "sync", "cost": 0.74, "model": ""}]` — one opaque entry, no per-attempt breakdown
- No indication of *which* tests failed or *why* generated code was rejected
- No LLM prompts/responses captured

## What's missing

| Gap | Impact |
|-----|--------|
| **Workflow failures not captured in `errors`** | `errors` only logs Python exceptions, not logical failures (fix loops, test failures, budget exhaustion). A "Failed" sync produces `errors: []`. |
| **No per-step breakdown for compound commands** | `sync` internally runs generate→test→fix→test→... but only records one step. Users can't see which sub-operation failed. |
| **Empty `model` field** | The model used is recorded as `""`, making it impossible to know which LLM was involved. |
| **No LLM request/response pairs** | The actual prompts sent to and responses received from the LLM are not captured. This is the single most important diagnostic for generation failures. |
| **No test output per attempt** | When tests fail and trigger fix loops, the test stderr/assertion messages aren't included. |
| **Operation log not bundled** | PDD already writes rich per-operation logs to `.pdd/meta/{basename}_{lang}_sync.log` (with timestamps, costs, success/failure per operation), but this file isn't included in `file_contents`. |
| **Raw ANSI escapes in terminal_output** | Makes the output unreadable when reviewing the JSON manually. |

## Proposed improvements

1. **Capture LLM request/response pairs** — at minimum, log the final prompt sent and the raw LLM response for each failed operation. This is critical for diagnosing "the LLM keeps generating broken code" scenarios. 
2. **Include operation sync log** in `file_contents` — this already exists and contains per-operation success/failure/cost/model data. Also, capture how much time each operation took so developers know which operations are high-priority to improve speed.
3. **Strip ANSI escapes** from `terminal_output` before saving
4. **Record logical failures** in `errors` — if a sync/fix loop terminates due to max retries, budget exhaustion, or other non-exception failures, log them as structured error entries
5. **Populate the `model` field** in step records
6. **Expand `steps` for sync** — record each internal operation (generate, test, fix) as its own step entry with: operation type, model, cost, success/failure, and a summary of the failure reason
7. **Capture test output** — include the test runner's stderr/stdout for each failed test attempt (truncated to ~5KB)
8. **Include generated code diffs** — show what changed between fix attempts
9. **Include `.pddrc`** and `llm_model.csv` configs (if not already done for the specific run)

## Context

Related: #230 (add LLM model CSV to core dump), #391 (confusing error output in core dumps)

Gap	Impact
Workflow failures not captured in `errors`	`errors` only logs Python exceptions, not logical failures (fix loops, test failures, budget exhaustion). A "Failed" sync produces `errors: []`.
No per-step breakdown for compound commands	`sync` internally runs generate→test→fix→test→... but only records one step. Users can't see which sub-operation failed.
Empty `model` field	The model used is recorded as `""`, making it impossible to know which LLM was involved.
No LLM request/response pairs	The actual prompts sent to and responses received from the LLM are not captured. This is the single most important diagnostic for generation failures.
No test output per attempt	When tests fail and trigger fix loops, the test stderr/assertion messages aren't included.
Operation log not bundled	PDD already writes rich per-operation logs to `.pdd/meta/{basename}_{lang}_sync.log` (with timestamps, costs, success/failure per operation), but this file isn't included in `file_contents`.
Raw ANSI escapes in terminal_output	Makes the output unreadable when reviewing the JSON manually.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Debug snapshots (core dumps) lack actionable diagnostic information for debugging failures #680

Problem

I think this is high priority, not only to help developers debug failures, but also to brainstorm improvements to the tool.

Example: `pdd sync summarize_directory` fails with "5 consecutive fix operations" loop

What's missing

Proposed improvements

Context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Debug snapshots (core dumps) lack actionable diagnostic information for debugging failures #680

Description

Problem

I think this is high priority, not only to help developers debug failures, but also to brainstorm improvements to the tool.

Example: pdd sync summarize_directory fails with "5 consecutive fix operations" loop

What's missing

Proposed improvements

Context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Example: `pdd sync summarize_directory` fails with "5 consecutive fix operations" loop