Skip to content

agentic_e2e_fix: Steps 1 & 8 pass unnecessary log file to pdd fix --loop causing workflow failures #360

@Serhan-Asad

Description

@Serhan-Asad

Bug Description

Steps 1 and 8 of the agentic_e2e_fix workflow (pdd fix <URL>) instruct the LLM agent to pass an unnecessary log file argument to pdd fix --manual --loop, causing pytest to fail with collection errors and the workflow to crash.

Affected Files

  1. pdd/prompts/agentic_e2e_fix_step1_unit_tests_LLM.prompt (line 41)
  2. pdd/prompts/agentic_e2e_fix_step8_run_pdd_fix_LLM.prompt (line 63)

Current Behavior

Both prompts instruct the LLM agent to run:

# Create a temp error file for pdd fix
touch /tmp/pdd_fix_errors_{{dev_unit}}.log
pdd fix --manual --loop PROMPT CODE TEST /tmp/pdd_fix_errors_{{dev_unit}}.log

What actually happens:

  1. touch creates the log file
  2. pdd fix is invoked with --loop flag
  3. In loop mode, pdd/commands/fix.py treats ALL positional args after CODE as test files:
    if loop:
        unit_test_files = args[2:]  # [TEST, /tmp/pdd_fix_errors_{{dev_unit}}.log]
        error_file = None           # ERROR_FILE is ignored in loop mode
  4. pytest attempts to collect tests from both files
  5. pytest fails to collect from the .log file
  6. Result: RuntimeError: Process exited with code 1

Evidence from Production Run

Test case: Ran pdd fix https://github.com/Serhan-Asad/pdd/issues/2

Core dump: /Users/.../fix-issue-2/.pdd/core_dumps/pdd-core-20260121T165934Z.json

Actual command executed (from core dump argv):

[
  "fix",
  "--manual",
  "--protect-tests",
  "--loop",
  "--verification-program",
  "context/math_helper_example.py",
  "--max-attempts",
  "5",
  "prompts/math_helper_python.prompt",
  "pdd/math_helper.py",
  "tests/test_math_helper.py",
  "/tmp/pdd_fix_errors_math_helper.log"
]

Error from core dump:

{
  "type": "RuntimeError",
  "message": "Process exited with code 1",
  "traceback": "RuntimeError: Process exited with code 1\n"
}

Terminal output:

Processing test file 1/2: tests/test_math_helper.py

The "1/2" indicates pytest detected 2 test files - the second being the log file.

Root Cause

Loop mode doesn't use the ERROR_FILE parameter - it runs tests iteratively and captures errors internally. Including a log file as a positional argument causes pytest to treat it as a test file.

From pdd/commands/fix.py (lines 120-125):

if loop:
    unit_test_files = args[2:]  # All args after code_file
    error_file = None           # ERROR_FILE not used!
else:
    unit_test_files = args[2:-1]
    error_file = args[-1]

Who Is Affected

ONLY affects:

  • pdd fix <URL> - The agentic_e2e_fix workflow
  • When the LLM agent executes commands based on these internal prompts

DOES NOT affect:

  • pdd fix --manual --loop ... - When users run it manually from terminal
  • Manual usage - Users don't see or use these internal prompts

Proposed Fix

Remove the log file from both prompt templates:

Before:

# Create a temp error file for pdd fix
touch /tmp/pdd_fix_errors_{{dev_unit}}.log
pdd fix --manual --loop PROMPT CODE TEST /tmp/pdd_fix_errors_{{dev_unit}}.log

After:

pdd fix --manual --loop PROMPT CODE TEST

Rationale:

  • Loop mode runs tests automatically and captures errors internally
  • No external error file is needed
  • Including it causes pytest collection errors

Reproduction

We've confirmed this bug through:

  1. Manual reproduction test with pytest + .log file (exit code 4)
  2. Code analysis of pdd/commands/fix.py showing loop mode behavior
  3. Full workflow test showing actual command execution and failure
  4. Core dump analysis proving the bug manifests in production

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions