Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 11 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -1256,10 +1256,10 @@ Arguments:
- `CODE_FILE`: The filename of the code file to be tested.

Options:
- `--output LOCATION`: Specify where to save the generated test file. The default file name is `test_<basename>.<language_file_extension>`.
- `--output LOCATION`: Specify where to save the generated test file. The default file name is `test_<basename>.<language_file_extension>`. If an output file with the specified name already exists, a new file with a numbered suffix (e.g., `test_calculator_1.py`) will be created instead of overwriting.
- `--language`: Specify the programming language. Defaults to the language specified by the prompt file name.
- `--coverage-report PATH`: Path to the coverage report file for existing tests. When provided, generates additional tests to improve coverage.
- `--existing-tests PATH`: Path to the existing unit test file. Required when using --coverage-report.
- `--existing-tests PATH [PATH...]`: Path(s) to the existing unit test file(s). Required when using --coverage-report. Multiple paths can be provided.
- `--target-coverage FLOAT`: Desired code coverage percentage to achieve (default is 90.0).
- `--merge`: When used with --existing-tests, merges new tests with existing test file instead of creating a separate file.

Expand Down Expand Up @@ -1288,9 +1288,9 @@ could influence the output of the `pdd test` command when run in the same direct
pdd [GLOBAL OPTIONS] test --output tests/test_factorial_calculator.py factorial_calculator_python.prompt src/factorial_calculator.py
```

2. Generate additional tests to improve coverage:
2. Generate additional tests to improve coverage (with multiple existing test files):
```
pdd [GLOBAL OPTIONS] test --coverage-report coverage.xml --existing-tests tests/test_calculator.py --output tests/test_calculator_enhanced.py calculator_python.prompt src/calculator.py
pdd [GLOBAL OPTIONS] test --coverage-report coverage.xml --existing-tests tests/test_calculator.py --existing-tests tests/test_calculator_edge_cases.py --output tests/test_calculator_enhanced.py calculator_python.prompt src/calculator.py
```

3. Improve coverage and merge with existing tests:
Expand Down Expand Up @@ -1407,11 +1407,11 @@ pdd [GLOBAL OPTIONS] fix [OPTIONS] PROMPT_FILE CODE_FILE UNIT_TEST_FILE ERROR_FI
Arguments:
- `PROMPT_FILE`: The filename of the prompt file that generated the code under test.
- `CODE_FILE`: The filename of the code file to be fixed.
- `UNIT_TEST_FILE`: The filename of the unit test file.
- `UNIT_TEST_FILES`: The filename(s) of the unit test file(s). Multiple files can be provided, and each will be processed individually.
- `ERROR_FILE`: The filename containing the unit test runtime error messages. Optional and does not need to exist when used with the `--loop` command.

Options:
- `--output-test LOCATION`: Specify where to save the fixed unit test file. The default file name is `test_<basename>_fixed.<language_file_extension>`. If an environment variable `PDD_FIX_TEST_OUTPUT_PATH` is set, the file will be saved in that path unless overridden by this option.
- `--output-test LOCATION`: Specify where to save the fixed unit test file. The default file name is `test_<basename>_fixed.<language_file_extension>`. **Warning: If multiple `UNIT_TEST_FILES` are provided along with this option, only the fixed content of the last processed test file will be saved to this location, overwriting previous results. For individual fixed files, omit this option.**
- `--output-code LOCATION`: Specify where to save the fixed code file. The default file name is `<basename>_fixed.<language_file_extension>`. If an environment variable `PDD_FIX_CODE_OUTPUT_PATH` is set, the file will be saved in that path unless overridden by this option.
- `--output-results LOCATION`: Specify where to save the results of the error fixing process. The default file name is `<basename>_fix_results.log`. If an environment variable `PDD_FIX_RESULTS_OUTPUT_PATH` is set, the file will be saved in that path unless overridden by this option.
- `--loop`: Enable iterative fixing process.
Expand All @@ -1423,8 +1423,8 @@ Options:
When the `--loop` option is used, the fix command will attempt to fix errors through multiple iterations. It will use the specified verification program to check if the code runs correctly after each fix attempt. The process will continue until either the errors are fixed, the maximum number of attempts is reached, or the budget is exhausted.

Outputs:
- Fixed unit test file
- Fixed code file
- Fixed unit test file(s).
- Fixed code file.
- Results file containing the LLM model's output with unit test results.
- Print out of results when using '--loop' containing:
- Success status (boolean)
Expand All @@ -1434,9 +1434,9 @@ Outputs:

Example:
```
pdd [GLOBAL OPTIONS] fix --output-test tests/test_factorial_calculator_fixed.py --output-code src/factorial_calculator_fixed.py --output-results results/factorial_fix_results.log factorial_calculator_python.prompt src/factorial_calculator.py tests/test_factorial_calculator.py errors.log
pdd [GLOBAL OPTIONS] fix --output-code src/factorial_calculator_fixed.py --output-results results/factorial_fix_results.log factorial_calculator_python.prompt src/factorial_calculator.py tests/test_factorial_calculator.py tests/test_factorial_calculator_edge_cases.py errors.log
```
In this example, `factorial_calculator_python.prompt` is the prompt file that originally generated the code under test.
In this example, `pdd fix` will be run for each test file, and the fixed test files will be saved as `tests/test_factorial_calculator_fixed.py` and `tests/test_factorial_calculator_edge_cases_fixed.py`.

### 7. split

Expand Down Expand Up @@ -1638,7 +1638,7 @@ Arguments:
- `DESIRED_OUTPUT_FILE`: File containing the desired (correct) output of the program.

Options:
- `--output LOCATION`: Specify where to save the generated unit test. The default file name is `test_<basename>_bug.<language_extension>`.
- `--output LOCATION`: Specify where to save the generated unit test. The default file name is `test_<basename>_bug.<language_extension>`. If an output file with the specified name already exists, a new file with a numbered suffix (e.g., `test_calculator_bug_1.py`) will be created instead of overwriting.
- `--language`: Specify the programming language for the unit test (default is "Python").

Example:
Expand Down
66 changes: 38 additions & 28 deletions pdd/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -776,9 +776,11 @@ def example(
)
@click.option(
"--existing-tests",
"existing_tests",
multiple=True,
type=click.Path(exists=True, dir_okay=False),
default=None,
help="Path to the existing unit test file.",
help="Path to the existing unit test file(s).",
)
@click.option(
"--target-coverage",
Expand All @@ -801,7 +803,7 @@ def test(
output: Optional[str],
language: Optional[str],
coverage_report: Optional[str],
existing_tests: Optional[str],
existing_tests: Optional[Tuple[str, ...]],
target_coverage: Optional[float],
merge: bool,
) -> Optional[Tuple[str, float, str]]:
Expand All @@ -814,7 +816,7 @@ def test(
output=output,
language=language,
coverage_report=coverage_report,
existing_tests=existing_tests,
existing_tests=list(existing_tests) if existing_tests else None,
target_coverage=target_coverage,
merge=merge,
)
Expand Down Expand Up @@ -897,7 +899,7 @@ def preprocess(
@cli.command("fix")
@click.argument("prompt_file", type=click.Path(exists=True, dir_okay=False))
@click.argument("code_file", type=click.Path(exists=True, dir_okay=False))
@click.argument("unit_test_file", type=click.Path(exists=True, dir_okay=False))
@click.argument("unit_test_files", nargs=-1, type=click.Path(exists=True, dir_okay=False))
@click.argument("error_file", type=click.Path(dir_okay=False)) # Allow non-existent for loop mode
@click.option(
"--output-test",
Expand Down Expand Up @@ -955,7 +957,7 @@ def fix(
ctx: click.Context,
prompt_file: str,
code_file: str,
unit_test_file: str,
unit_test_files: Tuple[str, ...],
error_file: str,
output_test: Optional[str],
output_code: Optional[str],
Expand All @@ -968,29 +970,37 @@ def fix(
) -> Optional[Tuple[Dict[str, Any], float, str]]:
"""Fix code based on a prompt and unit test errors."""
try:
# The actual logic is in fix_main
success, fixed_unit_test, fixed_code, attempts, total_cost, model_name = fix_main(
ctx=ctx,
prompt_file=prompt_file,
code_file=code_file,
unit_test_file=unit_test_file,
error_file=error_file,
output_test=output_test,
output_code=output_code,
output_results=output_results,
loop=loop,
verification_program=verification_program,
max_attempts=max_attempts,
budget=budget,
auto_submit=auto_submit,
)
result = {
"success": success,
"fixed_unit_test": fixed_unit_test,
"fixed_code": fixed_code,
"attempts": attempts,
}
return result, total_cost, model_name
all_results = []
total_cost = 0.0
model_name = ""

for unit_test_file in unit_test_files:
success, fixed_unit_test, fixed_code, attempts, cost, model = fix_main(
ctx=ctx,
prompt_file=prompt_file,
code_file=code_file,
unit_test_file=unit_test_file,
error_file=error_file,
output_test=output_test,
output_code=output_code,
output_results=output_results,
loop=loop,
verification_program=verification_program,
max_attempts=max_attempts,
budget=budget,
auto_submit=auto_submit,
)
all_results.append({
"success": success,
"fixed_unit_test": fixed_unit_test,
"fixed_code": fixed_code,
"attempts": attempts,
})
total_cost += cost
model_name = model

return {"results": all_results}, total_cost, model_name

except Exception as exception:
handle_error(exception, "fix", ctx.obj.get("quiet", False))
return None
Expand Down
38 changes: 34 additions & 4 deletions pdd/cmd_test_main.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ def cmd_test_main(
output: str | None,
language: str | None,
coverage_report: str | None,
existing_tests: str | None,
existing_tests: list[str] | None,
target_coverage: float | None,
merge: bool | None,
) -> tuple[str, float, str]:
Expand All @@ -37,7 +37,7 @@ def cmd_test_main(
output (str | None): Path to save the generated test file.
language (str | None): Programming language.
coverage_report (str | None): Path to the coverage report file.
existing_tests (str | None): Path to the existing unit test file.
existing_tests (list[str] | None): Paths to the existing unit test files.
target_coverage (float | None): Desired code coverage percentage.
merge (bool | None): Whether to merge new tests with existing tests.

Expand Down Expand Up @@ -73,7 +73,7 @@ def cmd_test_main(
if coverage_report:
input_file_paths["coverage_report"] = coverage_report
if existing_tests:
input_file_paths["existing_tests"] = existing_tests
input_file_paths["existing_tests"] = existing_tests[0]

command_options = {
"output": output,
Expand All @@ -90,6 +90,14 @@ def cmd_test_main(
command_options=command_options,
context_override=ctx.obj.get('context')
)

if existing_tests:
existing_tests_content = ""
for test_file in existing_tests:
with open(test_file, 'r') as f:
existing_tests_content += f.read() + "\n"
input_strings["existing_tests"] = existing_tests_content

except Exception as exception:
# Catching a general exception is necessary here to handle a wide range of
# potential errors during file I/O and path construction, ensuring the
Expand All @@ -115,7 +123,7 @@ def cmd_test_main(
else:
output_file = output
if merge and existing_tests:
output_file = existing_tests
output_file = existing_tests[0]

if not output_file:
print("[bold red]Error: Output file path could not be determined.[/bold red]")
Expand Down Expand Up @@ -176,6 +184,28 @@ def cmd_test_main(
ctx.exit(1)
return "", 0.0, ""

# Handle output - if output is a directory, use resolved file path from construct_paths
resolved_output = output_file_paths["output"]
if output is None:
output_file = resolved_output
else:
try:
is_dir_hint = output.endswith('/')
except Exception:
is_dir_hint = False
# Prefer resolved file if user passed a directory path
if is_dir_hint or (Path(output).exists() and Path(output).is_dir()):
output_file = resolved_output
else:
output_file = output
if merge and existing_tests:
output_file = existing_tests[0] if existing_tests else None

if not output_file:
print("[bold red]Error: Output file path could not be determined.[/bold red]")
ctx.exit(1)
return "", 0.0, ""

# Check if unit_test content is empty
if not unit_test or not unit_test.strip():
print(f"[bold red]Error: Generated unit test content is empty or whitespace-only.[/bold red]")
Expand Down
89 changes: 59 additions & 30 deletions pdd/construct_paths.py
Original file line number Diff line number Diff line change
Expand Up @@ -265,6 +265,24 @@ def _extract_basename(
"""
Deduce the project basename according to the rules explained in *Step A*.
"""
# Handle 'fix' command specifically to create a unique basename per test file
if command == "fix":
prompt_path = _candidate_prompt_path(input_file_paths)
if not prompt_path:
raise ValueError("Could not determine prompt file for 'fix' command.")

prompt_basename = _strip_language_suffix(prompt_path)

unit_test_path = input_file_paths.get("unit_test_file")
if not unit_test_path:
# Fallback to just the prompt basename if no unit test file is provided
# This might happen in some edge cases, but 'fix' command structure requires it
return prompt_basename

# Use the stem of the unit test file to make the basename unique
test_basename = Path(unit_test_path).stem
return f"{prompt_basename}_{test_basename}"

# Handle conflicts first due to its unique structure
if command == "conflicts":
key1 = "prompt1"
Expand Down Expand Up @@ -724,36 +742,47 @@ def construct_paths(
raise # Re-raise the ValueError

# ------------- Step 4: overwrite confirmation ------------
# Check if any output *file* exists (operate on Path objects)
existing_files: Dict[str, Path] = {}
for k, p_obj in output_paths_resolved.items():
# p_obj = Path(p_val) # Conversion now happens earlier
if p_obj.is_file():
existing_files[k] = p_obj # Store the Path object

if existing_files and not force:
if not quiet:
# Use the Path objects stored in existing_files for resolve()
# Print without Rich tags for easier testing
paths_list = "\n".join(f" • {p.resolve()}" for p in existing_files.values())
console.print(
f"Warning: The following output files already exist and may be overwritten:\n{paths_list}",
style="warning"
)
# Use click.confirm for user interaction
try:
if not click.confirm(
click.style("Overwrite existing files?", fg="yellow"), default=True, show_default=True
):
click.secho("Operation cancelled.", fg="red", err=True)
sys.exit(1) # Exit if user chooses not to overwrite
except Exception as e: # Catch potential errors during confirm (like EOFError in non-interactive)
if 'EOF' in str(e) or 'end-of-file' in str(e).lower():
# Non-interactive environment, default to not overwriting
click.secho("Non-interactive environment detected. Use --force to overwrite existing files.", fg="yellow", err=True)
else:
click.secho(f"Confirmation failed: {e}. Aborting.", fg="red", err=True)
sys.exit(1)
if command in ["test", "bug"] and not force:
for key, path in output_paths_resolved.items():
if path.is_file():
base, ext = os.path.splitext(path)
i = 1
new_path = Path(f"{base}_{i}{ext}")
while new_path.exists():
i += 1
new_path = Path(f"{base}_{i}{ext}")
output_paths_resolved[key] = new_path
else:
# Check if any output *file* exists (operate on Path objects)
existing_files: Dict[str, Path] = {}
for k, p_obj in output_paths_resolved.items():
# p_obj = Path(p_val) # Conversion now happens earlier
if p_obj.is_file():
existing_files[k] = p_obj # Store the Path object

if existing_files and not force:
if not quiet:
# Use the Path objects stored in existing_files for resolve()
# Print without Rich tags for easier testing
paths_list = "\n".join(f" • {p.resolve()}" for p in existing_files.values())
console.print(
f"Warning: The following output files already exist and may be overwritten:\n{paths_list}",
style="warning"
)
# Use click.confirm for user interaction
try:
if not click.confirm(
click.style("Overwrite existing files?", fg="yellow"), default=True, show_default=True
):
click.secho("Operation cancelled.", fg="red", err=True)
sys.exit(1) # Exit if user chooses not to overwrite
except Exception as e: # Catch potential errors during confirm (like EOFError in non-interactive)
if 'EOF' in str(e) or 'end-of-file' in str(e).lower():
# Non-interactive environment, default to not overwriting
click.secho("Non-interactive environment detected. Use --force to overwrite existing files.", fg="yellow", err=True)
else:
click.secho(f"Confirmation failed: {e}. Aborting.", fg="red", err=True)
sys.exit(1)


# ------------- Final reporting ---------------------------
Expand Down
3 changes: 2 additions & 1 deletion pdd/fix_errors_from_unit_tests.py
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,8 @@ def fix_errors_from_unit_tests(
Fix errors in unit tests using LLM models and log the process.

Args:
unit_test (str): The unit test code
unit_test (str): The unit test code, potentially multiple files concatenated
with <file name="filename.py">...</file> tags.
code (str): The code under test
prompt (str): The prompt that generated the code
error (str): The error message
Expand Down
Loading