Fix Agent Harness diff capture and validator workdirs#474
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryThis PR fixes three harness-side issues in the Agent Harness execution workflow: it stages untracked files via
Confidence Score: 4/5Safe to merge for harnesses with a configured repository URL; no-repo harness configs will now fail at the diff step where they previously succeeded. One P1 behavioral regression: the strict exit-code check on backend/internal/workflow/agent_harness_execution_workflow.go — the
|
| Filename | Overview |
|---|---|
| backend/internal/workflow/agent_harness_execution_workflow.go | Adds git add --intent-to-add before diff capture and agentHarnessValidatorWorkdir helper; uses path instead of path/filepath, and the strict exit-code check on git add regresses no-repo harness configs. |
| backend/internal/workflow/agent_harness_execution_workflow_test.go | Tests updated to expect bash instead of sh, assert ordering of git add --intent-to-add before git diff, and verify relative workdir resolution; new TestAgentHarnessValidatorWorkdir table-driven test added — good coverage of happy paths. |
| testing/fix-agent-harness-diff-and-validator-workdir.md | New test-contract documentation describing functional behavior, unit tests, integration tests, and smoke tests; no code changes. |
Sequence Diagram
sequenceDiagram
participant W as Workflow
participant S as Sandbox
participant R as Repository
W->>S: git clone + checkout (if RepositoryURL set)
W->>S: codex exec --full-auto
Note over W,S: NEW: capture untracked files
W->>S: git add --intent-to-add --all
S-->>W: exit code (hard fail if != 0)
W->>S: git diff --binary
S-->>W: binary diff artifact
W->>S: git status --short
S-->>W: changed files artifact
W->>R: TransitionStatus → scoring
loop each command validator
Note over W,S: NEW: relative workdir resolved under repo root
Note over W,S: NEW: bash -lc instead of sh -lc
W->>S: bash -lc command (workdir = resolved)
S-->>W: exit code + output
W->>R: record validator.command.passed/failed
end
W->>R: record scoring.completed
Prompt To Fix All With AI
Fix the following 3 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 3
backend/internal/workflow/agent_harness_execution_workflow.go:219-223
**Hard failure on `git add --intent-to-add` regresses no-repo harness configs**
When `RepositoryURL` is unset, `workdir` is set to `"/"` (line 207). Running `git add --intent-to-add --all` in `"/"` returns a non-zero exit code ("not a git repository"), which now causes the whole execution to return an error via the strict `result.ExitCode != 0` check here.
Before this PR, `git diff` also ran in `"/"` in that scenario but its exit code was **not** checked (the result was simply recorded and execution continued). The new check creates an asymmetry: no-repo harnesses that previously completed now fail at this step. Consider guarding the `git add --intent-to-add` call behind the same `RepositoryURL != nil` condition that gates the clone step, or skipping diff capture entirely when no repository is configured.
### Issue 2 of 3
backend/internal/workflow/agent_harness_execution_workflow.go:403-406
**Use `path/filepath` for OS filesystem paths**
`path` is designed for URL-style slash-separated paths, while `path/filepath` handles OS-specific filesystem paths. On Linux (where E2B runs) the behaviour is identical, but the idiomatic package for sandbox filesystem operations is `path/filepath`. Swap the import to avoid subtle issues if the host OS or test environment ever differs.
```suggestion
if filepath.IsAbs(workdir) {
return filepath.Clean(workdir)
}
return filepath.Join(defaultWorkdir, workdir)
```
### Issue 3 of 3
backend/internal/workflow/agent_harness_execution_workflow.go:368-369
**Redundant `TrimSpace` before `agentHarnessValidatorWorkdir`**
`validator.WorkingDirectory` is trimmed here and then trimmed again inside `agentHarnessValidatorWorkdir` (line 399). The outer trim is redundant; the helper already handles whitespace-only inputs by returning `defaultWorkdir`. Consider removing the pre-trim and letting the helper be the single source of truth.
```suggestion
workdir := agentHarnessValidatorWorkdir(defaultWorkdir, validator.WorkingDirectory)
```
Reviews (1): Last reviewed commit: "fix(agent-harness): capture new files an..." | Re-trigger Greptile
| if result, err := a.execHarnessCommand(ctx, execution.ID, session, "git.add_intent", []string{"git", "add", "--intent-to-add", "--all"}, workdir, 60*time.Second, env); err != nil { | ||
| return err | ||
| } else if result.ExitCode != 0 { | ||
| return fmt.Errorf("git add intent failed with exit code %d", result.ExitCode) | ||
| } |
There was a problem hiding this comment.
Hard failure on
git add --intent-to-add regresses no-repo harness configs
When RepositoryURL is unset, workdir is set to "/" (line 207). Running git add --intent-to-add --all in "/" returns a non-zero exit code ("not a git repository"), which now causes the whole execution to return an error via the strict result.ExitCode != 0 check here.
Before this PR, git diff also ran in "/" in that scenario but its exit code was not checked (the result was simply recorded and execution continued). The new check creates an asymmetry: no-repo harnesses that previously completed now fail at this step. Consider guarding the git add --intent-to-add call behind the same RepositoryURL != nil condition that gates the clone step, or skipping diff capture entirely when no repository is configured.
Prompt To Fix With AI
This is a comment left during a code review.
Path: backend/internal/workflow/agent_harness_execution_workflow.go
Line: 219-223
Comment:
**Hard failure on `git add --intent-to-add` regresses no-repo harness configs**
When `RepositoryURL` is unset, `workdir` is set to `"/"` (line 207). Running `git add --intent-to-add --all` in `"/"` returns a non-zero exit code ("not a git repository"), which now causes the whole execution to return an error via the strict `result.ExitCode != 0` check here.
Before this PR, `git diff` also ran in `"/"` in that scenario but its exit code was **not** checked (the result was simply recorded and execution continued). The new check creates an asymmetry: no-repo harnesses that previously completed now fail at this step. Consider guarding the `git add --intent-to-add` call behind the same `RepositoryURL != nil` condition that gates the clone step, or skipping diff capture entirely when no repository is configured.
How can I resolve this? If you propose a fix, please make it concise.| if path.IsAbs(workdir) { | ||
| return path.Clean(workdir) | ||
| } | ||
| return path.Join(defaultWorkdir, workdir) |
There was a problem hiding this comment.
Use
path/filepath for OS filesystem paths
path is designed for URL-style slash-separated paths, while path/filepath handles OS-specific filesystem paths. On Linux (where E2B runs) the behaviour is identical, but the idiomatic package for sandbox filesystem operations is path/filepath. Swap the import to avoid subtle issues if the host OS or test environment ever differs.
| if path.IsAbs(workdir) { | |
| return path.Clean(workdir) | |
| } | |
| return path.Join(defaultWorkdir, workdir) | |
| if filepath.IsAbs(workdir) { | |
| return filepath.Clean(workdir) | |
| } | |
| return filepath.Join(defaultWorkdir, workdir) |
Prompt To Fix With AI
This is a comment left during a code review.
Path: backend/internal/workflow/agent_harness_execution_workflow.go
Line: 403-406
Comment:
**Use `path/filepath` for OS filesystem paths**
`path` is designed for URL-style slash-separated paths, while `path/filepath` handles OS-specific filesystem paths. On Linux (where E2B runs) the behaviour is identical, but the idiomatic package for sandbox filesystem operations is `path/filepath`. Swap the import to avoid subtle issues if the host OS or test environment ever differs.
```suggestion
if filepath.IsAbs(workdir) {
return filepath.Clean(workdir)
}
return filepath.Join(defaultWorkdir, workdir)
```
How can I resolve this? If you propose a fix, please make it concise.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| workdir := strings.TrimSpace(validator.WorkingDirectory) | ||
| if workdir == "" { | ||
| workdir = defaultWorkdir | ||
| } | ||
| workdir = agentHarnessValidatorWorkdir(defaultWorkdir, workdir) |
There was a problem hiding this comment.
Redundant
TrimSpace before agentHarnessValidatorWorkdir
validator.WorkingDirectory is trimmed here and then trimmed again inside agentHarnessValidatorWorkdir (line 399). The outer trim is redundant; the helper already handles whitespace-only inputs by returning defaultWorkdir. Consider removing the pre-trim and letting the helper be the single source of truth.
| workdir := strings.TrimSpace(validator.WorkingDirectory) | |
| if workdir == "" { | |
| workdir = defaultWorkdir | |
| } | |
| workdir = agentHarnessValidatorWorkdir(defaultWorkdir, workdir) | |
| workdir := agentHarnessValidatorWorkdir(defaultWorkdir, validator.WorkingDirectory) |
Prompt To Fix With AI
This is a comment left during a code review.
Path: backend/internal/workflow/agent_harness_execution_workflow.go
Line: 368-369
Comment:
**Redundant `TrimSpace` before `agentHarnessValidatorWorkdir`**
`validator.WorkingDirectory` is trimmed here and then trimmed again inside `agentHarnessValidatorWorkdir` (line 399). The outer trim is redundant; the helper already handles whitespace-only inputs by returning `defaultWorkdir`. Consider removing the pre-trim and letting the helper be the single source of truth.
```suggestion
workdir := agentHarnessValidatorWorkdir(defaultWorkdir, validator.WorkingDirectory)
```
How can I resolve this? If you propose a fix, please make it concise.
Summary
Testing
Retry notes
The latest hosted retry got past Codex execution and created the requested docs file. The remaining failure was validator configuration/runtime plumbing: the root repo is not a Go module, new untracked files were absent from the diff artifact, and validators were launched through sh. This PR fixes the reusable harness-side pieces; a future harness config should use module-specific validators such as backend and cli workdirs.