Improve Factory AI Droid pre/post tool call E2E tests#959
Conversation
Add IsTaskCheckpoint/ToolUseID to RewindPoint struct and verify task checkpoints exist both before and after commit, including validation that pre-existing untracked files don't leak into committed checkpoint metadata. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: b669a7241c09
There was a problem hiding this comment.
Pull request overview
This PR strengthens Factory AI Droid E2E coverage around pre/post tool-call task checkpoints by making rewind points expose task-checkpoint identity and asserting task checkpoint metadata behavior before and after a commit.
Changes:
- Extend the E2E
RewindPointmodel to includeIsTaskCheckpointandToolUseIDfromentire rewind --list. - Replace the prior “hooks do not fail” test with a test that asserts a task rewind point exists pre-commit and includes a
tool_use_id. - Add an E2E test ensuring committed checkpoint metadata includes worker-created files while excluding pre-existing files from
files_touched.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
e2e/tests/factory_hooks_test.go |
Adds/renames Factory AI Droid E2E tests and helper assertions around task rewind points and committed checkpoint metadata. |
e2e/entire/entire.go |
Updates the E2E JSON model for rewind --list output to surface task-checkpoint fields used by tests. |
IsTaskCheckpoint + non-empty ToolUseID already capture the semantic intent. The "/tasks/" substring check coupled the test to internal metadata path layout unnecessarily. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: f48eb1bdc04a
|
Removed brittle |
waitForTaskRewindPoint already guarantees ToolUseID != "" before returning — no need to re-assert at the call site. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: 7d105af511a8
|
Removed redundant |
Test name says "ExcludesPreExistingUntrackedFiles" but `git add .` was staging the sentinel too. Now only the worker-created file is staged, keeping setup consistent with the test name and assertions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: f1dc2dd7fdec
|
Stage only |
assertCommittedTaskCheckpointExists tested functionality that doesn't exist yet — task checkpoints are saved to shadow branches but never transferred to entire/checkpoints/v1 during condensation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: e5cb2f13a018
|
Removed |
Summary
IsTaskCheckpointandToolUseIDfields toRewindPointstruct for task checkpoint visibility in E2E testsTestFactoryTaskHooksDoNotFail→TestFactoryTaskCheckpointExistsBeforeCommitwith stronger assertions (verify task rewind point exists withToolUseID)TestFactoryCommittedCheckpointExcludesPreExistingUntrackedFiles— verifies committed task checkpoints exist post-commit and pre-existing untracked files don't leak into checkpoint metadataTest plan
mise run test:cipasses (canary + unit + integration)mise run test:e2e --agent factoryai-droid TestFactoryTaskCheckpointExistsBeforeCommitpassesmise run test:e2e --agent factoryai-droid TestFactoryCommittedCheckpointExcludesPreExistingUntrackedFilespasses🤖 Generated with Claude Code
Note
Medium Risk
Moderate risk because it adds stricter E2E assertions around checkpoint/task metadata and committed checkpoint blobs, which may expose timing/flake issues but doesn’t change production logic.
Overview
Tightens factoryai-droid E2E coverage by extending
entire.RewindPointto surface task checkpoint identity (is_task_checkpoint,tool_use_id) and using it in tests.Replaces the prior “hooks do not fail” check with an assertion that a non-logs task rewind point with a
ToolUseIDappears before commit, and adds a new regression test that verifies a committed task checkpoint blob exists and that checkpoint metadata includes worker-created files while excluding pre-existing untracked sentinel files.Reviewed by Cursor Bugbot for commit 0ece4ec. Configure here.