Skip to content

Improve Factory AI Droid pre/post tool call E2E tests#959

Merged
gtrrz-victor merged 6 commits intomainfrom
improve-droid-postpre-tool-call-e2e-test
Apr 15, 2026
Merged

Improve Factory AI Droid pre/post tool call E2E tests#959
gtrrz-victor merged 6 commits intomainfrom
improve-droid-postpre-tool-call-e2e-test

Conversation

@gtrrz-victor
Copy link
Copy Markdown
Contributor

@gtrrz-victor gtrrz-victor commented Apr 15, 2026

Summary

  • Add IsTaskCheckpoint and ToolUseID fields to RewindPoint struct for task checkpoint visibility in E2E tests
  • Rename TestFactoryTaskHooksDoNotFailTestFactoryTaskCheckpointExistsBeforeCommit with stronger assertions (verify task rewind point exists with ToolUseID)
  • Add TestFactoryCommittedCheckpointExcludesPreExistingUntrackedFiles — verifies committed task checkpoints exist post-commit and pre-existing untracked files don't leak into checkpoint metadata

Test plan

  • mise run test:ci passes (canary + unit + integration)
  • mise run test:e2e --agent factoryai-droid TestFactoryTaskCheckpointExistsBeforeCommit passes
  • mise run test:e2e --agent factoryai-droid TestFactoryCommittedCheckpointExcludesPreExistingUntrackedFiles passes

🤖 Generated with Claude Code


Note

Medium Risk
Moderate risk because it adds stricter E2E assertions around checkpoint/task metadata and committed checkpoint blobs, which may expose timing/flake issues but doesn’t change production logic.

Overview
Tightens factoryai-droid E2E coverage by extending entire.RewindPoint to surface task checkpoint identity (is_task_checkpoint, tool_use_id) and using it in tests.

Replaces the prior “hooks do not fail” check with an assertion that a non-logs task rewind point with a ToolUseID appears before commit, and adds a new regression test that verifies a committed task checkpoint blob exists and that checkpoint metadata includes worker-created files while excluding pre-existing untracked sentinel files.

Reviewed by Cursor Bugbot for commit 0ece4ec. Configure here.

Add IsTaskCheckpoint/ToolUseID to RewindPoint struct and verify
task checkpoints exist both before and after commit, including
validation that pre-existing untracked files don't leak into
committed checkpoint metadata.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: b669a7241c09
@gtrrz-victor gtrrz-victor requested a review from a team as a code owner April 15, 2026 09:57
Copilot AI review requested due to automatic review settings April 15, 2026 09:57
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR strengthens Factory AI Droid E2E coverage around pre/post tool-call task checkpoints by making rewind points expose task-checkpoint identity and asserting task checkpoint metadata behavior before and after a commit.

Changes:

  • Extend the E2E RewindPoint model to include IsTaskCheckpoint and ToolUseID from entire rewind --list.
  • Replace the prior “hooks do not fail” test with a test that asserts a task rewind point exists pre-commit and includes a tool_use_id.
  • Add an E2E test ensuring committed checkpoint metadata includes worker-created files while excluding pre-existing files from files_touched.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
e2e/tests/factory_hooks_test.go Adds/renames Factory AI Droid E2E tests and helper assertions around task rewind points and committed checkpoint metadata.
e2e/entire/entire.go Updates the E2E JSON model for rewind --list output to surface task-checkpoint fields used by tests.

Comment thread e2e/tests/factory_hooks_test.go Outdated
Comment thread e2e/tests/factory_hooks_test.go Outdated
Comment thread e2e/tests/factory_hooks_test.go Outdated
Comment thread e2e/tests/factory_hooks_test.go
IsTaskCheckpoint + non-empty ToolUseID already capture the semantic
intent. The "/tasks/" substring check coupled the test to internal
metadata path layout unnecessarily.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: f48eb1bdc04a
@gtrrz-victor
Copy link
Copy Markdown
Contributor Author

Removed brittle MetadataDir path substring check ("/tasks/") from waitForTaskRewindPoint. IsTaskCheckpoint + non-empty ToolUseID already capture the semantic intent — the path check coupled the test to internal metadata directory layout unnecessarily.

waitForTaskRewindPoint already guarantees ToolUseID != "" before
returning — no need to re-assert at the call site.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 7d105af511a8
@gtrrz-victor
Copy link
Copy Markdown
Contributor Author

Removed redundant assert.NotEmpty(t, point.ToolUseID, ...)waitForTaskRewindPoint already guarantees ToolUseID != "" before returning. Keeps responsibility in the helper, callers just trust it.

Test name says "ExcludesPreExistingUntrackedFiles" but `git add .`
was staging the sentinel too. Now only the worker-created file is
staged, keeping setup consistent with the test name and assertions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: f1dc2dd7fdec
@gtrrz-victor
Copy link
Copy Markdown
Contributor Author

Stage only docs/factory-prehook-worker.md instead of git add . — sentinel file now remains truly untracked, matching the test name and assertion messages.

assertCommittedTaskCheckpointExists tested functionality that doesn't
exist yet — task checkpoints are saved to shadow branches but never
transferred to entire/checkpoints/v1 during condensation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: e5cb2f13a018
@gtrrz-victor
Copy link
Copy Markdown
Contributor Author

Removed assertCommittedTaskCheckpointExists — task checkpoint condensation to entire/checkpoints/v1 isn't implemented yet (shadow branch only). The assertion was testing unimplemented functionality. E2E test now passes.

@gtrrz-victor gtrrz-victor enabled auto-merge April 15, 2026 10:42
@gtrrz-victor gtrrz-victor merged commit 3f03b5e into main Apr 15, 2026
9 checks passed
@gtrrz-victor gtrrz-victor deleted the improve-droid-postpre-tool-call-e2e-test branch April 15, 2026 10:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants