Skip to content

feat: surface invalid trajectory files via status -v and trail doctor#26

Merged
khaliqgant merged 2 commits into
mainfrom
feat/trail-doctor
May 8, 2026
Merged

feat: surface invalid trajectory files via status -v and trail doctor#26
khaliqgant merged 2 commits into
mainfrom
feat/trail-doctor

Conversation

@khaliqgant
Copy link
Copy Markdown
Member

Summary

  • Reconcile previously skipped files that failed to load (malformed JSON or schema violations) with no way to find or repair them. ReconcileSummary now carries a failures: ReconcileFailure[] array (path + reason + first validation error, pre-rendered).
  • New trail doctor command lists invalid trajectories and --quarantine moves them into .trajectories/invalid/ so reconcile stops scanning them. Only parse/schema failures are quarantined; transient io_error failures are left in place.
  • New trail status -v / --verbose prints the same diagnostic inline so the existing `reconciled X/Y, invalid: N` warning has an actionable next step.

Test plan

  • `npx tsc --noEmit` clean
  • `npx vitest run` — 238/238 pass (including existing reconcile coverage)
  • Smoke test: created malformed and schema-invalid fixtures, ran `trail doctor`, `trail status -v`, and `trail doctor --quarantine` against them; files moved as expected and subsequent runs report a clean state

🤖 Generated with Claude Code

Reconcile silently skipped files that failed to load, leaving users
with no path to identify or repair them. ReconcileSummary now carries
per-file failures (path + reason + first validation error). The new
`trail doctor` command lists them and `--quarantine` moves them into
.trajectories/invalid/ so reconcile stops scanning. `trail status -v`
prints the same diagnostic inline.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 8, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

This PR adds per-file reconcile failure tracking and caching in storage, a quarantine routine to move non-IO invalid trajectories, a trail doctor CLI command (with --quarantine), and a --verbose option on trail status to display reconcile failures.

Changes

Diagnostic Failure Tracking and Quarantine System

Layer / File(s) Summary
Failure Type Contracts
src/storage/file.ts
Defines ReconcileFailureReason type, ReconcileFailure interface, and extends ReconcileSummary with a failures array.
Failure Description Helper
src/storage/file.ts
Implements describeReadFailure to produce single-line messages (extracts Zod issues for schema violations) and adjusts imports used for rendering.
Storage Failure Tracking and Caching
src/storage/file.ts
Initializes failures in reconcileIndex(), records structured ReconcileFailure entries, caches the summary on the FileStorage instance, and exposes getLastReconcileSummary().
Quarantine Implementation
src/storage/file.ts
Implements quarantineInvalid() to rerun reconcile, create .trajectories/invalid/, compute safe destinations (avoiding basename collisions), and move non-IO failure files; returns moved records and targetDir.
Doctor Command Implementation
src/cli/commands/doctor.ts
Adds trail doctor command that lists reconcile failures and, with --quarantine, calls quarantineInvalid() and reports moved file count or no-op.
Status Command Integration
src/cli/commands/status.ts
Adds --verbose flag to status which prints reconcile failures from storage.getLastReconcileSummary() or a clean-load message with remediation hint.
Command Registration
src/cli/commands/index.ts
Imports and invokes registerDoctorCommand(program) within registerCommands() to register the doctor command.
Tests
tests/storage/reconcile-real-data.test.ts
Adds a Vitest test that verifies quarantineInvalid() preserves both files when basenames collide across different subtrees and removes originals.
Manifest
package.json
Minor formatting change to the files array (single-line); no behavioral change.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I hop through logs with gentle paws,
Finding broken traces and listing the cause.
I tuck the invalid into an invalid/ nest,
Keep each file safe—no collision unrest.
Quietly I guard the trajectory quest.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately captures the main changes: introducing the trail doctor command and adding verbose output to status for surfacing invalid trajectory files.
Description check ✅ Passed The description is well-related to the changeset, explaining the reconcile failures tracking, the new trail doctor command with --quarantine flag, and the trail status -v verbose output feature.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/trail-doctor

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1ba2391d73

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/storage/file.ts Outdated
Comment on lines +383 to +385
const dest = join(targetDir, basename(failure.path));
try {
await rename(failure.path, dest);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid overwriting quarantined files with matching names

When invalid files from different scanned directories share the same basename (for example active/traj.json and completed/2026-04/traj.json, or two legacy monthly folders with the same malformed name), this destination collapses them into one .trajectories/invalid/<basename> path. rename will replace an existing file at that path on typical filesystems, so trail doctor --quarantine can silently lose one of the invalid trajectory files instead of preserving both for inspection. Use a collision-safe destination or preserve the relative path under invalid/.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/storage/file.ts`:
- Around line 382-386: The quarantine loop using basename(failure.path) before
rename() can cause collisions when files from different source directories share
names (e.g., failure objects in candidates), risking silent overwrite; modify
the logic in the loop that builds dest (inside the for..of over candidates) to
ensure uniqueness by adding a disambiguating component (for example include a
sanitized directory prefix, a timestamp/UUID, or incremental suffix) or check
for existence and choose a new name before calling rename(failure.path, dest),
and update moved.push(failure) only after a successful, non-colliding rename;
ensure you reference the same symbols (candidates, failure.path, targetDir,
basename, rename, moved) when making the change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: e57ea9a5-3c29-4601-adf3-b71cc04cb2b1

📥 Commits

Reviewing files that changed from the base of the PR and between 82dd5d6 and 1ba2391.

📒 Files selected for processing (4)
  • src/cli/commands/doctor.ts
  • src/cli/commands/index.ts
  • src/cli/commands/status.ts
  • src/storage/file.ts

Comment thread src/storage/file.ts
Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

View 3 additional findings in Devin Review.

Open in Devin Review

Comment thread src/storage/file.ts Outdated
await mkdir(targetDir, { recursive: true });
const moved: ReconcileFailure[] = [];
for (const failure of candidates) {
const dest = join(targetDir, basename(failure.path));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Quarantine overwrites files when two failing files share the same basename

quarantineInvalid() uses basename(failure.path) to construct the destination path in the invalid/ directory. If two failing files from different source directories share the same basename (e.g., active/traj_abc.json and completed/2024-01/traj_abc.json), the second rename() at line 385 will silently overwrite the first file already moved into invalid/. This is data loss, contradicting the explicit "never silently lose a file" comment on line 388. The scenario is possible when the same trajectory file exists in both active/ and a completed/ subdirectory and both copies are malformed or schema-invalid.

Prompt for agents
In quarantineInvalid(), the destination path is computed as join(targetDir, basename(failure.path)), which can collide when two failing files from different directories share the same basename. To prevent silent data loss, either: (1) derive a unique destination name that encodes the source subdirectory (e.g., replace path separators with underscores), (2) check if the destination already exists and append a numeric suffix, or (3) use the full relative path from trajectoriesDir as the destination, recreating the subdirectory structure under invalid/. Option 3 is cleanest: replace basename(failure.path) with the relative path from this.trajectoriesDir, and mkdir the parent under targetDir before renaming.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Codex, CodeRabbit, and Devin all flagged the same collision risk:
basename() collapses two invalid files that share a name across
active/ and completed/ onto the same invalid/ path, and rename()
silently overwrites the first one. quarantineInvalid now mirrors the
relative path under invalid/ so both copies survive, with a numeric
suffix fallback if the destination still collides with a previous
quarantine run.

Also formats package.json per biome to unblock the lint check.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/storage/reconcile-real-data.test.ts`:
- Around line 237-240: The test currently only asserts the original file was
removed from active/ using the variable activeAfter; add a corresponding check
for completed/2026-04 by reading its directory (e.g., a new completedAfter
variable via readdir(join(trajRoot, "completed", "2026-04"))) and assert
completedAfter does not contain "traj_dup00000_0000.json" so both active and
completed removals are verified in tests/storage/reconcile-real-data.test.ts.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 57489f0f-e86e-4134-bd83-007c2bd79bd3

📥 Commits

Reviewing files that changed from the base of the PR and between 1ba2391 and 87cc513.

📒 Files selected for processing (3)
  • package.json
  • src/storage/file.ts
  • tests/storage/reconcile-real-data.test.ts
✅ Files skipped from review due to trivial changes (1)
  • package.json
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/storage/file.ts

Comment on lines +237 to +240
// And the originals must be gone from active/ and completed/.
const activeAfter = await readdir(join(trajRoot, "active"));
expect(activeAfter).not.toContain("traj_dup00000_0000.json");
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Assert source removal in completed/2026-04 as well

The intent says both originals are removed, but only active/ is checked (Line 238). Please also assert that completed/2026-04/traj_dup00000_0000.json is gone.

Patch suggestion
     // And the originals must be gone from active/ and completed/.
     const activeAfter = await readdir(join(trajRoot, "active"));
     expect(activeAfter).not.toContain("traj_dup00000_0000.json");
+    const completedAfter = await readdir(join(trajRoot, "completed", "2026-04"));
+    expect(completedAfter).not.toContain("traj_dup00000_0000.json");
   });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// And the originals must be gone from active/ and completed/.
const activeAfter = await readdir(join(trajRoot, "active"));
expect(activeAfter).not.toContain("traj_dup00000_0000.json");
});
// And the originals must be gone from active/ and completed/.
const activeAfter = await readdir(join(trajRoot, "active"));
expect(activeAfter).not.toContain("traj_dup00000_0000.json");
const completedAfter = await readdir(join(trajRoot, "completed", "2026-04"));
expect(completedAfter).not.toContain("traj_dup00000_0000.json");
});
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/storage/reconcile-real-data.test.ts` around lines 237 - 240, The test
currently only asserts the original file was removed from active/ using the
variable activeAfter; add a corresponding check for completed/2026-04 by reading
its directory (e.g., a new completedAfter variable via readdir(join(trajRoot,
"completed", "2026-04"))) and assert completedAfter does not contain
"traj_dup00000_0000.json" so both active and completed removals are verified in
tests/storage/reconcile-real-data.test.ts.

@khaliqgant khaliqgant merged commit 55bfbe6 into main May 8, 2026
7 checks passed
@khaliqgant khaliqgant deleted the feat/trail-doctor branch May 8, 2026 13:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant