Skip to content

refactor(sdk): dedupe ingest filesystem walks + per-harness apply boilerplate#423

Merged
willwashburn merged 1 commit into
mainfrom
claude/fix-issue-343-bWMO9
May 21, 2026
Merged

refactor(sdk): dedupe ingest filesystem walks + per-harness apply boilerplate#423
willwashburn merged 1 commit into
mainfrom
claude/fix-issue-343-bWMO9

Conversation

@willwashburn
Copy link
Copy Markdown
Member

Closes #343.

Summary

  • Consolidates the three copies of list_dirs / list_jsonl_files / walk_jsonl into crates/relayburn-sdk/src/ingest/walk.rs. The duplicates in ingest/ingest.rs and ingest/reingest.rs are gone; both modules now import from walk.
  • Fixes the walk_jsonl filter to match .JSONL case-insensitively via Path::extension().eq_ignore_ascii_case("jsonl") (matches the TS adapter and silences the lurking clippy lint). list_jsonl_files uses the same helper.
  • Adds a small DerivedRecords trait + apply_parsed_extras helper covering the trailing 5-bucket "if not empty, append" block (content / events / relationships / tool-result events / user-turns). Implemented for ClaudeParseResult, ClaudeParseIncrementalResult, ParseCodexIncrementalResult, ParseOpencodeIncrementalResult. Call sites in ingest_claude_into, ingest_codex_into, ingest_opencode_into, and ingest_claude_session collapse to one line.
  • Adds run_single_harness to share the cleanup → load-cursors → resolve-content → body → emit-gap-warning → save-cursors skeleton across the three single-harness verbs (ingest_claude_projects, ingest_codex_sessions, ingest_opencode_sessions).
  • Switches the ingest_claude_session cwd encoding from cwd.chars().map(...).collect() to the idiomatic cwd.replace('/', "-").

Test plan

  • cargo build --workspace
  • cargo test --workspace (all 22+48+5+24+667+2+13 tests pass)
  • pnpm install --frozen-lockfile && pnpm run test
  • Added regression tests in ingest/walk.rs: walk_jsonl_matches_uppercase_extension, list_dirs_returns_immediate_children_only, list_jsonl_files_is_non_recursive_and_case_insensitive.

https://claude.ai/code


Generated by Claude Code

…lerplate

Consolidates the three copies of `list_dirs` / `list_jsonl_files` /
`walk_jsonl` into `ingest::walk`, fixes the `walk_jsonl` filter to
match `.JSONL` case-insensitively (matches TS adapter), and collapses
the per-harness append-if-not-empty boilerplate behind a
`DerivedRecords` trait + `apply_parsed_extras` helper. The three
single-harness ingest verbs now share a `run_single_harness` wrapper
for the cleanup/load-cursors/resolve-content/emit-gap/save skeleton.
Also switches the `ingest_claude_session` cwd encoding to the idiomatic
`replace('/', "-")`.

Closes #343
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 21, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

This PR implements the refactoring described in issue #343, consolidating duplicate filesystem-walk helpers and harness ingestion boilerplate across the ingest module. Walk helpers are centralized in walk.rs with case-insensitive JSONL matching, harness verbs are unified via run_single_harness wrapper and apply_parsed_extras helper, and local duplicates are removed from ingest.rs and reingest.rs.

Changes

Ingest module consolidation and deduplication

Layer / File(s) Summary
Walk module helpers and case-sensitivity fix
crates/relayburn-sdk/src/ingest/walk.rs
walk_jsonl now matches .jsonl files case-insensitively via has_extension_ci helper. New non-recursive helpers list_dirs and list_jsonl_files added with silent error-to-empty behavior. Unit tests updated and extended to cover case-insensitive matching, new helpers, and missing-root scenarios.
Ingest trait and consolidation helpers
crates/relayburn-sdk/src/ingest/ingest.rs
Introduces DerivedRecords trait (implemented via macro for each parser result type) and apply_parsed_extras function to conditionally append non-empty derived record buckets to the ledger. Imports expanded to include new walk helpers and parser types needed by the consolidation.
Harness verb wrapper and skeleton consolidation
crates/relayburn-sdk/src/ingest/ingest.rs
Adds run_single_harness generic helper standardizing per-harness skeleton: stale pending-stamp cleanup, cursor load/snapshot, content-mode resolution, harness body invocation, gap-warning emission, and cursor persistence. Simplifies Claude session cwd encoding to use replace('/', "-").
Harness verb refactoring with consolidated helpers
crates/relayburn-sdk/src/ingest/ingest.rs
Refactors three per-harness ingestion verbs (ingest_claude_projects, ingest_codex_sessions, ingest_opencode_sessions) and Claude session fast-path to use run_single_harness wrapper and apply_parsed_extras helper. Removes repetitive inline cursor/content-mode/ledger-append logic. Adjusts Codex parsing to split parsed.resume via std::mem::take for borrowing compatibility.
Remove local filesystem helpers from ingest
crates/relayburn-sdk/src/ingest/ingest.rs
Deletes duplicate list_dirs and list_jsonl_files implementations after consolidating into walk.rs and importing from crate::ingest::walk.
Reingest adaptation to consolidated walk helpers
crates/relayburn-sdk/src/ingest/reingest.rs
Updates imports to pull list_dirs and list_jsonl_files from consolidated crate::ingest::walk, removes local duplicate implementations, and reformats test PathBuf construction without changing assertions.
Changelog documentation
CHANGELOG.md
Documents filesystem-walk deduplication, case-insensitive JSONL matching fix, harness append boilerplate consolidation, and removal of per-harness verb skeletons. Notes behavior change is limited to case-sensitivity fix.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • AgentWorkforce/burn#351: Overlaps at cursor-mutation persistence code path in ingest.rs, which the main PR refactors via the new run_single_harness wrapper.

Poem

🐰 Walks consolidated, helpers now shared,
Boilerplate reduced—less code to care!
Case-insensitive matching, fixed at last,
The .JSONL files won't be passed!
Harnesses unified in one clean way, 🎉

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main changes: deduplication of ingest filesystem walks and removal of per-harness boilerplate.
Description check ✅ Passed The description is detailed and directly related to the changeset, covering consolidation of duplicated code, case-insensitivity fix, trait implementation, refactoring, and testing.
Linked Issues check ✅ Passed All objectives from #343 are met: filesystem walks consolidated into walk.rs with case-insensitive matching, DerivedRecords trait + apply_parsed_extras replaces repeated append blocks, run_single_harness extracts common verb skeleton, and cwd.replace() improves idiomaticity.
Out of Scope Changes check ✅ Passed All changes directly address the objectives in #343; no out-of-scope modifications detected beyond the stated refactoring and improvements.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch claude/fix-issue-343-bWMO9

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@CHANGELOG.md`:
- Around line 9-14: Split the long implementation-heavy bullet into multiple
concise, impact-first bullets and remove the issue reference: create one short
user-visible bullet stating the .JSONL filename matching is now case-insensitive
(fixes `.JSONL` vs `.jsonl` filtering), another bullet saying filesystem walk
helpers in relayburn-sdk were consolidated (mentioning functions list_dirs,
list_jsonl_files, walk_jsonl collapsed into ingest::walk) and a third noting the
internal boilerplate collapse (apply_parsed_extras and the single-harness
helpers like run_single_harness were simplified) — keep each bullet one terse
sentence and drop the "(`#343`)" reference.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 362ed27c-b301-4c24-ae04-9357f7d6f59d

📥 Commits

Reviewing files that changed from the base of the PR and between 2f07d32 and ecea0ae.

📒 Files selected for processing (4)
  • CHANGELOG.md
  • crates/relayburn-sdk/src/ingest/ingest.rs
  • crates/relayburn-sdk/src/ingest/reingest.rs
  • crates/relayburn-sdk/src/ingest/walk.rs

Comment thread CHANGELOG.md
Comment on lines +9 to +14
- `relayburn-sdk`: dedupe ingest filesystem walks (`list_dirs`,
`list_jsonl_files`, `walk_jsonl`) into `ingest::walk`, fix the
`walk_jsonl` filter to match `.JSONL` case-insensitively, and collapse
the per-harness append boilerplate (`apply_parsed_extras`) and the
three single-harness verb skeletons (`run_single_harness`). No
behavior change beyond the case-sensitivity fix. (#343)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Split this into concise impact-first bullets and drop the issue reference.

This entry is doing too much in one bullet and is implementation-heavy; it’s harder to scan in [Unreleased]. Please break it into short user-visible bullets (e.g., .JSONL matching fix as one bullet, ingest refactor effects as separate bullets) and remove (#343).

As per coding guidelines: “Changelog entries should be concise and impact-first… Prefer one short bullet per user-visible change… Drop issue/PR links…”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@CHANGELOG.md` around lines 9 - 14, Split the long implementation-heavy bullet
into multiple concise, impact-first bullets and remove the issue reference:
create one short user-visible bullet stating the .JSONL filename matching is now
case-insensitive (fixes `.JSONL` vs `.jsonl` filtering), another bullet saying
filesystem walk helpers in relayburn-sdk were consolidated (mentioning functions
list_dirs, list_jsonl_files, walk_jsonl collapsed into ingest::walk) and a third
noting the internal boilerplate collapse (apply_parsed_extras and the
single-harness helpers like run_single_harness were simplified) — keep each
bullet one terse sentence and drop the "(`#343`)" reference.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 3 additional findings.

Open in Devin Review

@willwashburn willwashburn merged commit 34f2e3b into main May 21, 2026
13 checks passed
@willwashburn willwashburn deleted the claude/fix-issue-343-bWMO9 branch May 21, 2026 13:17
willwashburn pushed a commit that referenced this pull request May 21, 2026
….7/2.8.6

Main released 2.9.0, 2.8.7, and 2.8.6 (#421, #422, #423, #426). The
auto-merge against ingest.rs and lib.rs was clean (no behavior
conflict with #423's walk dedupe + run_single_harness refactor); only
CHANGELOG needed manual sorting to keep this branch's items under
[Unreleased] above the new release sections.

https://claude.ai/code/session_011ubB69Zxijqb1BsYVYL9iQ
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Rust refactor: dedupe ingest filesystem walks + per-harness apply boilerplate

2 participants