refactor(sdk): dedupe ingest filesystem walks + per-harness apply boilerplate#423
Conversation
…lerplate
Consolidates the three copies of `list_dirs` / `list_jsonl_files` /
`walk_jsonl` into `ingest::walk`, fixes the `walk_jsonl` filter to
match `.JSONL` case-insensitively (matches TS adapter), and collapses
the per-harness append-if-not-empty boilerplate behind a
`DerivedRecords` trait + `apply_parsed_extras` helper. The three
single-harness ingest verbs now share a `run_single_harness` wrapper
for the cleanup/load-cursors/resolve-content/emit-gap/save skeleton.
Also switches the `ingest_claude_session` cwd encoding to the idiomatic
`replace('/', "-")`.
Closes #343
📝 WalkthroughWalkthroughThis PR implements the refactoring described in issue ChangesIngest module consolidation and deduplication
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@CHANGELOG.md`:
- Around line 9-14: Split the long implementation-heavy bullet into multiple
concise, impact-first bullets and remove the issue reference: create one short
user-visible bullet stating the .JSONL filename matching is now case-insensitive
(fixes `.JSONL` vs `.jsonl` filtering), another bullet saying filesystem walk
helpers in relayburn-sdk were consolidated (mentioning functions list_dirs,
list_jsonl_files, walk_jsonl collapsed into ingest::walk) and a third noting the
internal boilerplate collapse (apply_parsed_extras and the single-harness
helpers like run_single_harness were simplified) — keep each bullet one terse
sentence and drop the "(`#343`)" reference.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 362ed27c-b301-4c24-ae04-9357f7d6f59d
📒 Files selected for processing (4)
CHANGELOG.mdcrates/relayburn-sdk/src/ingest/ingest.rscrates/relayburn-sdk/src/ingest/reingest.rscrates/relayburn-sdk/src/ingest/walk.rs
| - `relayburn-sdk`: dedupe ingest filesystem walks (`list_dirs`, | ||
| `list_jsonl_files`, `walk_jsonl`) into `ingest::walk`, fix the | ||
| `walk_jsonl` filter to match `.JSONL` case-insensitively, and collapse | ||
| the per-harness append boilerplate (`apply_parsed_extras`) and the | ||
| three single-harness verb skeletons (`run_single_harness`). No | ||
| behavior change beyond the case-sensitivity fix. (#343) |
There was a problem hiding this comment.
Split this into concise impact-first bullets and drop the issue reference.
This entry is doing too much in one bullet and is implementation-heavy; it’s harder to scan in [Unreleased]. Please break it into short user-visible bullets (e.g., .JSONL matching fix as one bullet, ingest refactor effects as separate bullets) and remove (#343).
As per coding guidelines: “Changelog entries should be concise and impact-first… Prefer one short bullet per user-visible change… Drop issue/PR links…”
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@CHANGELOG.md` around lines 9 - 14, Split the long implementation-heavy bullet
into multiple concise, impact-first bullets and remove the issue reference:
create one short user-visible bullet stating the .JSONL filename matching is now
case-insensitive (fixes `.JSONL` vs `.jsonl` filtering), another bullet saying
filesystem walk helpers in relayburn-sdk were consolidated (mentioning functions
list_dirs, list_jsonl_files, walk_jsonl collapsed into ingest::walk) and a third
noting the internal boilerplate collapse (apply_parsed_extras and the
single-harness helpers like run_single_harness were simplified) — keep each
bullet one terse sentence and drop the "(`#343`)" reference.
….7/2.8.6 Main released 2.9.0, 2.8.7, and 2.8.6 (#421, #422, #423, #426). The auto-merge against ingest.rs and lib.rs was clean (no behavior conflict with #423's walk dedupe + run_single_harness refactor); only CHANGELOG needed manual sorting to keep this branch's items under [Unreleased] above the new release sections. https://claude.ai/code/session_011ubB69Zxijqb1BsYVYL9iQ
Closes #343.
Summary
list_dirs/list_jsonl_files/walk_jsonlintocrates/relayburn-sdk/src/ingest/walk.rs. The duplicates iningest/ingest.rsandingest/reingest.rsare gone; both modules now import fromwalk.walk_jsonlfilter to match.JSONLcase-insensitively viaPath::extension().eq_ignore_ascii_case("jsonl")(matches the TS adapter and silences the lurking clippy lint).list_jsonl_filesuses the same helper.DerivedRecordstrait +apply_parsed_extrashelper covering the trailing 5-bucket "if not empty, append" block (content / events / relationships / tool-result events / user-turns). Implemented forClaudeParseResult,ClaudeParseIncrementalResult,ParseCodexIncrementalResult,ParseOpencodeIncrementalResult. Call sites iningest_claude_into,ingest_codex_into,ingest_opencode_into, andingest_claude_sessioncollapse to one line.run_single_harnessto share the cleanup → load-cursors → resolve-content → body → emit-gap-warning → save-cursors skeleton across the three single-harness verbs (ingest_claude_projects,ingest_codex_sessions,ingest_opencode_sessions).ingest_claude_sessioncwd encoding fromcwd.chars().map(...).collect()to the idiomaticcwd.replace('/', "-").Test plan
cargo build --workspacecargo test --workspace(all 22+48+5+24+667+2+13 tests pass)pnpm install --frozen-lockfile && pnpm run testingest/walk.rs:walk_jsonl_matches_uppercase_extension,list_dirs_returns_immediate_children_only,list_jsonl_files_is_non_recursive_and_case_insensitive.https://claude.ai/code
Generated by Claude Code