Skip to content

relayburn-ledger: add list_content_session_ids + load_config (#279)#292

Merged
willwashburn merged 2 commits intomainfrom
claude/close-issue-279-ledger-config
May 5, 2026
Merged

relayburn-ledger: add list_content_session_ids + load_config (#279)#292
willwashburn merged 2 commits intomainfrom
claude/close-issue-279-ledger-config

Conversation

@willwashburn
Copy link
Copy Markdown
Member

Summary

  • Adds Ledger::list_content_session_ids() (returns HashSet<String> of distinct ids in content.sqlite, filters malformed ids) so relayburn-ingest::reingest_missing_content ([Rust port] relayburn-ingest: gap-warning state machine + reingest_missing_content #278) can skip already-covered sessions.
  • Adds BurnConfig + load_config / load_config_at mirroring packages/ledger/src/config.ts byte-for-byte: content.store: full|hash-only|off, content.retentionDays: number|'forever', sourced from $RELAYBURN_HOME/config.json with RELAYBURN_CONTENT_STORE / RELAYBURN_CONTENT_TTL_DAYS env overrides. Empty env string falls back to default to avoid the RELAYBURN_CONTENT_TTL_DAYS= → zero-day-retention footgun.
  • Home for Config: kept in relayburn-ledger to mirror @relayburn/ledger (TS source of truth co-locates loadConfig there). No new dependency edge — the crate already depends on relayburn-reader for ContentStoreMode, and relayburn-ingest already depends on relayburn-ledger. Splitting it into a new relayburn-config crate would create a third pole of dependencies for a ~150-line file with no separate consumers, so the layering judgement is to keep it co-located until we have a second non-ledger consumer.

Test plan

  • cargo test -p relayburn-ledger (44 tests; covers both new APIs: list_content_session_ids_returns_distinct_set plus seven config::tests::* cases for default / file-override / env-override / empty-env / malformed-JSON / non-object / unknown-store-fallback).
  • cargo build --workspace clean.
  • cargo test --workspace clean.
  • pnpm run build && pnpm run test clean (873 TS tests pass).

Closes #279.

Adds the two read APIs that `relayburn-ingest` (#277, #278) needs from the
SQLite-only ledger:

- `Ledger::list_content_session_ids()` runs `SELECT DISTINCT session_id
  FROM content` against `content.sqlite` and returns the set, filtering
  malformed ids defensively (mirrors the TS sqlite-adapter). Powers the
  skip filter in `reingest_missing_content`.
- `BurnConfig` + `load_config` mirror `packages/ledger/src/config.ts`:
  `content.store: full | hash-only | off` and
  `content.retentionDays: number | 'forever'`, sourced from
  `$RELAYBURN_HOME/config.json` with `RELAYBURN_CONTENT_STORE` /
  `RELAYBURN_CONTENT_TTL_DAYS` env overrides on top. Empty env strings
  fall back to default to avoid the `RELAYBURN_CONTENT_TTL_DAYS=` →
  zero-day-retention footgun.

Config is co-located in `relayburn-ledger` to match the TS layout
(`@relayburn/ledger` exports `loadConfig`); the crate already depends
on `relayburn-reader` for `ContentStoreMode`, so no new dependency edge.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 5, 2026

📝 Walkthrough

Walkthrough

Adds a user-level JSON config loader with environment-variable overrides and permissive parsing to relayburn-ledger, changes retention to use fractional days (f64), and adds a Ledger::list_content_session_ids() API (and underlying query) that returns distinct, validated session IDs as a HashSet<String>.

Changes

Configuration System & Session ID Querying

Layer / File(s) Summary
Data Shape
crates/relayburn-ledger/src/config.rs
Introduces DEFAULT_RETENTION_DAYS: f64 = 90.0, pub enum Retention { Days(f64), Forever }, ContentConfig, and BurnConfig (default: ContentStoreMode::Full, Retention::Days(DEFAULT_RETENTION_DAYS)). Retention::as_millis() converts Days(f64) → milliseconds via f64 multiplication and truncation.
Core Implementation — Config Loading & Normalization
crates/relayburn-ledger/src/config.rs
Adds config_path(), load_config() and load_config_at(path). load_config_at reads permissive raw JSON (serde_json::Value) with fail-soft behavior: missing file → defaults; malformed/non-object JSON → warning + defaults. Field-specific pickers apply env var precedence (RELAYBURN_CONTENT_STORE, RELAYBURN_CONTENT_TTL_DAYS), normalizing store strings case-insensitively (full, hash-only, off) and retention (empty env = unset, "forever"/negative → Forever, fractional numbers preserved as Days(f64)).
Core Implementation — Session ID Query
crates/relayburn-ledger/src/content.rs
Adds pub(crate) fn list_session_ids(conn: &Connection) -> Result<HashSet<String>> which SELECT DISTINCT session_id FROM content, skips rows that fail to decode as text, validates IDs with is_valid_session_id, and returns a HashSet<String> of valid IDs.
Wiring / Public Surface
crates/relayburn-ledger/src/lib.rs
Declares config module and re-exports config types/functions (config_path, load_config, load_config_at, BurnConfig, ContentConfig, Retention, DEFAULT_RETENTION_DAYS). Adds pub fn list_content_session_ids(&self) -> Result<HashSet<String>> on Ledger delegating to the content query.
Tests / Documentation
crates/relayburn-ledger/src/config.rs tests, crates/relayburn-ledger/src/tests.rs
Comprehensive unit tests for config loading/normalization (defaults, file vs env precedence, "forever"/negative handling, empty env handling, fractional retention from env and file, malformed/non-object JSON fallback, unknown store fallback, and as_millis() behavior). Integration test list_content_session_ids_returns_distinct_set verifies empty initial set, distinctness with duplicates, and resilience when a corrupted non-TEXT session_id row exists.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I nibble JSON, sniff the env breeze,
Fractional days and stores with ease,
Session IDs gathered, duplicates shed,
Malformed rows skipped — no tears to shed,
The ledger hums tidy as I hop to bed.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main changes: adding list_content_session_ids and load_config functionality to relayburn-ledger, directly matching the issue number and primary objectives.
Description check ✅ Passed The description comprehensively relates to the changeset, explaining the two main APIs added, their purposes, configuration schema, placement decision, and test coverage.
Linked Issues check ✅ Passed All coding requirements from issue #279 are met: list_content_session_ids returns HashSet of distinct session IDs filtering malformed entries; BurnConfig with load_config/load_config_at mirrors TS schema with env overrides and empty-string fallback handling; placement decision justified; comprehensive test coverage provided.
Out of Scope Changes check ✅ Passed All changes align with issue #279 scope: new Ledger::list_content_session_ids method, config module with BurnConfig/load_config, public re-exports for consumption, and comprehensive tests. No unrelated modifications detected.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch claude/close-issue-279-ledger-config

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 91a74f5d5c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread crates/relayburn-ledger/src/config.rs Outdated
if f < 0.0 {
return Some(Retention::Forever);
}
return Some(Retention::Days(f as u64));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve fractional retention days from config/env

The Rust parser truncates fractional TTL values to an integer day (f as u64), which diverges from the TS source behavior in packages/ledger/src/config.ts (normalizeRetention keeps finite numbers as-is). If a user sets RELAYBURN_CONTENT_TTL_DAYS=0.5 or "retentionDays": 0.5, this code resolves to 0 days instead of 12 hours, causing content to be pruned much earlier than intended.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 44f64d3. Switched Retention::Days from u64 to f64 so finite fractional values survive end-to-end — RELAYBURN_CONTENT_TTL_DAYS=0.5 now resolves to Days(0.5) and as_millis() returns Some(43_200_000) (12 hours). New tests cover both the env and file paths.

Comment thread crates/relayburn-ledger/src/content.rs Outdated
pub(crate) fn list_session_ids(conn: &Connection) -> Result<HashSet<String>> {
let mut stmt = conn.prepare("SELECT DISTINCT session_id FROM content")?;
let rows = stmt
.query_map([], |r| r.get::<_, String>(0))?
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Skip malformed session_id rows instead of failing

list_session_ids is documented to filter malformed IDs, but r.get::<_, String>(0) makes the whole call fail if any session_id is stored as a non-text value, so the filter never runs. Because content is not a STRICT table (schema.rs), a single bad row can cause list_content_session_ids() to return an error and break callers that rely on this as a best-effort skip set.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 44f64d3. list_session_ids now iterates with query() and skips rows whose session_id fails to decode as String (via let Ok(...) = row.get::<_, String>(0) else { continue }). New test injects a BLOB-typed session_id row — TEXT affinity preserves BLOBs, and rusqlite returns InvalidType for them — to exercise the skip path; the call returns the three valid ids and ignores the corrupt one.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
crates/relayburn-ledger/src/tests.rs (1)

679-701: ⚡ Quick win

Add one malformed-row coverage check to match the defensive-filtering contract

This test verifies distinctness, but not the “ignore malformed IDs” behavior. Consider injecting one corrupted session_id row (via direct SQL in test) and asserting the method still returns only valid IDs.

Possible test extension
 fn list_content_session_ids_returns_distinct_set() {
@@
     l.append_content(&[
         make_content("ses_a", "m1", "alpha"),
         make_content("ses_a", "m2", "beta"),
         make_content("ses_b", "m1", "gamma"),
         make_content("ses_c", "m1", "delta"),
     ])
     .unwrap();
+
+    // Simulate corruption bypassing writer validation.
+    l.conns
+        .content
+        .execute("UPDATE content SET session_id = 123 WHERE message_id = 'm2'", [])
+        .unwrap();
@@
-    assert_eq!(ids.len(), 3);
+    assert_eq!(ids.len(), 2);
     assert!(ids.contains("ses_a"));
-    assert!(ids.contains("ses_b"));
     assert!(ids.contains("ses_c"));
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/relayburn-ledger/src/tests.rs` around lines 679 - 701, Extend the
list_content_session_ids_returns_distinct_set test to also insert one malformed
row directly into the underlying SQLite DB and assert the method filters it out:
after setting up l (open_in) and appending valid rows with append_content,
obtain a DB connection (via the same storage used by list_content_session_ids),
execute a direct SQL INSERT that writes a corrupted session_id value (e.g., NULL
or an invalid type/text) into the content table, then call
l.list_content_session_ids() again and assert the returned set still has length
3 and contains only "ses_a", "ses_b", "ses_c"; this ensures
list_content_session_ids() correctly ignores malformed rows.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/relayburn-ledger/src/content.rs`:
- Around line 138-143: The function list_session_ids currently collects
query_map results with .collect::<rusqlite::Result<Vec<_>>>()? which propagates
row decoding errors; change it to iterate the query_map iterator and skip rows
that return Err from row.get rather than returning Err to the caller (e.g.,
replace the collect call with a manual loop or use .filter_map(|res| res.ok())
to only keep successfully decoded Strings), then apply is_valid_session_id and
collect into the HashSet; reference the existing list_session_ids, conn.prepare,
the query_map call, row.get::<_, String>(0), and is_valid_session_id when making
this change.

---

Nitpick comments:
In `@crates/relayburn-ledger/src/tests.rs`:
- Around line 679-701: Extend the list_content_session_ids_returns_distinct_set
test to also insert one malformed row directly into the underlying SQLite DB and
assert the method filters it out: after setting up l (open_in) and appending
valid rows with append_content, obtain a DB connection (via the same storage
used by list_content_session_ids), execute a direct SQL INSERT that writes a
corrupted session_id value (e.g., NULL or an invalid type/text) into the content
table, then call l.list_content_session_ids() again and assert the returned set
still has length 3 and contains only "ses_a", "ses_b", "ses_c"; this ensures
list_content_session_ids() correctly ignores malformed rows.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: eb387229-6a55-49bb-91bb-b4a183275c0d

📥 Commits

Reviewing files that changed from the base of the PR and between 7c7ce35 and 91a74f5.

📒 Files selected for processing (4)
  • crates/relayburn-ledger/src/config.rs
  • crates/relayburn-ledger/src/content.rs
  • crates/relayburn-ledger/src/lib.rs
  • crates/relayburn-ledger/src/tests.rs

Comment thread crates/relayburn-ledger/src/content.rs Outdated
- Preserve fractional retentionDays. Switch `Retention::Days` from
  `u64` to `f64` so `RELAYBURN_CONTENT_TTL_DAYS=0.5` round-trips as a
  12-hour window instead of being silently truncated to 0 days. Mirrors
  TS `normalizeRetention`, which keeps any finite JS number as-is.
- Skip malformed `session_id` rows in `list_session_ids` instead of
  aborting the whole call. The `content` table is non-STRICT, so a
  BLOB-typed value can land in a TEXT-affinity column and would
  otherwise poison the entire skip set.
- Cover both behaviors with new tests: env + file fractional retention
  paths and a BLOB-injected row that exercises the row-decode skip.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
crates/relayburn-ledger/src/config.rs (1)

133-145: 💤 Low value

Unused Serialize derive.

RawConfig and RawContent are only deserialized from the config file. The Serialize derive is unused and could be removed, but this is a minor cleanup.

♻️ Suggested cleanup
-#[derive(Debug, Default, Deserialize, Serialize)]
+#[derive(Debug, Default, Deserialize)]
 struct RawConfig {
     #[serde(default)]
     content: Option<RawContent>,
 }

-#[derive(Debug, Default, Deserialize, Serialize)]
+#[derive(Debug, Default, Deserialize)]
 struct RawContent {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/relayburn-ledger/src/config.rs` around lines 133 - 145, The derives
for Serialize on structs RawConfig and RawContent are unused; remove Serialize
from the #[derive(...)] lists for both RawConfig and RawContent and, if present,
delete any now-unused serde::Serialize import so the compiler no longer warns
about an unused derive/import while keeping Deserialize and Debug/Default
intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@crates/relayburn-ledger/src/config.rs`:
- Around line 133-145: The derives for Serialize on structs RawConfig and
RawContent are unused; remove Serialize from the #[derive(...)] lists for both
RawConfig and RawContent and, if present, delete any now-unused serde::Serialize
import so the compiler no longer warns about an unused derive/import while
keeping Deserialize and Debug/Default intact.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 51fa5415-5f30-4f32-b6b9-47ddace7ba01

📥 Commits

Reviewing files that changed from the base of the PR and between 91a74f5 and 44f64d3.

📒 Files selected for processing (3)
  • crates/relayburn-ledger/src/config.rs
  • crates/relayburn-ledger/src/content.rs
  • crates/relayburn-ledger/src/tests.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • crates/relayburn-ledger/src/tests.rs
  • crates/relayburn-ledger/src/content.rs

@willwashburn willwashburn merged commit 233c102 into main May 5, 2026
3 checks passed
@willwashburn willwashburn deleted the claude/close-issue-279-ledger-config branch May 5, 2026 15:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Rust port] relayburn-ledger: list_content_session_ids + Config / loadConfig surface

1 participant