Skip to content

feat: cross-encoder reranking, working memory, skill extraction#93

Merged
rohitg00 merged 6 commits intomainfrom
feat/issues-57-58-90
Apr 7, 2026
Merged

feat: cross-encoder reranking, working memory, skill extraction#93
rohitg00 merged 6 commits intomainfrom
feat/issues-57-58-90

Conversation

@rohitg00
Copy link
Copy Markdown
Owner

@rohitg00 rohitg00 commented Apr 7, 2026

Summary

Changes

File What
src/state/reranker.ts Cross-encoder reranking with lazy model loading
src/state/hybrid-search.ts Integrated reranker into search pipeline
src/functions/working-memory.ts Core memory + auto-paging (5 functions)
src/functions/skill-extract.ts Skill extraction + match (3 functions)
src/types.ts Extended ProceduralMemory with new fields
src/index.ts Registered new functions

Test plan

  • 646 tests passing (19 new)
  • Build succeeds
  • Manual test with RERANK_ENABLED=true and @xenova/transformers installed
  • Manual test of mem::skill-extract on a real completed session

Closes #90, closes #58, closes #57

Summary by CodeRabbit

  • New Features

    • Automatic extraction and storage of procedural skills from sessions with metadata
    • Skill discovery: list and match skills by relevance to queries
    • Core memory management: add/remove/list, prioritize/pin, auto-page to archival, and assemble working context
    • Optional enhanced search reranking for improved relevance
  • Chores

    • Version bumped to 0.8.0; export/import compatibility and plugin manifest updated
  • Tests

    • New test suites for reranker, skill-extract, and working-memory behaviors

…58, #57)

- reranker.ts: opt-in cross-encoder rerank after hybrid search via
  RERANK_ENABLED=true, uses @xenova/transformers MiniLM-L-6-v2,
  graceful fallback when unavailable
- working-memory.ts: core memory (pinned facts always in context) +
  auto-paging of archival memories by importance/recency/access score,
  5 new functions (core-add, core-remove, core-list, working-context,
  auto-page)
- skill-extract.ts: auto-generate reusable procedural skills from
  completed sessions via LLM, with reinforcement on re-extraction,
  skill-match for querying relevant skills (3 new functions)
- ProceduralMemory type extended with expectedOutcome, tags, concepts,
  sourceObservationIds
- 19 new tests, 646 total passing

Closes #90, closes #58, closes #57
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 7, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds procedural skill extraction, a KV-backed working-memory subsystem with automatic paging, and an optional cross-encoder reranker; registers new SDK functions, extends types/version to 0.8.0, and includes unit tests and minor export/import compatibility updates.

Changes

Cohort / File(s) Summary
Version & Manifests
package.json, plugin/.claude-plugin/plugin.json, src/version.ts
Bump package/plugin and exported VERSION to 0.8.0.
Types & Export Schema
src/types.ts, src/functions/export-import.ts, test/export-import.test.ts
Add "0.8.0" to version unions; extend ProceduralMemory with optional fields (expectedOutcome, sourceObservationIds, tags, concepts); update import/export test expectation.
Skill Extraction
src/functions/skill-extract.ts, test/skill-extract.test.ts
Add mem::skill-extract, mem::skill-list, mem::skill-match: build prompts from session summary+observations, call provider.summarize, parse XML, compute deterministic skill IDs, create/reinforce ProceduralMemory in KV, attempt audit writes; tests cover success, no-skill, and error paths.
Working Memory
src/functions/working-memory.ts, test/working-memory.test.ts
Add core-memory SDK functions (mem::core-add, mem::core-remove, mem::core-list, mem::working-context, mem::auto-page) to manage pinned/importance/access scoring, assemble context under token budget, update access metadata, and page low-scoring items to archival KV; tests included.
Reranker & Hybrid Search
src/state/reranker.ts, src/state/hybrid-search.ts, test/reranker.test.ts
Introduce optional cross-encoder reranker that lazy-loads @xenova/transformers, reranks top candidates when RERANK_ENABLED=true with a rerank window (20), caches pipeline/loading promise, and falls back gracefully; hybrid search updated to optionally invoke reranker.
Integration
src/index.ts
Register working-memory and skill-extract SDK functions during startup using configured tokenBudget and provider.
Other / Tests
test/*
New unit tests for reranker, skill-extract, and working-memory; minor export-import compatibility update.

Sequence Diagram(s)

sequenceDiagram
    actor Session as Completed Session
    participant Extract as mem::skill-extract
    participant KV as KV Storage
    participant Provider as LLM Provider
    participant Audit as Audit Logger

    Session->>Extract: invoke mem::skill-extract(sessionId)
    Extract->>KV: load session, summary, observations
    Extract->>Provider: summarize(prompt with summary + observations)
    Provider->>Extract: return XML
    Extract->>Extract: parse XML -> skill or <no-skill/>
    alt skill extracted
        Extract->>Extract: compute fingerprint ID
        Extract->>KV: get existing ProceduralMemory
        alt exists
            Extract->>KV: increment strength/frequency, merge sources, persist
            Extract->>Audit: attempt recordAudit("skill_extract", ...)
            Extract->>Session: return reinforced skill result
        else
            Extract->>KV: create new ProceduralMemory, persist
            Extract->>Audit: attempt recordAudit("skill_extract", ...)
            Extract->>Session: return new skill result
        end
    else no skill
        Extract->>Session: return { extracted: false, reason }
    end
Loading
sequenceDiagram
    participant Agent as Context Requestor
    participant WorkMem as mem::working-context
    participant KV as KV Storage
    participant Scorer as Scoring Logic

    Agent->>WorkMem: request working-context (budget)
    WorkMem->>KV: list core-memory entries
    WorkMem->>Scorer: compute scores (pinned, importance, recency, access)
    Scorer->>WorkMem: select pinned + top unpinned within core budget
    WorkMem->>KV: update accessCount/lastAccessedAt for included cores
    WorkMem->>KV: list archival memories
    WorkMem->>WorkMem: assemble combined context until budget reached
    WorkMem->>Agent: return context, core/archival entries, tokens, pagedOut
Loading
sequenceDiagram
    participant Client as Search Query
    participant Hybrid as HybridSearch
    participant Retriever as BM25/Vector/Graph
    participant Reranker as rerank()
    participant Model as Cross-encoder Model

    Client->>Hybrid: tripleStreamSearch(query, limit)
    Hybrid->>Retriever: initial retrieval
    Retriever->>Hybrid: combined results
    Hybrid->>Hybrid: diversifyBySession() + enrichResults()
    alt RERANK_ENABLED && results.length > 1
        Hybrid->>Reranker: rerank(query, head, rerankWindow)
        Reranker->>Model: lazy-load `@xenova/transformers` (if needed)
        Reranker->>Model: score candidates sequentially
        Model->>Reranker: per-candidate scores
        Reranker->>Hybrid: return reranked results
    else
        Hybrid->>Hybrid: use enriched results
    end
    Hybrid->>Client: return final ranked results
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

🐇 I hopped through sessions and found a trail,

Parsed triggers, steps, and tucked them in a pail.
I guard core facts close, page the fluff away—
I nudge a model’s ranks and help recall the day.
Version 0.8.0 — the burrow hums hooray.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title concisely summarizes the three main features added: cross-encoder reranking, working memory, and skill extraction, matching the core changes in the changeset.
Linked Issues check ✅ Passed All primary coding requirements from linked issues are met: reranking (#90) with RERANK_ENABLED, working memory with core/archival (#58), skill extraction with matching (#57), and idempotent reinforcement.
Out of Scope Changes check ✅ Passed All changes directly support the three linked issues; version bumps, type extensions, and test additions are standard maintenance for feature releases.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/issues-57-58-90

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/state/hybrid-search.ts (1)

23-32: ⚠️ Potential issue | 🟠 Major

Match the codebase's boolean env flag pattern.

Line 31 uses !!process.env.RERANK_ENABLED, which treats any non-empty value (including "false" or "0") as enabled. The codebase consistently uses explicit string comparison (!== "false") for feature flags elsewhere; apply the same pattern here.

🔧 Suggested change
-    private rerankEnabled = !!process.env.RERANK_ENABLED,
+    private rerankEnabled = process.env.RERANK_ENABLED !== "false",
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/state/hybrid-search.ts` around lines 23 - 32, The constructor parameter
rerankEnabled currently initializes with !!process.env.RERANK_ENABLED which
treats any non-empty string (including "false") as true; change it to follow the
codebase pattern by initializing rerankEnabled using explicit string comparison
(e.g., process.env.RERANK_ENABLED !== "false") so only the literal "false"
disables the feature; update the constructor signature where rerankEnabled is
declared in the HybridSearch class/constructor accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/functions/skill-extract.ts`:
- Around line 120-125: After fetching the session in skill-extract.ts (the
session variable returned by kv.get<Session>(KV.sessions, data.sessionId)), add
a guard that rejects sessions that are not completed by checking the session
state fields (e.g., require session.active === true and session.abandoned ===
false or a dedicated session.completed/status === 'complete' if present); if the
session is not in the completed state return { success: false, error: "session
not complete" } early to prevent persisting procedural memory from unfinished
work.
- Around line 164-172: The reinforcement branch that updates an existing skill
(the block that mutates existing.strength, existing.frequency,
existing.updatedAt, existing.sourceSessionIds and then calls
kv.set(KV.procedural, existing.id, existing)) must also call recordAudit() to
register a "skill_extract" audit entry (include context like skill id,
sessionId, before/after state and operation = "skill_extract"); add the same
recordAudit call for the other reinforcement path in this file as well. Also
update the AuditEntry.operation union in src/types.ts to include the new literal
"skill_extract" so the audit records type-check, and ensure the recordAudit
invocation validates inputs and uses the existing kv.get/kv.set flow for
state-changing ops.
- Around line 159-162: The fingerprinting currently uses only parsed.title which
can collide across different procedures; change the fingerprint input passed to
fingerprintId in the skill creation path to a canonical representation of the
whole procedure (e.g., combine parsed.title, parsed.description/summary, and the
ordered steps or actions — stringify a normalized object or use a deterministic
serializer) instead of parsed.title.toLowerCase(), so that
fingerprintId("skill", ...) de-duplicates by procedure content; keep using
generateId() where you need a unique ID (do not replace generateId with
fingerprintId).

In `@src/functions/working-memory.ts`:
- Around line 37-62: The mutating handlers (registered under "mem::core-add",
"mem::core-remove", the access-count updates in "mem::working-context", and
deletions in "mem::auto-page") currently persist changes via kv.set/kv.delete
without recording audits; modify each to call recordAudit() immediately after a
successful kv.set/kv.delete (or after batched mutations) with a clear audit
payload (e.g., action:
"core-add"/"core-remove"/"access-count-update"/"auto-page-delete", id(s),
scope/type, relevant deltas or content snippet, and timestamp/user if available)
so all state-changing operations are captured in the audit trail; ensure you
locate the exact post-write spots in the registered functions and include audit
on success only.
- Around line 218-223: The loop in mem::auto-page is deleting low-value core
entries (using kv.delete on CORE_SCOPE) which makes them irretrievable by
mem::working-context; instead move entries into the archival store (KV.memories)
before removing them from CORE_SCOPE. Replace the kv.delete(CORE_SCOPE,
entry.id) call with a two-step move: write the entry and its metadata (id,
content, timestamps, tags) into the archival scope/key used by
mem::working-context (e.g., KV.memories or ARCHIVE_SCOPE) using kv.put, then
remove the original core entry; keep use of estimateTokens(entry.content), and
still decrement totalTokens and increment paged after the move so behavior and
counters remain unchanged.

In `@src/state/hybrid-search.ts`:
- Around line 221-231: The current logic calls enrichResults(diversified, limit)
and then runs rerank only over that truncated set, which prevents deeper
candidates from being promoted; change to build a larger candidate pool for
reranking (e.g., compute rerankDepth = Math.max(limit,
SOME_RERANK_WINDOW_OR_CONST_20)), call enriched = await
this.enrichResults(diversified, rerankDepth), then if this.rerankEnabled and
enriched.length > 1 run rerank(query, enriched.slice(0, rerankDepth),
rerankDepth) (or run rerank over the first rerankDepth items), merge the
reranked top portion with the untouched suffix (enriched.slice(rerankDepth)),
and finally return the first limit items so the returned array is still limited
to limit while allowing promotion from deeper candidates; update references to
enriched, rerank, limit, rerankEnabled, diversified and query accordingly.

In `@src/state/reranker.ts`:
- Around line 10-26: The current async initializer (pipelineLoading) clears
itself on any outcome, so failures aren't cached and every rerank retries the
import; change the catch/finally logic so a load failure is remembered: in the
catch block set pipeline = null and set pipelineLoading = Promise.resolve(null)
(or set a pipelineUnavailable flag) to represent a settled failed state, and
modify the finally (or remove the unconditional pipelineLoading = null) so you
only clear pipelineLoading on a successful createPipeline return; reference
symbols: pipelineLoading, pipeline, createPipeline, and the
import("@xenova/transformers") block.

---

Outside diff comments:
In `@src/state/hybrid-search.ts`:
- Around line 23-32: The constructor parameter rerankEnabled currently
initializes with !!process.env.RERANK_ENABLED which treats any non-empty string
(including "false") as true; change it to follow the codebase pattern by
initializing rerankEnabled using explicit string comparison (e.g.,
process.env.RERANK_ENABLED !== "false") so only the literal "false" disables the
feature; update the constructor signature where rerankEnabled is declared in the
HybridSearch class/constructor accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 07f25e35-96f5-4563-b60c-96c291717364

📥 Commits

Reviewing files that changed from the base of the PR and between 1fa0dce and 34c60cd.

📒 Files selected for processing (11)
  • package.json
  • src/functions/skill-extract.ts
  • src/functions/working-memory.ts
  • src/index.ts
  • src/state/hybrid-search.ts
  • src/state/reranker.ts
  • src/types.ts
  • src/version.ts
  • test/reranker.test.ts
  • test/skill-extract.test.ts
  • test/working-memory.test.ts

Comment thread src/functions/skill-extract.ts
Comment thread src/functions/skill-extract.ts Outdated
Comment thread src/functions/skill-extract.ts
Comment thread src/functions/working-memory.ts
Comment thread src/functions/working-memory.ts
Comment thread src/state/hybrid-search.ts Outdated
Comment thread src/state/reranker.ts
- skill-extract: reject non-completed sessions, fingerprint full
  procedure (title+trigger+steps) not just title, audit reinforcement
- working-memory: add recordAudit for core-add/core-remove/auto-page,
  auto-page now archives to KV.memories instead of deleting
- reranker: cache pipelineUnavailable to avoid repeated import retries
- hybrid-search: rerank over larger pool (max(limit,20)) then slice
- types: add skill_extract/core_add/core_remove/auto_page to AuditEntry
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (3)
src/functions/working-memory.ts (2)

144-146: Capture timestamp once before the loop.

Creating a new timestamp on each iteration is inefficient and produces inconsistent timestamps across entries updated in the same operation.

♻️ Suggested fix
+      const accessedAt = new Date().toISOString();
       for (const entry of [...pinned, ...unpinned]) {
         const tokens = estimateTokens(entry.content);
         if (usedTokens + tokens > coreBudget && !entry.pinned) break;
         coreLines.push(`- ${entry.content}`);
         usedTokens += tokens;

         entry.accessCount++;
-        entry.lastAccessedAt = new Date().toISOString();
+        entry.lastAccessedAt = accessedAt;
         await kv.set(CORE_SCOPE, entry.id, entry);
       }

As per coding guidelines: "Capture timestamps once with new Date().toISOString() and reuse rather than calling repeatedly."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/working-memory.ts` around lines 144 - 146, The loop updates
call new Date().toISOString() for each entry (see entry.lastAccessedAt
assignment) which is inefficient and inconsistent; capture a single timestamp
before the loop (e.g., const now = new Date().toISOString()) and replace
per-iteration new Date() calls with entry.lastAccessedAt = now, leaving other
updates (entry.accessCount++ and await kv.set(CORE_SCOPE, entry.id, entry))
unchanged so all entries share the same timestamp.

63-69: Consider logging audit failures instead of silently swallowing them.

The empty catch {} block means audit failures are completely invisible. While this prevents audit issues from breaking the main operation, it could mask persistent problems.

💡 Suggested improvement
       try {
         await recordAudit(kv, "core_add", "mem::core-add", [entry.id], {
           content: entry.content.slice(0, 100),
           importance: entry.importance,
           pinned: entry.pinned,
         });
-      } catch {}
+      } catch (err) {
+        ctx.logger.warn("Audit recording failed for core_add", {
+          entryId: entry.id,
+          error: err instanceof Error ? err.message : String(err),
+        });
+      }

Note: You'd need to call const ctx = getContext(); at the start of the handler.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/working-memory.ts` around lines 63 - 69, The empty catch
swallows audit errors—call const ctx = getContext() at the start of the handler,
change the catch to catch (err) and log the failure (including err and
entry.id/context) via ctx.log.error (or your existing logger) when calling
recordAudit(kv, "core_add", "mem::core-add", ...); this preserves the non-fatal
behavior while making audit failures visible for debugging.
src/functions/skill-extract.ts (1)

165-172: Consider including expectedOutcome in the fingerprint.

The fingerprint uses title, trigger, and steps, but expectedOutcome is stored on new skills (line 212). If the same procedure is extracted with a different expected outcome description, it will reinforce the existing skill without updating the outcome.

This may be intentional (outcome is considered supplementary), but if outcomes should differentiate skills, include it in the fingerprint.

💡 If outcomes should distinguish skills
         const fp = fingerprintId(
           "skill",
           JSON.stringify({
             title: parsed.title.toLowerCase(),
             trigger: parsed.trigger.toLowerCase(),
             steps: parsed.steps.map((s) => s.toLowerCase().trim()),
+            expectedOutcome: parsed.expectedOutcome.toLowerCase(),
           }),
         );
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/skill-extract.ts` around lines 165 - 172, The fingerprint
currently builds from title, trigger, and steps in the fingerprintId call inside
skill-extract.ts; include parsed.expectedOutcome (e.g.,
parsed.expectedOutcome.toLowerCase().trim()) in that JSON used by fingerprintId
so different expectedOutcome text produces a different fingerprint, ensuring the
fingerprintId(...) call includes an "expectedOutcome" property alongside title,
trigger, and steps.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/functions/skill-extract.ts`:
- Line 256: The parameter object declares an unused optional property project in
the anonymous async handler (async (data: { project?: string; limit?: number })
=> { ... }), so remove project from the parameter type (change to data: {
limit?: number }) and eliminate any dead references to data.project; if callers
pass project, update them to stop doing so or keep backward-compatible parsing
in the caller, but do not introduce project-scoped behavior here until
implemented.

In `@src/functions/working-memory.ts`:
- Around line 117-122: The async arrow function defined as async (data: {
sessionId: string; project: string; budget?: number; query?: string; }) => in
working-memory.ts declares sessionId, project and query but never uses them; fix
by removing those unused properties from the parameter type and signature (or
destructure only the used field, e.g., { budget }) so the function only accepts
what it uses, or if the params must remain for API compatibility, prefix them
with an underscore (e.g., _sessionId, _project, _query) to mark them
intentionally unused and adjust any callers/types accordingly.
- Around line 237-251: archivalMemory currently reuses entry.id which can
collide in KV.memories; change archivalMemory.id to a generated unique
identifier (e.g. a UUID or a deterministic fingerprint/prefixed key) instead of
entry.id and update the kv.set call to use that new id; locate the
archivalMemory construction and the kv.set(KV.memories, archivalMemory.id,
archivalMemory) call and replace the id assignment with a generated unique value
(or a safe prefixed form like `${entry.id}-archived-${timestamp}` or a hash) so
archived entries cannot overwrite existing memories.

In `@src/state/hybrid-search.ts`:
- Line 31: The field rerankEnabled in src/state/hybrid-search.ts is currently
set via truthy coercion (private rerankEnabled = !!process.env.RERANK_ENABLED)
which treats values like "false" or "0" as true; change the initialization to
explicitly parse the environment variable (e.g., compare
process.env.RERANK_ENABLED to "true" or use a small parser that accepts
"true"/"1" case-insensitively) so rerankEnabled only becomes true when
RERANK_ENABLED is intentionally set to true; update any related tests or usages
of rerankEnabled in the class to reflect the stricter boolean semantics.

---

Nitpick comments:
In `@src/functions/skill-extract.ts`:
- Around line 165-172: The fingerprint currently builds from title, trigger, and
steps in the fingerprintId call inside skill-extract.ts; include
parsed.expectedOutcome (e.g., parsed.expectedOutcome.toLowerCase().trim()) in
that JSON used by fingerprintId so different expectedOutcome text produces a
different fingerprint, ensuring the fingerprintId(...) call includes an
"expectedOutcome" property alongside title, trigger, and steps.

In `@src/functions/working-memory.ts`:
- Around line 144-146: The loop updates call new Date().toISOString() for each
entry (see entry.lastAccessedAt assignment) which is inefficient and
inconsistent; capture a single timestamp before the loop (e.g., const now = new
Date().toISOString()) and replace per-iteration new Date() calls with
entry.lastAccessedAt = now, leaving other updates (entry.accessCount++ and await
kv.set(CORE_SCOPE, entry.id, entry)) unchanged so all entries share the same
timestamp.
- Around line 63-69: The empty catch swallows audit errors—call const ctx =
getContext() at the start of the handler, change the catch to catch (err) and
log the failure (including err and entry.id/context) via ctx.log.error (or your
existing logger) when calling recordAudit(kv, "core_add", "mem::core-add", ...);
this preserves the non-fatal behavior while making audit failures visible for
debugging.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 594e16ff-e0c8-4e38-aff8-f06307d25a58

📥 Commits

Reviewing files that changed from the base of the PR and between 0cb4799 and 25c530e.

📒 Files selected for processing (7)
  • src/functions/skill-extract.ts
  • src/functions/working-memory.ts
  • src/state/hybrid-search.ts
  • src/state/reranker.ts
  • src/types.ts
  • test/skill-extract.test.ts
  • test/working-memory.test.ts
✅ Files skipped from review due to trivial changes (1)
  • test/skill-extract.test.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • test/working-memory.test.ts
  • src/state/reranker.ts

Comment thread src/functions/skill-extract.ts Outdated
Comment thread src/functions/working-memory.ts Outdated
Comment thread src/functions/working-memory.ts
Comment thread src/state/hybrid-search.ts Outdated
rohitg00 added 2 commits April 7, 2026 19:17
- Parse RERANK_ENABLED with === "true" instead of truthy coercion
- Remove unused project param from mem::skill-list
- Remove unused sessionId/project/query params from mem::working-context
- Use generateId("mem") for archived core entries to avoid ID collision
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
src/state/hybrid-search.ts (1)

221-228: ⚠️ Potential issue | 🟠 Major

The rerank window is still coupled to limit.

Math.max(limit, 20) fixes the shallow-page case, but it still reranks every requested result when limit > 20. Large pages now scale cross-encoder cost with page size instead of the intended fixed ~20-candidate window. Keep a larger retrieval/enrichment pool for pagination, rerank only the first ~20, then append the untouched tail before the final slice.

♻️ Suggested change
-    const rerankDepth = this.rerankEnabled ? Math.max(limit, 20) : limit;
-    const diversified = this.diversifyBySession(combined, rerankDepth);
-    const enriched = await this.enrichResults(diversified, rerankDepth);
+    const candidateDepth = this.rerankEnabled ? Math.max(limit, 20) : limit;
+    const rerankWindow = this.rerankEnabled ? 20 : limit;
+    const diversified = this.diversifyBySession(combined, candidateDepth);
+    const enriched = await this.enrichResults(diversified, candidateDepth);
 
     if (this.rerankEnabled && enriched.length > 1) {
       try {
-        const reranked = await rerank(query, enriched, rerankDepth);
-        return reranked.slice(0, limit);
+        const rerankedHead = await rerank(
+          query,
+          enriched.slice(0, rerankWindow),
+          rerankWindow,
+        );
+        return [...rerankedHead, ...enriched.slice(rerankWindow)].slice(0, limit);
       } catch {
         return enriched.slice(0, limit);
       }
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/state/hybrid-search.ts` around lines 221 - 228, The rerank window is tied
to limit; instead decouple retrieval/enrichment depth from the rerank window:
compute a retrievalDepth = Math.max(limit, 20) (used for diversifyBySession and
enrichResults) and a rerankWindow = this.rerankEnabled ? 20 : limit; call
this.diversifyBySession(combined, retrievalDepth) and await
this.enrichResults(..., retrievalDepth); if rerank is enabled, split enriched
into head = enriched.slice(0, rerankWindow) and tail =
enriched.slice(rerankWindow), call rerank(query, head, rerankWindow) and then
return reranked.concat(tail).slice(0, limit); reference symbols: rerankDepth
(replace), retrievalDepth, rerankWindow, diversifyBySession, enrichResults,
rerank, enriched, combined, limit.
🧹 Nitpick comments (1)
src/functions/skill-extract.ts (1)

133-145: Load summary and observations concurrently.

These reads are independent after the completed-session guard, so the current sequence adds an extra KV round-trip to every extraction.

⚡ Suggested change
-      const summary = await kv
-        .get<SessionSummary>(KV.summaries, data.sessionId)
-        .catch(() => null);
+      const [summary, observations] = await Promise.all([
+        kv
+          .get<SessionSummary>(KV.summaries, data.sessionId)
+          .catch(() => null),
+        kv
+          .list<CompressedObservation>(KV.observations(data.sessionId))
+          .catch(() => []),
+      ]);
       if (!summary) {
         return {
           success: false,
           error: "no summary — run mem::summarize first",
         };
       }
-
-      const observations = await kv
-        .list<CompressedObservation>(KV.observations(data.sessionId))
-        .catch(() => []);
       if (observations.length < 3) {
         return { success: false, error: "too few observations for skill extraction" };
       }
As per coding guidelines, "Perform parallel operations where possible using Promise.all for independent kv writes/reads in TypeScript."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/skill-extract.ts` around lines 133 - 145, The code fetches
summary then observations sequentially causing an extra KV round-trip; after you
confirm the completed-session guard, load both by running kv.get(KV.summaries,
data.sessionId) and kv.list(KV.observations(data.sessionId)) in parallel with
Promise.all, assign the results to the existing summary and observations
variables (handling the same .catch fallback for each as currently used) so the
logic around summary existence and subsequent use of observations is unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/functions/skill-extract.ts`:
- Around line 177-184: The current branch always increments existing.strength
and existing.frequency and adds data.sessionId to existing.sourceSessionIds;
make this idempotent by first checking whether data.sessionId is already present
in existing.sourceSessionIds and only then increment strength/frequency and
append the session id (and update updatedAt); if the sessionId is already
present, avoid changing strength/frequency (you may still update other
non-reinforcement fields if needed) before calling kv.set(KV.procedural,
existing.id, existing). Target the logic around existing,
existing.sourceSessionIds, strength, frequency, data.sessionId and the kv.set
call inside the skill-extract handler.

In `@src/functions/working-memory.ts`:
- Around line 133-136: In the loop inside working-memory.ts that iterates over
[...pinned, ...unpinned], don't use break when an unpinned entry would exceed
coreBudget; replace the break with continue so the loop skips oversized unpinned
entries but still considers later smaller entries; specifically modify the block
that calls estimateTokens(entry.content) and checks if (usedTokens + tokens >
coreBudget && !entry.pinned) to use continue instead of break, ensuring
coreLines and usedTokens updates still occur only when an entry is added.
- Around line 139-141: In mem::working-context, the inline awaited kv.set for
each entry makes read-path bookkeeping fragile and slow; instead, increment
entry.accessCount and set entry.lastAccessedAt using a single timestamp, push
the modified entries (or their ids/objects) into a local buffer during the loop,
and after the loop persist them with Promise.allSettled
(non-blocking/best-effort) so transient write failures don't fail context
assembly and you avoid per-entry latency; keep these writes non-audited as
intended.

---

Duplicate comments:
In `@src/state/hybrid-search.ts`:
- Around line 221-228: The rerank window is tied to limit; instead decouple
retrieval/enrichment depth from the rerank window: compute a retrievalDepth =
Math.max(limit, 20) (used for diversifyBySession and enrichResults) and a
rerankWindow = this.rerankEnabled ? 20 : limit; call
this.diversifyBySession(combined, retrievalDepth) and await
this.enrichResults(..., retrievalDepth); if rerank is enabled, split enriched
into head = enriched.slice(0, rerankWindow) and tail =
enriched.slice(rerankWindow), call rerank(query, head, rerankWindow) and then
return reranked.concat(tail).slice(0, limit); reference symbols: rerankDepth
(replace), retrievalDepth, rerankWindow, diversifyBySession, enrichResults,
rerank, enriched, combined, limit.

---

Nitpick comments:
In `@src/functions/skill-extract.ts`:
- Around line 133-145: The code fetches summary then observations sequentially
causing an extra KV round-trip; after you confirm the completed-session guard,
load both by running kv.get(KV.summaries, data.sessionId) and
kv.list(KV.observations(data.sessionId)) in parallel with Promise.all, assign
the results to the existing summary and observations variables (handling the
same .catch fallback for each as currently used) so the logic around summary
existence and subsequent use of observations is unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 373ad586-7a71-404d-bb46-257fd8b5d283

📥 Commits

Reviewing files that changed from the base of the PR and between 25c530e and 9b09d9b.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (7)
  • package.json
  • src/functions/export-import.ts
  • src/functions/skill-extract.ts
  • src/functions/working-memory.ts
  • src/state/hybrid-search.ts
  • src/types.ts
  • src/version.ts
✅ Files skipped from review due to trivial changes (2)
  • package.json
  • src/functions/export-import.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/types.ts
  • src/version.ts

Comment thread src/functions/skill-extract.ts
Comment thread src/functions/working-memory.ts
Comment thread src/functions/working-memory.ts Outdated
- Make skill reinforcement idempotent (skip if sessionId already present)
- Parallelize summary + observations fetch in skill-extract
- Use continue instead of break for oversized unpinned core entries
- Buffer access-counter writes with Promise.allSettled (non-blocking)
- Decouple rerank window (fixed 20) from retrieval depth
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/functions/working-memory.ts (1)

234-258: Consider capturing timestamp once before the loop.

Line 241 calls new Date().toISOString() inside the loop, creating a different timestamp for each paged entry. While functionally acceptable, capturing the timestamp once before the loop would be more consistent with the guideline to "capture timestamps once and reuse."

♻️ Optional: reuse single timestamp
       let paged = 0;
       const pagedIds: string[] = [];
+      const pagedAt = new Date().toISOString();
       for (const entry of unpinned) {
         if (totalTokens <= coreBudget) break;
         const tokens = estimateTokens(entry.content);

         const archivalMemory: Memory = {
           id: generateId("mem"),
           createdAt: entry.createdAt,
-          updatedAt: new Date().toISOString(),
+          updatedAt: pagedAt,
           type: "fact",

As per coding guidelines: "Capture timestamps once with new Date().toISOString() and reuse the value rather than calling multiple times."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/working-memory.ts` around lines 234 - 258, The loop that
creates archivalMemory currently calls new Date().toISOString() for each entry
producing inconsistent timestamps; capture a single timestamp before the for
(const entry of unpinned) loop (e.g., const now = new Date().toISOString()) and
reuse that variable when setting archivalMemory.updatedAt (and any other
created/updated timestamps) so all paged archival memories share the same
timestamp; update references in the loop where archivalMemory is constructed (id
generation via generateId("mem"), createdAt from entry.createdAt can remain) to
use the captured now value.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/functions/skill-extract.ts`:
- Around line 174-201: The current logic in the skill reinforcement block still
updates updatedAt, calls kv.set, records an audit with reinforced: true and
returns reinforced: true even when alreadyReinforced is true; change this so
that if alreadyReinforced is true you skip the state mutation (do not set
updatedAt or call kv.set), skip recordAudit or call it with reinforced: false if
you must audit, and return reinforced: false (with extracted: true if the skill
exists). Concretely, inside the existing branch that computes alreadyReinforced
from existing.sourceSessionIds, add a conditional: if (alreadyReinforced) {
ctx.logger.info("Skill already reinforced", { id: existing.id, name:
parsed.title }); return { success: true, extracted: true, reinforced: false,
skill: existing }; } else perform the existing mutation (update strength,
frequency, sourceSessionIds, updatedAt), await kv.set(KV.procedural,
existing.id, existing), call recordAudit with reinforced: true, and then return
reinforced: true.

---

Nitpick comments:
In `@src/functions/working-memory.ts`:
- Around line 234-258: The loop that creates archivalMemory currently calls new
Date().toISOString() for each entry producing inconsistent timestamps; capture a
single timestamp before the for (const entry of unpinned) loop (e.g., const now
= new Date().toISOString()) and reuse that variable when setting
archivalMemory.updatedAt (and any other created/updated timestamps) so all paged
archival memories share the same timestamp; update references in the loop where
archivalMemory is constructed (id generation via generateId("mem"), createdAt
from entry.createdAt can remain) to use the captured now value.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3dcc4dbb-af8f-493b-8e73-b0334118f720

📥 Commits

Reviewing files that changed from the base of the PR and between 9b09d9b and 221bf89.

📒 Files selected for processing (3)
  • src/functions/skill-extract.ts
  • src/functions/working-memory.ts
  • src/state/hybrid-search.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/state/hybrid-search.ts

Comment on lines +174 to +201
if (existing) {
const alreadyReinforced = existing.sourceSessionIds.includes(data.sessionId);
if (!alreadyReinforced) {
existing.strength = Math.min(1.0, existing.strength + 0.15);
existing.frequency++;
existing.sourceSessionIds = [...existing.sourceSessionIds, data.sessionId];
}
existing.updatedAt = new Date().toISOString();
await kv.set(KV.procedural, existing.id, existing);

try {
await recordAudit(kv, "skill_extract", "mem::skill-extract", [], {
skillId: existing.id,
reinforced: true,
sessionId: data.sessionId,
});
} catch {}

ctx.logger.info("Skill reinforced", {
id: existing.id,
name: parsed.title,
});
return {
success: true,
extracted: true,
reinforced: true,
skill: existing,
};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Incomplete idempotency: unnecessary write and misleading return when session was already reinforced.

When alreadyReinforced is true, the code still:

  1. Updates updatedAt (line 181)
  2. Calls kv.set (line 182) — wasteful write
  3. Records audit with reinforced: true (line 188) — misleading
  4. Returns reinforced: true (line 199) — incorrect

The return value should reflect the actual outcome, and the write/audit should be skipped if no reinforcement occurred.

🛡️ Suggested fix for complete idempotency
       if (existing) {
-        const alreadyReinforced = existing.sourceSessionIds.includes(data.sessionId);
-        if (!alreadyReinforced) {
+        if (existing.sourceSessionIds.includes(data.sessionId)) {
+          ctx.logger.info("Skill already reinforced from this session", {
+            id: existing.id,
+            sessionId: data.sessionId,
+          });
+          return {
+            success: true,
+            extracted: true,
+            reinforced: false,
+            skill: existing,
+          };
+        }
+
-          existing.strength = Math.min(1.0, existing.strength + 0.15);
-          existing.frequency++;
-          existing.sourceSessionIds = [...existing.sourceSessionIds, data.sessionId];
-        }
+        existing.strength = Math.min(1.0, existing.strength + 0.15);
+        existing.frequency++;
+        existing.sourceSessionIds = [...existing.sourceSessionIds, data.sessionId];
         existing.updatedAt = new Date().toISOString();
         await kv.set(KV.procedural, existing.id, existing);

         try {
           await recordAudit(kv, "skill_extract", "mem::skill-extract", [], {
             skillId: existing.id,
             reinforced: true,
             sessionId: data.sessionId,
           });
         } catch {}

         ctx.logger.info("Skill reinforced", {
           id: existing.id,
           name: parsed.title,
         });
         return {
           success: true,
           extracted: true,
           reinforced: true,
           skill: existing,
         };
       }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/skill-extract.ts` around lines 174 - 201, The current logic in
the skill reinforcement block still updates updatedAt, calls kv.set, records an
audit with reinforced: true and returns reinforced: true even when
alreadyReinforced is true; change this so that if alreadyReinforced is true you
skip the state mutation (do not set updatedAt or call kv.set), skip recordAudit
or call it with reinforced: false if you must audit, and return reinforced:
false (with extracted: true if the skill exists). Concretely, inside the
existing branch that computes alreadyReinforced from existing.sourceSessionIds,
add a conditional: if (alreadyReinforced) { ctx.logger.info("Skill already
reinforced", { id: existing.id, name: parsed.title }); return { success: true,
extracted: true, reinforced: false, skill: existing }; } else perform the
existing mutation (update strength, frequency, sourceSessionIds, updatedAt),
await kv.set(KV.procedural, existing.id, existing), call recordAudit with
reinforced: true, and then return reinforced: true.

@rohitg00 rohitg00 merged commit 69bb787 into main Apr 7, 2026
3 checks passed
@rohitg00 rohitg00 deleted the feat/issues-57-58-90 branch April 7, 2026 18:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant