Skip to content

feat: add persistent memory with SQLite, embeddings, and hybrid retrieval#10

Merged
hackertron merged 2 commits intomainfrom
feat/memory-sqlite-embeddings-hybrid-retrieval
Mar 3, 2026
Merged

feat: add persistent memory with SQLite, embeddings, and hybrid retrieval#10
hackertron merged 2 commits intomainfrom
feat/memory-sqlite-embeddings-hybrid-retrieval

Conversation

@hackertron
Copy link
Copy Markdown
Owner

Summary

  • Add persistent memory system backed by pure-Go SQLite (modernc.org/sqlite) with WAL mode, FTS5 full-text search, and OpenAI text-embedding-3-small vector embeddings
  • Implement hybrid retrieval combining vector cosine similarity and FTS5 BM25 ranking via reciprocal rank fusion (RRF)
  • Replace the checkContextBudget() stub with real rolling summarization that compacts conversation history when context budget is exceeded
  • Add memory_search (ReadOnly) and memory_save (SideEffecting) tools for LLM-driven memory operations
  • Wire memory DB, embedding backend, session store, and conversation persistence into the CLI run command

New files

File Purpose
internal/memory/sqlite.go DB connection, WAL mode, schema migration (6 tables)
internal/memory/embedding.go Embedding backend factory
internal/memory/embedding_openai.go OpenAI embedder via openai-go SDK
internal/memory/retrieval.go Vector search, FTS5, cosine similarity, RRF, binary encoding
internal/memory/store.go MemoryRetrieval implementation
internal/memory/session_store.go SessionStore CRUD backed by SQLite
internal/memory/memory_test.go 11 test cases
internal/tool/memory_search.go memory_search tool
internal/tool/memory_save.go memory_save tool

Design decisions

  • Pure Go SQLite — no CGO required, FTS5 included out of the box
  • Brute-force cosine similarity — fast enough for MVP scale (<10k chunks)
  • Graceful degradation — missing OPENAI_API_KEY falls back to FTS-only; memory init failure continues without memory
  • Rolling summarization — uses the same LLM provider to generate summaries when context budget is hit

Test plan

  • go build ./... passes
  • go vet ./... passes
  • go test ./internal/memory/ -race -count=1 — 11/11 tests pass
  • go test ./... -race -count=1 — all packages pass
  • E2E: yantra run "Remember my favorite language is Go" then yantra run "What is my favorite language?" (requires API keys)
  • Verify .yantra/memory.db is created with correct schema

🤖 Generated with Claude Code

…eval

Implement Step 5 of the architecture — persistent memory backed by SQLite
(pure Go, no CGO) with OpenAI embeddings and hybrid vector+FTS retrieval
using reciprocal rank fusion. Replace the context budget stub with real
rolling summarization.

New packages/files:
- internal/memory/ — SQLite DB layer, OpenAI embedding backend, memory
  store (MemoryRetrieval), session store, vector/FTS retrieval with RRF
- internal/tool/memory_search.go — memory_search tool (ReadOnly)
- internal/tool/memory_save.go — memory_save tool (SideEffecting)

Modified:
- RegisterBuiltins now accepts optional MemoryRetrieval for memory tools
- AgentRuntime gains SetMemory() for persistent conversation history and
  real summarization when context budget is exceeded
- Session gains CompactWithSummary() for context window compaction
- CLI wires memory DB, embedder, session store, and tools

11 new tests covering all memory operations, all passing with -race.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@qodo-code-review
Copy link
Copy Markdown

Review Summary by Qodo

Add persistent memory with SQLite, embeddings, and hybrid retrieval

✨ Enhancement

Grey Divider

Walkthroughs

Description
• Implement persistent memory system with SQLite, OpenAI embeddings, and hybrid retrieval
• Add rolling summarization to compact conversation history when context budget exceeded
• Introduce memory_search (ReadOnly) and memory_save (SideEffecting) LLM tools
• Wire memory DB, embedder, session store into CLI and runtime with graceful degradation
Diagram
flowchart LR
  A["CLI run command"] -->|opens| B["SQLite DB"]
  A -->|creates| C["OpenAI Embedder"]
  A -->|initializes| D["Memory Store"]
  A -->|creates| E["Session"]
  D -->|hybrid search| F["Vector + FTS Results"]
  F -->|RRF fusion| G["Ranked Chunks"]
  H["Agent Runtime"] -->|stores events| D
  H -->|checks budget| I["Summarization"]
  I -->|compacts| J["Session with Summary"]
  K["memory_search tool"] -->|queries| D
  L["memory_save tool"] -->|persists| D
Loading

Grey Divider

File Changes

1. internal/memory/sqlite.go ✨ Enhancement +115/-0

SQLite database layer with WAL mode and schema

internal/memory/sqlite.go


2. internal/memory/embedding.go ✨ Enhancement +24/-0

Embedding backend factory with graceful degradation

internal/memory/embedding.go


3. internal/memory/embedding_openai.go ✨ Enhancement +76/-0

OpenAI embedder implementation using openai-go SDK

internal/memory/embedding_openai.go


View more (13)
4. internal/memory/retrieval.go ✨ Enhancement +214/-0

Vector search, FTS5, cosine similarity, and RRF fusion

internal/memory/retrieval.go


5. internal/memory/store.go ✨ Enhancement +312/-0

MemoryRetrieval implementation with hybrid query support

internal/memory/store.go


6. internal/memory/session_store.go ✨ Enhancement +132/-0

SessionStore CRUD operations backed by SQLite

internal/memory/session_store.go


7. internal/memory/memory_test.go 🧪 Tests +420/-0

Comprehensive test suite for memory operations

internal/memory/memory_test.go


8. internal/tool/memory_search.go ✨ Enhancement +75/-0

Memory search tool for LLM-driven retrieval

internal/tool/memory_search.go


9. internal/tool/memory_save.go ✨ Enhancement +62/-0

Memory save tool for LLM-driven persistence

internal/tool/memory_save.go


10. internal/runtime/runtime.go ✨ Enhancement +165/-10

Integrate memory, session persistence, and rolling summarization

internal/runtime/runtime.go


11. internal/runtime/session.go ✨ Enhancement +24/-0

Add CompactWithSummary method for context window compaction

internal/runtime/session.go


12. internal/tool/builtin.go ✨ Enhancement +5/-1

Register memory tools when MemoryRetrieval is available

internal/tool/builtin.go


13. internal/tool/builtin_test.go 🧪 Tests +1/-1

Update RegisterBuiltins test to pass nil memory parameter

internal/tool/builtin_test.go


14. cmd/yantra/main.go ✨ Enhancement +50/-3

Wire memory DB, embedder, and session into CLI run command

cmd/yantra/main.go


15. go.mod Dependencies +12/-2

Add modernc.org/sqlite and openai-go dependencies

go.mod


16. go.sum Dependencies +53/-4

Update checksums for new SQLite and related dependencies

go.sum


Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Copy Markdown

qodo-code-review Bot commented Mar 2, 2026

Code Review by Qodo

🐞 Bugs (6) 📘 Rule violations (0) 📎 Requirement gaps (0)

Grey Divider


Action required

1. FTS rank query broken🐞 Bug ✓ Correctness
Description
ftsSearch selects/orders by rank, but the FTS table schema doesn’t define a rank column. This
can cause FTS-only memory retrieval to fail at runtime and hybrid retrieval to silently degrade.
Code

internal/memory/retrieval.go[R66-75]

+// ftsSearch performs a full-text search using SQLite FTS5.
+func ftsSearch(ctx context.Context, db *DB, query string, topN int) ([]scoredChunk, error) {
+	rows, err := db.conn.QueryContext(ctx,
+		`SELECT f.id, f.content, c.source, c.tags, rank
+		 FROM chunks_fts f
+		 JOIN chunks c ON c.id = f.id
+		 WHERE chunks_fts MATCH ?
+		 ORDER BY rank
+		 LIMIT ?`, query, topN)
+	if err != nil {
Evidence
The FTS table is created with only id and content columns, but the retrieval SQL selects rank
and orders by it, without defining it in the query.

internal/memory/sqlite.go[68-72]
internal/memory/retrieval.go[66-75]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`ftsSearch` queries a `rank` value that is not defined by the FTS schema in this repo. This can break FTS-only retrieval and cause hybrid retrieval to silently drop FTS results.
### Issue Context
- The schema creates `chunks_fts` with only `id` and `content` columns.
- The retrieval query selects `rank` and orders by it.
### Fix Focus Areas
- internal/memory/sqlite.go[68-72]
- internal/memory/retrieval.go[66-75]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Summarization ignores TurnTimeout🐞 Bug ⛯ Reliability
Description
Rolling summarization calls the provider after the per-turn context is cancelled and uses the parent
context, so it is not bounded by TurnTimeout. A slow/hung summarization call can stall the whole
run.
Code

internal/runtime/runtime.go[R127-139]

  	turnCancel()
  	for _, msg := range toolMsgs {
  		session.Append(msg)
+			r.persistEvent(ctx, msg)
  	}

  	// Check if the parent context was cancelled during tool dispatch.
  	if ctx.Err() != nil {
  		return nil, types.ErrCancelled
  	}

-		r.checkContextBudget(session)
+		r.checkContextBudget(ctx, session, progress)
  }
Evidence
The per-turn context (turnCtx) is explicitly cancelled before checkContextBudget is invoked, and
checkContextBudget performs a provider call using the parent ctx, bypassing the intended
per-turn timeout protection.

internal/runtime/runtime.go[90-139]
internal/runtime/runtime.go[419-431]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Summarization is executed outside the per-turn timeout budget. This can cause `yantra run` to hang even when `TurnTimeout` is configured.
### Issue Context
- `turnCancel()` is called before `checkContextBudget()`.
- `checkContextBudget()` calls `provider.Complete()` with the parent context.
### Fix Focus Areas
- internal/runtime/runtime.go[90-139]
- internal/runtime/runtime.go[419-431]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Persistence ignores TurnTimeout🐞 Bug ⛯ Reliability
Description
Conversation persistence uses the parent context instead of the per-turn context, so DB stalls/locks
can block the run beyond the configured turn timeout. This affects every message persisted
(user/assistant/tool).
Code

internal/runtime/runtime.go[R144-151]

+// persistEvent stores a message to conversation history if memory is configured.
+func (r *AgentRuntime) persistEvent(ctx context.Context, msg types.Message) {
+	if r.memory == nil || r.sessionID == "" {
+		return
+	}
+	if err := r.memory.StoreConversationEvent(ctx, r.sessionID, msg); err != nil {
+		slog.Warn("failed to persist conversation event", "error", err)
+	}
Evidence
persistEvent accepts a context and is always invoked with the parent ctx in the turn loop, so
persistence is not bounded by the per-turn timeout even though it performs DB writes.

internal/runtime/runtime.go[88-114]
internal/runtime/runtime.go[144-151]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Conversation persistence bypasses the per-turn timeout and can block the run if SQLite is busy/locked.
### Issue Context
- `persistEvent` is called with the parent `ctx`.
- It performs DB writes via `StoreConversationEvent`.
### Fix Focus Areas
- internal/runtime/runtime.go[88-114]
- internal/runtime/runtime.go[144-151]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

4. Summary/scratchpad errors swallowed🐞 Bug ⛯ Reliability
Description
GetSummary and GetScratchpad treat any DB/JSON error as “no data” and return nil/empty state.
This hides real failures (corruption, I/O, unexpected schema issues) and makes memory problems hard
to diagnose.
Code

internal/memory/store.go[R159-196]

+// GetSummary returns the rolling summary for a session.
+func (s *Store) GetSummary(ctx context.Context, sessionID string) (*types.SessionSummary, error) {
+	var summary string
+	var epoch int64
+	err := s.db.conn.QueryRowContext(ctx,
+		`SELECT summary, epoch FROM session_summaries WHERE session_id = ?`, sessionID).
+		Scan(&summary, &epoch)
+	if err != nil {
+		return nil, nil // no summary yet
+	}
+	return &types.SessionSummary{Summary: summary, Epoch: epoch}, nil
+}
+
+// SetSummary updates the rolling summary for a session.
+func (s *Store) SetSummary(ctx context.Context, sessionID string, summary types.SessionSummary) error {
+	_, err := s.db.conn.ExecContext(ctx,
+		`INSERT INTO session_summaries (session_id, summary, epoch) VALUES (?, ?, ?)
+		 ON CONFLICT(session_id) DO UPDATE SET summary = excluded.summary, epoch = excluded.epoch`,
+		sessionID, summary.Summary, summary.Epoch)
+	if err != nil {
+		return &types.MemoryError{Op: "set_summary", Message: "upsert", Err: err}
+	}
+	return nil
+}
+
+// GetScratchpad returns the scratchpad state for a session.
+func (s *Store) GetScratchpad(ctx context.Context, sessionID string) (*types.ScratchpadState, error) {
+	var data string
+	err := s.db.conn.QueryRowContext(ctx,
+		`SELECT data FROM scratchpads WHERE session_id = ?`, sessionID).
+		Scan(&data)
+	if err != nil {
+		return &types.ScratchpadState{Data: make(map[string]string)}, nil
+	}
+	var state types.ScratchpadState
+	if err := json.Unmarshal([]byte(data), &state); err != nil {
+		return &types.ScratchpadState{Data: make(map[string]string)}, nil
+	}
Evidence
Both methods return success values on all errors instead of distinguishing expected no rows from
unexpected failures, preventing callers and logs from seeing actual DB issues.

internal/memory/store.go[159-168]
internal/memory/store.go[185-196]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Memory read APIs currently hide real failures by returning nil/empty values for all errors.
### Issue Context
This can mask DB corruption, permission issues, or unexpected schema problems, making memory behavior silently incorrect.
### Fix Focus Areas
- internal/memory/store.go[159-168]
- internal/memory/store.go[185-196]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


5. Event persistence not atomic🐞 Bug ⛯ Reliability
Description
StoreConversationEvent inserts the event and updates session counters as two independent
statements without a transaction. Partial failures can leave conversation history and
message_count/updated_at inconsistent.
Code

internal/memory/store.go[R230-244]

+	_, err := s.db.conn.ExecContext(ctx,
+		`INSERT INTO conversation_events (session_id, role, content, tool_calls, tool_call_id, tool_name)
+		 VALUES (?, ?, ?, ?, ?, ?)`,
+		sessionID, string(msg.Role), msg.Content, toolCallsJSON, msg.ToolCallID, msg.ToolName)
+	if err != nil {
+		return &types.MemoryError{Op: "store_event", Message: "insert", Err: err}
+	}
+
+	// Bump session message count.
+	_, err = s.db.conn.ExecContext(ctx,
+		`UPDATE sessions SET message_count = message_count + 1, updated_at = datetime('now') WHERE id = ?`,
+		sessionID)
+	if err != nil {
+		return &types.MemoryError{Op: "store_event", Message: "update session count", Err: err}
+	}
Evidence
A failure after the insert but before updating the session leaves the DB in a partially-updated
state; the runtime currently logs and continues, so inconsistency may persist unnoticed.

internal/memory/store.go[230-244]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Conversation event insert and session counter update are not transactional, so the DB can become inconsistent on partial failure.
### Issue Context
The runtime persists events frequently; transactional integrity here reduces hard-to-debug state drift.
### Fix Focus Areas
- internal/memory/store.go[219-245]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Advisory comments

6. rand.Read errors ignored🐞 Bug ⛯ Reliability
Description
Chunk/session ID generators ignore crypto/rand.Read errors. While rare, failures can produce
weak/unknown randomness and increase collision risk or make debugging ID issues difficult.
Code

internal/memory/store.go[R305-309]

+// generateID creates a random hex ID.
+func generateID() string {
+	b := make([]byte, 12)
+	rand.Read(b)
+	return hex.EncodeToString(b)
Evidence
Both ID generation helpers discard the error return from rand.Read, so callers cannot detect and
react to entropy failures.

internal/memory/store.go[305-309]
internal/memory/session_store.go[126-129]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
ID generation ignores entropy read errors.
### Issue Context
Even if rare, surfacing the error makes the system more diagnosable and avoids silent weak-ID generation.
### Fix Focus Areas
- internal/memory/store.go[305-310]
- internal/memory/session_store.go[126-130]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

Comment thread internal/memory/retrieval.go
Comment thread internal/runtime/runtime.go Outdated
Comment thread internal/runtime/runtime.go
@hackertron hackertron merged commit fb6a7d3 into main Mar 3, 2026
@hackertron hackertron deleted the feat/memory-sqlite-embeddings-hybrid-retrieval branch March 3, 2026 07:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant