feat: add persistent memory with SQLite, embeddings, and hybrid retrieval by hackertron · Pull Request #10 · hackertron/Yantra

hackertron · 2026-03-02T09:00:12Z

Summary

Add persistent memory system backed by pure-Go SQLite (modernc.org/sqlite) with WAL mode, FTS5 full-text search, and OpenAI text-embedding-3-small vector embeddings
Implement hybrid retrieval combining vector cosine similarity and FTS5 BM25 ranking via reciprocal rank fusion (RRF)
Replace the checkContextBudget() stub with real rolling summarization that compacts conversation history when context budget is exceeded
Add memory_search (ReadOnly) and memory_save (SideEffecting) tools for LLM-driven memory operations
Wire memory DB, embedding backend, session store, and conversation persistence into the CLI run command

New files

File	Purpose
`internal/memory/sqlite.go`	DB connection, WAL mode, schema migration (6 tables)
`internal/memory/embedding.go`	Embedding backend factory
`internal/memory/embedding_openai.go`	OpenAI embedder via `openai-go` SDK
`internal/memory/retrieval.go`	Vector search, FTS5, cosine similarity, RRF, binary encoding
`internal/memory/store.go`	`MemoryRetrieval` implementation
`internal/memory/session_store.go`	`SessionStore` CRUD backed by SQLite
`internal/memory/memory_test.go`	11 test cases
`internal/tool/memory_search.go`	`memory_search` tool
`internal/tool/memory_save.go`	`memory_save` tool

Design decisions

Pure Go SQLite — no CGO required, FTS5 included out of the box
Brute-force cosine similarity — fast enough for MVP scale (<10k chunks)
Graceful degradation — missing OPENAI_API_KEY falls back to FTS-only; memory init failure continues without memory
Rolling summarization — uses the same LLM provider to generate summaries when context budget is hit

Test plan

go build ./... passes
go vet ./... passes
go test ./internal/memory/ -race -count=1 — 11/11 tests pass
go test ./... -race -count=1 — all packages pass
E2E: yantra run "Remember my favorite language is Go" then yantra run "What is my favorite language?" (requires API keys)
Verify .yantra/memory.db is created with correct schema

🤖 Generated with Claude Code

…eval Implement Step 5 of the architecture — persistent memory backed by SQLite (pure Go, no CGO) with OpenAI embeddings and hybrid vector+FTS retrieval using reciprocal rank fusion. Replace the context budget stub with real rolling summarization. New packages/files: - internal/memory/ — SQLite DB layer, OpenAI embedding backend, memory store (MemoryRetrieval), session store, vector/FTS retrieval with RRF - internal/tool/memory_search.go — memory_search tool (ReadOnly) - internal/tool/memory_save.go — memory_save tool (SideEffecting) Modified: - RegisterBuiltins now accepts optional MemoryRetrieval for memory tools - AgentRuntime gains SetMemory() for persistent conversation history and real summarization when context budget is exceeded - Session gains CompactWithSummary() for context window compaction - CLI wires memory DB, embedder, session store, and tools 11 new tests covering all memory operations, all passing with -race. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

qodo-code-review · 2026-03-02T09:00:28Z

Review Summary by Qodo

Add persistent memory with SQLite, embeddings, and hybrid retrieval

✨ Enhancement

Walkthroughs

Description

• Implement persistent memory system with SQLite, OpenAI embeddings, and hybrid retrieval
• Add rolling summarization to compact conversation history when context budget exceeded
• Introduce memory_search (ReadOnly) and memory_save (SideEffecting) LLM tools
• Wire memory DB, embedder, session store into CLI and runtime with graceful degradation

Diagram

flowchart LR
  A["CLI run command"] -->|opens| B["SQLite DB"]
  A -->|creates| C["OpenAI Embedder"]
  A -->|initializes| D["Memory Store"]
  A -->|creates| E["Session"]
  D -->|hybrid search| F["Vector + FTS Results"]
  F -->|RRF fusion| G["Ranked Chunks"]
  H["Agent Runtime"] -->|stores events| D
  H -->|checks budget| I["Summarization"]
  I -->|compacts| J["Session with Summary"]
  K["memory_search tool"] -->|queries| D
  L["memory_save tool"] -->|persists| D

File Changes

1. internal/memory/sqlite.go ✨ Enhancement +115/-0

SQLite database layer with WAL mode and schema

internal/memory/sqlite.go

2. internal/memory/embedding.go ✨ Enhancement +24/-0

Embedding backend factory with graceful degradation

internal/memory/embedding.go

3. internal/memory/embedding_openai.go ✨ Enhancement +76/-0

OpenAI embedder implementation using openai-go SDK

internal/memory/embedding_openai.go

View more (13)

4. internal/memory/retrieval.go ✨ Enhancement +214/-0

Vector search, FTS5, cosine similarity, and RRF fusion

internal/memory/retrieval.go

5. internal/memory/store.go ✨ Enhancement +312/-0

MemoryRetrieval implementation with hybrid query support

internal/memory/store.go

6. internal/memory/session_store.go ✨ Enhancement +132/-0

SessionStore CRUD operations backed by SQLite

internal/memory/session_store.go

7. internal/memory/memory_test.go 🧪 Tests +420/-0

Comprehensive test suite for memory operations

internal/memory/memory_test.go

8. internal/tool/memory_search.go ✨ Enhancement +75/-0

Memory search tool for LLM-driven retrieval

internal/tool/memory_search.go

9. internal/tool/memory_save.go ✨ Enhancement +62/-0

Memory save tool for LLM-driven persistence

internal/tool/memory_save.go

10. internal/runtime/runtime.go ✨ Enhancement +165/-10

Integrate memory, session persistence, and rolling summarization

internal/runtime/runtime.go

11. internal/runtime/session.go ✨ Enhancement +24/-0

Add CompactWithSummary method for context window compaction

internal/runtime/session.go

12. internal/tool/builtin.go ✨ Enhancement +5/-1

Register memory tools when MemoryRetrieval is available

internal/tool/builtin.go

13. internal/tool/builtin_test.go 🧪 Tests +1/-1

Update RegisterBuiltins test to pass nil memory parameter

internal/tool/builtin_test.go

14. cmd/yantra/main.go ✨ Enhancement +50/-3

Wire memory DB, embedder, and session into CLI run command

cmd/yantra/main.go

15. go.mod Dependencies +12/-2

Add modernc.org/sqlite and openai-go dependencies

go.mod

16. go.sum Dependencies +53/-4

Update checksums for new SQLite and related dependencies

go.sum

qodo-code-review · 2026-03-02T09:00:29Z

Code Review by Qodo

🐞 Bugs (6) 📘 Rule violations (0) 📎 Requirement gaps (0)

1. ~~FTS rank query broken~~ ☑ 🐞 Bug ✓ Correctness

Description

ftsSearch selects/orders by rank, but the FTS table schema doesn’t define a rank column. This
can cause FTS-only memory retrieval to fail at runtime and hybrid retrieval to silently degrade.

Code

internal/memory/retrieval.go[R66-75]

+// ftsSearch performs a full-text search using SQLite FTS5.
+func ftsSearch(ctx context.Context, db *DB, query string, topN int) ([]scoredChunk, error) {
+	rows, err := db.conn.QueryContext(ctx,
+		`SELECT f.id, f.content, c.source, c.tags, rank
+		 FROM chunks_fts f
+		 JOIN chunks c ON c.id = f.id
+		 WHERE chunks_fts MATCH ?
+		 ORDER BY rank
+		 LIMIT ?`, query, topN)
+	if err != nil {

Evidence
The FTS table is created with only id and content columns, but the retrieval SQL selects rank
and orders by it, without defining it in the query.
internal/memory/sqlite.go[68-72]
internal/memory/retrieval.go[66-75]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`ftsSearch` queries a `rank` value that is not defined by the FTS schema in this repo. This can break FTS-only retrieval and cause hybrid retrieval to silently drop FTS results.
### Issue Context
- The schema creates `chunks_fts` with only `id` and `content` columns.
- The retrieval query selects `rank` and orders by it.
### Fix Focus Areas
- internal/memory/sqlite.go[68-72]
- internal/memory/retrieval.go[66-75]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

2. ~~Summarization ignores TurnTimeout~~ ☑ 🐞 Bug ⛯ Reliability

Description

Rolling summarization calls the provider after the per-turn context is cancelled and uses the parent
context, so it is not bounded by TurnTimeout. A slow/hung summarization call can stall the whole
run.

Code

internal/runtime/runtime.go[R127-139]

  	turnCancel()
  	for _, msg := range toolMsgs {
  		session.Append(msg)
+			r.persistEvent(ctx, msg)
  	}

  	// Check if the parent context was cancelled during tool dispatch.
  	if ctx.Err() != nil {
  		return nil, types.ErrCancelled
  	}

-		r.checkContextBudget(session)
+		r.checkContextBudget(ctx, session, progress)
  }

Evidence

The per-turn context (turnCtx) is explicitly cancelled before checkContextBudget is invoked, and
checkContextBudget performs a provider call using the parent ctx, bypassing the intended
per-turn timeout protection.

internal/runtime/runtime.go[90-139]
internal/runtime/runtime.go[419-431]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Summarization is executed outside the per-turn timeout budget. This can cause `yantra run` to hang even when `TurnTimeout` is configured.
### Issue Context
- `turnCancel()` is called before `checkContextBudget()`.
- `checkContextBudget()` calls `provider.Complete()` with the parent context.
### Fix Focus Areas
- internal/runtime/runtime.go[90-139]
- internal/runtime/runtime.go[419-431]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

3. ~~Persistence ignores TurnTimeout~~ ☑ 🐞 Bug ⛯ Reliability

Description

Conversation persistence uses the parent context instead of the per-turn context, so DB stalls/locks
can block the run beyond the configured turn timeout. This affects every message persisted
(user/assistant/tool).

Code

internal/runtime/runtime.go[R144-151]

+// persistEvent stores a message to conversation history if memory is configured.
+func (r *AgentRuntime) persistEvent(ctx context.Context, msg types.Message) {
+	if r.memory == nil || r.sessionID == "" {
+		return
+	}
+	if err := r.memory.StoreConversationEvent(ctx, r.sessionID, msg); err != nil {
+		slog.Warn("failed to persist conversation event", "error", err)
+	}

Evidence

persistEvent accepts a context and is always invoked with the parent ctx in the turn loop, so
persistence is not bounded by the per-turn timeout even though it performs DB writes.

internal/runtime/runtime.go[88-114]
internal/runtime/runtime.go[144-151]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Conversation persistence bypasses the per-turn timeout and can block the run if SQLite is busy/locked.
### Issue Context
- `persistEvent` is called with the parent `ctx`.
- It performs DB writes via `StoreConversationEvent`.
### Fix Focus Areas
- internal/runtime/runtime.go[88-114]
- internal/runtime/runtime.go[144-151]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

4. ~~Summary/scratchpad errors swallowed~~ ☑ 🐞 Bug ⛯ Reliability

Description

GetSummary and GetScratchpad treat any DB/JSON error as “no data” and return nil/empty state.
This hides real failures (corruption, I/O, unexpected schema issues) and makes memory problems hard
to diagnose.

Code

internal/memory/store.go[R159-196]

+// GetSummary returns the rolling summary for a session.
+func (s *Store) GetSummary(ctx context.Context, sessionID string) (*types.SessionSummary, error) {
+	var summary string
+	var epoch int64
+	err := s.db.conn.QueryRowContext(ctx,
+		`SELECT summary, epoch FROM session_summaries WHERE session_id = ?`, sessionID).
+		Scan(&summary, &epoch)
+	if err != nil {
+		return nil, nil // no summary yet
+	}
+	return &types.SessionSummary{Summary: summary, Epoch: epoch}, nil
+}
+
+// SetSummary updates the rolling summary for a session.
+func (s *Store) SetSummary(ctx context.Context, sessionID string, summary types.SessionSummary) error {
+	_, err := s.db.conn.ExecContext(ctx,
+		`INSERT INTO session_summaries (session_id, summary, epoch) VALUES (?, ?, ?)
+		 ON CONFLICT(session_id) DO UPDATE SET summary = excluded.summary, epoch = excluded.epoch`,
+		sessionID, summary.Summary, summary.Epoch)
+	if err != nil {
+		return &types.MemoryError{Op: "set_summary", Message: "upsert", Err: err}
+	}
+	return nil
+}
+
+// GetScratchpad returns the scratchpad state for a session.
+func (s *Store) GetScratchpad(ctx context.Context, sessionID string) (*types.ScratchpadState, error) {
+	var data string
+	err := s.db.conn.QueryRowContext(ctx,
+		`SELECT data FROM scratchpads WHERE session_id = ?`, sessionID).
+		Scan(&data)
+	if err != nil {
+		return &types.ScratchpadState{Data: make(map[string]string)}, nil
+	}
+	var state types.ScratchpadState
+	if err := json.Unmarshal([]byte(data), &state); err != nil {
+		return &types.ScratchpadState{Data: make(map[string]string)}, nil
+	}

Evidence

Both methods return success values on all errors instead of distinguishing expected no rows from
unexpected failures, preventing callers and logs from seeing actual DB issues.

internal/memory/store.go[159-168]
internal/memory/store.go[185-196]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Memory read APIs currently hide real failures by returning nil/empty values for all errors.
### Issue Context
This can mask DB corruption, permission issues, or unexpected schema problems, making memory behavior silently incorrect.
### Fix Focus Areas
- internal/memory/store.go[159-168]
- internal/memory/store.go[185-196]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

5. ~~Event persistence not atomic~~ ☑ 🐞 Bug ⛯ Reliability

Description

StoreConversationEvent inserts the event and updates session counters as two independent
statements without a transaction. Partial failures can leave conversation history and
message_count/updated_at inconsistent.

Code

internal/memory/store.go[R230-244]

+	_, err := s.db.conn.ExecContext(ctx,
+		`INSERT INTO conversation_events (session_id, role, content, tool_calls, tool_call_id, tool_name)
+		 VALUES (?, ?, ?, ?, ?, ?)`,
+		sessionID, string(msg.Role), msg.Content, toolCallsJSON, msg.ToolCallID, msg.ToolName)
+	if err != nil {
+		return &types.MemoryError{Op: "store_event", Message: "insert", Err: err}
+	}
+
+	// Bump session message count.
+	_, err = s.db.conn.ExecContext(ctx,
+		`UPDATE sessions SET message_count = message_count + 1, updated_at = datetime('now') WHERE id = ?`,
+		sessionID)
+	if err != nil {
+		return &types.MemoryError{Op: "store_event", Message: "update session count", Err: err}
+	}

Evidence

A failure after the insert but before updating the session leaves the DB in a partially-updated
state; the runtime currently logs and continues, so inconsistency may persist unnoticed.

internal/memory/store.go[230-244]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Conversation event insert and session counter update are not transactional, so the DB can become inconsistent on partial failure.
### Issue Context
The runtime persists events frequently; transactional integrity here reduces hard-to-debug state drift.
### Fix Focus Areas
- internal/memory/store.go[219-245]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

6. ~~rand.Read errors ignored~~ ☑ 🐞 Bug ⛯ Reliability

Description

Chunk/session ID generators ignore crypto/rand.Read errors. While rare, failures can produce
weak/unknown randomness and increase collision risk or make debugging ID issues difficult.

Code

internal/memory/store.go[R305-309]

+// generateID creates a random hex ID.
+func generateID() string {
+	b := make([]byte, 12)
+	rand.Read(b)
+	return hex.EncodeToString(b)

Evidence
Both ID generation helpers discard the error return from rand.Read, so callers cannot detect and
react to entropy failures.
internal/memory/store.go[305-309]
internal/memory/session_store.go[126-129]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
ID generation ignores entropy read errors.
### Issue Context
Even if rare, surfacing the error makes the system more diagnosable and avoids silent weak-ID generation.
### Fix Focus Areas
- internal/memory/store.go[305-310]
- internal/memory/session_store.go[126-130]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

ⓘ The new review experience is currently in Beta. Learn more

qodo-code-review Bot reviewed Mar 2, 2026

View reviewed changes

Comment thread internal/memory/retrieval.go

Comment thread internal/runtime/runtime.go Outdated

Comment thread internal/runtime/runtime.go

Use this:

ca9cfda

hackertron merged commit fb6a7d3 into main Mar 3, 2026

hackertron deleted the feat/memory-sqlite-embeddings-hybrid-retrieval branch March 3, 2026 07:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add persistent memory with SQLite, embeddings, and hybrid retrieval#10

feat: add persistent memory with SQLite, embeddings, and hybrid retrieval#10
hackertron merged 2 commits intomainfrom
feat/memory-sqlite-embeddings-hybrid-retrieval

hackertron commented Mar 2, 2026

Uh oh!

qodo-code-review Bot commented Mar 2, 2026

Uh oh!

qodo-code-review Bot commented Mar 2, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hackertron commented Mar 2, 2026

Summary

New files

Design decisions

Test plan

Uh oh!

qodo-code-review Bot commented Mar 2, 2026

Review Summary by Qodo

Walkthroughs

File Changes

Uh oh!

qodo-code-review Bot commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review by Qodo

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

qodo-code-review Bot commented Mar 2, 2026 •

edited

Loading