Add shared long-term memory server (experimental)#5015
Draft
Add shared long-term memory server (experimental)#5015
Conversation
Introduces the core domain types for ToolHive's shared long-term memory system: MemoryEntry, MemoryRevision, typed constants for MemoryType, AuthorType, SourceType, EntryStatus, and ArchiveReason, plus filter and result types used by the store interface. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Introduces pkg/memory with three pluggable interfaces (Store, VectorStore, Embedder), a Service orchestration layer with conflict detection and score-weighted search ranking, SQLite-backed implementations, an Ollama embedder, and gomock mocks for all interfaces. Key behaviours: - Conflict detection on write: cosine similarity > 0.85 blocks the write and returns conflicting entries for the agent to resolve - Trust scoring: author weight × age decay × correction penalty × flag multiplier - Staleness scoring: access age + flag bonus + correction bonus - Search ranking: composite score (similarity × trust × staleness penalty) so flagged/stale entries do not rank above fresh, trusted ones - TypeEpisodic memory type for time-indexed event records - ListFilter time-range fields (CreatedAfter/CreatedBefore) for timeline queries - SQLite migration 002 widens the type CHECK constraint to include episodic Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Standalone MCP server exposing 9 memory tools over streamable HTTP (/mcp endpoint, /health liveness probe). Wires SQLite store and vector store, Ollama embedder, and a background lifecycle job that runs every 24h to expire TTL'd entries and recompute trust/staleness scores. Tools: memory_remember, memory_search, memory_recall, memory_forget, memory_update, memory_flag, memory_list, memory_consolidate, memory_crystallize. Config via memory-server.yaml with defaults (SQLite + sqlite-vec + Ollama on localhost:11434, listening on 0.0.0.0:8080). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Covers architecture, MCP tool surface, trust/staleness scoring, conflict detection, Skills relationship, a comparison with LinkedIn's Cognitive Memory Agent, and the recommended three-tier memory activation strategy (session-boundary injection, signal-based mid-session reads, write-on-observation). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Large PR Detected
This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.
How to unblock this PR:
Add a section to your PR description with the following format:
## Large PR Justification
[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformationAlternative:
Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.
See our Contributing Guidelines for more details.
This review will be automatically dismissed once you add the justification section.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #5015 +/- ##
==========================================
- Coverage 69.02% 68.52% -0.51%
==========================================
Files 554 573 +19
Lines 73075 74128 +1053
==========================================
+ Hits 50443 50797 +354
- Misses 19620 20254 +634
- Partials 3012 3077 +65 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
ToolHive manages MCPs (tools) and Skills (procedural knowledge as OCI artifacts). The missing primitive is shared long-term memory — a knowledge store that agents can query and contribute to across sessions. Without it every agent session starts cold, and facts learned in one session are invisible to others.
This PR introduces the memory server core (Plan 1 of 3):
pkg/memory/— domain types (Entry,Revision,ListFilter), three pluggable interfaces (Store,VectorStore,Embedder), aServiceorchestration layer with conflict detection and score-weighted search ranking, trust/staleness scoring formulas, and gomock mockspkg/memory/sqlite/— SQLite-backedStoreandVectorStore(Go-native cosine similarity, no CGo dependency); goose migrations including aTypeEpisodictype for time-indexed event recordspkg/memory/embedder/ollama/— Ollama HTTP embedder that probes vector dimensions on startupcmd/thv-memory/— standalone MCP server binary serving 9 tools over streamable HTTP (/mcp), with a/healthliveness probe, YAML config with sensible defaults, and a background lifecycle job (TTL expiry, score recomputation every 24h)docs/proposals/2026-04-22-shared-memory-server.md— full design doc covering architecture, tool surface, scoring, conflict detection, Skills relationship, comparison with LinkedIn's Cognitive Memory Agent, and the recommended three-tier memory activation strategyKey design decisions:
similarity × trust_score × (1 − 0.3 × staleness_score)so flagged/stale entries don't rank above fresh, trusted onessemantic(aggregated facts),procedural(how-to),episodic(time-indexed events withCreatedAfter/CreatedBeforelist filters)Plans 2 (CLI
thv memorysubcommand + system workload integration) and 3 (KubernetesMCPMemoryServerCRD) are follow-up work.Type of change
Test plan
task test)task lint-fix)cmd/thv-memory/integration_test.gowires real SQLite store + vector store + fake embedder end-to-end: remember → search → access count increment → delete →ErrNotFound; conflict detection test verifies force-write path)Changes
pkg/memory/types.goEntry,Revision,ListFilter(with time-range fields),VectorFilter,Type(semantic/procedural/episodic), scoring typespkg/memory/interfaces.goStore,VectorStore,Embedderinterfaces + mockgen directivespkg/memory/service.goService: conflict detection,Remember,Searchwith composite rankingpkg/memory/scoring.goComputeTrustScore,ComputeStalenessScorepkg/memory/sqlite/pkg/memory/embedder/ollama/pkg/memory/mocks/cmd/thv-memory/main.gocmd/thv-memory/server.go/healthcmd/thv-memory/config.go0.0.0.0:8080)cmd/thv-memory/lifecycle/job.gocmd/thv-memory/tools/cmd/thv-memory/integration_test.godocs/proposals/2026-04-22-shared-memory-server.mdDoes this introduce a user-facing change?
No — this adds a new standalone binary (
cmd/thv-memory) and supporting packages. Nothing in the existing CLI or operator is modified. The binary is not yet wired intothvcommands (that is Plan 2).Implementation plan
Approved implementation plan
This PR was planned and implemented with Claude Code. The design spec is at
docs/proposals/2026-04-22-shared-memory-server.md. The implementation follows the spec with the following notable adaptations:EntrynotMemoryEntry,StorenotMemoryStore) to satisfy therevivelintergoose.NewProvider(scoped) used instead of globalgoose.SetBaseFS/SetDialectto avoid concurrent-open racesserver.NewStreamableHTTPServer+server.WithStdioContextFuncused to match actual mcp-go v0.48.0 APIVectorStoreinterface for datasets > 100K entriesSpecial notes for reviewers
This is experimental — do not merge until Plans 2 and 3 are ready. Specific areas to scrutinise:
pkg/memory/sqlite/vector.go: the load-all-and-score approach works for small datasets but will not scale past ~100K entries. TheVectorStoreinterface is designed to be swapped for Qdrant/pgvector when needed.pkg/memory/service.go: the conflict threshold (0.85) and staleness penalty weight (0.3) are initial values — they will need tuning against real usage data.cmd/thv-memory/server.go: no auth middleware on the MCP endpoint yet. Auth will be enforced at the ToolHive proxy layer when the system workload integration lands in Plan 2.Generated with Claude Code