You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We would like to broadly support agentic data like working memory, conversation history, derived facts, embeddings of past interactions, durable task state, audit trails of decisions and actions" as native runtime properties.
Today these are buildable on tables but there is no opinionated primitive. Every team writing an agent on Harper would invent their own conversation schema, compaction policy, and recall logic. This issue proposes ConversationResource as the standard primitive, used by:
buildContext() is what scope.models.generate() calls when conversationId is set. It:
Pulls the last N turns verbatim.
Includes the rolling summary for older turns.
Optionally injects semantically-recalled turns and facts via vector search.
Stays under tokenBudget.
Compaction
Default policy: when conversation exceeds 50 turns OR 8K cumulative content tokens, summarize the oldest unsummarized half, advance summarized_through. Old turns stay in the table for audit; buildContext() uses the summary instead.
Multiple agents and humans append to the same conversation; each turn carries actor + role. buildContext() can filter by participant for per-agent views of a shared transcript. Agent-to-agent coordination = two agents writing to a shared conversation; live updates flow through the existing subscription mechanism on the conversation id — no new infrastructure.
@export: ConversationResource is auto-exposed via REST and MCP.
MCP (HarperFast/mcp-server): conversations are MCP resources; turns are MCP messages; recall() is an MCP tool.
Audit: every turn is in auditStore via the standard transaction log — audit trails come free.
Replication: conversations replicate across Fabric like any table; works for edge/disconnected scenarios.
Subscriptions: transactionBroadcast on conversations.id gives live-updating UIs for free.
Privacy and data minimization
Tenant boundary enforced by existing RBAC on the table.
ttl triggers hard delete via a sweeper job.
extractFacts() can run with a redaction-prompt step to strip PII before persisting.
Per-conversation metadata.retention = 'audit-only' | 'full' | 'ephemeral' controls whether turn content is retained or only hashes after summarization.
Streaming-append protocol
For generateStream:
appendOpen() creates a pending turn row with empty content.
Each chunk updates the row's content (single update transaction batched with a debounce, default 250ms — subscribers see live updates).
appendCommit() finalizes status, tokens, model.
Abort → superseded_by is set, row remains for audit.
Vector index on turn embeddings: global with conversation-id filter, or per-conversation? Suggested: global with filter — simpler and faster to start.
Fact extraction: opt-in cost (default off)? Suggested: yes — off by default, enabled per conversation in metadata.
Edits/regenerations: in-place mutation or superseded_by chain? Suggested: superseded_by chain — preserves audit, mirrors git-style history.
Cross-conversation memory: do we need a global "user memory" layer? Suggested: defer — let apps query across conversations via the standard vector index for v1.
Acceptance
ConversationResource base class lands in core alongside Resource.
Three system tables (conversations, turns, conversation_facts) auto-create.
Context
We would like to broadly support agentic data like working memory, conversation history, derived facts, embeddings of past interactions, durable task state, audit trails of decisions and actions" as native runtime properties.
Today these are buildable on tables but there is no opinionated primitive. Every team writing an agent on Harper would invent their own conversation schema, compaction policy, and recall logic. This issue proposes
ConversationResourceas the standard primitive, used by:GenerateOpts.conversationIdbinding.Data model (three auto-created tables)
API
buildContext()is whatscope.models.generate()calls whenconversationIdis set. It:tokenBudget.Compaction
Default policy: when conversation exceeds 50 turns OR 8K cumulative content tokens, summarize the oldest unsummarized half, advance
summarized_through. Old turns stay in the table for audit;buildContext()uses the summary instead.Configurable per conversation:
Multi-actor / multi-agent
Multiple agents and humans append to the same conversation; each turn carries
actor + role.buildContext()can filter by participant for per-agent views of a shared transcript. Agent-to-agent coordination = two agents writing to a shared conversation; live updates flow through the existing subscription mechanism on the conversation id — no new infrastructure.Integration points
GenerateOpts.conversationId→ auto-append +buildContext().@export:ConversationResourceis auto-exposed via REST and MCP.recall()is an MCP tool.auditStorevia the standard transaction log — audit trails come free.transactionBroadcastonconversations.idgives live-updating UIs for free.Privacy and data minimization
ttltriggers hard delete via a sweeper job.extractFacts()can run with a redaction-prompt step to strip PII before persisting.metadata.retention = 'audit-only' | 'full' | 'ephemeral'controls whether turncontentis retained or only hashes after summarization.Streaming-append protocol
For
generateStream:appendOpen()creates apendingturn row with emptycontent.content(single update transaction batched with a debounce, default 250ms — subscribers see live updates).appendCommit()finalizes status, tokens, model.superseded_byis set, row remains for audit.Dependencies
@embedschema directive (from model-access API, Add unified model-access API (scope.models) #510) — turn embeddings require it.summarize()/extractFacts()/ semantic recall to use the configured embedding/generative backends. Could ship with hardcoded calls and refactor when Add unified model-access API (scope.models) #510 lands; cleaner to land in order.Related work
ConversationResourceas its store.Open decisions
metadata.superseded_bychain? Suggested:superseded_bychain — preserves audit, mirrors git-style history.Acceptance
ConversationResourcebase class lands in core alongsideResource.conversations,turns,conversation_facts) auto-create.append,history,recall,summarize,compact,buildContextimplemented.@exportmechanism.GenerateOpts.conversationIdbinding works end-to-end with model-access API (Add unified model-access API (scope.models) #510).auditStoreand replicate to followers.Out of scope