feat: add history threading and memory-retriever registry spec by aswhitehouse · Pull Request #77 · prime-vector/open-agent-spec

aswhitehouse · 2026-04-13T04:24:29Z

OAS is stateless — it never stores conversation history. This PR implements two clean memory patterns that respect that boundary:

History threading (short-term memory)

history is now a reserved input convention: an array of prior turns in OpenAI wire format [{"role": "user"|"assistant", "content": "…"}]
Python runner extracts history from input_data and passes it through the provider stack; injected between system prompt and current user message at the HTTP payload level
npm runner (OpenAI + Anthropic providers) receives the same treatment
All provider signatures updated: base.IntelligenceProvider.invoke, OpenAIProvider, AnthropicProvider, CodexProvider, CustomProvider, invoke_intelligence registry function
11 new tests in tests/test_history_threading.py; all existing test fake_invoke stubs updated to accept the optional history parameter

prime-vector/memory-retriever registry spec (long-term memory)

New registry spec at Website/public/registry/prime-vector/memory-retriever/
Accepts query, memory_store_url, top_k; calls external store; returns {history: [...turns], memory_count} ready for depends_on chaining
Added to Website/public/registry/index.json manifest

New examples

examples/chat-agent/: spec.yaml + chat.py REPL for session history
examples/memory-chat/: pipeline.yaml chaining memory-retriever → chat via depends_on, plus README with mock memory-store script

docs/REFERENCE.md: new Memory management section covering both patterns, boundary table (what OAS does NOT do), and links to examples

What changed

Why

How tested

All existing tests pass (pytest tests/)
New tests added (if applicable)

Type of change

Breaking changes

Checklist

Code follows project style (ruff check . && ruff format --check .)
Self-review completed
Version bumped if needed (pyproject.toml, templates, docs)

OAS is stateless — it never stores conversation history. This PR implements two clean memory patterns that respect that boundary: History threading (short-term memory) - `history` is now a reserved input convention: an array of prior turns in OpenAI wire format [{"role": "user"|"assistant", "content": "…"}] - Python runner extracts `history` from input_data and passes it through the provider stack; injected between system prompt and current user message at the HTTP payload level - npm runner (OpenAI + Anthropic providers) receives the same treatment - All provider signatures updated: base.IntelligenceProvider.invoke, OpenAIProvider, AnthropicProvider, CodexProvider, CustomProvider, invoke_intelligence registry function - 11 new tests in tests/test_history_threading.py; all existing test fake_invoke stubs updated to accept the optional history parameter prime-vector/memory-retriever registry spec (long-term memory) - New registry spec at Website/public/registry/prime-vector/memory-retriever/ - Accepts query, memory_store_url, top_k; calls external store; returns {history: [...turns], memory_count} ready for depends_on chaining - Added to Website/public/registry/index.json manifest New examples - examples/chat-agent/: spec.yaml + chat.py REPL for session history - examples/memory-chat/: pipeline.yaml chaining memory-retriever → chat via depends_on, plus README with mock memory-store script docs/REFERENCE.md: new Memory management section covering both patterns, boundary table (what OAS does NOT do), and links to examples Made-with: Cursor

vercel · 2026-04-13T04:24:34Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
open-agent-spec	Ready	Preview, Comment	Apr 13, 2026 6:09am

… blocks

sgriffiths · 2026-04-13T04:54:22Z

Code review

Found 2 issues:

History silently dropped when tools are active — history is extracted from input_data on line 673 but only forwarded in the no-tools branch (invoke_intelligence on line 682). The tools branch calls _invoke_with_tools on line 678 without passing history, so any task that declares tools will lose all conversation context. The _invoke_with_tools function signature has no history parameter and its internal message list starts without prior turns.

open-agent-spec/oas_cli/runner.py

Lines 672 to 682 in 06c7a4e

    
           # history is a reserved input convention — never stored by OAS, just forwarded. 
        
           history: list[dict] | None = input_data.get("history") or None 
        
           try: 
        
               tools = resolve_task_tools(spec_data, task_name) 
        
               if tools: 
        
                   raw_output = _invoke_with_tools( 
        
                       system, user, tools, intelligence_config, task_name 
        
                   ) 
        
               else: 
        
                   raw_output = invoke_intelligence(system, user, intelligence_config, history)

memory-retriever spec references template variables that can never be populated — The user prompt in the registry spec uses {memory_store_results} and {top_k_effective}, but neither appears in the task's input.properties (which only defines query, memory_store_url, top_k). OAS template substitution only replaces variables present in input_data — there is no mechanism to make an HTTP call to the memory store or compute intermediate values during prompt rendering. The prompt will be sent to the model with literal unfilled placeholders, making the spec non-functional as published.

https://github.com/prime-vector/open-agent-spec/blob/06c7a4e041756407e08fd94f7d23c65fc9b752ad/Website/public/registry/prime-vector/memory-retriever/1.0.0/spec.yaml#L93-L116

🤖 Generated with Claude Code

_{- If this code review was useful, please react with 👍. Otherwise, react with 👎.}

Fixes two issues from code review: 1. History silently dropped when tools active (runner.py) _invoke_with_tools had no history parameter so conversation context was always lost for tool-using tasks. - Added history: list[dict] | None = None to _invoke_with_tools - Prior turns injected before the current user turn in the initial message list (mirroring the no-tools path) - Falls back path (text-only providers) also forwards history to invoke_intelligence - Call site in _run_single_task now passes history to both branches - 2 new regression tests in test_history_threading.py 2. memory-retriever spec referenced unfillable template variables {memory_store_results} and {top_k_effective} appeared in the user prompt but were not in input.properties; OAS has no mechanism to make HTTP calls during prompt rendering so the spec was non-functional. Redesigned as an LLM re-ranker: - Input now requires `candidates` (pre-fetched turns from caller's store) instead of performing retrieval itself - {candidates} and {top_k} are always in scope from input_data - Spec description updated to clarify the two-layer architecture: store fetch (application code) + re-rank (this spec) - examples/memory-chat/input.json updated with sample candidates - examples/memory-chat/README.md rewritten with correct architecture diagram and Python integration snippet - docs/REFERENCE.md Pattern 2 updated to match Made-with: Cursor

Made-with: Cursor

sgriffiths · 2026-04-13T05:49:18Z

            from .openai_http import OpenAIProvider

-            return OpenAIProvider().invoke(system=system, user=user, config=config)
+            return OpenAIProvider().invoke(


Minor one: CustomProvider._invoke_class has the same history gap you just fixed in _invoke_with_tools. invoke() accepts history and forwards it on the OpenAI fallback path (line 40-42), but the class-based path calls _invoke_class() on line 44 without it _invoke_class has no history parameter.
Lower impact since custom module providers are less common, but same pattern.

The OpenAI-fallback path in CustomProvider.invoke() already forwarded history correctly, but the class-based path called _invoke_class() without it — same pattern as the _invoke_with_tools gap fixed earlier. Custom router classes use a flat single-string prompt interface, so history turns are serialised and prepended between the system prompt and the current user turn: <system> User: <prior turn> Assistant: <prior turn> … <current user message> This preserves conversation context for custom providers without requiring any changes to the router class interface (run() signature is unchanged). Added two tests in TestCustomProviderHistory: - history_prepended_to_merged_prompt: verifies turns appear in the correct position within the merged prompt string - no_history_omits_history_section: verifies the prompt is unchanged when history is None Made-with: Cursor

sgriffiths

LGTM — all four review items addressed. History threading is now consistent across tools path, custom providers, and the memory-retriever spec.

aswhitehouse requested a review from sgriffiths as a code owner April 13, 2026 04:24

Fix Ruff linting issues by removing unused imports and sorting import…

06c7a4e

… blocks

vercel Bot deployed to Preview April 13, 2026 04:26 View deployment

vercel Bot deployed to Preview April 13, 2026 05:12 View deployment

Fix variable naming convention violations detected by Ruff.

1b6438f

vercel Bot deployed to Preview April 13, 2026 05:13 View deployment

fix: rename FAKE_TOOL to fake_tool to satisfy N806 naming lint

73056e5

Made-with: Cursor

vercel Bot deployed to Preview April 13, 2026 05:38 View deployment

sgriffiths reviewed Apr 13, 2026

View reviewed changes

vercel Bot deployed to Preview April 13, 2026 06:09 View deployment

sgriffiths approved these changes Apr 13, 2026

View reviewed changes

aswhitehouse merged commit f10950e into main Apr 13, 2026
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add history threading and memory-retriever registry spec#77

feat: add history threading and memory-retriever registry spec#77
aswhitehouse merged 6 commits into
mainfrom
feat/history-and-memory

aswhitehouse commented Apr 13, 2026

Uh oh!

vercel Bot commented Apr 13, 2026 •

edited

Loading

Uh oh!

sgriffiths commented Apr 13, 2026

Uh oh!

sgriffiths Apr 13, 2026

Uh oh!

sgriffiths left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

aswhitehouse commented Apr 13, 2026

What changed

Why

How tested

Type of change

Breaking changes

Checklist

Uh oh!

vercel Bot commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sgriffiths commented Apr 13, 2026

Code review

Uh oh!

sgriffiths Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

sgriffiths left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vercel Bot commented Apr 13, 2026 •

edited

Loading