Python: Core: add experimental memory harness context provider#5613
Python: Core: add experimental memory harness context provider#5613eavanvalkenburg merged 3 commits intomicrosoft:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds an experimental long-term memory context provider to the Python core package’s harness surface. It introduces a file-backed memory store plus topic/index record types so agents can persist durable memories, search transcript history, and run extraction/consolidation flows as part of the context provider pipeline.
Changes:
- Add
MemoryContextProvider,MemoryStore,MemoryFileStore, and related record/constant types under the experimental harness surface. - Implement filesystem-backed topic/index/state/transcript storage plus tool hooks for listing, reading, writing, deleting, searching, and consolidating memory.
- Export the new symbols publicly, add the
HARNESSexperimental feature enum value, and add tests covering the new provider/store behavior.
Reviewed changes
Copilot reviewed 4 out of 5 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
python/packages/core/agent_framework/_harness/_memory.py |
Implements the new memory provider, record types, abstract store, and file-backed store/tooling. |
python/packages/core/tests/core/test_harness_memory.py |
Adds coverage for record serialization, file-store behavior, provider context injection, tools, consolidation, and experimental metadata. |
python/packages/core/agent_framework/_feature_stage.py |
Adds the new HARNESS experimental feature flag. |
python/packages/core/agent_framework/__init__.py |
Re-exports the new memory harness public APIs from the package root. |
python/packages/core/agent_framework/_harness/__init__.py |
Adds the harness package marker for the new namespace. |
There was a problem hiding this comment.
Automated Code Review
Reviewers: 2 | Confidence: 92%
✓ Correctness
This PR adds a memory harness system (MemoryIndexEntry, MemoryTopicRecord, MemoryStore, MemoryFileStore, MemoryContextProvider) with transcript-backed extraction, consolidation, and topic management. The code is well-structured and correctly uses the existing framework interfaces (HistoryProvider, SessionContext, FileHistoryProvider, SupportsChatGetResponse). I verified the signatures and behaviors of all referenced APIs against the source. The async tool handling, message grouping, index rebuilding, extraction/consolidation pipelines, and test assertions are all correct. No correctness bugs were found.
✗ Design Approach
The overall direction is promising, but the current design couples durable-memory context injection to the
HistoryProviderhook in a way that does not compose with the framework’s existing per-service-call history mode. There is also a narrower namespace assumption in transcript search that makes the newsource_idconfigurability only partially real. I would request changes before merging because the main provider abstraction is wired into the wrong lifecycle.
Flagged Issues
-
MemoryContextProviderinherits fromHistoryProviderand does much more than history loading: itsbefore_runadds tools, instructions, selected topic files, and recent transcript context. But the agent explicitly skipsHistoryProvider.before_runduringrequire_per_service_call_history_persistence(python/packages/core/agent_framework/_agents.py:1421-1425), and the per-service-call middleware only copiesservice_call_context.get_messages(include_input=True)back into the chat call (python/packages/core/agent_framework/_sessions.py:611-617,:687). That means this provider silently loses its memory tools/instructions in a supported agent mode. The better approach is to make the memory harness a plainContextProviderthat composes with an internal history/archive helper, instead of inheriting the history-provider lifecycle.
Automated review by eavanvalkenburg's agents
Adds MemoryContextProvider with topic-indexed long-term memory and chat-driven compaction. Pluggable MemoryStore backends include MemoryFileStore. Public types: MemoryIndexEntry, MemoryTopicRecord. Behind @experimental(ExperimentalFeature.HARNESS). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1aa5264 to
e4e9a0c
Compare
Python Test Coverage Report •
Python Unit Test Overview
|
||||||||||||||||||||||||||||||||||||||||
- mark MemoryStore as @experimental(HARNESS) for surface consistency - safely encode owner id and verify path containment (matches FileHistoryProvider pattern) - namespace MemoryFileStore on-disk layout by source_id to avoid cross-provider collisions - before_run computes index_entries once and only rewrites MEMORY.md when content changes - asyncio locks around topic/state read-modify-write to avoid concurrent-write races Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Atomic writes via os.replace + temp sibling for topic, state, and index files so crashes/disk-full failures cannot leave a truncated half-written file. - Stop creating directories on read paths: list_topics/read_state/search_transcripts and get_messages return empty when nothing has been written. mkdir is deferred to the actual save path (write_topic/write_state/save_messages). - Escape lines that look like markdown headings on render and unescape them on parse, so a memory or summary containing '## Summary'/'## Memories' cannot tamper with the topic file structure. - Narrow extraction/consolidation chat-client failure handling to ChatClientException, asyncio.TimeoutError, and OSError. Programmer errors (AttributeError, TypeError, ...) now propagate so misconfigured clients fail loudly. - Log a payload-prefix preview for every silent shape branch in _extract_memories and _consolidate_topic so unparsable extractor output is debuggable instead of invisible. - Restructure _run_consolidation: read maintenance state and topic snapshot under the state lock, run the LLM consolidation loop without holding the state lock, and only advance last_consolidated_at/sessions_since_consolidation if at least one topic succeeded. Transient consolidation failures now leave the maintenance window in place so the next after_run retries instead of silently sliding forward. - Add regression tests for: markdown-marker round-trip, atomic-write recovery on os.replace failure, no-mkdir on pure read paths, transient consolidation failure preserves state, and propagation of programmer errors. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Motivation and Context
Part of the experimental Agent Harness feature; the .NET counterpart work shipped in PR #5310 (.NET: Harness Feature branch) and follow-ups #5404, #5365, #5540.
Unlike sibling PRs #5611 (mode) and #5612 (todo), the Python
MemoryContextProviderdoes not mirror a single .NET class one-to-one. It is a distinct take on long-term memory designed for chat-driven, multi-session agents. The closest .NET cousins are:dotnet/src/Microsoft.Agents.AI/Harness/FileMemory/FileMemoryProvider.cs— session-scoped file working memory; the agent usesSaveFile/ReadFile/etc. to manage its own file bag.dotnet/src/Microsoft.Agents.AI/Memory/ChatHistoryMemoryProvider.cs— derives memory from chat history.Description
Adds
MemoryContextProviderto the experimental_harnessnamespace: an LLM-managed long-term memory with topic indexing and chat-driven extraction / consolidation. Memory is materialized on disk as a top-levelMEMORY.mdindex plus per-topic markdown files, plus a state file for bookkeeping.Public types:
MemoryContextProvider— the context providerMemoryStore— abstract backendMemoryFileStore— JSONL/markdown-on-disk backendMemoryIndexEntry,MemoryTopicRecord— record schemasDEFAULT_MEMORY_SOURCE_IDAll new public symbols decorated with
@experimental(ExperimentalFeature.HARNESS). If a sibling harness PR has not yet landed, this PR also adds theHARNESSvalue to theExperimentalFeatureenum and creates the (empty)_harness/__init__.py.Relationship to .NET
FileMemoryProvider(session-scoped file bag, agent-drivenSaveFile/ReadFile)ChatHistoryMemoryProvider(history-derived)MemoryContextProvider(history-derived, topic-indexed, on-disk index + topic files, LLM extraction & consolidation)AgentFileStore(pluggable file backend)MemoryStore/MemoryFileStore(pluggable memory backend)The Python design choices (topic index, MEMORY.md, per-topic files, configurable extraction / consolidation prompts) are intentional and tuned for chat-first, multi-session agents. They are not meant to subsume the .NET FileMemory file-bag pattern; the two can coexist as siblings.
This PR is one of three splitting the
_harnesspackage work apart for review:AgentModeProviderTodoProviderFileMemoryProvider; closer in spirit to .NETChatHistoryMemoryProviderbut with a richer on-disk topic indexThe three modules are independent. They share only the
HARNESSenum entry and the (empty)_harness/__init__.py. Mechanical merge conflicts in__init__.py__all__and_feature_stage.pyare expected if a sibling lands first.Contribution Checklist
@experimental(ExperimentalFeature.HARNESS).