Skip to content

Python: Fix per-service-call history persistence with server-storing clients#6310

Open
eavanvalkenburg wants to merge 2 commits into
microsoft:mainfrom
eavanvalkenburg:fix/5798-per-service-call-history-persistence
Open

Python: Fix per-service-call history persistence with server-storing clients#6310
eavanvalkenburg wants to merge 2 commits into
microsoft:mainfrom
eavanvalkenburg:fix/5798-per-service-call-history-persistence

Conversation

@eavanvalkenburg
Copy link
Copy Markdown
Member

Motivation and Context

When an Agent was configured with require_per_service_call_history_persistence=True together with a HistoryProvider, and the underlying chat client stored history server-side by default (e.g. OpenAIChatClient, where STORES_BY_DEFAULT=True), the external history provider was silently never persisted. The per-service-call middleware was skipped because the service was assumed to own history, and the once-per-run path also skipped the provider — so neither persisted.

Fixes #5798

Description

Unify persistence on the per-service-call middleware. When require_per_service_call_history_persistence=True and a HistoryProvider exists, the PerServiceCallHistoryPersistingMiddleware is now always installed and owns persistence. service_stores_history only selects how the middleware behaves, never whether it persists:

  • Service does not store: the middleware loads providers and drives the function loop with a local sentinel conversation id.
  • Service stores: the middleware skips loading (the service owns history) and persists each service call while the real conversation id flows through. A warning is logged for any provider with load_messages=True whose load is bypassed.

The observable contract: with the flag on, persistence happens per service call — in a function-call → final-completion run, the function-call turn is persisted before the second call starts.

Rationalize chat-options handling in _prepare_run_context:

  • _merge_options now skips None overrides and strips remaining None values in a single pass, so an unset store is never forwarded to the client and the service decides its own default (STORES_BY_DEFAULT is only an internal behavior hint).
  • store and conversation_id are resolved once from a single combined view (effective_options) instead of probing both the agent-default and runtime dicts separately. The InMemoryHistoryProvider auto-injection and the per-service-call resolution now agree on conversation_id (an agent-level default is honored consistently).

Tests are added to test_agents.py as a scenario matrix (sync + streaming) that asserts the per-service-call persistence timing across storing/non-storing clients and store overrides.

Contribution Checklist

  • The code builds clean without any errors or warnings
  • The PR follows the Contribution Guidelines
  • All unit tests pass, and I have added new tests where possible
  • Is this a breaking change? If yes, add "[BREAKING]" prefix to the title of the PR.

When an Agent set require_per_service_call_history_persistence=True together
with a HistoryProvider, and the chat client stored history server-side by
default (e.g. OpenAIChatClient, STORES_BY_DEFAULT=True), the external history
provider was silently never persisted.

Unify persistence on the per-service-call middleware: when the flag is set and
a HistoryProvider exists, the middleware is always installed and owns
persistence. service_stores_history now only selects middleware behavior:
- service does not store: load providers and drive the function loop with a
  local sentinel conversation id, or
- service stores: skip loading (the service owns history) and persist each
  service call while the real conversation id flows through.

Also rationalize chat-options handling in _prepare_run_context:
- _merge_options now skips None overrides and strips remaining None values, so
  an unset `store` is never forwarded and the service decides its own default.
- Resolve `store` and `conversation_id` once from a single combined view
  (effective_options) instead of probing both default and runtime dicts; the
  auto-injection and per-service-call resolution now agree on conversation_id.

Fixes microsoft#5798

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 3, 2026 14:40
@moonbox3 moonbox3 added the python label Jun 3, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a Python SDK bug where require_per_service_call_history_persistence=True combined with an external HistoryProvider could silently skip external persistence when the underlying chat client stores history server-side by default (e.g., OpenAI clients with STORES_BY_DEFAULT=True). The fix centralizes history persistence responsibility in the per-service-call middleware and rationalizes option-merging so unset values (notably store=None) are not forwarded to clients.

Changes:

  • Always installs PerServiceCallHistoryPersistingMiddleware when per-service-call persistence is required and a HistoryProvider is present; service-side storage now only changes how the middleware behaves (load+persist vs persist-only).
  • Updates _prepare_run_context to resolve store/conversation_id from a single merged options view and to treat None as “unset” (not forwarded).
  • Adds a scenario-matrix test suite validating persistence timing across storing/non-storing clients, streaming/non-streaming runs, and store overrides.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
python/packages/core/agent_framework/_agents.py Unifies option resolution and ensures per-service-call persistence is owned by middleware in both local and service-managed cases, with warning logging when load is bypassed.
python/packages/core/agent_framework/_sessions.py Extends per-service-call middleware to support a “service stores history” mode (persist-only; no provider load; no local sentinel behavior).
python/packages/core/agent_framework/_clients.py Updates as_agent() docstring to describe the per-service-call persistence behavior (note: one doc line currently contradicts implementation).
python/packages/core/tests/core/test_agents.py Adds regression tests and a scenario matrix asserting per-service-call persistence timing and store=None non-forwarding.

Comment thread python/packages/core/agent_framework/_clients.py Outdated
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 3, 2026

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
packages/core/agent_framework
   _agents.py4185287%465, 477, 532, 813, 1028, 1073, 1146–1150, 1246, 1274, 1311, 1333, 1353–1354, 1359, 1406, 1448, 1474, 1476, 1489, 1540, 1542, 1551–1556, 1561, 1563, 1569–1570, 1577, 1579–1580, 1588–1589, 1592–1594, 1604–1609, 1613, 1618, 1620
   _clients.py137794%325, 376, 532–535, 652
   _sessions.py3913391%102–104, 106–107, 124–125, 127–129, 206–207, 297, 558–562, 611, 617, 651, 710, 714, 724, 857, 873, 1006, 1020–1021, 1044, 1066, 1076, 1118
TOTAL37833442188% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
7543 34 💤 0 ❌ 0 🔥 1m 57s ⏱️

…ce per run

Address PR review: when the client stores history server-side, the
per-service-call middleware still persists after each model call; only
provider loading is skipped. The previous "persist once per run()" wording
contradicted the implementation.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 4 | Confidence: 84% | Result: All clear

Reviewed: Correctness, Security Reliability, Test Coverage, Design Approach


Automated review by eavanvalkenburg's agents

Comment on lines +696 to +703
require_per_service_call_history_persistence: When True (and a HistoryProvider is
present), the provider always persists history via per-service-call middleware,
regardless of whether the client stores history server-side. If the client does
not store history, the middleware also loads providers around each model call and
drives the function loop with a local conversation; if it does, loading is skipped
(the service-managed conversation is the source of truth) and the middleware only
persists. A warning is logged for providers with ``load_messages=True`` when
loading is skipped because service-side storage is active.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this becomes sort of confusing. How about when this is true and there isn't a HistoryProvider?

Comment on lines +1232 to +1236
"HistoryProvider '%s' has load_messages=True but the chat client stores history "
"server-side; skipping local history load and relying on the service-managed "
"conversation. Set store=False to load from the provider, or load_messages=False "
"to silence this warning.",
provider.source_id,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not throw an exception at agent initialization?

) -> ChatResponse:
"""Persist a model response and apply the local follow-up sentinel when needed."""
if response.conversation_id is not None and not is_local_history_conversation_id(response.conversation_id):
if (
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when service_stores_history is True but the client returns conversation_id=None? This guard is skipped, provider loading was already skipped (_sessions.py _prepare_service_call_context), and session.service_session_id only gets set when conversation_id is truthy (_agents.py:1400). So we persist this turn to the provider but the next run skips loading again with no service id to resume from, dropping cross-turn history silently with no error and no warning. Is a storing client guaranteed to always echo a conversation_id, or should we assert/warn in the storing branch when it comes back empty so the assumption can't fail quietly?

# Without service-side storage the middleware persists locally and drives the function
# loop with a local sentinel, which cannot be reconciled with an existing service-managed
# conversation. When the service stores history, an existing conversation id is expected.
if conversation_id is not None and not service_stores_history:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we covering the allow side of this boundary? The PR adds and not service_stores_history, so flag-on + storing client + an existing conversation_id now resumes instead of raising. Tests cover the raise side (store=False + conversation_id), but I couldn't find one for storing + conversation_id asserting it does NOT raise and that the id propagates to service_session_id. A regression dropping the and not service_stores_history qualifier would wrongly raise on the resume-a-stored-conversation path and nothing would catch it. Worth a test?

# persists each service call while the real conversation id flows through.
# In the service-managed case loading is skipped, so warn for providers that expect to load.
history_providers = self._get_history_providers()
if self.require_per_service_call_history_persistence and history_providers and service_stores_history:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sits in _prepare_run_context, which runs once per run(), so a long-lived agent looping turns against a storing client with a load_messages=True provider re-logs the same WARNING every turn. Could that train users to tune it out? Wondering if we should gate it to once per session/agent (or first run only) so it stays a signal rather than per-turn noise.


if conversation_id is not None:
# A live service-managed session id takes precedence over the resolved conversation id.
if session and session.service_session_id:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we covering a second run on the same session in storing mode? This precedence branch only fires once service_session_id is populated (run 2+), and every per-service-call test does a single run. A two-run test asserting persistence keeps happening, service_session_id stays stable, and loading stays skipped would lock down this path.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Python: OpenAI store=True can silently bypass external HistoryProvider persistence

4 participants