Python: accept AG-UI state data URI parameters#6905
Conversation
There was a problem hiding this comment.
Pull request overview
Updates the Python AG-UI client’s state extraction to accept standards-valid, parameterized JSON data URIs (e.g. including charset=utf-8) while still requiring application/json and base64 encoding, matching the broader data-URI parsing behavior elsewhere in the framework.
Changes:
- Parse the data URI metadata segment (instead of matching a single hard-coded prefix) to support media type parameters.
- Keep AG-UI state extraction constrained to
application/json+base64. - Add a regression test covering
data:application/json;charset=utf-8;base64,...state extraction.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| python/packages/ag-ui/agent_framework_ag_ui/_client.py | Broadened AG-UI state data-URI parsing to accept parameterized JSON data URIs. |
| python/packages/ag-ui/tests/ag_ui/test_ag_ui_client.py | Added regression coverage for parameterized JSON data URIs (charset=utf-8). |
| if prefix.startswith("data:") and media_type == "application/json" and "base64" in parameters: | ||
| import base64 | ||
|
|
||
| encoded_data = uri.split(",", 1)[1] # type: ignore[union-attr] | ||
| decoded_bytes = base64.b64decode(encoded_data) | ||
| state = json.loads(decoded_bytes.decode("utf-8")) |
There was a problem hiding this comment.
Addressed in 829a628 by validating base64 with base64.b64decode(..., validate=True) and catching binascii.Error in the existing warning/fallthrough path. I also added a malformed-base64 regression test that preserves the original message list and returns state is None.
Validation run locally:
uv run pytest packages/ag-ui/tests/ag_ui/test_ag_ui_client.py -quv run ruff check packages/ag-ui/agent_framework_ag_ui/_client.py packages/ag-ui/tests/ag_ui/test_ag_ui_client.pygit diff --check -- python/packages/ag-ui/agent_framework_ag_ui/_client.py python/packages/ag-ui/tests/ag_ui/test_ag_ui_client.py
|
@copilot-pull-request-reviewer I addressed the review feedback in 829a628 by switching to strict base64 validation, catching Local validation completed:
Could you please re-review when you get a chance? |
Motivation & Context
What Problem This Solves
AG-UI state extraction already gates state carrier content through
Content.from_uri(...),content.type == "data", andcontent.media_type == "application/json". However, after those structured checks passed, the extractor still re-checked the raw URI string against one exact prefix:That made the raw URI metadata stricter than the parsed content model. A valid JSON data URI with media type parameters, such as:
can still parse as data content with
media_type == "application/json", but previously failed state extraction because the raw metadata segment did not exactly match the hard-coded prefix.The mismatch meant a final AG-UI message could carry valid JSON/base64 state, pass the structured content gates, and still leave
stateunset. The fix stays narrow: accept parameterizedapplication/jsondata URIs only when the metadata still includesbase64, then decode and parse that payload as JSON rather than broadening state extraction to arbitrary URI or non-JSON content.Changes
This PR parses the data URI metadata segment instead of matching one exact string prefix.
The state extraction path remains intentionally narrow:
application/json.base64marker.The reviewer follow-up tightened malformed payload behavior by using
base64.b64decode(..., validate=True)and catchingbinascii.Errorin the existing warning/fallthrough path.Evidence
Regression coverage was added for a parameterized JSON state URI:
The test verifies that this form now populates the AG-UI
statefield and removes the state carrier message from the returned message list, matching the existing non-parameterized JSON/base64 behavior.Additional regression coverage was added for malformed base64:
That test verifies malformed base64 does not become state and the original messages are preserved.
This evidence is limited to the focused AG-UI regression tests in
python/packages/ag-ui/tests/ag_ui/test_ag_ui_client.py; it does not claim a full repository CI or typecheck run.Possible call chain / impact
The main impact is that AG-UI state extraction now accepts JSON/base64 state data URIs with media type parameters such as
charset=utf-8. Non-JSON data content, missingbase64metadata, invalid base64, and invalid JSON continue to avoid becoming AG-UI state through the fallback path.This should be a narrow parsing fix for valid JSON/base64 data URI metadata rather than a broad expansion of what can be treated as AG-UI state.
Description & Review Guide
What are the major changes?
application/jsonand base64 encoded.charset=utf-8.What is the impact of these changes?
statefield and are removed from the message list like the existing non-parameterized form.What do you want reviewers to focus on?
Related Issue
Fixes #6902
Contribution Checklist
breaking changelabel (or add "[BREAKING]" to the title prefix, before or after any language prefix) - a workflow keeps the label and title prefix in sync automatically.