feat(sdk): unify structured prompt rendering by mmabrouk · Pull Request #4331 · Agenta-AI/agenta

mmabrouk · 2026-05-14T15:37:14Z

Summary

Implements WP-B2 for prompt runtime unification.

Adds agenta.sdk.utils.rendering with render_messages(...), render_json_like(...), and typed StructuredRenderingError.
Routes PromptTemplate.format(...) through the structured renderer while preserving TemplateFormatError for chat/completion callers.
Routes auto_ai_critique_v0(...) through the shared renderer for prompt messages and judge json_schema rendering.
Aligns judge Jinja failures to raise PromptFormattingV0Error instead of silently sending unrendered content to the LLM.
Adds pure renderer tests and call-site tests for PromptTemplate and LLM-as-a-judge.
Adds/updates WP-B2 design, QA, and status docs.

Validation

cd sdks/python && uv run ruff format agenta/sdk/utils/rendering.py agenta/sdk/utils/types.py agenta/sdk/engines/running/handlers.py oss/tests/pytest/unit/test_structured_rendering.py oss/tests/pytest/unit/test_auto_ai_critique_v0_runtime.py oss/tests/pytest/unit/test_prompt_template_extensions.py oss/tests/pytest/unit/test_jinja2_sandbox.py oss/tests/pytest/unit/test_render_template_helper.py
cd sdks/python && uv run ruff check --fix agenta/sdk/utils/rendering.py agenta/sdk/utils/types.py agenta/sdk/engines/running/handlers.py oss/tests/pytest/unit/test_structured_rendering.py oss/tests/pytest/unit/test_auto_ai_critique_v0_runtime.py oss/tests/pytest/unit/test_prompt_template_extensions.py oss/tests/pytest/unit/test_jinja2_sandbox.py oss/tests/pytest/unit/test_render_template_helper.py
cd sdks/python && uv run pytest oss/tests/pytest/unit -q

Result: 411 passed, 3 warnings.

vercel · 2026-05-14T15:37:20Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agenta-documentation	Ready	Preview, Comment	May 15, 2026 1:46pm

coderabbitai · 2026-05-14T15:37:49Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 88d425b0-e91f-4459-8a49-5d20a660b8dc

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

📝 Walkthrough

Walkthrough

This PR implements WP-B2 rendering unification by introducing a shared structured rendering layer (render_messages, render_json_like) that unifies prompt message and JSON-like value templating across completion/chat and LLM-as-a-judge paths, with comprehensive RFC/planning documentation, integration into PromptTemplate and handlers, and test coverage verifying correct behavior and error handling.

Changes

WP-B2 Rendering Unification

Layer / File(s)	Summary
WP-B2 RFC, research, planning, QA documentation, and status tracking `docs/design/prompt-runtime-unification/README.md`, `docs/design/prompt-runtime-unification/wp-b2-rendering-unification/*`	RFC document, research documentation, design plan, QA strategy, status tracking, and workspace README comprehensively specify the WP-B2 structured rendering unification feature, its four-phase implementation plan, test strategy, validation goals, and implementation progress.
Structured rendering module for messages and JSON-like values `sdks/python/agenta/sdk/utils/rendering.py`	New `rendering.py` module with `render_messages(...)` and `render_json_like(...)` functions that render template strings within message content (text parts only, preserving non-text parts like images and audio) and recursively within JSON-like nested structures (lists/mappings), validating message shapes and detecting key collisions. Errors are wrapped as `StructuredRenderingError` with detailed paths and original error context.
PromptTemplate integration with structured renderers `sdks/python/agenta/sdk/utils/types.py`	`PromptTemplate.format(...)` delegates message rendering to `render_messages(...)` and response-format variable substitution to `render_json_like(...)`. New `_template_error_from_structured_error(...)` helper centralizes conversion of `StructuredRenderingError` to `TemplateFormatError`, with specialized messages for unresolved variables and Jinja2 failures.
Handler integration: _format_with_template and auto_ai_critique_v0 updates `sdks/python/agenta/sdk/engines/running/handlers.py`	`_format_with_template()` removes Jinja2-specific silent-failure behavior. `auto_ai_critique_v0()` removes early response-format initialization, switches prompt formatting to `render_messages(...)`, and renders `json_schema` via `render_json_like(...)` before building response_format, ensuring consistent error handling with completion/chat.
New test module for structured rendering behavior `sdks/python/oss/tests/pytest/unit/test_structured_rendering.py`	New comprehensive test module verifying `render_messages` and `render_json_like` correctly handle message objects, dict-form messages, text-part vs non-text-part preservation, Jinja error wrapping with paths, key collision detection, and input immutability.
Integration tests for PromptTemplate and judge handler rendering `sdks/python/oss/tests/pytest/unit/test_prompt_template_extensions.py`, `sdks/python/oss/tests/pytest/unit/test_auto_ai_critique_v0_runtime.py`, `sdks/python/oss/tests/pytest/unit/test_jinja2_sandbox.py`, `sdks/python/oss/tests/pytest/unit/test_render_template_helper.py`	Updated and new tests verifying `PromptTemplate.format(...)` renders response-format JSON-schema fields with templated properties, `auto_ai_critique_v0` renders variables inside `json_schema` before LLM calls, template errors raise exceptions without reaching the LLM, and Jinja2 sandbox violations now raise instead of silently returning payloads.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Agenta-AI/agenta#4231: WP-B1 refactoring of _format_with_template and low-level render_template that this PR builds upon by introducing the new structured rendering layer above it.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 29.27% which is insufficient. The required threshold is 60.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'feat(sdk): unify structured prompt rendering' directly and accurately describes the main change: the introduction of unified structured prompt rendering via a new rendering module.
Description check	✅ Passed	The description provides a comprehensive summary of the implementation, explaining key additions (rendering module with structured functions), routing changes (PromptTemplate and auto_ai_critique_v0), error behavior alignment, and test coverage—all directly related to the changeset.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/wp-b2-rendering-unification

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (2)

sdks/python/oss/tests/pytest/unit/test_jinja2_sandbox.py (1)
21-22: ⚡ Quick win

Assert a concrete exception type for sandbox violations.

Line 21 uses pytest.raises(Exception), which can pass on unrelated failures. Please assert the actual contract type (or wrapper type) to keep this test precise.
Suggested tightening
+from agenta.sdk.utils.lazy import _load_jinja2
...
 def test_handlers_jinja2_blocks_ssti_payload() -> None:
-    with pytest.raises(Exception):
+    _, TemplateError = _load_jinja2()
+    with pytest.raises(TemplateError):
         _format_with_template(
             content=SSTI_PAYLOAD,
             format="jinja2",
             kwargs={},
         )
sdks/python/oss/tests/pytest/unit/test_render_template_helper.py (1)
786-789: ⚡ Quick win

Use a specific exception assertion here, not Exception.

Line 788 is too broad and may hide unrelated failures. Match the concrete Jinja sandbox exception contract (consistent with the rest of this file).
Suggested tightening
 def test_handlers_format_with_template_jinja2_raises_on_sandbox_violation():
     payload = "{{ lipsum.__globals__['os'].popen('id').read() }}"
-    with pytest.raises(Exception):
+    _, TemplateError = _load_jinja2()
+    with pytest.raises(TemplateError):
         _format_with_template(content=payload, format="jinja2", kwargs={})

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: a774c3b6-f63f-470d-b5c0-c819b76d7fce

📥 Commits

Reviewing files that changed from the base of the PR and between 0fae4a5 and f7cbeb7.

📒 Files selected for processing (15)

docs/design/prompt-runtime-unification/README.md
docs/design/prompt-runtime-unification/wp-b2-rendering-unification/README.md
docs/design/prompt-runtime-unification/wp-b2-rendering-unification/plan.md
docs/design/prompt-runtime-unification/wp-b2-rendering-unification/qa.md
docs/design/prompt-runtime-unification/wp-b2-rendering-unification/research.md
docs/design/prompt-runtime-unification/wp-b2-rendering-unification/rfc.md
docs/design/prompt-runtime-unification/wp-b2-rendering-unification/status.md
sdks/python/agenta/sdk/engines/running/handlers.py
sdks/python/agenta/sdk/utils/rendering.py
sdks/python/agenta/sdk/utils/types.py
sdks/python/oss/tests/pytest/unit/test_auto_ai_critique_v0_runtime.py
sdks/python/oss/tests/pytest/unit/test_jinja2_sandbox.py
sdks/python/oss/tests/pytest/unit/test_prompt_template_extensions.py
sdks/python/oss/tests/pytest/unit/test_render_template_helper.py
sdks/python/oss/tests/pytest/unit/test_structured_rendering.py

…unification

github-actions · 2026-05-15T11:07:11Z

Railway Preview Environment


Status	Destroyed (PR converted to draft)

Updated at 2026-05-15T11:29:42.394Z

mmabrouk · 2026-05-15T11:25:25Z

junaway

lgtm! @mmabrouk

All comment are non-blocking.

feat(sdk): unify structured prompt rendering

f7cbeb7

vercel Bot had a problem deploying to Preview May 14, 2026 15:38 Failure

mmabrouk commented May 14, 2026

View reviewed changes

Comment thread sdks/python/agenta/sdk/utils/types.py Outdated

coderabbitai Bot reviewed May 14, 2026

View reviewed changes

mmabrouk commented May 14, 2026

View reviewed changes

Comment thread sdks/python/agenta/sdk/utils/types.py Outdated

mmabrouk marked this pull request as ready for review May 14, 2026 15:45

dosubot Bot added size:XXL This PR changes 1000+ lines, ignoring generated files. documentation Improvements or additions to documentation enhancement New feature or request python Pull requests that update Python code tests labels May 14, 2026

mmabrouk marked this pull request as draft May 14, 2026 15:45

fix(sdk): address structured rendering review

107cf38

vercel Bot had a problem deploying to Preview May 14, 2026 15:47 Failure

Merge remote-tracking branch 'origin/main' into feat/wp-b2-rendering-…

f47fcb7

…unification

vercel Bot deployed to Preview May 15, 2026 10:41 View deployment

mmabrouk marked this pull request as ready for review May 15, 2026 10:44

docs(sdk): clarify structured renderer contracts

4a0aa08

dosubot Bot added the refactoring A code change that neither fixes a bug nor adds a feature label May 15, 2026

vercel Bot deployed to Preview May 15, 2026 10:47 View deployment

fix(sdk): preserve OpenAI non-text content parts

0fe0d6d

vercel Bot deployed to Preview May 15, 2026 10:51 View deployment

docs(sdk): clarify structured rendering comments

6b0fbea

vercel Bot deployed to Preview May 15, 2026 10:55 View deployment

mmabrouk requested a review from jp-agenta May 15, 2026 11:26

mmabrouk marked this pull request as draft May 15, 2026 11:29

mmabrouk marked this pull request as ready for review May 15, 2026 11:29

mmabrouk marked this pull request as draft May 15, 2026 11:34