Skip to content

Python: Improve prompt-template msg serialize and sample usage#13738

Open
moonbox3 wants to merge 3 commits intomicrosoft:mainfrom
moonbox3:jinja2-improvements
Open

Python: Improve prompt-template msg serialize and sample usage#13738
moonbox3 wants to merge 3 commits intomicrosoft:mainfrom
moonbox3:jinja2-improvements

Conversation

@moonbox3
Copy link
Copy Markdown
Collaborator

@moonbox3 moonbox3 commented Apr 6, 2026

Motivation and Context

This PR updates the Jinja2 and Handlebars prompt-template helpers to serialize chat messages through the existing XML/message serializer instead of assembling message XML manually. It also aligns the prompt-template samples with the serializer-backed helper and adds regression coverage for common message content.

Description

Prompt template message serialize improvements.

Contribution Checklist

@moonbox3 moonbox3 self-assigned this Apr 6, 2026
Copilot AI review requested due to automatic review settings April 6, 2026 03:13
@moonbox3 moonbox3 requested a review from a team as a code owner April 6, 2026 03:13
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request improves Python prompt-template message serialization by using XML element serialization (instead of manual string concatenation) for Jinja2 and Handlebars message helpers, updates related samples to use serializer-backed helpers, and adds regression tests to ensure correct escaping and round-tripping of chat history with XML metacharacters.

Changes:

  • Update Jinja2 and Handlebars message helpers to build message XML via ElementTree serialization to correctly escape XML metacharacters.
  • Align Azure chat prompt-template samples to use message_to_prompt for serializer-backed message rendering.
  • Add unit and end to end tests covering escaping and chat history round-trip behavior, including system message preservation.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
python/semantic_kernel/prompt_template/utils/jinja2_system_helpers.py Serialize message() output via XML element serialization to escape metacharacters.
python/semantic_kernel/prompt_template/utils/handlebars_system_helpers.py Serialize message block helper output via XML element serialization to escape metacharacters.
python/tests/unit/prompt_template/test_jinja2_prompt_template.py Add regression test validating escaping and round-trip parsing for message() helper.
python/tests/unit/prompt_template/test_jinja2_prompt_template_e2e.py Add end to end tests for round-trip with metacharacters and system role preservation.
python/tests/unit/prompt_template/test_handlebars_prompt_template.py Add regression test validating escaping and round-trip parsing for message block helper.
python/tests/unit/prompt_template/test_handlebars_prompt_template_e2e.py Add end to end tests for round-trip with metacharacters and system role preservation.
python/samples/concepts/prompt_templates/azure_chat_gpt_api_jinja2.py Update sample template to use message_to_prompt(item) for serializer-backed rendering.
python/samples/concepts/prompt_templates/azure_chat_gpt_api_handlebars.py Update sample template to use {{message_to_prompt}} for serializer-backed rendering.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 3 | Confidence: 89%

✓ Security Reliability

This PR is a security improvement that replaces manual f-string XML construction with xml.etree.ElementTree for generating XML tags in both Handlebars and Jinja2 prompt template helpers. Previously, user content containing XML metacharacters (<, >, &, ") was injected verbatim into XML strings, which could corrupt the XML structure and cause incorrect message reconstruction via ChatHistory.from_rendered_prompt(). The fix uses Element/tostring for proper escaping, consistent with how the rest of the codebase (ChatMessageContent.to_element, ChatHistory.from_rendered_prompt) already handles XML. The use of xml.etree.ElementTree for serialization-only (not parsing) is safe and matches the existing nosec B405 pattern in 12+ other files; defusedxml is correctly used elsewhere for parsing. The sample files are updated to demonstrate the message_to_prompt helper. Test coverage is thorough, including round-trip tests that verify ChatHistory.from_rendered_prompt correctly recovers the original messages after escaping.

✓ Test Coverage

This PR switches the handlebars and jinja2 _message helpers from manual XML string concatenation to xml.etree.ElementTree, properly escaping XML metacharacters in message content and attribute values. The test additions are well-structured: they verify specific escape sequences (&lt;, &amp;, unescaped " in text), and critically verify round-trip fidelity via ChatHistory.from_rendered_prompt. Both template engines (handlebars and jinja2) have symmetric unit and e2e tests. Minor gaps: no test covers None/empty content in _message helpers (the jinja2 helper previously rendered None as the literal string "None" — now fixed but untested), and no test exercises attribute-value escaping in the handlebars _message block helper kwargs path. These are non-blocking.

✓ Design Approach

The PR correctly fixes an XML injection bug in both the Handlebars and Jinja2 _message helpers by switching from manual string concatenation to xml.etree.ElementTree.Element + tostring(), which properly escapes XML metacharacters in content. The approach is sound and the # nosec B405 annotations are appropriate (we are generating XML, not parsing untrusted input). The sample changes steering users toward message_to_prompt are a good idea since it uses to_prompt() / to_element() for full-fidelity serialization. However, the jinja2 _message(item) helper still only serializes item.content (the first TextContent text), silently dropping image items, function-call items, and any other non-text items from the message. Its sibling _message_to_prompt avoids this by delegating to item.to_prompt() / to_element(). Since _message(item) accepts a full ChatMessageContent, calers reasonably expect full serialization — yet the implementation silently produces an incomplete representation. The better approach for _message in jinja2 would be to call item.to_element() and use tostring() on the resulting element, exactly as to_prompt() does, rather than manually setting only the text field.

Suggestions

  • The Jinja2 _message(item) helper sets message.text = item.content, which only captures the first TextContent's text and silently drops non-text items (images, function calls, etc.). Additionally, item.content could be a non-string type, and ElementTree.tostring behavior with non-strings is implementation-defined. The better approach is to delegate to item.to_element() and call tostring() on that—the same way _message_to_prompt works via item.to_prompt()—which would avoid data loss for multi-modal messages and eliminate the type-safety concern.
  • Add a test for None/empty content in the Jinja2 _message helper—the old code rendered None as the literal string "None", and the new code correctly produces an empty element, but this fix is untested. A round-trip test with a ChatMessageContent that has no text items (e.g., a FunctionCallContent-only message) through {{ message(item) }} would lock this in.
  • Consider adding a test for attribute values containing XML metacharacters in the Handlebars _message block helper (e.g., {{#message role=role custom_attr=value}}...{{/message}} where value contains < or "). The Element.set() call correctly escapes these, but there's no test verifying it.
  • The Handlebars sample (azure_chat_gpt_api_handlebars.py) switched from {{#message role=role}}{{~content~}}{{/message}} to {{message_to_prompt}}, which changes the XML serialization format (from bare text to <text> child elements). While both formats round-trip correctly, this sample change is orthogonal to the XML-escaping fix and could be confusing to users who reference the old sample pattern.
  • The Handlebars _message helper now sets message.text = str(options['fn'](this)), treating the entire block output as plain text. For users with allow_dangerously_set_content=True who previously relied on embedding raw XML sub-structure (e.g., image or function-call tags) directly in the block, this is a silent behavior change. It would be worth documenting that _message now always produces plain-text message content, and that message_to_prompt should be used for structured/multi-modal content.

Automated review by moonbox3's agents

MAF Dashboard Bot and others added 2 commits April 6, 2026 04:25
- Use public re-export for AuthorRole (from semantic_kernel.contents) in
  test_jinja2_prompt_template_e2e.py and test_handlebars_prompt_template_e2e.py
- Replace manual XML construction in _message() with item.to_element() to
  properly serialize all content items (images, function calls, etc.)
- Remove unused Enum and Element imports from jinja2_system_helpers.py
- Update test assertion to match correct to_element() output format

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@moonbox3 moonbox3 enabled auto-merge April 6, 2026 04:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants