feat(agent): structured task format and workflow improvements by 0xhis · Pull Request #384 · usestrix/strix

0xhis · 2026-03-21T08:01:59Z

Summary

Replace free-form task descriptions with structured XML format and reduce token overhead in inter-agent communication.

Changes

Replace free-form task description with structured XML format (<scan_task><targets><mode>) in StrixAgent for clearer LLM parsing
Replace verbose <inter_agent_message> with compact <agent_message> format to reduce token overhead
Add corrective message when agents respond with plain text instead of tool calls, enforcing tool-call-only behavior
Simplify thinking_blocks type annotation in AgentState
Add <agent_message> pattern to clean_content() for hidden XML cleanup

Files Changed

strix/agents/StrixAgent/strix_agent.py (+32/-10)
strix/agents/base_agent.py (+17/-24)
strix/agents/state.py (+1/-3)
strix/llm/utils.py (+1)

Split from #328.

greptile-apps · 2026-03-21T08:06:00Z

Greptile Summary

This PR improves inter-agent communication by replacing the verbose <inter_agent_message> format with a compact <agent_message> attribute-based format, restructures the scan task description into a typed XML <scan_task> block for clearer LLM parsing, and adds a corrective feedback message when an agent responds with plain text instead of a tool call.

Structured scan task format (strix_agent.py): Clean improvement — target type, path, and scan mode are now clearly delineated. However, URL and path values are interpolated directly into XML without html.escape(), meaning query strings with & (e.g., https://target.com/?a=1&b=2) will produce technically invalid XML.
Compact inter-agent messages (base_agent.py): Reduces token overhead effectively. The new format embeds message content inline inside the XML element, which makes it susceptible to premature tag closure if content contains the literal string </agent_message> — the same regex in clean_content() that strips these from traces could then strip only part of the content.
Corrective plain-text message (base_agent.py): Good pattern for enforcing tool-call discipline, but the message is in BaseAgent and hardcodes tool names (create_agent, terminal_execute, wait_for_message) that are specific to StrixAgent, making it misleading for other subclasses.
clean_content() update (utils.py) and thinking_blocks simplification (state.py): Both are straightforward and correct.

Confidence Score: 4/5

Safe to merge with minor follow-up fixes; no runtime-breaking bugs in the primary path.
All three substantive issues (unescaped XML values, hardcoded tool names in base class, content embedding without escaping) are P2 style/best-practice concerns rather than blocking bugs. In normal usage, scan targets are well-formed strings and inter-agent content is controlled, so these edge cases are unlikely to cause production failures. The core improvements — structured task format, compact messaging, corrective feedback — are solid and improve LLM reliability.
strix/agents/base_agent.py (corrective message tool names and XML content escaping) and strix/agents/StrixAgent/strix_agent.py (URL escaping in XML targets)

Important Files Changed

Filename	Overview
strix/agents/StrixAgent/strix_agent.py	Replaces free-form task description with structured XML `<scan_task>` format; URL and path values are embedded without HTML escaping, which could produce malformed XML for targets with `&` in query strings.
strix/agents/base_agent.py	Adds plain-text corrective feedback loop and compacts inter-agent message format; corrective message hardcodes StrixAgent-specific tool names in the base class, and message content is not escaped before embedding into the new compact XML format.
strix/agents/state.py	Simplifies `thinking_blocks` type annotation from `list[dict[str, Any]]` to `list`; functionally equivalent and consistent with the rest of the codebase's runtime behaviour.
strix/llm/utils.py	Adds `<agent_message>` regex pattern to `clean_content()` to strip the new compact format from traces; pattern correctly uses `\b` and `[^>]*` to handle attributes, and `re.DOTALL` ensures multi-line content is matched.

Prompt To Fix All With AI

This is a comment left during a code review.
Path: strix/agents/StrixAgent/strix_agent.py
Line: 65-79

Comment:
**XML special characters not escaped in target values**

URLs, repository URLs, and paths are embedded directly into XML element content without escaping. A URL like `https://target.com/api?foo=1&bar=2` will produce `&bar=2` inside the XML, which is technically invalid XML (unescaped `&`). While the LLM is generally tolerant of malformed XML, the `clean_content()` regex in `utils.py` strips patterns matching `<agent_message\b[^>]*>.*?</agent_message>` using `re.DOTALL` — similar stripping logic applied to `<scan_task>` elsewhere could misfire on malformed XML.

Consider HTML-escaping target values before embedding:

```python
import html

target_lines.append(
    f'  <target type="url">{html.escape(url)}</target>'
)
```

The same applies to `repo["url"]`, `code["path"]`, and the IP address entries.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: strix/agents/base_agent.py
Line: 401-411

Comment:
**Corrective message hardcodes StrixAgent-specific tool names in BaseAgent**

The corrective message references `create_agent`, `terminal_execute`, and `wait_for_message` — tools that are specific to `StrixAgent`. Since this logic lives in `BaseAgent`, other subclasses that respond with plain text will receive guidance for tools they don't have, which could confuse those agents or cause them to attempt non-existent tool calls.

Consider making this message generic (e.g., "Use an available tool to proceed") or overriding `_process_iteration` in `StrixAgent` to provide the agent-specific corrective message.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: strix/agents/base_agent.py
Line: 490-496

Comment:
**Message content not escaped in XML attribute/element context**

`sender_name`, `sender_id`, and `message.get("content", "")` are embedded into the `<agent_message>` XML without any escaping. If `sender_name` or `sender_id` contains a double-quote character, the XML attributes will be broken. More importantly, if the message `content` itself contains the literal string `</agent_message>`, the new compact format would cause the regex in `clean_content()` (which matches `<agent_message\b[^>]*>.*?</agent_message>`) to terminate prematurely, leaving the remainder of the content unexpectedly visible in traces.

The verbose `<inter_agent_message>` format wrapped `content` in a `<content>` child element which made it somewhat more resilient to this (though not fully). The new inline format is more susceptible. Consider CDATA-wrapping or escaping the content before embedding.

How can I resolve this? If you propose a fix, please make it concise.

_{Last reviewed commit: "feat(agent): structu..."}

greptile-apps · 2026-03-21T08:06:04Z

strix/agents/StrixAgent/strix_agent.py

+                    target_lines.append(f'  <target type="repository">{repo["url"]} (code at: {repo["workspace_path"]})</target>')
                else:
-                    task_parts.append(f"- {repo['url']}")
+                    target_lines.append(f'  <target type="repository">{repo["url"]}</target>')

        if local_code:
-            task_parts.append("\n\nLocal Codebases:")
-            task_parts.extend(
-                f"- {code['path']} (available at: {code['workspace_path']})" for code in local_code
-            )
+            for code in local_code:
+                target_lines.append(f'  <target type="local_code">{code["path"]} (code at: {code["workspace_path"]})</target>')

        if urls:
-            task_parts.append("\n\nURLs:")
-            task_parts.extend(f"- {url}" for url in urls)
+            for url in urls:
+                target_lines.append(f'  <target type="url">{url}</target>')

        if ip_addresses:
-            task_parts.append("\n\nIP Addresses:")
-            task_parts.extend(f"- {ip}" for ip in ip_addresses)
-
-        task_description = " ".join(task_parts)
+            for ip in ip_addresses:
+                target_lines.append(f'  <target type="ip">{ip}</target>')


XML special characters not escaped in target values

URLs, repository URLs, and paths are embedded directly into XML element content without escaping. A URL like https://target.com/api?foo=1&bar=2 will produce &bar=2 inside the XML, which is technically invalid XML (unescaped &). While the LLM is generally tolerant of malformed XML, the clean_content() regex in utils.py strips patterns matching <agent_message\b[^>]*>.*?</agent_message> using re.DOTALL — similar stripping logic applied to <scan_task> elsewhere could misfire on malformed XML.

Consider HTML-escaping target values before embedding:

import html target_lines.append( f' <target type="url">{html.escape(url)}</target>' )

The same applies to repo["url"], code["path"], and the IP address entries.

Prompt To Fix With AI

This is a comment left during a code review. Path: strix/agents/StrixAgent/strix_agent.py Line: 65-79 Comment: **XML special characters not escaped in target values** URLs, repository URLs, and paths are embedded directly into XML element content without escaping. A URL like `https://target.com/api?foo=1&bar=2` will produce `&bar=2` inside the XML, which is technically invalid XML (unescaped `&`). While the LLM is generally tolerant of malformed XML, the `clean_content()` regex in `utils.py` strips patterns matching `<agent_message\b[^>]*>.*?</agent_message>` using `re.DOTALL` — similar stripping logic applied to `<scan_task>` elsewhere could misfire on malformed XML. Consider HTML-escaping target values before embedding: ```python import html target_lines.append( f' <target type="url">{html.escape(url)}</target>' ) ``` The same applies to `repo["url"]`, `code["path"]`, and the IP address entries. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-03-21T08:06:05Z

strix/agents/base_agent.py

+        corrective_message = (
+            "You responded with plain text instead of a tool call. "
+            "While the agent loop is running, EVERY response MUST be a tool call. "
+            "Do NOT send plain text messages. Act via tools:\n"
+            "- Use the think tool to reason through problems\n"
+            "- Use create_agent to spawn subagents for testing\n"
+            "- Use terminal_execute to run commands\n"
+            "- Use wait_for_message ONLY when waiting for subagent results\n"
+            "Review your task and take action now."
+        )
+        self.state.add_message("user", corrective_message)


Corrective message hardcodes StrixAgent-specific tool names in BaseAgent

The corrective message references create_agent, terminal_execute, and wait_for_message — tools that are specific to StrixAgent. Since this logic lives in BaseAgent, other subclasses that respond with plain text will receive guidance for tools they don't have, which could confuse those agents or cause them to attempt non-existent tool calls.

Consider making this message generic (e.g., "Use an available tool to proceed") or overriding _process_iteration in StrixAgent to provide the agent-specific corrective message.

Prompt To Fix With AI

This is a comment left during a code review. Path: strix/agents/base_agent.py Line: 401-411 Comment: **Corrective message hardcodes StrixAgent-specific tool names in BaseAgent** The corrective message references `create_agent`, `terminal_execute`, and `wait_for_message` — tools that are specific to `StrixAgent`. Since this logic lives in `BaseAgent`, other subclasses that respond with plain text will receive guidance for tools they don't have, which could confuse those agents or cause them to attempt non-existent tool calls. Consider making this message generic (e.g., "Use an available tool to proceed") or overriding `_process_iteration` in `StrixAgent` to provide the agent-specific corrective message. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-03-21T08:06:06Z

strix/agents/base_agent.py

+                            message_content = f"""<agent_message
+from="{sender_name}"
+id="{sender_id}"
+type="{message.get("message_type", "information")}"
+priority="{message.get("priority", "normal")}">
 {message.get("content", "")}
-    </content>
-    <delivery_info>
-        <note>This message was delivered during your task execution.
-        Please acknowledge and respond if needed.</note>
-    </delivery_info>
-</inter_agent_message>"""
+</agent_message>"""


Message content not escaped in XML attribute/element context

sender_name, sender_id, and message.get("content", "") are embedded into the <agent_message> XML without any escaping. If sender_name or sender_id contains a double-quote character, the XML attributes will be broken. More importantly, if the message content itself contains the literal string </agent_message>, the new compact format would cause the regex in clean_content() (which matches <agent_message\b[^>]*>.*?</agent_message>) to terminate prematurely, leaving the remainder of the content unexpectedly visible in traces.

The verbose <inter_agent_message> format wrapped content in a <content> child element which made it somewhat more resilient to this (though not fully). The new inline format is more susceptible. Consider CDATA-wrapping or escaping the content before embedding.

Prompt To Fix With AI

This is a comment left during a code review. Path: strix/agents/base_agent.py Line: 490-496 Comment: **Message content not escaped in XML attribute/element context** `sender_name`, `sender_id`, and `message.get("content", "")` are embedded into the `<agent_message>` XML without any escaping. If `sender_name` or `sender_id` contains a double-quote character, the XML attributes will be broken. More importantly, if the message `content` itself contains the literal string `</agent_message>`, the new compact format would cause the regex in `clean_content()` (which matches `<agent_message\b[^>]*>.*?</agent_message>`) to terminate prematurely, leaving the remainder of the content unexpectedly visible in traces. The verbose `<inter_agent_message>` format wrapped `content` in a `<content>` child element which made it somewhat more resilient to this (though not fully). The new inline format is more susceptible. Consider CDATA-wrapping or escaping the content before embedding. How can I resolve this? If you propose a fix, please make it concise.

Copilot

Pull request overview

This PR restructures how Strix agents exchange task and inter-agent message content by introducing more compact, LLM-parseable XML blocks and adding enforcement when the model responds without tool calls.

Changes:

Replace free-form scan task text with a structured <scan_task> XML block containing <targets> and <mode> in StrixAgent.
Replace verbose <inter_agent_message> envelopes with compact <agent_message ...> formatting in inter-agent delivery.
Add a corrective “tool-call-only” message when the assistant responds with plain text (no tool invocations), and extend clean_content() to strip <agent_message> blocks.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 7 comments.

File	Description
`strix/agents/StrixAgent/strix_agent.py`	Builds structured `<scan_task>` XML for targets/mode and appends user instructions.
`strix/agents/base_agent.py`	Adds corrective prompt for non-tool responses; changes inter-agent messages to compact `<agent_message>` format.
`strix/agents/state.py`	Simplifies `thinking_blocks` annotation in `AgentState.add_message`.
`strix/llm/utils.py`	Updates hidden-XML cleanup to also strip `<agent_message>` blocks.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-21T08:07:43Z

strix/agents/state.py

-    def add_message(
-        self, role: str, content: Any, thinking_blocks: list[dict[str, Any]] | None = None
-    ) -> None:
+    def add_message(self, role: str, content: Any, thinking_blocks: list | None = None) -> None:


thinking_blocks is now annotated as list | None, which introduces an unparameterized generic (list[Any]). With this repo’s strict type-checking (mypy disallow_any_generics = true / pyright strict), this will fail type checking. Keep the previous precise type (e.g., list[dict[str, Any]] | None) or introduce an explicit alias type for thinking blocks.

Suggested change

def add_message(self, role: str, content: Any, thinking_blocks: list | None = None) -> None:

def add_message(self, role: str, content: Any, thinking_blocks: list[dict[str, Any]] | None = None) -> None:

Copilot · 2026-03-21T08:07:43Z

strix/agents/StrixAgent/strix_agent.py

+                    target_lines.append(f'  <target type="repository">{repo["url"]} (code at: {repo["workspace_path"]})</target>')
                else:
-                    task_parts.append(f"- {repo['url']}")
+                    target_lines.append(f'  <target type="repository">{repo["url"]}</target>')

        if local_code:
-            task_parts.append("\n\nLocal Codebases:")
-            task_parts.extend(
-                f"- {code['path']} (available at: {code['workspace_path']})" for code in local_code
-            )
+            for code in local_code:
+                target_lines.append(f'  <target type="local_code">{code["path"]} (code at: {code["workspace_path"]})</target>')


The <target> XML content interpolates URLs/paths directly. If any value contains XML-significant characters (e.g., &, <, >) the <scan_task> becomes malformed and can confuse downstream LLM parsing. Consider XML-escaping target text (and ideally keep structured fields as attributes/elements rather than free-form text in the node body).

Copilot · 2026-03-21T08:07:44Z

strix/agents/StrixAgent/strix_agent.py

+        task_description = (
+            f"<scan_task>\n"
+            f"<targets>\n{targets_block}\n</targets>\n"
+            f"<mode>{mode}</mode>\n"
+            f"<action>Begin security assessment NOW. Your first tool call must be create_agent to spawn context-gathering subagents for the targets listed above. Do NOT call wait_for_message — the targets are already specified.</action>\n"
+            f"</scan_task>"
+        )

        if user_instructions:
            task_description += f"\n\nSpecial instructions: {user_instructions}"


user_instructions are appended outside the <scan_task> block, which undermines the goal of a fully-structured task format and makes parsing less reliable. Consider adding a dedicated element (e.g., <special_instructions>…</special_instructions>) inside <scan_task> (with proper escaping) instead of free-form text after the XML.

Copilot · 2026-03-21T08:07:44Z

strix/agents/StrixAgent/strix_agent.py

+            f"<mode>{mode}</mode>\n"
+            f"<action>Begin security assessment NOW. Your first tool call must be create_agent to spawn context-gathering subagents for the targets listed above. Do NOT call wait_for_message — the targets are already specified.</action>\n"
+            f"</scan_task>"


These f-strings create very long physical lines (notably the <action>…</action> line), which is likely to trip the repo’s Ruff/Black 100-char line-length enforcement (E501). Please wrap/split the string across multiple lines (e.g., implicit concatenation) so formatting/lint passes consistently.

Copilot · 2026-03-21T08:07:44Z

strix/agents/base_agent.py

+                            message_content = f"""<agent_message
+from="{sender_name}"
+id="{sender_id}"
+type="{message.get("message_type", "information")}"
+priority="{message.get("priority", "normal")}">


<agent_message> attribute values are interpolated without escaping. Agent names/IDs can contain quotes or other characters that will break the XML-like structure and reduce LLM parse reliability. Escape attribute values (at least ", <, &) or move metadata into child elements to avoid attribute-escaping issues.

Copilot · 2026-03-21T08:07:44Z

strix/agents/base_agent.py

+            "- Use the think tool to reason through problems\n"
+            "- Use create_agent to spawn subagents for testing\n"
+            "- Use terminal_execute to run commands\n"
+            "- Use wait_for_message ONLY when waiting for subagent results\n"


The corrective prompt says to use wait_for_message ONLY for subagent results, but the tool is also used to wait for user/other agent messages more generally (and earlier guidance in this file references waiting for user/other agents). Consider rewording to avoid an inaccurate restriction that could steer the agent away from the correct tool in non-subagent scenarios.

Suggested change

"- Use wait_for_message ONLY when waiting for subagent results\n"

"- Use agents_graph_actions.wait_for_message to wait for messages from user or other agents\n"

Copilot · 2026-03-21T08:07:44Z

strix/agents/base_agent.py

+priority="{message.get("priority", "normal")}">
 {message.get("content", "")}
-    </content>
-    <delivery_info>
-        <note>This message was delivered during your task execution.
-        Please acknowledge and respond if needed.</note>
-    </delivery_info>
-</inter_agent_message>"""
+</agent_message>"""


message.get("content", "") is inserted verbatim inside <agent_message> without escaping/wrapping. If the message body contains sequences like </agent_message> (or other markup), it can prematurely terminate the block and corrupt the prompt structure. Consider escaping the body content or wrapping it in a dedicated child element and escaping (or using a CDATA-like convention).

- Replace free-form task description with structured XML format (<scan_task><targets><mode>) in StrixAgent for clearer LLM parsing - Replace verbose <inter_agent_message> with compact <agent_message> format to reduce token overhead in inter-agent communication - Add corrective message when agents respond with plain text instead of tool calls, enforcing tool-call-only behavior - Simplify thinking_blocks type annotation in AgentState - Add <agent_message> pattern to clean_content() for hidden XML cleanup

- Add html.escape() to target values in <scan_task> (URLs, paths, IPs) - Escape sender_name/sender_id in <agent_message> attributes - CDATA-wrap message content in <agent_message> to handle any text - Make corrective message generic (no StrixAgent-specific tool names)

0xhis mentioned this pull request Mar 21, 2026

feat: OWASP WSTG methodology alignment & TUI live status #328

Closed

0xhis marked this pull request as ready for review March 21, 2026 08:03

Copilot AI review requested due to automatic review settings March 21, 2026 08:03

Copilot started reviewing on behalf of 0xhis March 21, 2026 08:03 View session

greptile-apps bot reviewed Mar 21, 2026

View reviewed changes

Copilot AI reviewed Mar 21, 2026

View reviewed changes

0xhis force-pushed the feat/agent-workflow-restructure branch from 0e06c9a to 1ca72c0 Compare March 21, 2026 08:17

0xhis force-pushed the feat/agent-workflow-restructure branch from f19c2a9 to 6f71f5f Compare March 21, 2026 08:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(agent): structured task format and workflow improvements#384

feat(agent): structured task format and workflow improvements#384
0xhis wants to merge 2 commits intousestrix:mainfrom
0xhis:feat/agent-workflow-restructure

0xhis commented Mar 21, 2026

Uh oh!

greptile-apps bot commented Mar 21, 2026

Uh oh!

greptile-apps bot Mar 21, 2026

Uh oh!

greptile-apps bot Mar 21, 2026

Uh oh!

greptile-apps bot Mar 21, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 21, 2026

Uh oh!

Copilot AI Mar 21, 2026

Uh oh!

Copilot AI Mar 21, 2026

Uh oh!

Copilot AI Mar 21, 2026

Uh oh!

Copilot AI Mar 21, 2026

Uh oh!

Copilot AI Mar 21, 2026

Uh oh!

Copilot AI Mar 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	def add_message(self, role: str, content: Any, thinking_blocks: list \| None = None) -> None:
	def add_message(self, role: str, content: Any, thinking_blocks: list[dict[str, Any]] \| None = None) -> None:

	"- Use wait_for_message ONLY when waiting for subagent results\n"
	"- Use agents_graph_actions.wait_for_message to wait for messages from user or other agents\n"

Conversation

0xhis commented Mar 21, 2026

Summary

Changes

Files Changed

Uh oh!

greptile-apps bot commented Mar 21, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Uh oh!

greptile-apps bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants