Enable Chat History for WebSocket Messages by ericevans-nv · Pull Request #999 · NVIDIA/NeMo-Agent-Toolkit

ericevans-nv · 2025-10-14T20:15:34Z

Description

This PR updates the WebSocket message handler to pass the correct payload to the agent, enabling chat history for supported workflow types.

By Submitting this PR I confirm:

I am familiar with the Contributing Guidelines.
We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
- Any contribution which contains commits that are not Signed-Off will not be accepted.
When the PR is ready for review, new or existing tests cover these changes.
When the PR is ready for review, the documentation is up to date with these changes.

Summary by CodeRabbit

New Features
- Enhanced WebSocket handling for multi-message chats and unified Chat/Generate streams with consistent outputs.
- Improved human-in-the-loop prompts and responses, returning clearer, consistent message content.
Bug Fixes
- Prevents workflows from starting while another task runs, reducing race conditions.
- More robust message creation and error handling for fewer failed interactions.

Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>

coderabbitai · 2025-10-14T20:16:00Z

Walkthrough

Introduces new helpers and public models to process WebSocket user messages: extracts last user TextContent, converts UserMessages to ChatRequest or prompt text, updates workflow invocation and streaming to use these results, and shifts human-interaction futures/callbacks to operate on TextContent.

Changes

Cohort / File(s)	Summary
WebSocket message processing & handler `src/nat/front_ends/fastapi/message_handler.py`	- Added `_extract_last_user_message_content(messages: list[UserMessages]) -> TextContent`, `_process_websocket_user_interaction_response_message(user_content: WebSocketUserInteractionResponseMessage) -> TextContent`, and `_process_websocket_user_message(user_content: WebSocketUserMessage) -> ChatRequest
Human interaction types & callback flow `src/nat/front_ends/fastapi/message_handler.py`	- `_user_interaction_response` type changed from `asyncio.Future[HumanResponse]
Message validator signature formatting `src/nat/front_ends/fastapi/message_validator.py`	- Reformatted signature: `create_system_response_token_message(..., content: SystemResponseContent

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Client as WebSocket Client
  participant MH as MessageHandler
  participant WF as Workflow Runner

  rect rgba(230,240,255,0.5)
    note right of Client: WebSocketUserMessage (contains UserMessages)
    Client->>MH: send(WebSocketUserMessage)
    MH->>MH: _process_websocket_user_message(...)
    alt CHAT / CHAT_STREAM
      MH->>MH: build ChatRequest
      MH->>WF: start(ChatRequest) [if no task running]
    else GENERATE / GENERATE_STREAM
      MH->>MH: extract last TextContent.text
      MH->>WF: start(prompt text) [if no task running]
    end
  end

  rect rgba(240,255,230,0.5)
    note right of Client: WebSocketUserInteractionResponseMessage
    Client->>MH: send(WebSocketUserInteractionResponseMessage)
    MH->>MH: _process_websocket_user_interaction_response_message(...) -> TextContent
    MH->>MH: convert TextContent -> HumanResponse
    MH-->>WF: resume workflow with HumanResponse
  end

  MH-->>Client: stream/progress/results

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 72.73% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title “Enable Chat History for WebSocket Messages” clearly summarizes the main change, uses an imperative verb to describe the action, remains under the 72-character limit, and accurately reflects the PR’s objective of enabling chat history support over WebSocket.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…to bugfix/websocket-chat-history

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (3)

src/nat/front_ends/fastapi/message_handler.py (3)
139-152: Consider using enum constant for role comparison.

Line 148 uses string literal "user" for role comparison. Consider using UserMessageContentRoleType.USER instead for type safety and consistency with the codebase's type system.

Apply this diff:
-        for user_message in messages[::-1]:
-            if user_message.role == "user":
+        for user_message in messages[::-1]:
+            if user_message.role == UserMessageContentRoleType.USER:
You'll need to add the import:
+from nat.data_models.api_server import UserMessageContentRoleType
Minor: Exception message length.

Ruff flags line 152 for a long exception message. While the message is clear, consider extracting it to a constant if this pattern repeats.

154-176: LGTM! Routing logic correctly handles different workflow types.

The method appropriately routes processing based on workflow schema type, returning the correct data structure for each case. The union return type ChatRequest | TextContent | str is intentional and handled correctly by the caller.

Minor: Exception message length.

Ruff flags line 176 for a long exception message, similar to line 152. This is a minor style issue.

178-211: LGTM! Workflow request processing correctly updated.

The method properly uses the new process_user_message_content to get the appropriate message format and passes it to _run_workflow. Error handling is appropriate.

Minor: Unnecessary parentheses around condition.

Line 192 has unnecessary parentheses around the condition if (self._running_workflow_task is None):. While not incorrect, removing them would be more pythonic.

Apply this diff:
-            if (self._running_workflow_task is None):
+            if self._running_workflow_task is None:

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ac3ea63 and 52424c5.

📒 Files selected for processing (1)

src/nat/front_ends/fastapi/message_handler.py (6 hunks)

🧰 Additional context used

📓 Path-based instructions (6)

**/*.{py,yaml,yml}

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.

Files:

src/nat/front_ends/fastapi/message_handler.py

**/*.py

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).

**/*.py: In code comments/identifiers use NAT abbreviations as specified: nat for API namespace/CLI, nvidia-nat for package name, NAT for env var prefixes; do not use these abbreviations in documentation
Follow PEP 20 and PEP 8; run yapf with column_limit=120; use 4-space indentation; end files with a single trailing newline
Run ruff check --fix as linter (not formatter) using pyproject.toml config; fix warnings unless explicitly ignored
Respect naming: snake_case for functions/variables, PascalCase for classes, UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: use bare raise to re-raise; log with logger.error() when re-raising to avoid duplicate stack traces; use logger.exception() when catching without re-raising
Provide Google-style docstrings for every public module, class, function, and CLI command; first line concise and ending with a period; surround code entities with backticks
Validate and sanitize all user input, especially in web or CLI interfaces
Prefer httpx with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile or mprof before optimizing; cache expensive computations with functools.lru_cache or external cache; leverage NumPy vectorized operations when beneficial

Files:

src/nat/front_ends/fastapi/message_handler.py

src/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

All importable Python code must live under src/ (or packages//src/)

Files:

src/nat/front_ends/fastapi/message_handler.py

src/nat/**/*

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Changes in src/nat should prioritize backward compatibility

Files:

src/nat/front_ends/fastapi/message_handler.py

⚙️ CodeRabbit configuration file

This directory contains the core functionality of the toolkit. Changes should prioritize backward compatibility.

Files:

src/nat/front_ends/fastapi/message_handler.py

{src/**/*.py,packages/*/src/**/*.py}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

All public APIs must have Python 3.11+ type hints on parameters and return values; prefer typing/collections.abc abstractions; use typing.Annotated when useful

Files:

src/nat/front_ends/fastapi/message_handler.py

**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions
Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.
Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
Example:
def my_function(param1: int, param2: str) -> bool:
    pass
For Python exception handling, ensure proper stack trace preservation:

When re-raising exceptions: use bare raise statements to maintain the original stack trace,
and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.

When catching and logging exceptions without re-raising: always use logger.exception()
to capture the full stack trace information.
Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

and should contain an Apache License 2.0 header comment at the top of each file.

Confirm that copyright years are up-to date whenever a file is changed.

Files:

src/nat/front_ends/fastapi/message_handler.py

🧬 Code graph analysis (1)

src/nat/front_ends/fastapi/message_handler.py (2)

src/nat/data_models/api_server.py (14)

ChatRequest (150-197)

ChatResponse (297-347)

ChatResponseChunk (350-438)

Error (535-540)

ErrorTypes (527-532)

Message (119-121)

ResponsePayloadOutput (456-465)

ResponseSerializable (274-282)

SystemResponseContent (605-608)

TextContent (102-106)

UserMessages (508-512)

WebSocketUserMessage (543-560)

WebSocketUserInteractionResponseMessage (563-576)

WorkflowSchemaType (490-497)

src/nat/front_ends/fastapi/message_validator.py (1)

convert_text_content_to_human_response (164-197)

🪛 Ruff (0.14.0)

src/nat/front_ends/fastapi/message_handler.py

152-152: Avoid specifying long messages outside the exception class

(TRY003)

176-176: Avoid specifying long messages outside the exception class

(TRY003)

🔇 Additional comments (3)

src/nat/front_ends/fastapi/message_handler.py (3)

71-71: LGTM! Type annotation correctly updated.

The change from Future[HumanResponse] to Future[TextContent] aligns with the new flow where the WebSocket receives raw text content first, then converts it to a HumanResponse in the callback where the original prompt context is available.

126-137: LGTM! Clean conversion logic.

The method correctly transforms a list of UserMessages into a ChatRequest with full message history, enabling conversation context for CHAT and CHAT_STREAM workflows.

284-319: LGTM! Human interaction flow correctly refactored.

The callback properly handles the new flow:

Creates a Future[TextContent] to receive the raw user response

Waits for the WebSocket to provide the TextContent

Converts it to the appropriate HumanResponse type using the original prompt.content for context

This design correctly separates concerns: the Future receives raw text, and the conversion to typed response happens where the prompt type information is available.

src/nat/front_ends/fastapi/message_handler.py

Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/nat/front_ends/fastapi/message_handler.py (1)

119-122: Check for missing interaction future.

If a client sends a WebSocketUserInteractionResponseMessage out of band (no outstanding prompt), _user_interaction_response is still None, so set_result raises AttributeError and tears down the handler. Please guard against this (e.g., ignore with warning or send protocol error) before calling set_result.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 52424c5 and fd0f4dd.

📒 Files selected for processing (1)

src/nat/front_ends/fastapi/message_handler.py (7 hunks)

🧰 Additional context used

📓 Path-based instructions (6)

**/*.{py,yaml,yml}

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.

Files:

src/nat/front_ends/fastapi/message_handler.py

**/*.py

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).

**/*.py: In code comments/identifiers use NAT abbreviations as specified: nat for API namespace/CLI, nvidia-nat for package name, NAT for env var prefixes; do not use these abbreviations in documentation
Follow PEP 20 and PEP 8; run yapf with column_limit=120; use 4-space indentation; end files with a single trailing newline
Run ruff check --fix as linter (not formatter) using pyproject.toml config; fix warnings unless explicitly ignored
Respect naming: snake_case for functions/variables, PascalCase for classes, UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: use bare raise to re-raise; log with logger.error() when re-raising to avoid duplicate stack traces; use logger.exception() when catching without re-raising
Provide Google-style docstrings for every public module, class, function, and CLI command; first line concise and ending with a period; surround code entities with backticks
Validate and sanitize all user input, especially in web or CLI interfaces
Prefer httpx with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile or mprof before optimizing; cache expensive computations with functools.lru_cache or external cache; leverage NumPy vectorized operations when beneficial

Files:

src/nat/front_ends/fastapi/message_handler.py

src/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

All importable Python code must live under src/ (or packages//src/)

Files:

src/nat/front_ends/fastapi/message_handler.py

src/nat/**/*

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Changes in src/nat should prioritize backward compatibility

Files:

src/nat/front_ends/fastapi/message_handler.py

⚙️ CodeRabbit configuration file

This directory contains the core functionality of the toolkit. Changes should prioritize backward compatibility.

Files:

src/nat/front_ends/fastapi/message_handler.py

{src/**/*.py,packages/*/src/**/*.py}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

All public APIs must have Python 3.11+ type hints on parameters and return values; prefer typing/collections.abc abstractions; use typing.Annotated when useful

Files:

src/nat/front_ends/fastapi/message_handler.py

**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions
Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.
Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
Example:
def my_function(param1: int, param2: str) -> bool:
    pass
For Python exception handling, ensure proper stack trace preservation:

When re-raising exceptions: use bare raise statements to maintain the original stack trace,
and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.

When catching and logging exceptions without re-raising: always use logger.exception()
to capture the full stack trace information.
Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

and should contain an Apache License 2.0 header comment at the top of each file.

Confirm that copyright years are up-to date whenever a file is changed.

Files:

src/nat/front_ends/fastapi/message_handler.py

🧬 Code graph analysis (1)

src/nat/front_ends/fastapi/message_handler.py (2)

src/nat/data_models/api_server.py (8)

ChatRequest (124-194)

Error (575-580)

Message (119-121)

TextContent (102-106)

UserMessages (548-552)

WebSocketUserMessage (583-600)

WebSocketUserInteractionResponseMessage (603-616)

WorkflowSchemaType (530-537)

src/nat/front_ends/fastapi/message_validator.py (1)

convert_text_content_to_human_response (164-197)

🪛 Ruff (0.14.0)

src/nat/front_ends/fastapi/message_handler.py

160-160: Avoid specifying long messages outside the exception class

(TRY003)

187-187: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: CI Pipeline / Check

src/nat/front_ends/fastapi/message_handler.py

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (2)

src/nat/front_ends/fastapi/message_handler.py (2)

154-164: Handle legacy generate payloads.

Line 162 calls user_content.content.messages without checking if the messages attribute exists. Legacy WebSocket generate requests may send minimal payloads like {"content": {"text": "..."}} without a messages array, which would raise AttributeError and break backward compatibility.

Apply this diff to handle both legacy and new payload formats:

     async def _process_websocket_user_message(self, user_content: WebSocketUserMessage) -> ChatRequest | str:
         """
         Processes a WebSocketUserMessage based on schema type.
         """
         if self._workflow_schema_type in [WorkflowSchemaType.CHAT, WorkflowSchemaType.CHAT_STREAM]:
             return ChatRequest(**user_content.content.model_dump(exclude_none=True))
 
         elif self._workflow_schema_type in [WorkflowSchemaType.GENERATE, WorkflowSchemaType.GENERATE_STREAM]:
+            # Handle legacy format: {"content": {"text": "..."}}
+            if hasattr(user_content.content, 'text') and user_content.content.text:
+                return user_content.content.text
+            # Handle new format: {"content": {"messages": [...]}}
+            if not hasattr(user_content.content, 'messages'):
+                raise ValueError("UserMessageContent must have either 'text' or 'messages' field")
             return self._extract_last_user_message_content(user_content.content.messages).text
 
         raise ValueError("Unsupported workflow schema type for WebSocketUserMessage")

Based on learnings from past review comments.

127-145: Guard against non-text user payloads.

The method only extracts TextContent from messages. If a user message contains only non-text content (e.g., images, tool calls), the method raises a generic ValueError even when the payload is legitimate. This will cause WebSocket requests with valid multimodal inputs to fail unnecessarily.

Consider one of these approaches:

Accept the first content item regardless of type (if downstream can handle it):

 def _extract_last_user_message_content(self, messages: list[UserMessages]) -> TextContent:
     """
-    Extracts the last user's TextContent from a list of messages.
+    Extracts the last user's content from a list of messages.

     Args:
         messages: List of UserMessages.

     Returns:
-        TextContent object from the last user message.
+        First content object from the last user message.

     Raises:
-        ValueError: If no user text content is found.
+        ValueError: If no user content is found.
     """
     for user_message in messages[::-1]:
         if user_message.role == UserMessageContentRoleType.USER:
-            for attachment in user_message.content:
-                if isinstance(attachment, TextContent):
-                    return attachment
-    raise ValueError("No user text content found in messages.")
+            if user_message.content:
+                return user_message.content[0]
+    raise ValueError("No user content found in messages.")

Reject non-text payloads earlier with a schema-aware error (if only text is supported):

 def _extract_last_user_message_content(self, messages: list[UserMessages]) -> TextContent:
     """
     Extracts the last user's TextContent from a list of messages.

     Args:
         messages: List of UserMessages.

     Returns:
         TextContent object from the last user message.

     Raises:
         ValueError: If no user text content is found.
     """
     for user_message in messages[::-1]:
         if user_message.role == UserMessageContentRoleType.USER:
             for attachment in user_message.content:
                 if isinstance(attachment, TextContent):
                     return attachment
+                else:
+                    raise ValueError(
+                        f"Non-text content type {type(attachment).__name__} is not supported. "
+                        "Only TextContent is accepted."
+                    )
     raise ValueError("No user text content found in messages.")

Based on learnings from past review comments.

🧹 Nitpick comments (3)

src/nat/front_ends/fastapi/message_validator.py (1)

243-243: Consider using None as default and initializing inside the function.

The linter flags calling SystemResponseContent() in the default argument. While Pydantic models are generally safe (each call creates a new instance), the pattern is still considered a code smell. As per coding guidelines, this is flagged by ruff check and should ideally be addressed.

Apply this diff to follow the recommended pattern:

     async def create_system_response_token_message(
         self,
         message_type: Literal[WebSocketMessageType.RESPONSE_MESSAGE,
                               WebSocketMessageType.ERROR_MESSAGE] = WebSocketMessageType.RESPONSE_MESSAGE,
         message_id: str | None = str(uuid.uuid4()),
         thread_id: str = "default",
         parent_id: str = "default",
         conversation_id: str | None = None,
-        content: SystemResponseContent | Error = SystemResponseContent(),
+        content: SystemResponseContent | Error | None = None,
         status: WebSocketMessageStatus = WebSocketMessageStatus.IN_PROGRESS,
         timestamp: str = str(datetime.datetime.now(datetime.UTC))
     ) -> WebSocketSystemResponseTokenMessage | None:
         """
         Creates a system response token message with default values.
 
         :param message_type: Type of WebSocket message.
         :param message_id: Unique identifier for the message (default: generated UUID).
         :param thread_id: ID of the thread the message belongs to (default: "default").
         :param parent_id: ID of the user message that spawned child messages.
         :param conversation_id: ID of the conversation this message belongs to (default: None).
         :param content: Message content.
         :param status: Status of the message (default: IN_PROGRESS).
         :param timestamp: Timestamp of the message (default: current UTC time).
         :return: A WebSocketSystemResponseTokenMessage instance.
         """
         try:
+            if content is None:
+                content = SystemResponseContent()
             return WebSocketSystemResponseTokenMessage(type=message_type,
                                                        id=message_id,
                                                        thread_id=thread_id,
                                                        parent_id=parent_id,
                                                        conversation_id=conversation_id,
                                                        content=content,
                                                        status=status,
                                                        timestamp=timestamp)

src/nat/front_ends/fastapi/message_handler.py (2)

145-145: Extract exception messages to constants.

Lines 145 and 164 specify long error messages directly in raise statements. Per coding guidelines and Ruff TRY003, consider extracting these to module-level constants for better maintainability.

Add constants at the module level:
# Near the top of the file, after imports
_ERR_NO_USER_TEXT_CONTENT = "No user text content found in messages."
_ERR_UNSUPPORTED_SCHEMA_TYPE = "Unsupported workflow schema type for WebSocketUserMessage"
Then use them:
-        raise ValueError("No user text content found in messages.")
+        raise ValueError(_ERR_NO_USER_TEXT_CONTENT)
-        raise ValueError("Unsupported workflow schema type for WebSocketUserMessage")
+        raise ValueError(_ERR_UNSUPPORTED_SCHEMA_TYPE)
As per coding guidelines.

Also applies to: 164-164

222-224: Simplify attribute access.

Using getattr with a constant attribute name 'id' is unnecessary since you've already verified the attribute exists with hasattr. Direct property access is clearer and equally safe here.

Apply this diff:
             if hasattr(data_model, 'id'):
-                message_id: str = str(getattr(data_model, 'id'))
+                message_id: str = str(data_model.id)
             else:
                 message_id = str(uuid.uuid4())

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fd0f4dd and 8d58088.

📒 Files selected for processing (2)

src/nat/front_ends/fastapi/message_handler.py (9 hunks)
src/nat/front_ends/fastapi/message_validator.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (6)

**/*.{py,yaml,yml}

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.

Files:

src/nat/front_ends/fastapi/message_validator.py
src/nat/front_ends/fastapi/message_handler.py

**/*.py

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).

**/*.py: In code comments/identifiers use NAT abbreviations as specified: nat for API namespace/CLI, nvidia-nat for package name, NAT for env var prefixes; do not use these abbreviations in documentation
Follow PEP 20 and PEP 8; run yapf with column_limit=120; use 4-space indentation; end files with a single trailing newline
Run ruff check --fix as linter (not formatter) using pyproject.toml config; fix warnings unless explicitly ignored
Respect naming: snake_case for functions/variables, PascalCase for classes, UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: use bare raise to re-raise; log with logger.error() when re-raising to avoid duplicate stack traces; use logger.exception() when catching without re-raising
Provide Google-style docstrings for every public module, class, function, and CLI command; first line concise and ending with a period; surround code entities with backticks
Validate and sanitize all user input, especially in web or CLI interfaces
Prefer httpx with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile or mprof before optimizing; cache expensive computations with functools.lru_cache or external cache; leverage NumPy vectorized operations when beneficial

Files:

src/nat/front_ends/fastapi/message_validator.py
src/nat/front_ends/fastapi/message_handler.py

src/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

All importable Python code must live under src/ (or packages//src/)

Files:

src/nat/front_ends/fastapi/message_validator.py
src/nat/front_ends/fastapi/message_handler.py

src/nat/**/*

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Changes in src/nat should prioritize backward compatibility

Files:

src/nat/front_ends/fastapi/message_validator.py
src/nat/front_ends/fastapi/message_handler.py

⚙️ CodeRabbit configuration file

This directory contains the core functionality of the toolkit. Changes should prioritize backward compatibility.

Files:

src/nat/front_ends/fastapi/message_validator.py
src/nat/front_ends/fastapi/message_handler.py

{src/**/*.py,packages/*/src/**/*.py}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

All public APIs must have Python 3.11+ type hints on parameters and return values; prefer typing/collections.abc abstractions; use typing.Annotated when useful

Files:

src/nat/front_ends/fastapi/message_validator.py
src/nat/front_ends/fastapi/message_handler.py

**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions
Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.
Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
Example:
def my_function(param1: int, param2: str) -> bool:
    pass
For Python exception handling, ensure proper stack trace preservation:

When re-raising exceptions: use bare raise statements to maintain the original stack trace,
and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.

When catching and logging exceptions without re-raising: always use logger.exception()
to capture the full stack trace information.
Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

and should contain an Apache License 2.0 header comment at the top of each file.

Confirm that copyright years are up-to date whenever a file is changed.

Files:

src/nat/front_ends/fastapi/message_validator.py
src/nat/front_ends/fastapi/message_handler.py

🧬 Code graph analysis (2)

src/nat/front_ends/fastapi/message_validator.py (1)

src/nat/data_models/api_server.py (2)

SystemResponseContent (645-648)

Error (575-580)

src/nat/front_ends/fastapi/message_handler.py (3)

src/nat/data_models/api_server.py (8)

ChatRequest (124-194)

UserMessageContentRoleType (40-46)

UserMessages (548-552)

TextContent (102-106)

WebSocketUserInteractionResponseMessage (603-616)

WebSocketUserMessage (583-600)

WorkflowSchemaType (530-537)

WebSocketMessageStatus (540-545)

src/nat/front_ends/fastapi/message_validator.py (1)

convert_text_content_to_human_response (164-197)

src/nat/runtime/session.py (1)

session (93-128)

🪛 Ruff (0.14.0)

src/nat/front_ends/fastapi/message_validator.py

243-243: Do not perform function call SystemResponseContent in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

src/nat/front_ends/fastapi/message_handler.py

145-145: Avoid specifying long messages outside the exception class

(TRY003)

164-164: Avoid specifying long messages outside the exception class

(TRY003)

223-223: Do not call getattr with a constant attribute value. It is not any safer than normal property access.

Replace getattr with attribute access

(B009)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: CI Pipeline / Check

🔇 Additional comments (4)

src/nat/front_ends/fastapi/message_handler.py (4)

28-28: LGTM: Import and type annotation updates.

The new imports (ChatRequest, UserMessageContentRoleType, UserMessages) and updated type annotations (_workflow_schema_type, _user_interaction_response, _schema_output_mapping) correctly support the refactored message handling flow for chat history.

Also applies to: 37-39, 70-71, 75-80

147-152: LGTM: User interaction response processing.

The _process_websocket_user_interaction_response_message method correctly extracts TextContent from interaction responses using the helper method.

183-184: LGTM: Unused parameter naming convention.

Renaming the callback parameter from task to _task follows Python convention for intentionally unused parameters.

287-287: LGTM: TextContent-based interaction handling.

The updated interaction callback flow correctly uses TextContent as the intermediate type and converts it to HumanResponse using the validator's conversion method. This aligns with the new message handling architecture.

Also applies to: 303-306

Signed-off-by: Will Killian <wkillian@nvidia.com>

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/nat/front_ends/fastapi/message_handler.py (1)
181-192: Handle concurrent message requests gracefully.

When a workflow is already running (_running_workflow_task is not None), new user messages are silently ignored without notification to the client. This can lead to a confusing user experience where messages appear to be sent but receive no response.

Consider one of these approaches:

Queue messages for sequential processing:
if self._running_workflow_task is None:
    # Start workflow
    self._running_workflow_task = asyncio.create_task(...)
else:
    # Optionally queue or send error
    await self.create_websocket_message(
        data_model=Error(
            code=ErrorTypes.WORKFLOW_BUSY,
            message="A workflow is already in progress. Please wait for completion.",
            details=f"Message {user_message_as_validated_type.id} ignored."
        ),
        message_type=WebSocketMessageType.ERROR_MESSAGE,
        status=WebSocketMessageStatus.COMPLETE
    )
Reject with clear error if concurrent requests aren't supported.

🧹 Nitpick comments (2)

src/nat/front_ends/fastapi/message_handler.py (2)
127-146: Consider more descriptive error for unsupported content types.

The method raises ValueError when no TextContent is found, but this could occur for legitimate non-text payloads (e.g., multimodal content). The error message doesn't clarify whether this is a workflow limitation or a validation issue.

Consider this improvement:
         for user_message in messages[::-1]:
             if user_message.role == UserMessageContentRoleType.USER:
                 for attachment in user_message.content:
                     if isinstance(attachment, TextContent):
                         return attachment
-        raise ValueError("No user text content found in messages.")
+        raise ValueError(
+            "No text content found in user messages. This workflow only supports text-based interactions. "
+            "Multimodal content (images, files, etc.) is not supported."
+        )
222-223: Simplify attribute access pattern.

The combination of hasattr and getattr with a constant attribute name is safe but verbose. After checking with hasattr, you can use direct attribute access.

Consider this simplification:
-            if hasattr(data_model, 'id'):
-                message_id: str = str(getattr(data_model, 'id'))
+            if hasattr(data_model, 'id'):
+                message_id: str = str(data_model.id)
             else:
                 message_id = str(uuid.uuid4())

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8d58088 and 809f935.

📒 Files selected for processing (2)

src/nat/front_ends/fastapi/message_handler.py (9 hunks)
src/nat/front_ends/fastapi/message_validator.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (6)

**/*.{py,yaml,yml}

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.

Files:

src/nat/front_ends/fastapi/message_validator.py
src/nat/front_ends/fastapi/message_handler.py

**/*.py

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).

**/*.py: In code comments/identifiers use NAT abbreviations as specified: nat for API namespace/CLI, nvidia-nat for package name, NAT for env var prefixes; do not use these abbreviations in documentation
Follow PEP 20 and PEP 8; run yapf with column_limit=120; use 4-space indentation; end files with a single trailing newline
Run ruff check --fix as linter (not formatter) using pyproject.toml config; fix warnings unless explicitly ignored
Respect naming: snake_case for functions/variables, PascalCase for classes, UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: use bare raise to re-raise; log with logger.error() when re-raising to avoid duplicate stack traces; use logger.exception() when catching without re-raising
Provide Google-style docstrings for every public module, class, function, and CLI command; first line concise and ending with a period; surround code entities with backticks
Validate and sanitize all user input, especially in web or CLI interfaces
Prefer httpx with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile or mprof before optimizing; cache expensive computations with functools.lru_cache or external cache; leverage NumPy vectorized operations when beneficial

Files:

src/nat/front_ends/fastapi/message_validator.py
src/nat/front_ends/fastapi/message_handler.py

src/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

All importable Python code must live under src/ (or packages//src/)

Files:

src/nat/front_ends/fastapi/message_validator.py
src/nat/front_ends/fastapi/message_handler.py

src/nat/**/*

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Changes in src/nat should prioritize backward compatibility

Files:

src/nat/front_ends/fastapi/message_validator.py
src/nat/front_ends/fastapi/message_handler.py

⚙️ CodeRabbit configuration file

This directory contains the core functionality of the toolkit. Changes should prioritize backward compatibility.

Files:

src/nat/front_ends/fastapi/message_validator.py
src/nat/front_ends/fastapi/message_handler.py

{src/**/*.py,packages/*/src/**/*.py}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

All public APIs must have Python 3.11+ type hints on parameters and return values; prefer typing/collections.abc abstractions; use typing.Annotated when useful

Files:

src/nat/front_ends/fastapi/message_validator.py
src/nat/front_ends/fastapi/message_handler.py

**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions
Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.
Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
Example:
def my_function(param1: int, param2: str) -> bool:
    pass
For Python exception handling, ensure proper stack trace preservation:

When re-raising exceptions: use bare raise statements to maintain the original stack trace,
and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.

When catching and logging exceptions without re-raising: always use logger.exception()
to capture the full stack trace information.
Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

and should contain an Apache License 2.0 header comment at the top of each file.

Confirm that copyright years are up-to date whenever a file is changed.

Files:

src/nat/front_ends/fastapi/message_validator.py
src/nat/front_ends/fastapi/message_handler.py

🧬 Code graph analysis (2)

src/nat/front_ends/fastapi/message_validator.py (1)

src/nat/data_models/api_server.py (2)

SystemResponseContent (645-648)

Error (575-580)

src/nat/front_ends/fastapi/message_handler.py (4)

src/nat/data_models/api_server.py (8)

ChatRequest (124-194)

UserMessageContentRoleType (40-46)

UserMessages (548-552)

TextContent (102-106)

WebSocketUserInteractionResponseMessage (603-616)

WebSocketUserMessage (583-600)

WorkflowSchemaType (530-537)

WebSocketMessageStatus (540-545)

src/nat/authentication/interfaces.py (1)

FlowHandlerBase (75-96)

src/nat/front_ends/fastapi/message_validator.py (1)

convert_text_content_to_human_response (164-197)

src/nat/runtime/session.py (1)

session (93-128)

🪛 Ruff (0.14.0)

src/nat/front_ends/fastapi/message_validator.py

243-243: Do not perform function call SystemResponseContent in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

src/nat/front_ends/fastapi/message_handler.py

145-145: Avoid specifying long messages outside the exception class

(TRY003)

164-164: Avoid specifying long messages outside the exception class

(TRY003)

223-223: Do not call getattr with a constant attribute value. It is not any safer than normal property access.

Replace getattr with attribute access

(B009)

🔇 Additional comments (2)

src/nat/front_ends/fastapi/message_handler.py (2)

274-312: LGTM! Clean refactoring of interaction handling.

The refactoring to handle user interactions as TextContent internally, then convert to the appropriate HumanResponse type is well-structured. The flow is clear:

Create future for TextContent

Send interaction prompt to client

Await TextContent response

Convert to typed HumanResponse

This separation of concerns improves maintainability.

314-344: LGTM! Enhanced workflow invocation with better typing.

The additions of result_type and output_type parameters provide better type information for workflow execution. The explicit parameter passing in the session context manager improves code clarity and makes the authentication callback handling more obvious.

src/nat/front_ends/fastapi/message_handler.py

src/nat/front_ends/fastapi/message_validator.py

willkill07 · 2025-10-15T01:08:20Z

/merge

Sending proper workflow payload to agent to enable chat history

52424c5

Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>

ericevans-nv self-assigned this Oct 14, 2025

ericevans-nv requested a review from a team as a code owner October 14, 2025 20:15

ericevans-nv added improvement Improvement to existing functionality non-breaking Non-breaking change labels Oct 14, 2025

ericevans-nv changed the title ~~Enable Chat History for WebSocket~~ Enable Chat History for WebSocket Messages Oct 14, 2025

Merge branch 'release/1.3' of github.com:NVIDIA/NeMo-Agent-Toolkit in…

dda8a52

…to bugfix/websocket-chat-history

coderabbitai bot reviewed Oct 14, 2025

View reviewed changes

willkill07 approved these changes Oct 14, 2025

View reviewed changes

willkill07 reviewed Oct 14, 2025

View reviewed changes

src/nat/front_ends/fastapi/message_handler.py Outdated Show resolved Hide resolved

src/nat/front_ends/fastapi/message_handler.py Outdated Show resolved Hide resolved

Updating docstring style

fd0f4dd

Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>

coderabbitai bot reviewed Oct 14, 2025

View reviewed changes

src/nat/front_ends/fastapi/message_handler.py Outdated Show resolved Hide resolved

src/nat/front_ends/fastapi/message_handler.py Show resolved Hide resolved

src/nat/front_ends/fastapi/message_handler.py Outdated Show resolved Hide resolved

coderabbitai bot reviewed Oct 15, 2025

View reviewed changes

code enhancements for readability and type safety

809f935

Signed-off-by: Will Killian <wkillian@nvidia.com>

willkill07 force-pushed the bugfix/websocket-chat-history branch from 8d58088 to 809f935 Compare October 15, 2025 00:29

coderabbitai bot reviewed Oct 15, 2025

View reviewed changes

src/nat/front_ends/fastapi/message_handler.py Show resolved Hide resolved

src/nat/front_ends/fastapi/message_validator.py Show resolved Hide resolved

rapids-bot bot merged commit 184114b into NVIDIA:release/1.3 Oct 15, 2025
17 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable Chat History for WebSocket Messages#999

Enable Chat History for WebSocket Messages#999
rapids-bot[bot] merged 4 commits intoNVIDIA:release/1.3from
ericevans-nv:bugfix/websocket-chat-history

ericevans-nv commented Oct 14, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Oct 14, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

Uh oh!

coderabbitai bot left a comment

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

Uh oh!

Uh oh!

Uh oh!

willkill07 commented Oct 15, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ericevans-nv commented Oct 14, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

By Submitting this PR I confirm:

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

Uh oh!

Uh oh!

Uh oh!

willkill07 commented Oct 15, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ericevans-nv commented Oct 14, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 14, 2025 •

edited

Loading