Skip to content

Enable Chat History for WebSocket Messages#999

Merged
rapids-bot[bot] merged 4 commits intoNVIDIA:release/1.3from
ericevans-nv:bugfix/websocket-chat-history
Oct 15, 2025
Merged

Enable Chat History for WebSocket Messages#999
rapids-bot[bot] merged 4 commits intoNVIDIA:release/1.3from
ericevans-nv:bugfix/websocket-chat-history

Conversation

@ericevans-nv
Copy link
Contributor

@ericevans-nv ericevans-nv commented Oct 14, 2025

Description

This PR updates the WebSocket message handler to pass the correct payload to the agent, enabling chat history for supported workflow types.

By Submitting this PR I confirm:

  • I am familiar with the Contributing Guidelines.
  • We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
    • Any contribution which contains commits that are not Signed-Off will not be accepted.
  • When the PR is ready for review, new or existing tests cover these changes.
  • When the PR is ready for review, the documentation is up to date with these changes.

Summary by CodeRabbit

  • New Features

    • Enhanced WebSocket handling for multi-message chats and unified Chat/Generate streams with consistent outputs.
    • Improved human-in-the-loop prompts and responses, returning clearer, consistent message content.
  • Bug Fixes

    • Prevents workflows from starting while another task runs, reducing race conditions.
    • More robust message creation and error handling for fewer failed interactions.

Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>
@ericevans-nv ericevans-nv self-assigned this Oct 14, 2025
@ericevans-nv ericevans-nv requested a review from a team as a code owner October 14, 2025 20:15
@ericevans-nv ericevans-nv added improvement Improvement to existing functionality non-breaking Non-breaking change labels Oct 14, 2025
@ericevans-nv ericevans-nv changed the title Enable Chat History for WebSocket Enable Chat History for WebSocket Messages Oct 14, 2025
@coderabbitai
Copy link

coderabbitai bot commented Oct 14, 2025

Walkthrough

Introduces new helpers and public models to process WebSocket user messages: extracts last user TextContent, converts UserMessages to ChatRequest or prompt text, updates workflow invocation and streaming to use these results, and shifts human-interaction futures/callbacks to operate on TextContent.

Changes

Cohort / File(s) Summary
WebSocket message processing & handler
src/nat/front_ends/fastapi/message_handler.py
- Added _extract_last_user_message_content(messages: list[UserMessages]) -> TextContent, _process_websocket_user_interaction_response_message(user_content: WebSocketUserInteractionResponseMessage) -> TextContent, and `_process_websocket_user_message(user_content: WebSocketUserMessage) -> ChatRequest
Human interaction types & callback flow
src/nat/front_ends/fastapi/message_handler.py
- _user_interaction_response type changed from `asyncio.Future[HumanResponse]
Message validator signature formatting
src/nat/front_ends/fastapi/message_validator.py
- Reformatted signature: `create_system_response_token_message(..., content: SystemResponseContent

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Client as WebSocket Client
  participant MH as MessageHandler
  participant WF as Workflow Runner

  rect rgba(230,240,255,0.5)
    note right of Client: WebSocketUserMessage (contains UserMessages)
    Client->>MH: send(WebSocketUserMessage)
    MH->>MH: _process_websocket_user_message(...)
    alt CHAT / CHAT_STREAM
      MH->>MH: build ChatRequest
      MH->>WF: start(ChatRequest) [if no task running]
    else GENERATE / GENERATE_STREAM
      MH->>MH: extract last TextContent.text
      MH->>WF: start(prompt text) [if no task running]
    end
  end

  rect rgba(240,255,230,0.5)
    note right of Client: WebSocketUserInteractionResponseMessage
    Client->>MH: send(WebSocketUserInteractionResponseMessage)
    MH->>MH: _process_websocket_user_interaction_response_message(...) -> TextContent
    MH->>MH: convert TextContent -> HumanResponse
    MH-->>WF: resume workflow with HumanResponse
  end

  MH-->>Client: stream/progress/results
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 72.73% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title “Enable Chat History for WebSocket Messages” clearly summarizes the main change, uses an imperative verb to describe the action, remains under the 72-character limit, and accurately reflects the PR’s objective of enabling chat history support over WebSocket.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
src/nat/front_ends/fastapi/message_handler.py (3)

139-152: Consider using enum constant for role comparison.

Line 148 uses string literal "user" for role comparison. Consider using UserMessageContentRoleType.USER instead for type safety and consistency with the codebase's type system.

Apply this diff:

-        for user_message in messages[::-1]:
-            if user_message.role == "user":
+        for user_message in messages[::-1]:
+            if user_message.role == UserMessageContentRoleType.USER:

You'll need to add the import:

+from nat.data_models.api_server import UserMessageContentRoleType

Minor: Exception message length.

Ruff flags line 152 for a long exception message. While the message is clear, consider extracting it to a constant if this pattern repeats.


154-176: LGTM! Routing logic correctly handles different workflow types.

The method appropriately routes processing based on workflow schema type, returning the correct data structure for each case. The union return type ChatRequest | TextContent | str is intentional and handled correctly by the caller.


Minor: Exception message length.

Ruff flags line 176 for a long exception message, similar to line 152. This is a minor style issue.


178-211: LGTM! Workflow request processing correctly updated.

The method properly uses the new process_user_message_content to get the appropriate message format and passes it to _run_workflow. Error handling is appropriate.


Minor: Unnecessary parentheses around condition.

Line 192 has unnecessary parentheses around the condition if (self._running_workflow_task is None):. While not incorrect, removing them would be more pythonic.

Apply this diff:

-            if (self._running_workflow_task is None):
+            if self._running_workflow_task is None:
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ac3ea63 and 52424c5.

📒 Files selected for processing (1)
  • src/nat/front_ends/fastapi/message_handler.py (6 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
**/*.{py,yaml,yml}

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.

Files:

  • src/nat/front_ends/fastapi/message_handler.py
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).

**/*.py: In code comments/identifiers use NAT abbreviations as specified: nat for API namespace/CLI, nvidia-nat for package name, NAT for env var prefixes; do not use these abbreviations in documentation
Follow PEP 20 and PEP 8; run yapf with column_limit=120; use 4-space indentation; end files with a single trailing newline
Run ruff check --fix as linter (not formatter) using pyproject.toml config; fix warnings unless explicitly ignored
Respect naming: snake_case for functions/variables, PascalCase for classes, UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: use bare raise to re-raise; log with logger.error() when re-raising to avoid duplicate stack traces; use logger.exception() when catching without re-raising
Provide Google-style docstrings for every public module, class, function, and CLI command; first line concise and ending with a period; surround code entities with backticks
Validate and sanitize all user input, especially in web or CLI interfaces
Prefer httpx with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile or mprof before optimizing; cache expensive computations with functools.lru_cache or external cache; leverage NumPy vectorized operations when beneficial

Files:

  • src/nat/front_ends/fastapi/message_handler.py
src/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

All importable Python code must live under src/ (or packages//src/)

Files:

  • src/nat/front_ends/fastapi/message_handler.py
src/nat/**/*

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Changes in src/nat should prioritize backward compatibility

Files:

  • src/nat/front_ends/fastapi/message_handler.py

⚙️ CodeRabbit configuration file

This directory contains the core functionality of the toolkit. Changes should prioritize backward compatibility.

Files:

  • src/nat/front_ends/fastapi/message_handler.py
{src/**/*.py,packages/*/src/**/*.py}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

All public APIs must have Python 3.11+ type hints on parameters and return values; prefer typing/collections.abc abstractions; use typing.Annotated when useful

Files:

  • src/nat/front_ends/fastapi/message_handler.py
**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions

  • Ensure the code follows best practices and coding standards. - For Python code, follow
    PEP 20 and
    PEP 8 for style guidelines.
  • Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
    Example:
    def my_function(param1: int, param2: str) -> bool:
        pass
  • For Python exception handling, ensure proper stack trace preservation:
    • When re-raising exceptions: use bare raise statements to maintain the original stack trace,
      and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.
    • When catching and logging exceptions without re-raising: always use logger.exception()
      to capture the full stack trace information.

Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

and should contain an Apache License 2.0 header comment at the top of each file.

  • Confirm that copyright years are up-to date whenever a file is changed.

Files:

  • src/nat/front_ends/fastapi/message_handler.py
🧬 Code graph analysis (1)
src/nat/front_ends/fastapi/message_handler.py (2)
src/nat/data_models/api_server.py (14)
  • ChatRequest (150-197)
  • ChatResponse (297-347)
  • ChatResponseChunk (350-438)
  • Error (535-540)
  • ErrorTypes (527-532)
  • Message (119-121)
  • ResponsePayloadOutput (456-465)
  • ResponseSerializable (274-282)
  • SystemResponseContent (605-608)
  • TextContent (102-106)
  • UserMessages (508-512)
  • WebSocketUserMessage (543-560)
  • WebSocketUserInteractionResponseMessage (563-576)
  • WorkflowSchemaType (490-497)
src/nat/front_ends/fastapi/message_validator.py (1)
  • convert_text_content_to_human_response (164-197)
🪛 Ruff (0.14.0)
src/nat/front_ends/fastapi/message_handler.py

152-152: Avoid specifying long messages outside the exception class

(TRY003)


176-176: Avoid specifying long messages outside the exception class

(TRY003)

🔇 Additional comments (3)
src/nat/front_ends/fastapi/message_handler.py (3)

71-71: LGTM! Type annotation correctly updated.

The change from Future[HumanResponse] to Future[TextContent] aligns with the new flow where the WebSocket receives raw text content first, then converts it to a HumanResponse in the callback where the original prompt context is available.


126-137: LGTM! Clean conversion logic.

The method correctly transforms a list of UserMessages into a ChatRequest with full message history, enabling conversation context for CHAT and CHAT_STREAM workflows.


284-319: LGTM! Human interaction flow correctly refactored.

The callback properly handles the new flow:

  1. Creates a Future[TextContent] to receive the raw user response
  2. Waits for the WebSocket to provide the TextContent
  3. Converts it to the appropriate HumanResponse type using the original prompt.content for context

This design correctly separates concerns: the Future receives raw text, and the conversion to typed response happens where the prompt type information is available.

Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/nat/front_ends/fastapi/message_handler.py (1)

119-122: Check for missing interaction future.

If a client sends a WebSocketUserInteractionResponseMessage out of band (no outstanding prompt), _user_interaction_response is still None, so set_result raises AttributeError and tears down the handler. Please guard against this (e.g., ignore with warning or send protocol error) before calling set_result.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 52424c5 and fd0f4dd.

📒 Files selected for processing (1)
  • src/nat/front_ends/fastapi/message_handler.py (7 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
**/*.{py,yaml,yml}

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.

Files:

  • src/nat/front_ends/fastapi/message_handler.py
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).

**/*.py: In code comments/identifiers use NAT abbreviations as specified: nat for API namespace/CLI, nvidia-nat for package name, NAT for env var prefixes; do not use these abbreviations in documentation
Follow PEP 20 and PEP 8; run yapf with column_limit=120; use 4-space indentation; end files with a single trailing newline
Run ruff check --fix as linter (not formatter) using pyproject.toml config; fix warnings unless explicitly ignored
Respect naming: snake_case for functions/variables, PascalCase for classes, UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: use bare raise to re-raise; log with logger.error() when re-raising to avoid duplicate stack traces; use logger.exception() when catching without re-raising
Provide Google-style docstrings for every public module, class, function, and CLI command; first line concise and ending with a period; surround code entities with backticks
Validate and sanitize all user input, especially in web or CLI interfaces
Prefer httpx with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile or mprof before optimizing; cache expensive computations with functools.lru_cache or external cache; leverage NumPy vectorized operations when beneficial

Files:

  • src/nat/front_ends/fastapi/message_handler.py
src/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

All importable Python code must live under src/ (or packages//src/)

Files:

  • src/nat/front_ends/fastapi/message_handler.py
src/nat/**/*

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Changes in src/nat should prioritize backward compatibility

Files:

  • src/nat/front_ends/fastapi/message_handler.py

⚙️ CodeRabbit configuration file

This directory contains the core functionality of the toolkit. Changes should prioritize backward compatibility.

Files:

  • src/nat/front_ends/fastapi/message_handler.py
{src/**/*.py,packages/*/src/**/*.py}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

All public APIs must have Python 3.11+ type hints on parameters and return values; prefer typing/collections.abc abstractions; use typing.Annotated when useful

Files:

  • src/nat/front_ends/fastapi/message_handler.py
**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions

  • Ensure the code follows best practices and coding standards. - For Python code, follow
    PEP 20 and
    PEP 8 for style guidelines.
  • Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
    Example:
    def my_function(param1: int, param2: str) -> bool:
        pass
  • For Python exception handling, ensure proper stack trace preservation:
    • When re-raising exceptions: use bare raise statements to maintain the original stack trace,
      and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.
    • When catching and logging exceptions without re-raising: always use logger.exception()
      to capture the full stack trace information.

Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

and should contain an Apache License 2.0 header comment at the top of each file.

  • Confirm that copyright years are up-to date whenever a file is changed.

Files:

  • src/nat/front_ends/fastapi/message_handler.py
🧬 Code graph analysis (1)
src/nat/front_ends/fastapi/message_handler.py (2)
src/nat/data_models/api_server.py (8)
  • ChatRequest (124-194)
  • Error (575-580)
  • Message (119-121)
  • TextContent (102-106)
  • UserMessages (548-552)
  • WebSocketUserMessage (583-600)
  • WebSocketUserInteractionResponseMessage (603-616)
  • WorkflowSchemaType (530-537)
src/nat/front_ends/fastapi/message_validator.py (1)
  • convert_text_content_to_human_response (164-197)
🪛 Ruff (0.14.0)
src/nat/front_ends/fastapi/message_handler.py

160-160: Avoid specifying long messages outside the exception class

(TRY003)


187-187: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: CI Pipeline / Check

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
src/nat/front_ends/fastapi/message_handler.py (2)

154-164: Handle legacy generate payloads.

Line 162 calls user_content.content.messages without checking if the messages attribute exists. Legacy WebSocket generate requests may send minimal payloads like {"content": {"text": "..."}} without a messages array, which would raise AttributeError and break backward compatibility.

Apply this diff to handle both legacy and new payload formats:

     async def _process_websocket_user_message(self, user_content: WebSocketUserMessage) -> ChatRequest | str:
         """
         Processes a WebSocketUserMessage based on schema type.
         """
         if self._workflow_schema_type in [WorkflowSchemaType.CHAT, WorkflowSchemaType.CHAT_STREAM]:
             return ChatRequest(**user_content.content.model_dump(exclude_none=True))
 
         elif self._workflow_schema_type in [WorkflowSchemaType.GENERATE, WorkflowSchemaType.GENERATE_STREAM]:
+            # Handle legacy format: {"content": {"text": "..."}}
+            if hasattr(user_content.content, 'text') and user_content.content.text:
+                return user_content.content.text
+            # Handle new format: {"content": {"messages": [...]}}
+            if not hasattr(user_content.content, 'messages'):
+                raise ValueError("UserMessageContent must have either 'text' or 'messages' field")
             return self._extract_last_user_message_content(user_content.content.messages).text
 
         raise ValueError("Unsupported workflow schema type for WebSocketUserMessage")

Based on learnings from past review comments.


127-145: Guard against non-text user payloads.

The method only extracts TextContent from messages. If a user message contains only non-text content (e.g., images, tool calls), the method raises a generic ValueError even when the payload is legitimate. This will cause WebSocket requests with valid multimodal inputs to fail unnecessarily.

Consider one of these approaches:

  1. Accept the first content item regardless of type (if downstream can handle it):
 def _extract_last_user_message_content(self, messages: list[UserMessages]) -> TextContent:
     """
-    Extracts the last user's TextContent from a list of messages.
+    Extracts the last user's content from a list of messages.

     Args:
         messages: List of UserMessages.

     Returns:
-        TextContent object from the last user message.
+        First content object from the last user message.

     Raises:
-        ValueError: If no user text content is found.
+        ValueError: If no user content is found.
     """
     for user_message in messages[::-1]:
         if user_message.role == UserMessageContentRoleType.USER:
-            for attachment in user_message.content:
-                if isinstance(attachment, TextContent):
-                    return attachment
-    raise ValueError("No user text content found in messages.")
+            if user_message.content:
+                return user_message.content[0]
+    raise ValueError("No user content found in messages.")
  1. Reject non-text payloads earlier with a schema-aware error (if only text is supported):
 def _extract_last_user_message_content(self, messages: list[UserMessages]) -> TextContent:
     """
     Extracts the last user's TextContent from a list of messages.

     Args:
         messages: List of UserMessages.

     Returns:
         TextContent object from the last user message.

     Raises:
         ValueError: If no user text content is found.
     """
     for user_message in messages[::-1]:
         if user_message.role == UserMessageContentRoleType.USER:
             for attachment in user_message.content:
                 if isinstance(attachment, TextContent):
                     return attachment
+                else:
+                    raise ValueError(
+                        f"Non-text content type {type(attachment).__name__} is not supported. "
+                        "Only TextContent is accepted."
+                    )
     raise ValueError("No user text content found in messages.")

Based on learnings from past review comments.

🧹 Nitpick comments (3)
src/nat/front_ends/fastapi/message_validator.py (1)

243-243: Consider using None as default and initializing inside the function.

The linter flags calling SystemResponseContent() in the default argument. While Pydantic models are generally safe (each call creates a new instance), the pattern is still considered a code smell. As per coding guidelines, this is flagged by ruff check and should ideally be addressed.

Apply this diff to follow the recommended pattern:

     async def create_system_response_token_message(
         self,
         message_type: Literal[WebSocketMessageType.RESPONSE_MESSAGE,
                               WebSocketMessageType.ERROR_MESSAGE] = WebSocketMessageType.RESPONSE_MESSAGE,
         message_id: str | None = str(uuid.uuid4()),
         thread_id: str = "default",
         parent_id: str = "default",
         conversation_id: str | None = None,
-        content: SystemResponseContent | Error = SystemResponseContent(),
+        content: SystemResponseContent | Error | None = None,
         status: WebSocketMessageStatus = WebSocketMessageStatus.IN_PROGRESS,
         timestamp: str = str(datetime.datetime.now(datetime.UTC))
     ) -> WebSocketSystemResponseTokenMessage | None:
         """
         Creates a system response token message with default values.
 
         :param message_type: Type of WebSocket message.
         :param message_id: Unique identifier for the message (default: generated UUID).
         :param thread_id: ID of the thread the message belongs to (default: "default").
         :param parent_id: ID of the user message that spawned child messages.
         :param conversation_id: ID of the conversation this message belongs to (default: None).
         :param content: Message content.
         :param status: Status of the message (default: IN_PROGRESS).
         :param timestamp: Timestamp of the message (default: current UTC time).
         :return: A WebSocketSystemResponseTokenMessage instance.
         """
         try:
+            if content is None:
+                content = SystemResponseContent()
             return WebSocketSystemResponseTokenMessage(type=message_type,
                                                        id=message_id,
                                                        thread_id=thread_id,
                                                        parent_id=parent_id,
                                                        conversation_id=conversation_id,
                                                        content=content,
                                                        status=status,
                                                        timestamp=timestamp)
src/nat/front_ends/fastapi/message_handler.py (2)

145-145: Extract exception messages to constants.

Lines 145 and 164 specify long error messages directly in raise statements. Per coding guidelines and Ruff TRY003, consider extracting these to module-level constants for better maintainability.

Add constants at the module level:

# Near the top of the file, after imports
_ERR_NO_USER_TEXT_CONTENT = "No user text content found in messages."
_ERR_UNSUPPORTED_SCHEMA_TYPE = "Unsupported workflow schema type for WebSocketUserMessage"

Then use them:

-        raise ValueError("No user text content found in messages.")
+        raise ValueError(_ERR_NO_USER_TEXT_CONTENT)
-        raise ValueError("Unsupported workflow schema type for WebSocketUserMessage")
+        raise ValueError(_ERR_UNSUPPORTED_SCHEMA_TYPE)

As per coding guidelines.

Also applies to: 164-164


222-224: Simplify attribute access.

Using getattr with a constant attribute name 'id' is unnecessary since you've already verified the attribute exists with hasattr. Direct property access is clearer and equally safe here.

Apply this diff:

             if hasattr(data_model, 'id'):
-                message_id: str = str(getattr(data_model, 'id'))
+                message_id: str = str(data_model.id)
             else:
                 message_id = str(uuid.uuid4())
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fd0f4dd and 8d58088.

📒 Files selected for processing (2)
  • src/nat/front_ends/fastapi/message_handler.py (9 hunks)
  • src/nat/front_ends/fastapi/message_validator.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
**/*.{py,yaml,yml}

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.

Files:

  • src/nat/front_ends/fastapi/message_validator.py
  • src/nat/front_ends/fastapi/message_handler.py
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).

**/*.py: In code comments/identifiers use NAT abbreviations as specified: nat for API namespace/CLI, nvidia-nat for package name, NAT for env var prefixes; do not use these abbreviations in documentation
Follow PEP 20 and PEP 8; run yapf with column_limit=120; use 4-space indentation; end files with a single trailing newline
Run ruff check --fix as linter (not formatter) using pyproject.toml config; fix warnings unless explicitly ignored
Respect naming: snake_case for functions/variables, PascalCase for classes, UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: use bare raise to re-raise; log with logger.error() when re-raising to avoid duplicate stack traces; use logger.exception() when catching without re-raising
Provide Google-style docstrings for every public module, class, function, and CLI command; first line concise and ending with a period; surround code entities with backticks
Validate and sanitize all user input, especially in web or CLI interfaces
Prefer httpx with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile or mprof before optimizing; cache expensive computations with functools.lru_cache or external cache; leverage NumPy vectorized operations when beneficial

Files:

  • src/nat/front_ends/fastapi/message_validator.py
  • src/nat/front_ends/fastapi/message_handler.py
src/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

All importable Python code must live under src/ (or packages//src/)

Files:

  • src/nat/front_ends/fastapi/message_validator.py
  • src/nat/front_ends/fastapi/message_handler.py
src/nat/**/*

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Changes in src/nat should prioritize backward compatibility

Files:

  • src/nat/front_ends/fastapi/message_validator.py
  • src/nat/front_ends/fastapi/message_handler.py

⚙️ CodeRabbit configuration file

This directory contains the core functionality of the toolkit. Changes should prioritize backward compatibility.

Files:

  • src/nat/front_ends/fastapi/message_validator.py
  • src/nat/front_ends/fastapi/message_handler.py
{src/**/*.py,packages/*/src/**/*.py}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

All public APIs must have Python 3.11+ type hints on parameters and return values; prefer typing/collections.abc abstractions; use typing.Annotated when useful

Files:

  • src/nat/front_ends/fastapi/message_validator.py
  • src/nat/front_ends/fastapi/message_handler.py
**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions

  • Ensure the code follows best practices and coding standards. - For Python code, follow
    PEP 20 and
    PEP 8 for style guidelines.
  • Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
    Example:
    def my_function(param1: int, param2: str) -> bool:
        pass
  • For Python exception handling, ensure proper stack trace preservation:
    • When re-raising exceptions: use bare raise statements to maintain the original stack trace,
      and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.
    • When catching and logging exceptions without re-raising: always use logger.exception()
      to capture the full stack trace information.

Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

and should contain an Apache License 2.0 header comment at the top of each file.

  • Confirm that copyright years are up-to date whenever a file is changed.

Files:

  • src/nat/front_ends/fastapi/message_validator.py
  • src/nat/front_ends/fastapi/message_handler.py
🧬 Code graph analysis (2)
src/nat/front_ends/fastapi/message_validator.py (1)
src/nat/data_models/api_server.py (2)
  • SystemResponseContent (645-648)
  • Error (575-580)
src/nat/front_ends/fastapi/message_handler.py (3)
src/nat/data_models/api_server.py (8)
  • ChatRequest (124-194)
  • UserMessageContentRoleType (40-46)
  • UserMessages (548-552)
  • TextContent (102-106)
  • WebSocketUserInteractionResponseMessage (603-616)
  • WebSocketUserMessage (583-600)
  • WorkflowSchemaType (530-537)
  • WebSocketMessageStatus (540-545)
src/nat/front_ends/fastapi/message_validator.py (1)
  • convert_text_content_to_human_response (164-197)
src/nat/runtime/session.py (1)
  • session (93-128)
🪛 Ruff (0.14.0)
src/nat/front_ends/fastapi/message_validator.py

243-243: Do not perform function call SystemResponseContent in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

src/nat/front_ends/fastapi/message_handler.py

145-145: Avoid specifying long messages outside the exception class

(TRY003)


164-164: Avoid specifying long messages outside the exception class

(TRY003)


223-223: Do not call getattr with a constant attribute value. It is not any safer than normal property access.

Replace getattr with attribute access

(B009)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: CI Pipeline / Check
🔇 Additional comments (4)
src/nat/front_ends/fastapi/message_handler.py (4)

28-28: LGTM: Import and type annotation updates.

The new imports (ChatRequest, UserMessageContentRoleType, UserMessages) and updated type annotations (_workflow_schema_type, _user_interaction_response, _schema_output_mapping) correctly support the refactored message handling flow for chat history.

Also applies to: 37-39, 70-71, 75-80


147-152: LGTM: User interaction response processing.

The _process_websocket_user_interaction_response_message method correctly extracts TextContent from interaction responses using the helper method.


183-184: LGTM: Unused parameter naming convention.

Renaming the callback parameter from task to _task follows Python convention for intentionally unused parameters.


287-287: LGTM: TextContent-based interaction handling.

The updated interaction callback flow correctly uses TextContent as the intermediate type and converts it to HumanResponse using the validator's conversion method. This aligns with the new message handling architecture.

Also applies to: 303-306

Signed-off-by: Will Killian <wkillian@nvidia.com>
@willkill07 willkill07 force-pushed the bugfix/websocket-chat-history branch from 8d58088 to 809f935 Compare October 15, 2025 00:29
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/nat/front_ends/fastapi/message_handler.py (1)

181-192: Handle concurrent message requests gracefully.

When a workflow is already running (_running_workflow_task is not None), new user messages are silently ignored without notification to the client. This can lead to a confusing user experience where messages appear to be sent but receive no response.

Consider one of these approaches:

  1. Queue messages for sequential processing:
if self._running_workflow_task is None:
    # Start workflow
    self._running_workflow_task = asyncio.create_task(...)
else:
    # Optionally queue or send error
    await self.create_websocket_message(
        data_model=Error(
            code=ErrorTypes.WORKFLOW_BUSY,
            message="A workflow is already in progress. Please wait for completion.",
            details=f"Message {user_message_as_validated_type.id} ignored."
        ),
        message_type=WebSocketMessageType.ERROR_MESSAGE,
        status=WebSocketMessageStatus.COMPLETE
    )
  1. Reject with clear error if concurrent requests aren't supported.
🧹 Nitpick comments (2)
src/nat/front_ends/fastapi/message_handler.py (2)

127-146: Consider more descriptive error for unsupported content types.

The method raises ValueError when no TextContent is found, but this could occur for legitimate non-text payloads (e.g., multimodal content). The error message doesn't clarify whether this is a workflow limitation or a validation issue.

Consider this improvement:

         for user_message in messages[::-1]:
             if user_message.role == UserMessageContentRoleType.USER:
                 for attachment in user_message.content:
                     if isinstance(attachment, TextContent):
                         return attachment
-        raise ValueError("No user text content found in messages.")
+        raise ValueError(
+            "No text content found in user messages. This workflow only supports text-based interactions. "
+            "Multimodal content (images, files, etc.) is not supported."
+        )

222-223: Simplify attribute access pattern.

The combination of hasattr and getattr with a constant attribute name is safe but verbose. After checking with hasattr, you can use direct attribute access.

Consider this simplification:

-            if hasattr(data_model, 'id'):
-                message_id: str = str(getattr(data_model, 'id'))
+            if hasattr(data_model, 'id'):
+                message_id: str = str(data_model.id)
             else:
                 message_id = str(uuid.uuid4())
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8d58088 and 809f935.

📒 Files selected for processing (2)
  • src/nat/front_ends/fastapi/message_handler.py (9 hunks)
  • src/nat/front_ends/fastapi/message_validator.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
**/*.{py,yaml,yml}

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.

Files:

  • src/nat/front_ends/fastapi/message_validator.py
  • src/nat/front_ends/fastapi/message_handler.py
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).

**/*.py: In code comments/identifiers use NAT abbreviations as specified: nat for API namespace/CLI, nvidia-nat for package name, NAT for env var prefixes; do not use these abbreviations in documentation
Follow PEP 20 and PEP 8; run yapf with column_limit=120; use 4-space indentation; end files with a single trailing newline
Run ruff check --fix as linter (not formatter) using pyproject.toml config; fix warnings unless explicitly ignored
Respect naming: snake_case for functions/variables, PascalCase for classes, UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: use bare raise to re-raise; log with logger.error() when re-raising to avoid duplicate stack traces; use logger.exception() when catching without re-raising
Provide Google-style docstrings for every public module, class, function, and CLI command; first line concise and ending with a period; surround code entities with backticks
Validate and sanitize all user input, especially in web or CLI interfaces
Prefer httpx with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile or mprof before optimizing; cache expensive computations with functools.lru_cache or external cache; leverage NumPy vectorized operations when beneficial

Files:

  • src/nat/front_ends/fastapi/message_validator.py
  • src/nat/front_ends/fastapi/message_handler.py
src/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

All importable Python code must live under src/ (or packages//src/)

Files:

  • src/nat/front_ends/fastapi/message_validator.py
  • src/nat/front_ends/fastapi/message_handler.py
src/nat/**/*

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Changes in src/nat should prioritize backward compatibility

Files:

  • src/nat/front_ends/fastapi/message_validator.py
  • src/nat/front_ends/fastapi/message_handler.py

⚙️ CodeRabbit configuration file

This directory contains the core functionality of the toolkit. Changes should prioritize backward compatibility.

Files:

  • src/nat/front_ends/fastapi/message_validator.py
  • src/nat/front_ends/fastapi/message_handler.py
{src/**/*.py,packages/*/src/**/*.py}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

All public APIs must have Python 3.11+ type hints on parameters and return values; prefer typing/collections.abc abstractions; use typing.Annotated when useful

Files:

  • src/nat/front_ends/fastapi/message_validator.py
  • src/nat/front_ends/fastapi/message_handler.py
**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions

  • Ensure the code follows best practices and coding standards. - For Python code, follow
    PEP 20 and
    PEP 8 for style guidelines.
  • Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
    Example:
    def my_function(param1: int, param2: str) -> bool:
        pass
  • For Python exception handling, ensure proper stack trace preservation:
    • When re-raising exceptions: use bare raise statements to maintain the original stack trace,
      and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.
    • When catching and logging exceptions without re-raising: always use logger.exception()
      to capture the full stack trace information.

Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

and should contain an Apache License 2.0 header comment at the top of each file.

  • Confirm that copyright years are up-to date whenever a file is changed.

Files:

  • src/nat/front_ends/fastapi/message_validator.py
  • src/nat/front_ends/fastapi/message_handler.py
🧬 Code graph analysis (2)
src/nat/front_ends/fastapi/message_validator.py (1)
src/nat/data_models/api_server.py (2)
  • SystemResponseContent (645-648)
  • Error (575-580)
src/nat/front_ends/fastapi/message_handler.py (4)
src/nat/data_models/api_server.py (8)
  • ChatRequest (124-194)
  • UserMessageContentRoleType (40-46)
  • UserMessages (548-552)
  • TextContent (102-106)
  • WebSocketUserInteractionResponseMessage (603-616)
  • WebSocketUserMessage (583-600)
  • WorkflowSchemaType (530-537)
  • WebSocketMessageStatus (540-545)
src/nat/authentication/interfaces.py (1)
  • FlowHandlerBase (75-96)
src/nat/front_ends/fastapi/message_validator.py (1)
  • convert_text_content_to_human_response (164-197)
src/nat/runtime/session.py (1)
  • session (93-128)
🪛 Ruff (0.14.0)
src/nat/front_ends/fastapi/message_validator.py

243-243: Do not perform function call SystemResponseContent in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

src/nat/front_ends/fastapi/message_handler.py

145-145: Avoid specifying long messages outside the exception class

(TRY003)


164-164: Avoid specifying long messages outside the exception class

(TRY003)


223-223: Do not call getattr with a constant attribute value. It is not any safer than normal property access.

Replace getattr with attribute access

(B009)

🔇 Additional comments (2)
src/nat/front_ends/fastapi/message_handler.py (2)

274-312: LGTM! Clean refactoring of interaction handling.

The refactoring to handle user interactions as TextContent internally, then convert to the appropriate HumanResponse type is well-structured. The flow is clear:

  1. Create future for TextContent
  2. Send interaction prompt to client
  3. Await TextContent response
  4. Convert to typed HumanResponse

This separation of concerns improves maintainability.


314-344: LGTM! Enhanced workflow invocation with better typing.

The additions of result_type and output_type parameters provide better type information for workflow execution. The explicit parameter passing in the session context manager improves code clarity and makes the authentication callback handling more obvious.

@willkill07
Copy link
Member

/merge

@rapids-bot rapids-bot bot merged commit 184114b into NVIDIA:release/1.3 Oct 15, 2025
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improvement to existing functionality non-breaking Non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants