-
Notifications
You must be signed in to change notification settings - Fork 485
Fix chat history support in tool_calling_agent #837
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix chat history support in tool_calling_agent #837
Conversation
Signed-off-by: Greg Freeman <gfreeman@nvidia.com>
Signed-off-by: Greg Freeman <gfreeman@nvidia.com>
WalkthroughAdds a max_history field to ToolCallAgentWorkflowConfig, changes the tool-calling agent response function to accept a Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor Client
participant Graph as ToolCallAgentGraph
participant Resp as _response_fn
participant Trim as trim_messages
participant LLM as LLM/Tools
Client->>Graph: send ChatRequest (messages)
Graph->>Resp: call _response_fn(input_message: ChatRequest)
Resp->>Trim: trim_messages(input_message.messages, max_history)
Trim-->>Resp: BaseMessage[] (trimmed history)
Resp->>LLM: invoke agent with trimmed BaseMessage history
LLM-->>Resp: assistant response (may include tool calls)
Resp-->>Graph: return str(output_message.content)
Graph-->>Client: ChatResponse
note over Resp,Trim: History trimming applied before LLM invocation
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested labels
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests
📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/nat/agent/tool_calling_agent/register.py (1)
85-111: Return type mismatch: function annotated to return ChatResponse but returns str.This breaks type hints and API consistency with other agents (ReAct/ReWOO). Return a ChatResponse in both success and error paths.
Apply this diff:
@@ - async def _response_fn(input_message: ChatRequest) -> ChatResponse: + async def _response_fn(input_message: ChatRequest) -> ChatResponse: @@ - output_message = state.messages[-1] - return output_message.content + output_message = state.messages[-1] + return ChatResponse.from_string(str(output_message.content)) @@ - except Exception as ex: - logger.exception("%s Tool Calling Agent failed with exception: %s", AGENT_LOG_PREFIX, ex) - if config.verbose: - return str(ex) - return "I seem to be having a problem." + except Exception as ex: + logger.exception("%s Tool Calling Agent failed with exception: %s", AGENT_LOG_PREFIX, ex) + if config.verbose: + return ChatResponse.from_string(str(ex)) + return ChatResponse.from_string("I seem to be having a problem.")
🧹 Nitpick comments (3)
tests/nat/agent/test_tool_calling.py (1)
101-127: Strengthen assertions to verify history preservation, not just length.Assert that the first three messages are unchanged and that a new AI message is appended. This tightens the guarantee that history isn’t stripped.
Apply this diff:
@@ - state = await graph.ainvoke(state, config={'recursion_limit': 5}) - state = ToolCallAgentGraphState(**state) - assert len(state.messages) > 3 + state = await graph.ainvoke(state, config={'recursion_limit': 5}) + state = ToolCallAgentGraphState(**state) + # history preserved in order + assert [type(m) for m in state.messages[:3]] == [type(m) for m in messages] + assert [m.content for m in state.messages[:3]] == [m.content for m in messages] + # exactly one new AI message appended for this scenario + assert len(state.messages) == len(messages) + 1 + assert isinstance(state.messages[-1], AIMessage)src/nat/agent/tool_calling_agent/register.py (2)
43-44: Clarify max_history semantics (tokens vs. messages).You pass
max_historytotrim_messagesas a token budget (withtoken_counter=len), but the description says “number of messages.” Align the description to avoid confusion.Apply this diff:
- max_history: int = Field(default=15, description="Maximum number of messages to keep in the conversation history.") + max_history: int = Field(default=15, description="Maximum token budget for trimming conversation history.")
114-117: Avoid exception logging on normal GeneratorExit; fix stale cleanup message.Use debug level for normal shutdown and correct the agent name in the log.
Apply this diff:
except GeneratorExit: - logger.exception("%s Workflow exited early!", AGENT_LOG_PREFIX) + logger.debug("%s Workflow exited early!", AGENT_LOG_PREFIX) finally: - logger.debug("%s Cleaning up react_agent workflow.", AGENT_LOG_PREFIX) + logger.debug("%s Cleaning up tool_calling_agent workflow.", AGENT_LOG_PREFIX)
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/nat/agent/tool_calling_agent/register.py(4 hunks)tests/nat/agent/test_tool_calling.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (9)
src/**/*.py
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
src/**/*.py: All importable Python code must live under src/
All public APIs in src/ require Python 3.11+ type hints on parameters and return values; prefer typing/collections.abc abstractions; use typing.Annotated when useful
Provide Google-style docstrings for every public module, class, function, and CLI command; first line concise with a period; surround code entities with backticks
Files:
src/nat/agent/tool_calling_agent/register.py
src/nat/**/*
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Core functionality under src/nat should prioritize backward compatibility when changed
Files:
src/nat/agent/tool_calling_agent/register.py
⚙️ CodeRabbit configuration file
This directory contains the core functionality of the toolkit. Changes should prioritize backward compatibility.
Files:
src/nat/agent/tool_calling_agent/register.py
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
**/*.py: Follow PEP 8/20 style; format with yapf (column_limit=120) and use 4-space indentation; end files with a single newline
Run ruff (ruff check --fix) per pyproject.toml; fix warnings unless explicitly ignored; ruff is linter-only
Use snake_case for functions/variables, PascalCase for classes, and UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: preserve stack traces and avoid duplicate logging
When re-raising exceptions, use bareraiseand log with logger.error(), not logger.exception()
When catching and not re-raising, log with logger.exception() to capture stack trace
Validate and sanitize all user input; prefer httpx with SSL verification and follow OWASP Top‑10
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile/mprof; cache with functools.lru_cache or external cache; leverage NumPy vectorization when beneficial
**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).
Files:
src/nat/agent/tool_calling_agent/register.pytests/nat/agent/test_tool_calling.py
**/*.{py,sh,md,yml,yaml,toml,ini,json,ipynb,txt,rst}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
**/*.{py,sh,md,yml,yaml,toml,ini,json,ipynb,txt,rst}: Every file must start with the standard SPDX Apache-2.0 header; keep copyright years up‑to‑date
All source files must include the SPDX Apache‑2.0 header; do not bypass CI header checks
Files:
src/nat/agent/tool_calling_agent/register.pytests/nat/agent/test_tool_calling.py
**/*.{py,md}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Never hard‑code version numbers in code or docs; versions are derived by setuptools‑scm
Files:
src/nat/agent/tool_calling_agent/register.pytests/nat/agent/test_tool_calling.py
**/*.{py,yaml,yml}
📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)
**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.
Files:
src/nat/agent/tool_calling_agent/register.pytests/nat/agent/test_tool_calling.py
**/*
⚙️ CodeRabbit configuration file
**/*: # Code Review Instructions
- Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.- Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
Example:def my_function(param1: int, param2: str) -> bool: pass- For Python exception handling, ensure proper stack trace preservation:
- When re-raising exceptions: use bare
raisestatements to maintain the original stack trace,
and uselogger.error()(notlogger.exception()) to avoid duplicate stack trace output.- When catching and logging exceptions without re-raising: always use
logger.exception()
to capture the full stack trace information.Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any
words listed in the
ci/vale/styles/config/vocabularies/nat/reject.txtfile, words that might appear to be
spelling mistakes but are listed in theci/vale/styles/config/vocabularies/nat/accept.txtfile are OK.Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,
and should contain an Apache License 2.0 header comment at the top of each file.
- Confirm that copyright years are up-to date whenever a file is changed.
Files:
src/nat/agent/tool_calling_agent/register.pytests/nat/agent/test_tool_calling.py
tests/**/*.py
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Unit tests must live under tests/ and use configured markers (e2e, integration, etc.)
Files:
tests/nat/agent/test_tool_calling.py
⚙️ CodeRabbit configuration file
tests/**/*.py: - Ensure that tests are comprehensive, cover edge cases, and validate the functionality of the code. - Test functions should be named using thetest_prefix, using snake_case. - Any frequently repeated code should be extracted into pytest fixtures. - Pytest fixtures should define the name argument when applying the pytest.fixture decorator. The fixture
function being decorated should be named using thefixture_prefix, using snake_case. Example:
@pytest.fixture(name="my_fixture")
def fixture_my_fixture():
pass
Files:
tests/nat/agent/test_tool_calling.py
**/tests/**/*.py
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
**/tests/**/*.py: Test functions must use the test_ prefix and snake_case
Extract repeated test code into pytest fixtures; fixtures should set name=... in @pytest.fixture and functions named with fixture_ prefix
Mark expensive tests with @pytest.mark.slow or @pytest.mark.integration
Use pytest with pytest-asyncio for async code; mock external services with pytest_httpserver or unittest.mock
Files:
tests/nat/agent/test_tool_calling.py
🧬 Code graph analysis (2)
src/nat/agent/tool_calling_agent/register.py (4)
src/nat/data_models/api_server.py (2)
ChatRequest(114-185)ChatResponse(238-283)src/nat/agent/rewoo_agent/register.py (1)
_response_fn(127-153)src/nat/agent/react_agent/register.py (1)
_response_fn(120-148)src/nat/agent/tool_calling_agent/agent.py (1)
ToolCallAgentGraphState(44-46)
tests/nat/agent/test_tool_calling.py (2)
tests/conftest.py (2)
mock_llm(289-347)mock_tool(351-373)src/nat/agent/tool_calling_agent/agent.py (3)
ToolCallAgentGraph(49-221)ToolCallAgentGraphState(44-46)build_graph(214-221)
🔇 Additional comments (2)
src/nat/agent/tool_calling_agent/register.py (2)
25-27: ChatResponse import currently unused due to return-type mismatch below.Once you return ChatResponse in
_response_fn(see fix below), this import will be used; otherwise ruff will flag it.
52-56: Localize heavy imports — LGTM.Pulling these into the workflow scope is fine and keeps module import cost lower.
Signed-off-by: Greg Freeman <gfreeman@nvidia.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/nat/agent/tool_calling_agent/register.py (1)
84-120: Add a string-API wrapper or document ChatRequest-only behaviortool_calling_agent_workflow registers _response_fn(input_message: ChatRequest) and yields it directly (src/nat/agent/tool_calling_agent/register.py — yield FunctionInfo.from_fn), but unlike react_agent/rewoo_agent it has no use_openai_api flag or _str_api_fn using GlobalTypeConverter; add a GlobalTypeConverter-based _str_api_fn or converters for string/OpenAI-compatible callers, or explicitly document that this workflow only accepts ChatRequest.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/nat/agent/react_agent/register.py(1 hunks)src/nat/agent/rewoo_agent/register.py(1 hunks)src/nat/agent/tool_calling_agent/register.py(5 hunks)
✅ Files skipped from review due to trivial changes (1)
- src/nat/agent/rewoo_agent/register.py
🧰 Additional context used
📓 Path-based instructions (7)
src/**/*.py
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
src/**/*.py: All importable Python code must live under src/
All public APIs in src/ require Python 3.11+ type hints on parameters and return values; prefer typing/collections.abc abstractions; use typing.Annotated when useful
Provide Google-style docstrings for every public module, class, function, and CLI command; first line concise with a period; surround code entities with backticks
Files:
src/nat/agent/react_agent/register.pysrc/nat/agent/tool_calling_agent/register.py
src/nat/**/*
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Core functionality under src/nat should prioritize backward compatibility when changed
Files:
src/nat/agent/react_agent/register.pysrc/nat/agent/tool_calling_agent/register.py
⚙️ CodeRabbit configuration file
This directory contains the core functionality of the toolkit. Changes should prioritize backward compatibility.
Files:
src/nat/agent/react_agent/register.pysrc/nat/agent/tool_calling_agent/register.py
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
**/*.py: Follow PEP 8/20 style; format with yapf (column_limit=120) and use 4-space indentation; end files with a single newline
Run ruff (ruff check --fix) per pyproject.toml; fix warnings unless explicitly ignored; ruff is linter-only
Use snake_case for functions/variables, PascalCase for classes, and UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: preserve stack traces and avoid duplicate logging
When re-raising exceptions, use bareraiseand log with logger.error(), not logger.exception()
When catching and not re-raising, log with logger.exception() to capture stack trace
Validate and sanitize all user input; prefer httpx with SSL verification and follow OWASP Top‑10
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile/mprof; cache with functools.lru_cache or external cache; leverage NumPy vectorization when beneficial
**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).
Files:
src/nat/agent/react_agent/register.pysrc/nat/agent/tool_calling_agent/register.py
**/*.{py,sh,md,yml,yaml,toml,ini,json,ipynb,txt,rst}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
**/*.{py,sh,md,yml,yaml,toml,ini,json,ipynb,txt,rst}: Every file must start with the standard SPDX Apache-2.0 header; keep copyright years up‑to‑date
All source files must include the SPDX Apache‑2.0 header; do not bypass CI header checks
Files:
src/nat/agent/react_agent/register.pysrc/nat/agent/tool_calling_agent/register.py
**/*.{py,md}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Never hard‑code version numbers in code or docs; versions are derived by setuptools‑scm
Files:
src/nat/agent/react_agent/register.pysrc/nat/agent/tool_calling_agent/register.py
**/*.{py,yaml,yml}
📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)
**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.
Files:
src/nat/agent/react_agent/register.pysrc/nat/agent/tool_calling_agent/register.py
**/*
⚙️ CodeRabbit configuration file
**/*: # Code Review Instructions
- Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.- Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
Example:def my_function(param1: int, param2: str) -> bool: pass- For Python exception handling, ensure proper stack trace preservation:
- When re-raising exceptions: use bare
raisestatements to maintain the original stack trace,
and uselogger.error()(notlogger.exception()) to avoid duplicate stack trace output.- When catching and logging exceptions without re-raising: always use
logger.exception()
to capture the full stack trace information.Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any
words listed in the
ci/vale/styles/config/vocabularies/nat/reject.txtfile, words that might appear to be
spelling mistakes but are listed in theci/vale/styles/config/vocabularies/nat/accept.txtfile are OK.Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,
and should contain an Apache License 2.0 header comment at the top of each file.
- Confirm that copyright years are up-to date whenever a file is changed.
Files:
src/nat/agent/react_agent/register.pysrc/nat/agent/tool_calling_agent/register.py
🧬 Code graph analysis (1)
src/nat/agent/tool_calling_agent/register.py (4)
src/nat/data_models/api_server.py (1)
ChatRequest(114-185)src/nat/agent/react_agent/register.py (1)
_response_fn(120-159)src/nat/agent/rewoo_agent/register.py (1)
_response_fn(127-164)src/nat/agent/tool_calling_agent/agent.py (1)
ToolCallAgentGraphState(44-46)
🔇 Additional comments (7)
src/nat/agent/react_agent/register.py (1)
121-131: LGTM! Excellent docstring addition.The detailed Google-style docstring follows the coding guidelines and provides clear documentation for the internal
_response_fn. The documentation is comprehensive, properly formatted, and uses backticks around code entities as required.src/nat/agent/tool_calling_agent/register.py (6)
25-25: Import addition looks correct.The
ChatRequestimport is properly added and aligns with the signature change in_response_fn.
42-43: LGTM! Configuration field added consistently.The
max_historyfield addition is consistent with the ReAct and ReWOO agents, providing proper history management configuration.
53-54: Import additions are appropriate.The new imports for
trim_messagesandBaseMessagesupport the message history handling functionality.
84-95: Function signature change correctly implements chat history support.The signature change from
strtoChatRequestenables proper message history handling, and the docstring follows Google-style formatting requirements.
98-104: Message history handling implementation is solid.The implementation uses
trim_messagesconsistently with other agents (ReAct and ReWOO) and properly converts the ChatRequest messages to BaseMessage format for the graph state.
115-115: Verify string conversion behavior for different message content types.ChatRequest.Message.content is typed as str | list[UserContent] and the agent messages come from an external langchain_core.BaseMessage (not defined in this repo); blind use of str(output_message.content) can produce unexpected serialization for Pydantic models, Image/AudioContent or lists — explicitly handle cases (isinstance str → return directly; Pydantic BaseModel → model_dump_json()/model_dump(); objects with .text/.content → use that; lists → serialize/flatten) instead of relying on str().
Location: src/nat/agent/tool_calling_agent/register.py (return str(output_message.content)); compare with existing conversions in src/nat/data_models/api_server.py.
Signed-off-by: Greg Freeman <gfreeman@nvidia.com>
|
/ok to test 250763e |
|
/merge |
Fixes chat message history handling in tool_calling_agent Handles agent input with type ChatRequest which allows multiple messages to be processed. Closes NVIDIA#836 ## By Submitting this PR I confirm: - I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/NeMo-Agent-Toolkit/blob/develop/docs/source/resources/contributing.md). - We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license. - Any contribution which contains commits that are not Signed-Off will not be accepted. - When the PR is ready for review, new or existing tests cover these changes. - When the PR is ready for review, the documentation is up to date with these changes. ## Summary by CodeRabbit * **New Features** * Added configurable max conversation history and automatic trimming to limit retained messages. * Chat flows now accept structured request objects and produce consistent string responses. * **Tests** * Added an async test verifying conversation history is preserved and processed through execution. * **Documentation** * Added docstrings clarifying internal response function behavior for maintainability. Authors: - https://github.com/gfreeman-nvidia Approvers: - Yuchen Zhang (https://github.com/yczhang-nv) URL: NVIDIA#837 Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>
Description
Fixes chat message history handling in tool_calling_agent
Handles agent input with type ChatRequest which allows multiple messages to be processed.
Closes #836
By Submitting this PR I confirm:
Summary by CodeRabbit
New Features
Tests
Documentation