fix: Improved model detection/rules for ThinkingMixin#733
fix: Improved model detection/rules for ThinkingMixin#733rapids-bot[bot] merged 2 commits intoNVIDIA:developfrom
ThinkingMixin#733Conversation
Signed-off-by: Will Killian <wkillian@nvidia.com>
WalkthroughConsolidates Nemotron model detection into a single regex, updates ThinkingMixin.supported accordingly, refactors thinking_system_prompt to normalize model names and branch by standardized prefixes (NVIDIA Nemotron, Llama Nemotron v1.0/v1.1/v1.5), returns explicit prompts or None, and makes control flow robust to missing model keys. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant C as Caller
participant TM as ThinkingMixin
Note over C,TM: Build system prompt from model name and thinking flag
C->>TM: thinking_system_prompt(model, thinking: bool)
activate TM
TM->>TM: Normalize model string (lowercase, _/.\u2192-)
alt matches NVIDIA Nemotron
TM-->>C: "/think" if thinking else "/no_think"
else matches Llama Nemotron v1.0/v1.1
TM-->>C: "detailed thinking on" if thinking else "detailed thinking off"
else matches Llama Nemotron v1.5
TM-->>C: "/think" if thinking else "/no_think"
else unknown or missing
TM-->>C: None
end
deactivate TM
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested labels
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
Status, Documentation and Community
|
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (4)
src/nat/data_models/thinking_mixin.py (4)
23-24: Stale comment references “two separate regex patterns.”The implementation now uses a single composite regex. Update the comment to avoid confusion.
-# The system prompt format for thinking is different for these, so we need to distinguish them here with two separate -# regex patterns +# The system prompt format differs across Nemotron families; a single composite regex captures both variants.
54-56: Docstring drift: include v1.1 and clarify fallback.The code handles v1-0 and v1-1 similarly; docstring mentions only v1.0. Also note the default for newer Llama-Nemotron variants if you adopt the fallback.
- For Llama Nemotron v1.5, returns "/think" if enabled, else "/no_think". - For Llama Nemotron v1.0, returns "detailed thinking on" if enabled, else "detailed thinking off". + For Llama Nemotron v1.5, returns "/think" if enabled, else "/no_think". + For Llama Nemotron v1.0–v1.1, returns "detailed thinking on" if enabled, else "detailed thinking off". + For newer Llama Nemotron variants, defaults to "/think" if enabled, else "/no_think".
38-46: Minor: attribute doc consistency.Consider wrapping field names and literals in backticks per repo guidelines.
- thinking: Whether to enable thinking. Defaults to None when supported on the model. + `thinking`: Whether to enable thinking. Defaults to `None` when supported on the model.
49-83: Tests needed to lock behavior across model name variants.Add parameterized tests for:
- NVIDIA Nemotron:
nvidia/nemotron-4-340b-instruct- Llama Nemotron v1.0/v1.1: detailed on/off
- Llama Nemotron v1.5:
/think//no_think- Non-Nemotron Llama (e.g.,
nvidia/llama-3.1-70b-instruct): returnsNone- Attributes present but
None/non-str: safe skip (no crash)I can push a test module under
tests/nat/data_models/test_thinking_mixin.pywith parametrized cases if you want me to draft it.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
src/nat/data_models/thinking_mixin.py(3 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
src/**/*.py
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
src/**/*.py: All importable Python code must live under src/
All public APIs in src/ require Python 3.11+ type hints on parameters and return values; prefer typing/collections.abc abstractions; use typing.Annotated when useful
Provide Google-style docstrings for every public module, class, function, and CLI command; first line concise with a period; surround code entities with backticks
Files:
src/nat/data_models/thinking_mixin.py
src/nat/**/*
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Core functionality under src/nat should prioritize backward compatibility when changed
Files:
src/nat/data_models/thinking_mixin.py
⚙️ CodeRabbit configuration file
This directory contains the core functionality of the toolkit. Changes should prioritize backward compatibility.
Files:
src/nat/data_models/thinking_mixin.py
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
**/*.py: Follow PEP 8/20 style; format with yapf (column_limit=120) and use 4-space indentation; end files with a single newline
Run ruff (ruff check --fix) per pyproject.toml; fix warnings unless explicitly ignored; ruff is linter-only
Use snake_case for functions/variables, PascalCase for classes, and UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: preserve stack traces and avoid duplicate logging
When re-raising exceptions, use bareraiseand log with logger.error(), not logger.exception()
When catching and not re-raising, log with logger.exception() to capture stack trace
Validate and sanitize all user input; prefer httpx with SSL verification and follow OWASP Top‑10
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile/mprof; cache with functools.lru_cache or external cache; leverage NumPy vectorization when beneficial
Files:
src/nat/data_models/thinking_mixin.py
**/*.{py,sh,md,yml,yaml,toml,ini,json,ipynb,txt,rst}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
**/*.{py,sh,md,yml,yaml,toml,ini,json,ipynb,txt,rst}: Every file must start with the standard SPDX Apache-2.0 header; keep copyright years up‑to‑date
All source files must include the SPDX Apache‑2.0 header; do not bypass CI header checks
Files:
src/nat/data_models/thinking_mixin.py
**/*.{py,md}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Never hard‑code version numbers in code or docs; versions are derived by setuptools‑scm
Files:
src/nat/data_models/thinking_mixin.py
**/*
⚙️ CodeRabbit configuration file
**/*: # Code Review Instructions
- Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.- Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
Example:def my_function(param1: int, param2: str) -> bool: pass- For Python exception handling, ensure proper stack trace preservation:
- When re-raising exceptions: use bare
raisestatements to maintain the original stack trace,
and uselogger.error()(notlogger.exception()) to avoid duplicate stack trace output.- When catching and logging exceptions without re-raising: always use
logger.exception()
to capture the full stack trace information.Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any
words listed in the
ci/vale/styles/config/vocabularies/nat/reject.txtfile, words that might appear to be
spelling mistakes but are listed in theci/vale/styles/config/vocabularies/nat/accept.txtfile are OK.Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,
and should contain an Apache License 2.0 header comment at the top of each file.
- Confirm that copyright years are up-to date whenever a file is changed.
Files:
src/nat/data_models/thinking_mixin.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: CI Pipeline / Check
🔇 Additional comments (1)
src/nat/data_models/thinking_mixin.py (1)
25-25: Regex now correctly matches bothnvidia/nemotron-*andnvidia/llama-nemotron-*families
Verified against representative samples—includingnvidia/nemotron-4-340b-instruct,nvidia/llama-nemotron-70b-instruct-v1.0, and exclusions likenvidia/llama-3.1-70b-instruct—and works as intended.
Signed-off-by: Will Killian <wkillian@nvidia.com>
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
src/nat/data_models/thinking_mixin.py (1)
72-74: Handle “nvidia/nemotron-*” prefix in prompt selection.Without this, plain Nemotron models resolve to None. Also avoid matching unrelated “nvidia/nvidia-*” models.
- if model.startswith("nvidia/nvidia"): + if model.startswith(("nvidia/nemotron", "nvidia/nvidia-nemotron")): return "/think" if self.thinking else "/no_think"
🧹 Nitpick comments (5)
tests/nat/data_models/test_thinking_mixin.py (2)
41-58: Good coverage; add dotted/alt-version variants to harden normalization.You normalize '.'→'-', but tests only cover v1.0 dotted and v1-5 hyphenated. Please also assert:
- v1.1 with hyphen (v1-1)
- v1.5 with dot (v1.5)
m_true = Model(model_name="NVIDIA/LLaMa-3.1-Nemotron-v1.1", thinking=True) assert m_true.thinking_system_prompt == "detailed thinking on" m_false = Model(model_name="NVIDIA/LLaMa-3.1-Nemotron-v1.1", thinking=False) assert m_false.thinking_system_prompt == "detailed thinking off" + # v1.1 with hyphen variant + m_true = Model(model_name="NVIDIA/LLaMa-3.1-Nemotron-v1-1", thinking=True) + assert m_true.thinking_system_prompt == "detailed thinking on" + m_false = Model(model_name="NVIDIA/LLaMa-3.1-Nemotron-v1-1", thinking=False) + assert m_false.thinking_system_prompt == "detailed thinking off" + m_true = Model(model_name="NVIDIA/LLaMa-3.1-Nemotron-v1-5", thinking=True) assert m_true.thinking_system_prompt == "/think" m_false = Model(model_name="NVIDIA/LLaMa-3.1-Nemotron-v1-5", thinking=False) assert m_false.thinking_system_prompt == "/no_think" + + # v1.5 with dotted variant + m_true = Model(model_name="NVIDIA/LLaMa-3.1-Nemotron-v1.5", thinking=True) + assert m_true.thinking_system_prompt == "/think" + m_false = Model(model_name="NVIDIA/LLaMa-3.1-Nemotron-v1.5", thinking=False) + assert m_false.thinking_system_prompt == "/no_think"
98-99: Azure deployment: also cover v1.5 to validate /think path.Add a v1.5 azure case to assert the inline token behavior.
- m = Model(azure_deployment="nvidia/llama3-nemotron-v1-0", thinking=True) - assert m.thinking_system_prompt == "detailed thinking on" + m = Model(azure_deployment="nvidia/llama3-nemotron-v1-0", thinking=True) + assert m.thinking_system_prompt == "detailed thinking on" + m = Model(azure_deployment="nvidia/llama3-nemotron-v1-5", thinking=False) + assert m.thinking_system_prompt == "/no_think"src/nat/data_models/thinking_mixin.py (3)
54-56: Docstring: include v1.1 for “detailed thinking on/off”.Behavior covers v1-0 and v1-1 identically; reflect that.
- For Llama Nemotron v1.5, returns "/think" if enabled, else "/no_think". - For Llama Nemotron v1.0, returns "detailed thinking on" if enabled, else "detailed thinking off". + For Llama Nemotron v1.5, returns "/think" if enabled, else "/no_think". + For Llama Nemotron v1.0 and v1.1, returns "detailed thinking on" if enabled, else "detailed thinking off".
65-71: None/empty guard looks good; minor simplification optional.The explicit “or model is None” is redundant after isinstance check; using “not model” also skips empty strings.
- if not isinstance(model, str) or model is None: + if not isinstance(model, str) or not model: continue # Normalize name to reduce checks model = model.lower().translate(str.maketrans("_.", "--"))
75-85: Optional: constrain Llama branch to Nemotron variants.Gating already filters by regex, but a cheap guard makes intent explicit.
- if model.startswith("nvidia/llama"): + if model.startswith("nvidia/llama") and "nemotron" in model: if "v1-0" in model or "v1-1" in model: return f"detailed thinking {'on' if self.thinking else 'off'}" if "v1-5" in model: # v1.5 models are updated to use the /think and /no_think system prompts return "/think" if self.thinking else "/no_think" # Assume any other model is a newer model that uses the /think and /no_think system prompts return "/think" if self.thinking else "/no_think"
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
src/nat/data_models/thinking_mixin.py(3 hunks)tests/nat/data_models/test_thinking_mixin.py(2 hunks)
🧰 Additional context used
📓 Path-based instructions (8)
tests/**/*.py
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Unit tests must live under tests/ and use configured markers (e2e, integration, etc.)
Files:
tests/nat/data_models/test_thinking_mixin.py
⚙️ CodeRabbit configuration file
tests/**/*.py: - Ensure that tests are comprehensive, cover edge cases, and validate the functionality of the code. - Test functions should be named using thetest_prefix, using snake_case. - Any frequently repeated code should be extracted into pytest fixtures. - Pytest fixtures should define the name argument when applying the pytest.fixture decorator. The fixture
function being decorated should be named using thefixture_prefix, using snake_case. Example:
@pytest.fixture(name="my_fixture")
def fixture_my_fixture():
pass
Files:
tests/nat/data_models/test_thinking_mixin.py
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
**/*.py: Follow PEP 8/20 style; format with yapf (column_limit=120) and use 4-space indentation; end files with a single newline
Run ruff (ruff check --fix) per pyproject.toml; fix warnings unless explicitly ignored; ruff is linter-only
Use snake_case for functions/variables, PascalCase for classes, and UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: preserve stack traces and avoid duplicate logging
When re-raising exceptions, use bareraiseand log with logger.error(), not logger.exception()
When catching and not re-raising, log with logger.exception() to capture stack trace
Validate and sanitize all user input; prefer httpx with SSL verification and follow OWASP Top‑10
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile/mprof; cache with functools.lru_cache or external cache; leverage NumPy vectorization when beneficial
Files:
tests/nat/data_models/test_thinking_mixin.pysrc/nat/data_models/thinking_mixin.py
**/tests/**/*.py
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
**/tests/**/*.py: Test functions must use the test_ prefix and snake_case
Extract repeated test code into pytest fixtures; fixtures should set name=... in @pytest.fixture and functions named with fixture_ prefix
Mark expensive tests with @pytest.mark.slow or @pytest.mark.integration
Use pytest with pytest-asyncio for async code; mock external services with pytest_httpserver or unittest.mock
Files:
tests/nat/data_models/test_thinking_mixin.py
**/*.{py,sh,md,yml,yaml,toml,ini,json,ipynb,txt,rst}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
**/*.{py,sh,md,yml,yaml,toml,ini,json,ipynb,txt,rst}: Every file must start with the standard SPDX Apache-2.0 header; keep copyright years up‑to‑date
All source files must include the SPDX Apache‑2.0 header; do not bypass CI header checks
Files:
tests/nat/data_models/test_thinking_mixin.pysrc/nat/data_models/thinking_mixin.py
**/*.{py,md}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Never hard‑code version numbers in code or docs; versions are derived by setuptools‑scm
Files:
tests/nat/data_models/test_thinking_mixin.pysrc/nat/data_models/thinking_mixin.py
**/*
⚙️ CodeRabbit configuration file
**/*: # Code Review Instructions
- Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.- Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
Example:def my_function(param1: int, param2: str) -> bool: pass- For Python exception handling, ensure proper stack trace preservation:
- When re-raising exceptions: use bare
raisestatements to maintain the original stack trace,
and uselogger.error()(notlogger.exception()) to avoid duplicate stack trace output.- When catching and logging exceptions without re-raising: always use
logger.exception()
to capture the full stack trace information.Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any
words listed in the
ci/vale/styles/config/vocabularies/nat/reject.txtfile, words that might appear to be
spelling mistakes but are listed in theci/vale/styles/config/vocabularies/nat/accept.txtfile are OK.Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,
and should contain an Apache License 2.0 header comment at the top of each file.
- Confirm that copyright years are up-to date whenever a file is changed.
Files:
tests/nat/data_models/test_thinking_mixin.pysrc/nat/data_models/thinking_mixin.py
src/**/*.py
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
src/**/*.py: All importable Python code must live under src/
All public APIs in src/ require Python 3.11+ type hints on parameters and return values; prefer typing/collections.abc abstractions; use typing.Annotated when useful
Provide Google-style docstrings for every public module, class, function, and CLI command; first line concise with a period; surround code entities with backticks
Files:
src/nat/data_models/thinking_mixin.py
src/nat/**/*
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Core functionality under src/nat should prioritize backward compatibility when changed
Files:
src/nat/data_models/thinking_mixin.py
⚙️ CodeRabbit configuration file
This directory contains the core functionality of the toolkit. Changes should prioritize backward compatibility.
Files:
src/nat/data_models/thinking_mixin.py
🧬 Code graph analysis (1)
tests/nat/data_models/test_thinking_mixin.py (1)
src/nat/data_models/thinking_mixin.py (1)
thinking_system_prompt(50-87)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: CI Pipeline / Check
|
/merge |
Description
Llama Nemotron v1.5 models use
/thinkand/no_think. Let's generalize the logic a bit and set up to handle more models.Closes
By Submitting this PR I confirm:
Summary by CodeRabbit
Bug Fixes
Refactor
Documentation
Tests