docs: fix docker run commands for local LLMs#1398
docs: fix docker run commands for local LLMs#1398rapids-bot[bot] merged 1 commit intoNVIDIA:release/1.4from
Conversation
Signed-off-by: Will Killian <wkillian@nvidia.com>
WalkthroughUpdated GPU specification format in Docker run invocations within documentation from numeric index notation to explicit device string format, affecting how GPUs are referenced in containerized LLM deployment examples. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
docs/source/build-workflows/llms/using-local-llms.md (2)
20-20: Add "NVIDIA" prefix for first reference.Per coding guidelines, the first reference to the toolkit in body text should be "NVIDIA NeMo Agent toolkit" rather than "NeMo Agent toolkit".
📝 Suggested fix
-NeMo Agent toolkit has the ability to interact with locally hosted LLMs, in this guide we will demonstrate how to adapt the simple example (`examples/getting_started/simple_web_query`) to use locally hosted LLMs using two different approaches using [NVIDIA NIM](https://docs.nvidia.com/nim/) and [vLLM](https://docs.vllm.ai/), though any locally hosted LLM with an OpenAI-compatible API can be used. +NVIDIA NeMo Agent toolkit has the ability to interact with locally hosted LLMs, in this guide we will demonstrate how to adapt the simple example (`examples/getting_started/simple_web_query`) to use locally hosted LLMs using two different approaches using [NVIDIA NIM](https://docs.nvidia.com/nim/) and [vLLM](https://docs.vllm.ai/), though any locally hosted LLM with an OpenAI-compatible API can be used.As per coding guidelines for Markdown files.
26-26: Avoid possessive with inanimate objects.Per coding guidelines, avoid using possessive 's with inanimate objects in Markdown documentation. "the model's container" should be rephrased as "the container for the model".
📝 Suggested fix
-Regardless of the model you choose, the process is the same for downloading the model's container from [`build.nvidia.com`](https://build.nvidia.com/). Navigate to the model you wish to run locally, if it is able to be downloaded it will be labeled with the `RUN ANYWHERE` tag, the exact commands will be specified on the `Deploy` tab for the model. +Regardless of the model you choose, the process is the same for downloading the container for the model from [`build.nvidia.com`](https://build.nvidia.com/). Navigate to the model you wish to run locally, if it is able to be downloaded it will be labeled with the `RUN ANYWHERE` tag, the exact commands will be specified on the `Deploy` tab for the model.As per coding guidelines for Markdown files.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
docs/source/build-workflows/llms/using-local-llms.md
🧰 Additional context used
📓 Path-based instructions (6)
**/*.{md,mdx}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
**/*.{md,mdx}: Use 'NVIDIA NeMo Agent toolkit' for full name (first use), 'NeMo Agent toolkit' or 'the toolkit' for subsequent references, and 'Toolkit' (capital T) in titles/headings, 'toolkit' (lowercase t) in body text
Never use deprecated names: 'Agent Intelligence toolkit', 'aiqtoolkit', 'AgentIQ', 'AIQ', or 'aiq' in documentation; update any occurrences unless intentionally referring to deprecated versions or implementing compatibility layers
Files:
docs/source/build-workflows/llms/using-local-llms.md
**/*.{md,mdx,rst}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
**/*.{md,mdx,rst}: Documentation must be clear, comprehensive, and free of TODOs, FIXMEs, placeholder text, offensive or outdated terms, and spelling mistakes
Do not use words listed in 'ci/vale/styles/config/vocabularies/nat/reject.txt' in documentation
Words listed in 'ci/vale/styles/config/vocabularies/nat/accept.txt' are acceptable even if they appear to be spelling mistakes
Files:
docs/source/build-workflows/llms/using-local-llms.md
**/*.{py,js,ts,tsx,jsx,sh,yaml,yml,json,toml,md,mdx,rst}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
**/*.{py,js,ts,tsx,jsx,sh,yaml,yml,json,toml,md,mdx,rst}: Every file must start with the standard SPDX Apache-2.0 header
Confirm that copyright years are up-to-date whenever a file is changed
All source files must include the SPDX Apache-2.0 header template
Files:
docs/source/build-workflows/llms/using-local-llms.md
**/*.{py,md,mdx,rst}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Version numbers are derived automatically by 'setuptools-scm'; never hard-code them in code or docs
Files:
docs/source/build-workflows/llms/using-local-llms.md
**/*
⚙️ CodeRabbit configuration file
**/*: # Code Review Instructions
- Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.- Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values (except for return values of
None,
in that situation no return type hint is needed).
Example:def my_function(param1: int, param2: str) -> bool: pass- For Python exception handling, ensure proper stack trace preservation:
- When re-raising exceptions: use bare
raisestatements to maintain the original stack trace,
and uselogger.error()(notlogger.exception()) to avoid duplicate stack trace output.- When catching and logging exceptions without re-raising: always use
logger.exception()
to capture the full stack trace information.Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any
words listed in the
ci/vale/styles/config/vocabularies/nat/reject.txtfile, words that might appear to be
spelling mistakes but are listed in theci/vale/styles/config/vocabularies/nat/accept.txtfile are OK.
- Documentation in Markdown files should not contain usage of a possessive 's with inanimate objects
(ex: "the system's performance" should be "the performance of the system").- Documentation in Markdown files should not use NAT as an acronym, always spell out NeMo Agent Toolkit.
The exception to this rule is when referring to package names or code identifiers that contain "nat", th...
Files:
docs/source/build-workflows/llms/using-local-llms.md
docs/source/**/*
⚙️ CodeRabbit configuration file
docs/source/**/*: This directory contains the source code for the documentation. All documentation should be written in Markdown format. Any image files should be placed in thedocs/source/_staticdirectory.Documentation Categories
Ensure documentation is placed in the correct category:
get-started/: Introductory documentation for new users
get-started/tutorials/: Step-by-step learning guidesbuild-workflows/: Workflow creation, configuration, adding remote MCP tools or A2A agents -run-workflows/: Execution, observability, serving workflows via MCP and A2A protocols -improve-workflows/: Evaluation and optimization guides -components/: Specific component implementations (agents, tools, connectors) -extend/: Custom component development and testing (not core library contributions) -reference/: Python and REST API documentation only -resources/: Project information (licensing, FAQs)
resources/contributing/: Development environment and contribution guidesPlacement rules:
- Component implementations always belong in
components/, notbuild-workflows/2. API documentation belongs only inreference/3. Using remote MCP tools or A2A agents should be placed inbuild-workflows/4. Serving workflows via MCP/A2A should be placed inrun-workflows/
Files:
docs/source/build-workflows/llms/using-local-llms.md
🔇 Additional comments (2)
docs/source/build-workflows/llms/using-local-llms.md (2)
88-88: Excellent fix! Consistent with the GPU syntax correction.This change correctly applies the same Docker GPU device specification format as line 74, ensuring both container commands use valid syntax.
74-74: Correct Docker GPU syntax fix.The change from
--gpus 0to--gpus '"device=0"'is necessary. The--gpusflag with a bare numeric index (e.g.,--gpus 0) allocates N GPUs by count rather than selecting a specific device. To specify a particular GPU by index, use the device syntax--gpus '"device=0"'. The nested quoting (outer single quotes, inner double quotes) is the documented format for proper shell parsing.
|
/merge |
mnajafian-nv
left a comment
There was a problem hiding this comment.
✅ LGTM - Approved
Summary
Straightforward documentation fix for incorrect Docker --gpus flag syntax.
Technical Assessment
| Aspect | Status |
|---|---|
| Correctness | ✅ Fix is accurate |
| Docker Syntax | ✅ Proper NVIDIA runtime device specification |
| User Impact | ✅ Prevents GPU allocation failures |
| Scope | ✅ Minimal, focused change |
Details
The original syntax (--gpus 0) is ambiguous/invalid. The NVIDIA Container Toolkit requires the device=<id> format, and the nested quotes ('"device=0"') ensure proper shell escaping and Docker parsing.
The chosen syntax is the most shell-portable option for specifying individual GPUs.
Ship it! 🚀
|
/merge |
Closes ## By Submitting this PR I confirm: - I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/NeMo-Agent-Toolkit/blob/develop/docs/source/resources/contributing/index.md). - We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license. - Any contribution which contains commits that are not Signed-Off will not be accepted. - When the PR is ready for review, new or existing tests cover these changes. - When the PR is ready for review, the documentation is up to date with these changes. ## Summary by CodeRabbit * **Documentation** * Updated Docker run command examples in local LLM setup guide to use explicit GPU device specification format, improving compatibility and ensuring correct GPU selection. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> Authors: - Will Killian (https://github.com/willkill07) Approvers: - https://github.com/mnajafian-nv URL: NVIDIA#1398
Description
Closes
By Submitting this PR I confirm:
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.