docs: fix docker run commands for local LLMs by willkill07 · Pull Request #1398 · NVIDIA/NeMo-Agent-Toolkit

willkill07 · 2026-01-13T14:12:54Z

Description

Closes

By Submitting this PR I confirm:

I am familiar with the Contributing Guidelines.
We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
- Any contribution which contains commits that are not Signed-Off will not be accepted.
When the PR is ready for review, new or existing tests cover these changes.
When the PR is ready for review, the documentation is up to date with these changes.

Summary by CodeRabbit

Documentation
- Updated Docker run command examples in local LLM setup guide to use explicit GPU device specification format, improving compatibility and ensuring correct GPU selection.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Signed-off-by: Will Killian <wkillian@nvidia.com>

coderabbitai · 2026-01-13T14:13:06Z

Walkthrough

Updated GPU specification format in Docker run invocations within documentation from numeric index notation to explicit device string format, affecting how GPUs are referenced in containerized LLM deployment examples.

Changes

Cohort / File(s)	Summary
Documentation updates `docs/source/build-workflows/llms/using-local-llms.md`	Modified GPU device specification in two docker run commands from `--gpus 0` and `--gpus 1` to `--gpus '"device=0"'` and `--gpus '"device=1"'` format

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely describes the main change: fixing docker run commands for local LLMs documentation. It follows imperative mood and is well within the 72-character limit.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

docs/source/build-workflows/llms/using-local-llms.md (2)

20-20: Add "NVIDIA" prefix for first reference.

Per coding guidelines, the first reference to the toolkit in body text should be "NVIDIA NeMo Agent toolkit" rather than "NeMo Agent toolkit".

📝 Suggested fix

-NeMo Agent toolkit has the ability to interact with locally hosted LLMs, in this guide we will demonstrate how to adapt the simple example (`examples/getting_started/simple_web_query`) to use locally hosted LLMs using two different approaches using [NVIDIA NIM](https://docs.nvidia.com/nim/) and [vLLM](https://docs.vllm.ai/), though any locally hosted LLM with an OpenAI-compatible API can be used.
+NVIDIA NeMo Agent toolkit has the ability to interact with locally hosted LLMs, in this guide we will demonstrate how to adapt the simple example (`examples/getting_started/simple_web_query`) to use locally hosted LLMs using two different approaches using [NVIDIA NIM](https://docs.nvidia.com/nim/) and [vLLM](https://docs.vllm.ai/), though any locally hosted LLM with an OpenAI-compatible API can be used.

As per coding guidelines for Markdown files.

26-26: Avoid possessive with inanimate objects.

Per coding guidelines, avoid using possessive 's with inanimate objects in Markdown documentation. "the model's container" should be rephrased as "the container for the model".

📝 Suggested fix

-Regardless of the model you choose, the process is the same for downloading the model's container from [`build.nvidia.com`](https://build.nvidia.com/). Navigate to the model you wish to run locally, if it is able to be downloaded it will be labeled with the `RUN ANYWHERE` tag, the exact commands will be specified on the `Deploy` tab for the model.
+Regardless of the model you choose, the process is the same for downloading the container for the model from [`build.nvidia.com`](https://build.nvidia.com/). Navigate to the model you wish to run locally, if it is able to be downloaded it will be labeled with the `RUN ANYWHERE` tag, the exact commands will be specified on the `Deploy` tab for the model.

As per coding guidelines for Markdown files.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0342649 and 7bf8e27.

📒 Files selected for processing (1)

docs/source/build-workflows/llms/using-local-llms.md

🧰 Additional context used

📓 Path-based instructions (6)

**/*.{md,mdx}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

**/*.{md,mdx}: Use 'NVIDIA NeMo Agent toolkit' for full name (first use), 'NeMo Agent toolkit' or 'the toolkit' for subsequent references, and 'Toolkit' (capital T) in titles/headings, 'toolkit' (lowercase t) in body text
Never use deprecated names: 'Agent Intelligence toolkit', 'aiqtoolkit', 'AgentIQ', 'AIQ', or 'aiq' in documentation; update any occurrences unless intentionally referring to deprecated versions or implementing compatibility layers

Files:

docs/source/build-workflows/llms/using-local-llms.md

**/*.{md,mdx,rst}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

**/*.{md,mdx,rst}: Documentation must be clear, comprehensive, and free of TODOs, FIXMEs, placeholder text, offensive or outdated terms, and spelling mistakes
Do not use words listed in 'ci/vale/styles/config/vocabularies/nat/reject.txt' in documentation
Words listed in 'ci/vale/styles/config/vocabularies/nat/accept.txt' are acceptable even if they appear to be spelling mistakes

Files:

docs/source/build-workflows/llms/using-local-llms.md

**/*.{py,js,ts,tsx,jsx,sh,yaml,yml,json,toml,md,mdx,rst}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

**/*.{py,js,ts,tsx,jsx,sh,yaml,yml,json,toml,md,mdx,rst}: Every file must start with the standard SPDX Apache-2.0 header
Confirm that copyright years are up-to-date whenever a file is changed
All source files must include the SPDX Apache-2.0 header template

Files:

docs/source/build-workflows/llms/using-local-llms.md

**/*.{py,md,mdx,rst}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Version numbers are derived automatically by 'setuptools-scm'; never hard-code them in code or docs

Files:

docs/source/build-workflows/llms/using-local-llms.md

**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions
Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.
Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values (except for return values of None,
in that situation no return type hint is needed).
Example:
def my_function(param1: int, param2: str) -> bool:
    pass
For Python exception handling, ensure proper stack trace preservation:

When re-raising exceptions: use bare raise statements to maintain the original stack trace,
and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.

When catching and logging exceptions without re-raising: always use logger.exception()
to capture the full stack trace information.
Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Documentation in Markdown files should not contain usage of a possessive 's with inanimate objects
(ex: "the system's performance" should be "the performance of the system").

Documentation in Markdown files should not use NAT as an acronym, always spell out NeMo Agent Toolkit.
The exception to this rule is when referring to package names or code identifiers that contain "nat", th...

Files:

docs/source/build-workflows/llms/using-local-llms.md

docs/source/**/*

⚙️ CodeRabbit configuration file

docs/source/**/*: This directory contains the source code for the documentation. All documentation should be written in Markdown format. Any image files should be placed in the docs/source/_static directory.

Documentation Categories

Ensure documentation is placed in the correct category:

get-started/: Introductory documentation for new users

get-started/tutorials/: Step-by-step learning guides

build-workflows/: Workflow creation, configuration, adding remote MCP tools or A2A agents - run-workflows/: Execution, observability, serving workflows via MCP and A2A protocols - improve-workflows/: Evaluation and optimization guides - components/: Specific component implementations (agents, tools, connectors) - extend/: Custom component development and testing (not core library contributions) - reference/: Python and REST API documentation only - resources/: Project information (licensing, FAQs)

resources/contributing/: Development environment and contribution guides

Placement rules:

Component implementations always belong in components/, not build-workflows/ 2. API documentation belongs only in reference/ 3. Using remote MCP tools or A2A agents should be placed in build-workflows/ 4. Serving workflows via MCP/A2A should be placed in run-workflows/

Files:

docs/source/build-workflows/llms/using-local-llms.md

🔇 Additional comments (2)

docs/source/build-workflows/llms/using-local-llms.md (2)

88-88: Excellent fix! Consistent with the GPU syntax correction.

This change correctly applies the same Docker GPU device specification format as line 74, ensuring both container commands use valid syntax.

74-74: Correct Docker GPU syntax fix.

The change from --gpus 0 to --gpus '"device=0"' is necessary. The --gpus flag with a bare numeric index (e.g., --gpus 0) allocates N GPUs by count rather than selecting a specific device. To specify a particular GPU by index, use the device syntax --gpus '"device=0"'. The nested quoting (outer single quotes, inner double quotes) is the documented format for proper shell parsing.

mnajafian-nv

LGTM!

willkill07 · 2026-01-13T14:20:03Z

/merge

mnajafian-nv

✅ LGTM - Approved

Summary

Straightforward documentation fix for incorrect Docker --gpus flag syntax.

Technical Assessment

Aspect	Status
Correctness	✅ Fix is accurate
Docker Syntax	✅ Proper NVIDIA runtime device specification
User Impact	✅ Prevents GPU allocation failures
Scope	✅ Minimal, focused change

Details

The original syntax (--gpus 0) is ambiguous/invalid. The NVIDIA Container Toolkit requires the device=<id> format, and the nested quotes ('"device=0"') ensure proper shell escaping and Docker parsing.

The chosen syntax is the most shell-portable option for specifying individual GPUs.

Ship it! 🚀

willkill07 · 2026-01-13T15:30:12Z

/merge

Closes ## By Submitting this PR I confirm: - I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/NeMo-Agent-Toolkit/blob/develop/docs/source/resources/contributing/index.md). - We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license. - Any contribution which contains commits that are not Signed-Off will not be accepted. - When the PR is ready for review, new or existing tests cover these changes. - When the PR is ready for review, the documentation is up to date with these changes. ## Summary by CodeRabbit * **Documentation** * Updated Docker run command examples in local LLM setup guide to use explicit GPU device specification format, improving compatibility and ensuring correct GPU selection. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> Authors: - Will Killian (https://github.com/willkill07) Approvers: - https://github.com/mnajafian-nv URL: NVIDIA#1398

docs: fix docker run commands for local LLMs

7bf8e27

Signed-off-by: Will Killian <wkillian@nvidia.com>

willkill07 self-assigned this Jan 13, 2026

willkill07 added the doc Improvements or additions to documentation label Jan 13, 2026

willkill07 requested a review from a team as a code owner January 13, 2026 14:12

willkill07 added non-breaking Non-breaking change skip-ci Optionally Skip CI for this PR labels Jan 13, 2026

coderabbitai bot reviewed Jan 13, 2026

View reviewed changes

mnajafian-nv approved these changes Jan 13, 2026

View reviewed changes

willkill07 removed the skip-ci Optionally Skip CI for this PR label Jan 13, 2026

rapids-bot bot merged commit c12456f into NVIDIA:release/1.4 Jan 13, 2026
32 of 34 checks passed

willkill07 deleted the wkk_fix-docker-run-for-local-llms branch February 25, 2026 12:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: fix docker run commands for local LLMs#1398

docs: fix docker run commands for local LLMs#1398
rapids-bot[bot] merged 1 commit intoNVIDIA:release/1.4from
willkill07:wkk_fix-docker-run-for-local-llms

willkill07 commented Jan 13, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jan 13, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Documentation Categories

Uh oh!

mnajafian-nv left a comment

Uh oh!

willkill07 commented Jan 13, 2026

Uh oh!

mnajafian-nv left a comment

Uh oh!

willkill07 commented Jan 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

willkill07 commented Jan 13, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

By Submitting this PR I confirm:

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Documentation Categories

Uh oh!

mnajafian-nv left a comment

Choose a reason for hiding this comment

Uh oh!

willkill07 commented Jan 13, 2026

Uh oh!

mnajafian-nv left a comment

Choose a reason for hiding this comment

✅ LGTM - Approved

Summary

Technical Assessment

Details

Uh oh!

willkill07 commented Jan 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

willkill07 commented Jan 13, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 13, 2026 •

edited

Loading