Skip to content

docs: fix docker run commands for local LLMs#1398

Merged
rapids-bot[bot] merged 1 commit intoNVIDIA:release/1.4from
willkill07:wkk_fix-docker-run-for-local-llms
Jan 13, 2026
Merged

docs: fix docker run commands for local LLMs#1398
rapids-bot[bot] merged 1 commit intoNVIDIA:release/1.4from
willkill07:wkk_fix-docker-run-for-local-llms

Conversation

@willkill07
Copy link
Member

@willkill07 willkill07 commented Jan 13, 2026

Description

Closes

By Submitting this PR I confirm:

  • I am familiar with the Contributing Guidelines.
  • We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
    • Any contribution which contains commits that are not Signed-Off will not be accepted.
  • When the PR is ready for review, new or existing tests cover these changes.
  • When the PR is ready for review, the documentation is up to date with these changes.

Summary by CodeRabbit

  • Documentation
    • Updated Docker run command examples in local LLM setup guide to use explicit GPU device specification format, improving compatibility and ensuring correct GPU selection.

✏️ Tip: You can customize this high-level summary in your review settings.

Signed-off-by: Will Killian <wkillian@nvidia.com>
@willkill07 willkill07 self-assigned this Jan 13, 2026
@willkill07 willkill07 added the doc Improvements or additions to documentation label Jan 13, 2026
@willkill07 willkill07 requested a review from a team as a code owner January 13, 2026 14:12
@willkill07 willkill07 added non-breaking Non-breaking change skip-ci Optionally Skip CI for this PR labels Jan 13, 2026
@coderabbitai
Copy link

coderabbitai bot commented Jan 13, 2026

Walkthrough

Updated GPU specification format in Docker run invocations within documentation from numeric index notation to explicit device string format, affecting how GPUs are referenced in containerized LLM deployment examples.

Changes

Cohort / File(s) Summary
Documentation updates
docs/source/build-workflows/llms/using-local-llms.md
Modified GPU device specification in two docker run commands from --gpus 0 and --gpus 1 to --gpus '"device=0"' and --gpus '"device=1"' format

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely describes the main change: fixing docker run commands for local LLMs documentation. It follows imperative mood and is well within the 72-character limit.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
docs/source/build-workflows/llms/using-local-llms.md (2)

20-20: Add "NVIDIA" prefix for first reference.

Per coding guidelines, the first reference to the toolkit in body text should be "NVIDIA NeMo Agent toolkit" rather than "NeMo Agent toolkit".

📝 Suggested fix
-NeMo Agent toolkit has the ability to interact with locally hosted LLMs, in this guide we will demonstrate how to adapt the simple example (`examples/getting_started/simple_web_query`) to use locally hosted LLMs using two different approaches using [NVIDIA NIM](https://docs.nvidia.com/nim/) and [vLLM](https://docs.vllm.ai/), though any locally hosted LLM with an OpenAI-compatible API can be used.
+NVIDIA NeMo Agent toolkit has the ability to interact with locally hosted LLMs, in this guide we will demonstrate how to adapt the simple example (`examples/getting_started/simple_web_query`) to use locally hosted LLMs using two different approaches using [NVIDIA NIM](https://docs.nvidia.com/nim/) and [vLLM](https://docs.vllm.ai/), though any locally hosted LLM with an OpenAI-compatible API can be used.

As per coding guidelines for Markdown files.


26-26: Avoid possessive with inanimate objects.

Per coding guidelines, avoid using possessive 's with inanimate objects in Markdown documentation. "the model's container" should be rephrased as "the container for the model".

📝 Suggested fix
-Regardless of the model you choose, the process is the same for downloading the model's container from [`build.nvidia.com`](https://build.nvidia.com/). Navigate to the model you wish to run locally, if it is able to be downloaded it will be labeled with the `RUN ANYWHERE` tag, the exact commands will be specified on the `Deploy` tab for the model.
+Regardless of the model you choose, the process is the same for downloading the container for the model from [`build.nvidia.com`](https://build.nvidia.com/). Navigate to the model you wish to run locally, if it is able to be downloaded it will be labeled with the `RUN ANYWHERE` tag, the exact commands will be specified on the `Deploy` tab for the model.

As per coding guidelines for Markdown files.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0342649 and 7bf8e27.

📒 Files selected for processing (1)
  • docs/source/build-workflows/llms/using-local-llms.md
🧰 Additional context used
📓 Path-based instructions (6)
**/*.{md,mdx}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

**/*.{md,mdx}: Use 'NVIDIA NeMo Agent toolkit' for full name (first use), 'NeMo Agent toolkit' or 'the toolkit' for subsequent references, and 'Toolkit' (capital T) in titles/headings, 'toolkit' (lowercase t) in body text
Never use deprecated names: 'Agent Intelligence toolkit', 'aiqtoolkit', 'AgentIQ', 'AIQ', or 'aiq' in documentation; update any occurrences unless intentionally referring to deprecated versions or implementing compatibility layers

Files:

  • docs/source/build-workflows/llms/using-local-llms.md
**/*.{md,mdx,rst}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

**/*.{md,mdx,rst}: Documentation must be clear, comprehensive, and free of TODOs, FIXMEs, placeholder text, offensive or outdated terms, and spelling mistakes
Do not use words listed in 'ci/vale/styles/config/vocabularies/nat/reject.txt' in documentation
Words listed in 'ci/vale/styles/config/vocabularies/nat/accept.txt' are acceptable even if they appear to be spelling mistakes

Files:

  • docs/source/build-workflows/llms/using-local-llms.md
**/*.{py,js,ts,tsx,jsx,sh,yaml,yml,json,toml,md,mdx,rst}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

**/*.{py,js,ts,tsx,jsx,sh,yaml,yml,json,toml,md,mdx,rst}: Every file must start with the standard SPDX Apache-2.0 header
Confirm that copyright years are up-to-date whenever a file is changed
All source files must include the SPDX Apache-2.0 header template

Files:

  • docs/source/build-workflows/llms/using-local-llms.md
**/*.{py,md,mdx,rst}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Version numbers are derived automatically by 'setuptools-scm'; never hard-code them in code or docs

Files:

  • docs/source/build-workflows/llms/using-local-llms.md
**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions

  • Ensure the code follows best practices and coding standards. - For Python code, follow
    PEP 20 and
    PEP 8 for style guidelines.
  • Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values (except for return values of None,
    in that situation no return type hint is needed).
    Example:
    def my_function(param1: int, param2: str) -> bool:
        pass
  • For Python exception handling, ensure proper stack trace preservation:
    • When re-raising exceptions: use bare raise statements to maintain the original stack trace,
      and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.
    • When catching and logging exceptions without re-raising: always use logger.exception()
      to capture the full stack trace information.

Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

  • Documentation in Markdown files should not contain usage of a possessive 's with inanimate objects
    (ex: "the system's performance" should be "the performance of the system").
  • Documentation in Markdown files should not use NAT as an acronym, always spell out NeMo Agent Toolkit.
    The exception to this rule is when referring to package names or code identifiers that contain "nat", th...

Files:

  • docs/source/build-workflows/llms/using-local-llms.md
docs/source/**/*

⚙️ CodeRabbit configuration file

docs/source/**/*: This directory contains the source code for the documentation. All documentation should be written in Markdown format. Any image files should be placed in the docs/source/_static directory.

Documentation Categories

Ensure documentation is placed in the correct category:

  • get-started/: Introductory documentation for new users
    • get-started/tutorials/: Step-by-step learning guides
  • build-workflows/: Workflow creation, configuration, adding remote MCP tools or A2A agents - run-workflows/: Execution, observability, serving workflows via MCP and A2A protocols - improve-workflows/: Evaluation and optimization guides - components/: Specific component implementations (agents, tools, connectors) - extend/: Custom component development and testing (not core library contributions) - reference/: Python and REST API documentation only - resources/: Project information (licensing, FAQs)
    • resources/contributing/: Development environment and contribution guides

Placement rules:

  1. Component implementations always belong in components/, not build-workflows/ 2. API documentation belongs only in reference/ 3. Using remote MCP tools or A2A agents should be placed in build-workflows/ 4. Serving workflows via MCP/A2A should be placed in run-workflows/

Files:

  • docs/source/build-workflows/llms/using-local-llms.md
🔇 Additional comments (2)
docs/source/build-workflows/llms/using-local-llms.md (2)

88-88: Excellent fix! Consistent with the GPU syntax correction.

This change correctly applies the same Docker GPU device specification format as line 74, ensuring both container commands use valid syntax.


74-74: Correct Docker GPU syntax fix.

The change from --gpus 0 to --gpus '"device=0"' is necessary. The --gpus flag with a bare numeric index (e.g., --gpus 0) allocates N GPUs by count rather than selecting a specific device. To specify a particular GPU by index, use the device syntax --gpus '"device=0"'. The nested quoting (outer single quotes, inner double quotes) is the documented format for proper shell parsing.

Copy link
Contributor

@mnajafian-nv mnajafian-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@willkill07
Copy link
Member Author

/merge

Copy link
Contributor

@mnajafian-nv mnajafian-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ LGTM - Approved

Summary

Straightforward documentation fix for incorrect Docker --gpus flag syntax.

Technical Assessment

Aspect Status
Correctness ✅ Fix is accurate
Docker Syntax ✅ Proper NVIDIA runtime device specification
User Impact ✅ Prevents GPU allocation failures
Scope ✅ Minimal, focused change

Details

The original syntax (--gpus 0) is ambiguous/invalid. The NVIDIA Container Toolkit requires the device=<id> format, and the nested quotes ('"device=0"') ensure proper shell escaping and Docker parsing.

The chosen syntax is the most shell-portable option for specifying individual GPUs.

Ship it! 🚀

@willkill07 willkill07 removed the skip-ci Optionally Skip CI for this PR label Jan 13, 2026
@willkill07
Copy link
Member Author

/merge

@rapids-bot rapids-bot bot merged commit c12456f into NVIDIA:release/1.4 Jan 13, 2026
32 of 34 checks passed
Jerryguan777 pushed a commit to Jerryguan777/NeMo-Agent-Toolkit that referenced this pull request Jan 28, 2026
Closes

## By Submitting this PR I confirm:
- I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/NeMo-Agent-Toolkit/blob/develop/docs/source/resources/contributing/index.md).
- We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
  - Any contribution which contains commits that are not Signed-Off will not be accepted.
- When the PR is ready for review, new or existing tests cover these changes.
- When the PR is ready for review, the documentation is up to date with these changes.




## Summary by CodeRabbit

* **Documentation**
  * Updated Docker run command examples in local LLM setup guide to use explicit GPU device specification format, improving compatibility and ensuring correct GPU selection.

<sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub>

Authors:
  - Will Killian (https://github.com/willkill07)

Approvers:
  - https://github.com/mnajafian-nv

URL: NVIDIA#1398
@willkill07 willkill07 deleted the wkk_fix-docker-run-for-local-llms branch February 25, 2026 12:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

doc Improvements or additions to documentation non-breaking Non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants