Skip to content

docs(k8s): document evaluation-only patterns and production alternatives (#1442)#1676

Open
ColinM-sys wants to merge 4 commits intoNVIDIA:mainfrom
ColinM-sys:fix/1442-k8s-security-doc
Open

docs(k8s): document evaluation-only patterns and production alternatives (#1442)#1676
ColinM-sys wants to merge 4 commits intoNVIDIA:mainfrom
ColinM-sys:fix/1442-k8s-security-doc

Conversation

@ColinM-sys
Copy link
Copy Markdown
Contributor

@ColinM-sys ColinM-sys commented Apr 9, 2026

Summary

  • Add `k8s/SECURITY.md`, walking through every risky pattern in `k8s/nemoclaw-k8s.yaml` (privileged DinD, disabled docker TLS, `POLICY_MODE=skip`, `curl | bash` installer at pod start, placeholder API key, no NetworkPolicy, no resource limits) and what a production alternative looks like.
  • Cross-link from `k8s/README.md` so the warning is discoverable from the existing entry point.

Fixes #1442.

Why

`k8s/README.md` calls the manifest "experimental" but does not spell out which patterns are unsafe in production. A user deploying to a real cluster has no way to know that `privileged: true`, `DOCKER_TLS_CERTDIR=""`, `POLICY_MODE=skip`, `curl | bash` at pod start, the `dummy` placeholder API key, the absence of any NetworkPolicy, and the absence of `resources.limits` are all intentional tradeoffs for a kubectl-apply-and-try-it-out flow — and not a production blueprint. The IssueFinder report in #1442 explicitly asked for this page.

What changed

  • New file: `k8s/SECURITY.md` — section per risky pattern with the YAML excerpt, the threat model, and the production alternative.
  • Edit: `k8s/README.md` — one-line cross-link in the existing experimental warning callout.

No code changes. Pure documentation, fully additive (apart from the one-line README link).

Test plan

  • Manual review of every YAML excerpt in SECURITY.md against the live `nemoclaw-k8s.yaml` to make sure quoted lines match exactly.
  • Cross-link in README.md renders correctly under GitHub-flavored markdown.
  • Maintainer should sanity-check the production alternatives (especially ci: auto-update release notes on push to main #1 — runtime class instead of DinD) match how NVIDIA actually wants to run NemoClaw on production K8s. If a particular alternative is wrong I am happy to revise.

Summary by CodeRabbit

  • Documentation
    • Added a Kubernetes security guide that lists unsafe evaluation patterns, explains the production risks they introduce, and provides concrete production-ready alternatives plus a minimum production checklist.
    • Updated the deployment README to reference the new security guidance and emphasize that the included manifests are for evaluation only.
    • Clarified recommendations for credentials, network policies, resource limits, container runtime, and supply-chain practices.

Signed-off-by: ColinM-sys cmcdonough@50words.com

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 9, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Added Kubernetes security guidance: k8s/README.md now includes a security callout linking to a new k8s/SECURITY.md that documents seven evaluation-only insecure patterns in nemoclaw-k8s.yaml and provides production hardening alternatives and a minimum checklist.

Changes

Cohort / File(s) Summary
Kubernetes README
k8s/README.md
Inserted an experimental/security callout that points readers to k8s/SECURITY.md and warns that the manifest is for evaluation only.
Kubernetes Security Guide
k8s/SECURITY.md
Added a new security document describing the three-container evaluation manifest and enumerating seven insecure patterns (privileged DinD, Docker socket/TLS disabled, NEMOCLAW_POLICY_MODE=skip, `curl

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐇 I nudged a warning in the README bright,
A tiny flag to keep your clusters tight.
SECURITY.md hums what not to do,
Swap DinD, hide keys, and tighten the crew,
Hop safe, hop smart — then ship anew.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main change: adding documentation of evaluation-only patterns and production alternatives for Kubernetes deployment.
Linked Issues check ✅ Passed The PR fully addresses all coding requirements from issue #1442: creates k8s/SECURITY.md documenting all seven security gaps with production alternatives, adds cross-link warning in k8s/README.md, and labels manifest as evaluation-only.
Out of Scope Changes check ✅ Passed All changes are in scope: k8s/SECURITY.md and k8s/README.md modifications directly address issue #1442 objectives with no unrelated alterations.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
k8s/SECURITY.md (1)

100-100: Consider streamlining "at the moment".

For conciseness, "when the pod boots" reads more directly than "at the moment the pod boots".

✨ Suggested improvement
-`nvidia.com/nemoclaw.sh` at the moment the pod boots. There is no
+`nvidia.com/nemoclaw.sh` when the pod boots. There is no
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@k8s/SECURITY.md` at line 100, Replace the phrase "at the moment the pod
boots" with the more concise "when the pod boots" in the SECURITY.md sentence
that references `nvidia.com/nemoclaw.sh` so the line reads more directly and
improves clarity.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@k8s/README.md`:
- Line 1: This README is missing the required SPDX license header; add an HTML
comment SPDX header as the very first lines of the file (e.g., an HTML comment
containing "SPDX-License-Identifier: <license-id>") so the Markdown begins with
the SPDX header comment before the "# NemoClaw on Kubernetes" title.

In `@k8s/SECURITY.md`:
- Line 80: In the SECURITY.md line that lists package/service policies (the text
containing "Policies (`pypi`, `npm`, `github`, `huggingface`, etc.)"), fix the
casing by replacing `github` with `GitHub` so the product name uses the correct
capitalization; update the inline code/backtick token accordingly to `GitHub`.

---

Nitpick comments:
In `@k8s/SECURITY.md`:
- Line 100: Replace the phrase "at the moment the pod boots" with the more
concise "when the pod boots" in the SECURITY.md sentence that references
`nvidia.com/nemoclaw.sh` so the line reads more directly and improves clarity.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7fac52a5-9c9d-423e-85ab-b54df7e091bb

📥 Commits

Reviewing files that changed from the base of the PR and between e8b30a2 and 5893671.

📒 Files selected for processing (2)
  • k8s/README.md
  • k8s/SECURITY.md

Comment thread k8s/README.md
@@ -1,6 +1,8 @@
# NemoClaw on Kubernetes
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Missing required SPDX license header.

All Markdown files must include an SPDX license header. As per coding guidelines, the header should be in HTML comment format at the top of the file.

📋 Proposed fix

Add these lines at the very beginning of the file:

+<!-- SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. -->
+<!-- SPDX-License-Identifier: Apache-2.0 -->
+
 # NemoClaw on Kubernetes
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# NemoClaw on Kubernetes
<!-- SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. -->
<!-- SPDX-License-Identifier: Apache-2.0 -->
# NemoClaw on Kubernetes
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@k8s/README.md` at line 1, This README is missing the required SPDX license
header; add an HTML comment SPDX header as the very first lines of the file
(e.g., an HTML comment containing "SPDX-License-Identifier: <license-id>") so
the Markdown begins with the SPDX header comment before the "# NemoClaw on
Kubernetes" title.

Comment thread k8s/SECURITY.md Outdated
@wscurran wscurran added documentation Improvements or additions to documentation K8s Use this label to identify Kubernetes deployment issues with NemoClaw. labels Apr 9, 2026
@wscurran
Copy link
Copy Markdown
Contributor

wscurran commented Apr 9, 2026

✨ Thanks for submitting this PR, which proposes an enhancement for documenting K8s deployment.


Possibly related open PRs:


Possibly related open issues:

@ColinM-sys ColinM-sys force-pushed the fix/1442-k8s-security-doc branch from 5893671 to bfc10ed Compare April 10, 2026 01:19
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
k8s/SECURITY.md (1)

80-80: ⚠️ Potential issue | 🟡 Minor

Fix service name casing: githubGitHub.

This appears to be the same issue already identified earlier; keep canonical product casing in the inline list.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@k8s/SECURITY.md` at line 80, Replace the lowercase service name "github" in
the inline list of services (the string containing `pypi`, `npm`, `github`,
`huggingface`, etc.) with the canonical casing "GitHub" so the documentation
uses consistent product capitalization.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@k8s/SECURITY.md`:
- Line 80: Replace the lowercase service name "github" in the inline list of
services (the string containing `pypi`, `npm`, `github`, `huggingface`, etc.)
with the canonical casing "GitHub" so the documentation uses consistent product
capitalization.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a8aff78d-1e1d-4375-9084-1f7286b4ee63

📥 Commits

Reviewing files that changed from the base of the PR and between 5893671 and bfc10ed.

📒 Files selected for processing (2)
  • k8s/README.md
  • k8s/SECURITY.md
✅ Files skipped from review due to trivial changes (1)
  • k8s/README.md

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
k8s/SECURITY.md (1)

80-80: ⚠️ Potential issue | 🟡 Minor

Fix casing: githubGitHub.

The official name is "GitHub" with a capital H. As per coding guidelines, product names must use correct casing.

📝 Proposed fix
-services. Policies (`pypi`, `npm`, `github`, `huggingface`, etc.)
+services. Policies (`pypi`, `npm`, `GitHub`, `huggingface`, etc.)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@k8s/SECURITY.md` at line 80, Update the product casing in the services list:
replace the lowercase token "github" in the string "services. Policies (`pypi`,
`npm`, `github`, `huggingface`, etc.)" with the correct "GitHub" spelling so the
line reads with proper product casing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@k8s/SECURITY.md`:
- Line 80: Update the product casing in the services list: replace the lowercase
token "github" in the string "services. Policies (`pypi`, `npm`, `github`,
`huggingface`, etc.)" with the correct "GitHub" spelling so the line reads with
proper product casing.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 1f6421f2-5b3f-4a1a-a17d-2dab6d6ab06d

📥 Commits

Reviewing files that changed from the base of the PR and between bfc10ed and 883ff42.

📒 Files selected for processing (2)
  • k8s/README.md
  • k8s/SECURITY.md
✅ Files skipped from review due to trivial changes (1)
  • k8s/README.md

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@k8s/SECURITY.md`:
- Around line 83-85: The doc recommends setting NEMOCLAW_POLICY_MODE=enforce
which the app-side validation does not accept; update the recommendation to use
a supported mode (e.g., NEMOCLAW_POLICY_MODE=suggested or
NEMOCLAW_POLICY_MODE=skip, or the specific alias your runtime accepts) or add a
clear note saying "only use =enforce if your runtime has been updated to accept
it"; ensure mentions of NEMOCLAW_POLICY_MODE at the two places referenced are
changed to a supported value or annotated to avoid startup failures.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 9e628701-a7d2-415d-8326-99cce9f1b4c5

📥 Commits

Reviewing files that changed from the base of the PR and between 883ff42 and 4cc36c2.

📒 Files selected for processing (1)
  • k8s/SECURITY.md

Comment thread k8s/SECURITY.md Outdated
@wscurran wscurran added the status: rebase PR needs to be rebased against main before review can continue label Apr 15, 2026
@ColinM-sys ColinM-sys force-pushed the fix/1442-k8s-security-doc branch from 4cc36c2 to 723e45b Compare April 15, 2026 14:57
@ColinM-sys
Copy link
Copy Markdown
Contributor Author

Rebased onto current main.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
k8s/SECURITY.md (1)

83-85: ⚠️ Potential issue | 🟠 Major

Avoid recommending NEMOCLAW_POLICY_MODE=enforce unless runtime support is confirmed.

The production recommendation still uses enforce in two places; if that value is not accepted by current runtime validation, users can copy/paste into startup failures. Please switch to a supported value (or add an explicit version-gated note).

Also applies to: 177-178

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@k8s/SECURITY.md` around lines 83 - 85, The docs currently recommend setting
NEMOCLAW_POLICY_MODE=enforce which may be rejected by some runtimes; update both
occurrences (the block mentioning "Production alternative" and the later
instance at lines ~177-178) to either use a supported value (e.g., "warn" or
"audit") or add an explicit version-gated note stating "use `enforce` only if
your runtime validates and supports it (version X.Y+)" and show the fallback
value; reference the NEMOCLAW_POLICY_MODE env var and the two text locations so
the wording is replaced or annotated consistently.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@k8s/SECURITY.md`:
- Around line 111-118: The documentation snippet incorrectly shows
COMPATIBLE_API_KEY set to "dummy"; update the SECURITY.md example and threat
model to reflect the actual implementation in k8s/nemoclaw-k8s.yaml which uses
valueFrom.secretKeyRef to pull the key from a Kubernetes Secret and only uses
the dummy value as a fallback in the startup shell logic; replace the hardcoded
YAML example and wording to show the secretKeyRef usage and mention the runtime
fallback behavior so readers aren’t misinformed.
- Around line 70-75: Update the SECURITY.md section that shows
NEMOCLAW_POLICY_MODE=skip so it matches the current manifest which sets
NEMOCLAW_POLICY_MODE="suggested" (change the YAML snippet value and any text
that claims POLICY_MODE=skip); also revise the downstream sentence that
currently states `POLICY_MODE=skip` disables enforcement to instead describe
that a permissive mode (e.g., "suggested") weakens NemoClaw's network guardrails
but does not fully disable them. Locate references to the NEMOCLAW_POLICY_MODE
setting and the evaluation manifest (nemoclaw-k8s.yaml) and replace the
incorrect literal and explanatory text accordingly.

---

Duplicate comments:
In `@k8s/SECURITY.md`:
- Around line 83-85: The docs currently recommend setting
NEMOCLAW_POLICY_MODE=enforce which may be rejected by some runtimes; update both
occurrences (the block mentioning "Production alternative" and the later
instance at lines ~177-178) to either use a supported value (e.g., "warn" or
"audit") or add an explicit version-gated note stating "use `enforce` only if
your runtime validates and supports it (version X.Y+)" and show the fallback
value; reference the NEMOCLAW_POLICY_MODE env var and the two text locations so
the wording is replaced or annotated consistently.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: a029a9ad-2faa-41ec-9171-5c3108cf7ca5

📥 Commits

Reviewing files that changed from the base of the PR and between 4cc36c2 and 723e45b.

📒 Files selected for processing (2)
  • k8s/README.md
  • k8s/SECURITY.md
✅ Files skipped from review due to trivial changes (1)
  • k8s/README.md

Comment thread k8s/SECURITY.md Outdated
Comment thread k8s/SECURITY.md Outdated
@wscurran wscurran removed the status: rebase PR needs to be rebased against main before review can continue label Apr 15, 2026
The k8s/README.md calls the manifest "experimental" but does not
spell out which specific patterns are unsafe in production. A user
deploying to a real cluster has no way to know that
`privileged: true`, `DOCKER_TLS_CERTDIR=""`, `POLICY_MODE=skip`,
`curl | bash` at pod start, the `dummy` placeholder API key, the
absence of any NetworkPolicy, and the absence of resource limits
are all *intentional* tradeoffs for a kubectl-apply-and-try-it-out
flow — and not a production blueprint.

Add k8s/SECURITY.md, walking through every risky pattern in the
manifest, why it is unsafe in production, and what a production
alternative would look like. Cross-link from k8s/README.md so the
warning is discoverable from the existing entry point.

Refs: NVIDIA#1442
Signed-off-by: ColinM-sys <cmcdonough@50words.com>
Signed-off-by: ColinM-sys <cmcdonough@50words.com>
…s per CodeRabbit

Signed-off-by: ColinM-sys <cmcdonough@50words.com>
…ection

Signed-off-by: ColinM-sys <cmcdonough@50words.com>
@ColinM-sys ColinM-sys force-pushed the fix/1442-k8s-security-doc branch from b1647b6 to 01c2f93 Compare April 16, 2026 15:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation K8s Use this label to identify Kubernetes deployment issues with NemoClaw.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

K8s Deployment Manifest Lacks Security Documentation and Production Warnings - IssueFinder - SN 18

2 participants