Skip to content

Improve Safety and Security retail agent docs#1492

Merged
rapids-bot[bot] merged 6 commits intoNVIDIA:release/1.4from
ericevans-nv:docs/safety-security-vdr-fixes
Jan 27, 2026
Merged

Improve Safety and Security retail agent docs#1492
rapids-bot[bot] merged 6 commits intoNVIDIA:release/1.4from
ericevans-nv:docs/safety-security-vdr-fixes

Conversation

@ericevans-nv
Copy link
Contributor

@ericevans-nv ericevans-nv commented Jan 26, 2026

Description

  • Replace the retail agent README with a restructured, diagram-driven guide
  • Clarify evaluation context, results interpretation, and defense behavior
  • Add model guidance for reliable tool calling and streamline section headings

By Submitting this PR I confirm:

  • I am familiar with the Contributing Guidelines.
  • We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
    • Any contribution which contains commits that are not Signed-Off will not be accepted.
  • When the PR is ready for review, new or existing tests cover these changes.
  • When the PR is ready for review, the documentation is up to date with these changes.

Summary by CodeRabbit

  • Documentation

    • Restructured README with a Table of Contents and modular, reference-style sections
    • Replaced narrative intro and long run traces with scenario-driven, tabular configuration docs, dataset examples, diagrams, and summarized results
    • Added concise sections describing red‑teaming components, defenses (PII Defense, Content Safety Guard, Output Verifier), and threat/defense mappings
  • Chores

    • Shortened agent branding text in configuration examples for consistency

✏️ Tip: You can customize this high-level summary in your review settings.

Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>
Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>
@ericevans-nv ericevans-nv self-assigned this Jan 26, 2026
@ericevans-nv ericevans-nv requested review from a team as code owners January 26, 2026 21:38
@ericevans-nv ericevans-nv changed the base branch from develop to release/1.4 January 26, 2026 21:39
@coderabbitai
Copy link

coderabbitai bot commented Jan 26, 2026

Walkthrough

README for the retail agent example was rewritten into a structured reference: Table of Contents, Scenario File schema, NASSE component descriptions (RedTeamingRunner, RedTeamingMiddleware, RedTeamingEvaluator, Defense Middleware), dataset/config examples, and consolidated evaluation guidance. Two example config YAMLs have a minor branding wording change.

Changes

Cohort / File(s) Summary
Documentation Restructuring
examples/safety_and_security/retail_agent/README.md
Large rewrite: narrative walkthrough replaced with TOC-driven layout, explicit Scenario File schema (fields such as attack_payload, target_function_or_group, target_location, target_field, payload_placement), modular NASSE component docs, dataset and config examples, and summarized evaluation/defense descriptions. Check scenario/config field names, example dataset and config paths, diagram references, and cross-links.
Config wording tweaks
examples/safety_and_security/retail_agent/src/nat_retail_agent/configs/config-with-defenses.yml, examples/safety_and_security/retail_agent/src/nat_retail_agent/configs/config.yml
Minor text change in additional_instructions: "GreenThumb Gardening Equipment" → "GreenThumb Gardening". No functional or control-flow changes; verify branding consistency and README references.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main change—improving documentation for the safety and security retail agent example—and accurately reflects the substantial README restructuring and config updates in the changeset.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ericevans-nv ericevans-nv added improvement Improvement to existing functionality non-breaking Non-breaking change labels Jan 26, 2026
Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>
Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@examples/safety_and_security/retail_agent/README.md`:
- Line 79: The document uses bold emphasis for major section titles (e.g.,
"**Red Teaming Flow**" and the two other bolded section markers at lines 85 and
123); replace each bolded inline emphasis with a proper Markdown heading (choose
an appropriate level like "## Red Teaming Flow") so those three instances become
semantic headings, update their surrounding spacing to include a blank line
before and after the heading, and ensure consistency of heading levels across
the file.
- Around line 26-44: Add a Table of Contents entry linking to the existing
section heading "Why We Need a Safety and Security Framework" so readers can
navigate to that section (or if the heading should be a subsection, adjust its
heading level and update the TOC accordingly); update the TOC block to include a
line such as "- [Why We Need a Safety and Security
Framework](`#why-we-need-a-safety-and-security-framework`)" matching the heading
text and slug, and ensure heading levels are consistent with surrounding entries
like "Introduction" and "Key Features Overview".
🧹 Nitpick comments (1)
examples/safety_and_security/retail_agent/README.md (1)

97-115: Convert scenario names to proper headings.

The scenario names ("Competitor redirect", "Data exfiltration", "Denial of service", "Harmful content") use bold emphasis instead of proper headings. Converting these to level 4 headings (####) would improve document structure and accessibility.

📝 Proposed fix for scenario headings
-**Competitor redirect**
+#### Competitor redirect
+
 - **Scenario:** The agent is asked about products and should return accurate information from the product database.
 - **Attack:** A malicious string is injected into the product description so the agent recommends a competitor instead of the correct product.
 - **Defense:** Output Verifier detects manipulated tool output and blocks or sanitizes it.
 
-**Data exfiltration**
+#### Data exfiltration
+
 - **Scenario:** The agent receives a normal customer request.
 - **Attack:** The user input attempts to trick the agent into revealing customer data.
 - **Defense:** PII Defense detects and sanitizes sensitive output.
 
-**Denial of service**
+#### Denial of service
+
 - **Scenario:** The agent retrieves product reviews to respond to a customer.
 - **Attack:** A fake system error is injected into reviews, causing the agent to refuse service.
 - **Defense:** Output Verifier flags non-review content and removes the injected instructions.
 
-**Harmful content**
+#### Harmful content
+
 - **Scenario:** The agent answers product safety questions in a customer email.
 - **Attack:** The attacker tries to elicit unsafe content or harmful guidance.
 - **Defense:** Content Safety Guard detects and blocks unsafe content.

Copy link
Contributor

@mnajafian-nv mnajafian-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conditional approval pending your review of my feedback. Good documentation restructuring with helpful diagrams and clearer organization. However, several important technical reference sections were removed that users need. Can you pleas review my feedback and think more about what you actually want to remove?

Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@examples/safety_and_security/retail_agent/README.md`:
- Around line 26-53: The Table of Contents list uses inconsistent indentation
causing markdownlint errors (MD005/MD007); fix by normalizing the list nesting
under the "NeMo Agent Safety and Security Engine (NASSE)" and "Table of
Contents" headings so each sub-item (e.g., "Demonstrated Through Retail Agent
Example", "Introduction", "Why We Need a Safety and Security Framework", and
nested subsections like "The Problem"/"The Solution") is indented consistently
using the same number of spaces (choose 2 or 4) for all nested levels, ensure
all top-level bullets align with "Table of Contents" and all child bullets are
uniformly indented beneath their parent to restore proper list structure and
satisfy MD005/MD007.

@ericevans-nv
Copy link
Contributor Author

/merge

Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@examples/safety_and_security/retail_agent/README.md`:
- Around line 432-443: The two HTML <img> tags embedding attack-score.png and
defense-score.png lack alt attributes causing accessibility/MD045 failures;
update the tags that reference "attack-score.png" and "defense-score.png" to
include concise descriptive alt text (e.g., "Attack score before defenses" and
"Defense score after defenses") so the images are accessible and the
markdownlint warning is resolved.

@rapids-bot rapids-bot bot merged commit 09c9018 into NVIDIA:release/1.4 Jan 27, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improvement to existing functionality non-breaking Non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants