From 074d05762fc779f96f1531d6715910a846301e13 Mon Sep 17 00:00:00 2001
From: Johnny Greco <jogreco@nvidia.com>
Date: Tue, 7 Apr 2026 11:29:38 -0400
Subject: [PATCH 1/5] fix: prevent skill load failure when data-designer CLI is
 not installed

Append `|| true` to the shell command that resolves the data-designer
path so it always exits 0. Without this, the skill fails to load
entirely when the CLI is missing, and the "If blank, see
Troubleshooting" fallback is never reached.
---
 skills/data-designer/SKILL.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/skills/data-designer/SKILL.md b/skills/data-designer/SKILL.md
index cbb05c1a..feda1d87 100644
--- a/skills/data-designer/SKILL.md
+++ b/skills/data-designer/SKILL.md
@@ -8,7 +8,7 @@ argument-hint: [describe the dataset you want to generate]
 
 Do not explore the workspace first. The workflow's Learn step gives you everything you need.
 
-`data-designer` command: !`command -v data-designer 2>/dev/null || (test -x .venv/bin/data-designer && realpath .venv/bin/data-designer)`
+`data-designer` command: !`command -v data-designer 2>/dev/null || (test -x .venv/bin/data-designer && realpath .venv/bin/data-designer) || true`
 Use this path for all `data-designer` commands throughout this skill. If blank, see Troubleshooting.
 
 # Goal

From e6a96f1bb4317baa30ce2abcf3bb813922cc8fab Mon Sep 17 00:00:00 2001
From: Johnny Greco <jogreco@nvidia.com>
Date: Tue, 7 Apr 2026 11:57:43 -0400
Subject: [PATCH 2/5] fix: use explicit NOT_FOUND sentinel when data-designer
 CLI is missing

Replace `|| true` (blank output) with `|| echo NOT_FOUND` so the agent
sees a clear signal. Update the instruction to bold/imperative so it
actually gets followed.
---
 skills/data-designer/SKILL.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/skills/data-designer/SKILL.md b/skills/data-designer/SKILL.md
index feda1d87..b4da1c7f 100644
--- a/skills/data-designer/SKILL.md
+++ b/skills/data-designer/SKILL.md
@@ -8,8 +8,8 @@ argument-hint: [describe the dataset you want to generate]
 
 Do not explore the workspace first. The workflow's Learn step gives you everything you need.
 
-`data-designer` command: !`command -v data-designer 2>/dev/null || (test -x .venv/bin/data-designer && realpath .venv/bin/data-designer) || true`
-Use this path for all `data-designer` commands throughout this skill. If blank, see Troubleshooting.
+`data-designer` command: !`command -v data-designer 2>/dev/null || (test -x .venv/bin/data-designer && realpath .venv/bin/data-designer) || echo NOT_FOUND`
+Use this path for all `data-designer` commands throughout this skill. **If the value is `NOT_FOUND`, STOP and follow the Troubleshooting section before doing anything else.**
 
 # Goal
 

From aa34af2b6676bb3152ebbb95c16f18be02a3743f Mon Sep 17 00:00:00 2001
From: Johnny Greco <jogreco@nvidia.com>
Date: Tue, 7 Apr 2026 14:14:09 -0400
Subject: [PATCH 3/5] fix: move CLI resolution into workflow steps instead of
 skill preamble

Remove the \!`command` substitution from SKILL.md and add a "Resolve CLI
command" step to both workflows. The agent now runs the lookup itself
and uses the result as the data-designer executable for all subsequent
commands. If the command fails, the agent stops and follows
Troubleshooting.
---
 skills/data-designer/SKILL.md                 |  3 ---
 skills/data-designer/workflows/autopilot.md   | 19 +++++++++++--------
 skills/data-designer/workflows/interactive.md | 19 +++++++++++--------
 3 files changed, 22 insertions(+), 19 deletions(-)

diff --git a/skills/data-designer/SKILL.md b/skills/data-designer/SKILL.md
index b4da1c7f..ddee328a 100644
--- a/skills/data-designer/SKILL.md
+++ b/skills/data-designer/SKILL.md
@@ -8,9 +8,6 @@ argument-hint: [describe the dataset you want to generate]
 
 Do not explore the workspace first. The workflow's Learn step gives you everything you need.
 
-`data-designer` command: !`command -v data-designer 2>/dev/null || (test -x .venv/bin/data-designer && realpath .venv/bin/data-designer) || echo NOT_FOUND`
-Use this path for all `data-designer` commands throughout this skill. **If the value is `NOT_FOUND`, STOP and follow the Troubleshooting section before doing anything else.**
-
 # Goal
 
 Build a synthetic dataset using the Data Designer library that matches this description:
diff --git a/skills/data-designer/workflows/autopilot.md b/skills/data-designer/workflows/autopilot.md
index 2f13b7e7..c56c070f 100644
--- a/skills/data-designer/workflows/autopilot.md
+++ b/skills/data-designer/workflows/autopilot.md
@@ -2,25 +2,28 @@
 
 In this mode, make reasonable design decisions autonomously based on the dataset description. Do not ask clarifying questions — infer sensible defaults and move straight through to a working preview.
 
-1. **Learn** — Run `data-designer agent context`.
+1. **Resolve CLI command** — Run `command -v data-designer 2>/dev/null || (test -x .venv/bin/data-designer && realpath .venv/bin/data-designer)`.
+  - If the command outputs a path, use it as the `data-designer` executable for all commands in this workflow.
+  - If it produces no output or fails, STOP and follow the Troubleshooting section in SKILL.md. Do not continue to the next step.
+2. **Learn** — Run `data-designer agent context`.
   - If no model aliases are configured, stop and tell the user to run `data-designer config` to set them up before proceeding.
   - Inspect schemas for every column, sampler type, validator, and processor you plan to use.
   - Never guess types or parameters — read the relevant config files first.
   - Always read `base.py` for inherited fields shared by all config objects.
-2. **Infer** — Based on the dataset description, make reasonable decisions for:
+3. **Infer** — Based on the dataset description, make reasonable decisions for:
   - Axes of diversity and what should be well represented.
   - Which variables to randomize.
   - The schema of the final dataset.
   - The structure of any structured output columns.
   - Briefly state the key decisions you made so the user can course-correct if needed.
-3. **Plan** — Determine columns, samplers, processors, validators, and other dataset features needed.
-4. **Build** — Write the Python script with `load_config_builder()` (see Output Template in SKILL.md).
-5. **Validate** — Run `data-designer validate <path>`. Address any warnings or errors and re-validate until it passes.
-6. **Preview** — Run `data-designer preview <path> --save-results` to generate sample records as HTML files.
+4. **Plan** — Determine columns, samplers, processors, validators, and other dataset features needed.
+5. **Build** — Write the Python script with `load_config_builder()` (see Output Template in SKILL.md).
+6. **Validate** — Run `data-designer validate <path>`. Address any warnings or errors and re-validate until it passes.
+7. **Preview** — Run `data-designer preview <path> --save-results` to generate sample records as HTML files.
   - Note the sample records directory printed by the `data-designer preview` command
   - Give the user a clickable link: `file://<sample-records-dir>/sample_records_browser.html`
-7. **Create** — If the user specified a record count:
+8. **Create** — If the user specified a record count:
   - Run `data-designer create <path> --num-records <N> --dataset-name <name>`.
   - Generation speed depends heavily on the dataset configuration and the user's inference setup. For larger datasets, warn the user and ask for confirmation before running.
   - If no record count was specified, skip this step.
-8. **Present** — Summarize what was built: columns, samplers used, key design choices. If the create command was run, share the results. Ask the user if they want any changes. If so, edit the script, re-validate, re-preview, and iterate.
+9. **Present** — Summarize what was built: columns, samplers used, key design choices. If the create command was run, share the results. Ask the user if they want any changes. If so, edit the script, re-validate, re-preview, and iterate.
diff --git a/skills/data-designer/workflows/interactive.md b/skills/data-designer/workflows/interactive.md
index d4a4ab33..5e3d87f7 100644
--- a/skills/data-designer/workflows/interactive.md
+++ b/skills/data-designer/workflows/interactive.md
@@ -2,12 +2,15 @@
 
 This is an interactive, iterative design process. Do not disengage from the loop unless the user says they are satisfied.
 
-1. **Learn** — Run `data-designer agent context`.
+1. **Resolve CLI command** — Run `command -v data-designer 2>/dev/null || (test -x .venv/bin/data-designer && realpath .venv/bin/data-designer)`.
+  - If the command outputs a path, use it as the `data-designer` executable for all commands in this workflow.
+  - If it produces no output or fails, STOP and follow the Troubleshooting section in SKILL.md. Do not continue to the next step.
+2. **Learn** — Run `data-designer agent context`.
   - If no model aliases are configured, stop and tell the user to run `data-designer config` to set them up before proceeding.
   - Inspect schemas for every column, sampler type, validator, and processor you plan to use.
   - Never guess types or parameters — read the relevant config files first.
   - Always read `base.py` for inherited fields shared by all config objects.
-2. **Clarify** — Ask the user clarifying questions to narrow down precisely what they want.
+3. **Clarify** — Ask the user clarifying questions to narrow down precisely what they want.
   - Optimize for a great user experience: prefer a structured question tool over plain text if one is available, batch related questions together, keep the set short, provide concrete options/examples/defaults where possible, and use structured inputs (single-select, multi-select, free text, etc.) when they make answering easier.
   - If multiple model aliases are available, ask which one(s) to use (or default to an alias with the appropriate `generation_type` for each column).
   - Common things to make precise:
@@ -17,17 +20,17 @@ This is an interactive, iterative design process. Do not disengage from the loop
     - The schema of the final dataset.
     - The structure of any required structured output columns.
     - What facets of the output dataset are important to capture.
-3. **Plan** — Determine columns, samplers, processors, validators, and other dataset features needed. Present the plan to the user and ask if they want any changes before generating a preview.
-4. **Build** — Write the Python script with `load_config_builder()` (see Output Template in SKILL.md).
-5. **Validate** — Run `data-designer validate <path>`. Address any warnings or errors and re-validate until it passes.
-6. **Preview** — Run `data-designer preview <path> --save-results` to generate sample records as HTML files.
+4. **Plan** — Determine columns, samplers, processors, validators, and other dataset features needed. Present the plan to the user and ask if they want any changes before generating a preview.
+5. **Build** — Write the Python script with `load_config_builder()` (see Output Template in SKILL.md).
+6. **Validate** — Run `data-designer validate <path>`. Address any warnings or errors and re-validate until it passes.
+7. **Preview** — Run `data-designer preview <path> --save-results` to generate sample records as HTML files.
   - Note the sample records directory printed by the `data-designer preview` command
   - Give the user a clickable link: `file://<sample-records-dir>/sample_records_browser.html`
-7. **Iterate**
+8. **Iterate**
    - Ask the user for feedback.
    - Offer to review the records yourself and suggest improvements. If the user accepts, read `references/preview-review.md` for guidance.
    - Apply changes, re-validate, and re-preview. Repeat until the user is satisfied.
-8. **Finalize** — Once the user is happy, tell them they can run the following command to create the dataset:
+9. **Finalize** — Once the user is happy, tell them they can run the following command to create the dataset:
   - `data-designer create <path> --num-records <N> --dataset-name <name>`.
   - Caution the user that generation speed depends heavily on the dataset configuration and their inference setup.
   - Do not run this command yourself — the user should control when it runs.

From ea1e2f5a42ba96fb86c1e6f403e48b6cab8c8e44 Mon Sep 17 00:00:00 2001
From: Johnny Greco <jogreco@nvidia.com>
Date: Tue, 7 Apr 2026 14:20:24 -0400
Subject: [PATCH 4/5] fix: use CLI_NOT_FOUND sentinel to avoid triggering agent
 error-fixing

The resolve command now always exits 0 and outputs CLI_NOT_FOUND when
the executable is missing, so the agent evaluates a value rather than
reacting to a shell error.
---
 skills/data-designer/workflows/autopilot.md   | 6 +++---
 skills/data-designer/workflows/interactive.md | 6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/skills/data-designer/workflows/autopilot.md b/skills/data-designer/workflows/autopilot.md
index c56c070f..e6c2a396 100644
--- a/skills/data-designer/workflows/autopilot.md
+++ b/skills/data-designer/workflows/autopilot.md
@@ -2,9 +2,9 @@
 
 In this mode, make reasonable design decisions autonomously based on the dataset description. Do not ask clarifying questions — infer sensible defaults and move straight through to a working preview.
 
-1. **Resolve CLI command** — Run `command -v data-designer 2>/dev/null || (test -x .venv/bin/data-designer && realpath .venv/bin/data-designer)`.
-  - If the command outputs a path, use it as the `data-designer` executable for all commands in this workflow.
-  - If it produces no output or fails, STOP and follow the Troubleshooting section in SKILL.md. Do not continue to the next step.
+1. **Resolve CLI command** — Run `command -v data-designer 2>/dev/null || (test -x .venv/bin/data-designer && realpath .venv/bin/data-designer) || echo CLI_NOT_FOUND`.
+  - If the output is a path, use it as the `data-designer` executable for all commands in this workflow.
+  - If the output is `CLI_NOT_FOUND`, STOP and follow the Troubleshooting section in SKILL.md. Do not continue to the next step.
 2. **Learn** — Run `data-designer agent context`.
   - If no model aliases are configured, stop and tell the user to run `data-designer config` to set them up before proceeding.
   - Inspect schemas for every column, sampler type, validator, and processor you plan to use.
diff --git a/skills/data-designer/workflows/interactive.md b/skills/data-designer/workflows/interactive.md
index 5e3d87f7..590447b6 100644
--- a/skills/data-designer/workflows/interactive.md
+++ b/skills/data-designer/workflows/interactive.md
@@ -2,9 +2,9 @@
 
 This is an interactive, iterative design process. Do not disengage from the loop unless the user says they are satisfied.
 
-1. **Resolve CLI command** — Run `command -v data-designer 2>/dev/null || (test -x .venv/bin/data-designer && realpath .venv/bin/data-designer)`.
-  - If the command outputs a path, use it as the `data-designer` executable for all commands in this workflow.
-  - If it produces no output or fails, STOP and follow the Troubleshooting section in SKILL.md. Do not continue to the next step.
+1. **Resolve CLI command** — Run `command -v data-designer 2>/dev/null || (test -x .venv/bin/data-designer && realpath .venv/bin/data-designer) || echo CLI_NOT_FOUND`.
+  - If the output is a path, use it as the `data-designer` executable for all commands in this workflow.
+  - If the output is `CLI_NOT_FOUND`, STOP and follow the Troubleshooting section in SKILL.md. Do not continue to the next step.
 2. **Learn** — Run `data-designer agent context`.
   - If no model aliases are configured, stop and tell the user to run `data-designer config` to set them up before proceeding.
   - Inspect schemas for every column, sampler type, validator, and processor you plan to use.

From 8406bffd6785bda1efc361063af63c4a9a92e457 Mon Sep 17 00:00:00 2001
From: Johnny Greco <jogreco@nvidia.com>
Date: Tue, 7 Apr 2026 16:01:49 -0400
Subject: [PATCH 5/5] fix: require user permission before installing
 data-designer

Update Troubleshooting to ask the user before creating a venv or
installing packages, instead of attempting it automatically.
---
 skills/data-designer/SKILL.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/skills/data-designer/SKILL.md b/skills/data-designer/SKILL.md
index ddee328a..51cbdef3 100644
--- a/skills/data-designer/SKILL.md
+++ b/skills/data-designer/SKILL.md
@@ -39,7 +39,7 @@ Read **only** the workflow file that matches the selected mode, then follow it:
 
 # Troubleshooting
 
-- **`data-designer` command not found:** If no virtual environment exists, create one first (`python -m venv .venv && source .venv/bin/activate`), then install (`pip install data-designer`). If a virtual environment already exists, activate it and verify the package is installed.
+- **`data-designer` CLI not found:** Tell the user that `data-designer` is not installed in this environment (requires Python >= 3.10). Ask if they would like you to create a virtual environment and install it, or if they prefer to do it themselves. Do not install anything without the user's permission.
 - **Network errors during preview:** A sandbox environment may be blocking outbound requests. Ask the user for permission to retry the command with the sandbox disabled. Only as a last resort, if retrying outside the sandbox also fails, tell the user to run the command themselves.
 
 # Output Template