[DSPX-3302] (5/5) Claude plugin: bug-repro skills for OpenTDF#454
[DSPX-3302] (5/5) Claude plugin: bug-repro skills for OpenTDF#454dmihalcik-virtru wants to merge 5 commits into
Conversation
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces the opentdf-test-harness Claude plugin, which includes several skills for bug reproduction, environment management, and test execution. The review feedback highlights the need to clarify that the agent lacks git write permissions for branch creation and suggests using the instance-status skill to prevent port collisions when initializing new scenarios.
|
|
||
| - `metadata.id = <jira-key-lowercased>` — e.g. `DSPX-3302` → `dspx-3302`. | ||
| - Scenario file path: `xtest/scenarios/<jira-key-lowercased>.yaml`. | ||
| - If you need a new git branch, propose `<JIRA-KEY>-repro` (e.g. `DSPX-3302-repro`) and let the user confirm before switching. |
There was a problem hiding this comment.
The current permissions defined in .claude/settings.json and .claude/plugin/plugin.json do not include git write access (e.g., git checkout -b or git branch). The instruction should be updated to direct the agent to ask the user to create and switch to the branch, rather than implying it can perform the switch itself after confirmation.
| - If you need a new git branch, propose `<JIRA-KEY>-repro` (e.g. `DSPX-3302-repro`) and let the user confirm before switching. | |
| - If you need a new git branch, propose <JIRA-KEY>-repro (e.g. DSPX-3302-repro) and instruct the user to create and switch to it (you do not have git write permissions). |
| instance: | ||
| metadata: { name: <jira-key-lowercased> } | ||
| platform: { dist: <platform_version> } | ||
| ports: { base: <pick free base; 8080 if first, +1000 per concurrent scenario> } |
There was a problem hiding this comment.
To avoid port collisions when multiple scenarios are active, the agent should be explicitly instructed to use the instance-status skill to check for existing instances and their port bases before picking a new one for the YAML manifest.
| ports: { base: <pick free base; 8080 if first, +1000 per concurrent scenario> } | |
| ports: { base: <check instance-status and pick a free base; 8080 if none, else highest + 1000> } |
There was a problem hiding this comment.
Code Review
This pull request introduces a Claude plugin for the OpenTDF test harness, featuring skills to automate bug reproduction from Jira reports, environment provisioning, and test execution. Feedback focuses on correcting missing permissions in the plugin manifest and settings for tools like git, grep, and glob. Additionally, improvements were suggested for the scenario generation skill, specifically regarding the use of current dates in metadata and utilizing the instance-status skill for more reliable port allocation.
| "allow": [ | ||
| "Bash(uv run otdf-local *)", | ||
| "Bash(uv run otdf-sdk-mgr *)", | ||
| "Bash(uv run pytest *)", | ||
| "Bash(acli jira workitem view *)", | ||
| "Bash(acli jira workitem search *)", | ||
| "Bash(acli jira workitem comment list *)", | ||
| "Bash(acli jira workitem comment create *)", | ||
| "Bash(acli jira workitem attachment list *)", | ||
| "Bash(acli jira workitem link list *)", | ||
| "Bash(acli jira project view *)", | ||
| "Write(xtest/scenarios/**)", | ||
| "Write(xtest/bug_*_test.py)", | ||
| "Write(tests/instances/**)" | ||
| ] |
There was a problem hiding this comment.
The plugin manifest is missing several essential permissions required by the skills. All skills require Read access to examine files (like scenarios and test patterns), and scenario-from-bug-report specifically requires Grep, Glob, and git permissions to search for existing tests and manage reproduction branches. Additionally, Write access to .claude/tmp/** is often useful for intermediate operations.
"allow": [
"Read(**)",
"Grep(**)",
"Glob(**)",
"Bash(git *)",
"Bash(uv run otdf-local *)",
"Bash(uv run otdf-sdk-mgr *)",
"Bash(uv run pytest *)",
"Bash(acli jira workitem view *)",
"Bash(acli jira workitem search *)",
"Bash(acli jira workitem comment list *)",
"Bash(acli jira workitem comment create *)",
"Bash(acli jira workitem attachment list *)",
"Bash(acli jira workitem link list *)",
"Bash(acli jira project view *)",
"Write(xtest/scenarios/**)",
"Write(xtest/bug_*_test.py)",
"Write(tests/instances/**)",
"Write(.claude/tmp/**)"
]| "allow": [ | ||
| "Bash(uv run otdf-local *)", | ||
| "Bash(uv run otdf-sdk-mgr *)", | ||
| "Bash(uv run pytest *)", | ||
| "Bash(uv sync *)", | ||
| "Bash(git status *)", | ||
| "Bash(git diff *)", | ||
| "Bash(git log *)", | ||
| "Bash(git show *)", | ||
| "Bash(gh api *)", | ||
| "Bash(gh issue view *)", | ||
| "Bash(gh pr view *)", | ||
| "Bash(gh run *)", | ||
| "Bash(acli jira workitem view *)", | ||
| "Bash(acli jira workitem search *)", | ||
| "Bash(acli jira workitem comment list *)", | ||
| "Bash(acli jira workitem comment create *)", | ||
| "Bash(acli jira workitem attachment list *)", | ||
| "Bash(acli jira workitem link list *)", | ||
| "Bash(acli jira workitem watcher list *)", | ||
| "Bash(acli jira project view *)", | ||
| "Bash(acli jira board view *)", | ||
| "Bash(acli jira sprint view *)", | ||
| "Write(xtest/scenarios/**)", | ||
| "Write(xtest/bug_*_test.py)", | ||
| "Write(tests/instances/**)", | ||
| "Write(.claude/tmp/**)" | ||
| ] |
There was a problem hiding this comment.
The local settings are missing Read, Grep, and Glob permissions, which are listed as allowed-tools in the skills. Also, the git permissions should be expanded to allow branch management (e.g., checkout, branch) as required by the scenario-from-bug-report skill.
"allow": [
"Read(**)",
"Grep(**)",
"Glob(**)",
"Bash(uv run otdf-local *)",
"Bash(uv run otdf-sdk-mgr *)",
"Bash(uv run pytest *)",
"Bash(uv sync *)",
"Bash(git *)",
"Bash(gh *)",
"Bash(acli jira workitem view *)",
"Bash(acli jira workitem search *)",
"Bash(acli jira workitem comment list *)",
"Bash(acli jira workitem comment create *)",
"Bash(acli jira workitem attachment list *)",
"Bash(acli jira workitem link list *)",
"Bash(acli jira workitem watcher list *)",
"Bash(acli jira project view *)",
"Bash(acli jira board view *)",
"Bash(acli jira sprint view *)",
"Write(xtest/scenarios/**)",
"Write(xtest/bug_*_test.py)",
"Write(tests/instances/**)",
"Write(.claude/tmp/**)"
]| metadata: | ||
| id: <jira-key-lowercased> | ||
| title: "<Jira summary>" | ||
| created: <YYYY-MM-DD> |
| instance: | ||
| metadata: { name: <jira-key-lowercased> } | ||
| platform: { dist: <platform_version> } | ||
| ports: { base: <pick free base; 8080 if first, +1000 per concurrent scenario> } |
There was a problem hiding this comment.
The port selection logic is a bit manual. Since the instance-status skill is specifically designed to identify running instances and port usage, it should be the primary method for determining a free port base to avoid collisions.
| ports: { base: <pick free base; 8080 if first, +1000 per concurrent scenario> } | |
| ports: { base: <pick a free base port; use the instance-status skill to check for collisions> } |
X-Test Results✅ js-v0.15.0 |
6d2e83c to
f23ccce
Compare
21535d2 to
a5502f6
Compare
f23ccce to
6498765
Compare
Adds five Claude Code skills under tests/.claude/skills/ that together
turn a Jira bug ticket into a running reproduction, plus a downstream-
installable plugin manifest under .claude/plugin/.
Why
---
The end-to-end goal of DSPX-3302 is to make bug reproduction approachable
for QA, downstream-product engineers, and CI. PRs 1-4 build the plumbing
(shared schema, platform installer, multi-instance otdf-local, xtest
conftest hooks). This PR is the user-facing surface: a Claude can pull
context from Jira, draft an xtest/scenarios/<jira-key>.yaml (and, when
needed, an xtest/bug_<jira_key>_test.py), bring the environment up at
the right version pins, run the scenario's pytest selection, and tear
down.
Skills
------
scenario-from-bug-report
Pulls the Jira issue and its comments via `acli jira workitem view
--fields '*all' --json` and `acli jira workitem comment list`,
extracts version pins / KAS topology / container type / feature
flags, then writes xtest/scenarios/<jira-key-lowercased>.yaml
validated against otdf_sdk_mgr.schema.Scenario. Drafts a new
xtest/bug_<id>_test.py only when no existing pytest covers the
case; never silently lands assertions.
scenario-up
Runs `otdf-sdk-mgr install scenario`, then `otdf-local instance
init --from-scenario`, then `otdf-local --instance <name> up`, and
polls status until healthy. Surfaces logs rather than retrying
blindly when something stays unhealthy.
scenario-run
Invokes `otdf-local scenario run <path>` and classifies the
result: "bug reproduced" / "not reproduced" / "unrelated failure".
Cites the evidence line and points at per-service logs.
scenario-tear-down
Stops the instance and optionally removes the directory after
explicit user confirmation.
instance-status
Lists known instances, their port bases, health, and flags port
collisions.
Jira-safety
-----------
Permissions in both .claude/settings.json and the plugin manifest
allow only read+comment via acli jira: workitem view, workitem search,
workitem comment list, workitem comment create, plus a handful of
read-only project/board/sprint queries. edit, delete, transition,
assign, archive, link create, watcher add are all denied. The
plugin.json carries a permission_notes block explaining the policy.
Plugin manifest
---------------
.claude/plugin/plugin.json declares the skill names, runtime
requirements (uv, go, git, docker, acli), and the canonical permission
allowlist, so downstream first/third-party integrators can install
this plugin into their own Claude Code setups.
Refs: https://virtru.atlassian.net/browse/DSPX-3302
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
a5502f6 to
943eee9
Compare
…m-ticket (DSPX-3302) Headless dogfooding (run-1 on DSPX-2719) showed the bug-only framing was too narrow — the common workflow is writing tests for new features first (TDD), not reproducing version-pinned bugs. - Rename and rewrite the skill to branch on Jira Issue Type. Bug follows the old expected/actual flow; Story/Task uses ref pins (`main`, feature branch, PR SHA via `gh pr view --json headRefOid`) for forward-looking regression gates; Spike bails out rather than fabricating. Mandates `acli workitem comment list` and steers away from cli.sh greps (both were run-1 gaps). - New `scenario-matrix` sibling skill: write N scenario files from a base × N refs (PRs/branches/releases). Schema/installer support was already there via `PlatformPin.source.ref` and `install_platform_source(ref)` — no other changes needed. - `scenario-run` output classification generalized from "bug reproduced / not reproduced" to "expected / unexpected outcome", with explicit branches for bug-repro vs TDD interpretations. - `scenario-up` description and `plugin.json` (description, skills array, requirements) updated to match. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
For features (or bugs) that touch more than one OpenTDF repo — platform
plus the Go / Java / JS SDKs — feature-design captures the work as a
single spec at xtest/features/<name>.yaml plus the tests-side artifacts
that land first (feature_type entry in tdfs.py, scenario, draft test).
The model matches the team's existing pattern: tests-side artifacts
merge first, dormant under a `supports("<feature>")` gate, and each
per-repo PR activates the gate by adding `supports <feature>` to its
cli.sh. PRs land async, in any order; no cross-PR lockstep needed.
- `feature-design` SKILL: propose-then-iterate authoring from a Jira
ticket (or free-form description). Drafts a complete spec on the
first pass, asks one composite redirect question, then writes the
spec + patches tdfs.py + invokes scenario-from-ticket internally
to produce the dormant scenario and draft test. Bails on Spike or
unclear tickets rather than fabricating.
- `xtest/features/{README,CLAUDE}.md`: progressive-disclosure docs —
human-facing README and agent-facing CLAUDE.md.
- `xtest/README.md` gains a brief "Test artifact directories" section
pointing at scenarios/ and features/.
- `settings.json` + `plugin.json`: Write(xtest/features/**) allowlist,
feature-design added to plugin skills array.
The complementary feature-orchestrate skill (fanning out per-repo
subagents to draft impl PRs in each touched repo) is a follow-up.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…PX-3302) Headless dogfooding (runs 1 and 2 of scenario-from-ticket on DSPX-2719) surfaced two real gaps: - The `Skill` tool was denied on both runs because the allowlist didn't cover it, so the body of SKILL.md wasn't injected on invocation; the agent had to manually `Read` the skill file ~25 turns in, wasting time and biasing exploration toward grepping unrelated files first. Add `Skill(*)` to settings.json and per-skill `Skill(<name>)` entries to plugin.json (the latter enumerates exactly what downstream installs get, since they shouldn't inherit a wildcard). - `acli jira workitem comment list` requires `--key <KEY>` (the subcommand differs from `view`, which takes the key positionally). Both scenario-from-ticket and feature-design had the wrong form; corrected, with a one-line note about the asymmetry so the next agent doesn't paraphrase. Verified via run-3 on DSPX-2719: 41 turns / 5m16s / $1.07 (vs run-1's 48 turns / 6m44s / $1.27). Skill tool returned success on first call, both acli commands ran cleanly, the Story/Task branch produced `source.ref: main` pins correctly (no more incorrectly defaulting to `dist: lts`), and the agent's `actual:` field correctly enumerated all three test-infrastructure prerequisites including a `with_ecdsa_binding` parameter that run-1's scenario missed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…emas (DSPX-3302)
Headless runs of scenario-from-ticket kept trying `python3 -c "from
otdf_sdk_mgr.schema import Scenario; ..."` to introspect Pydantic model
shape while authoring scenarios. That form isn't in the plugin's Bash
allowlist (deliberately — it's arbitrary code execution), so the agent
fell back to Reading schema.py source. Static, committed JSON Schemas
give the same information declaratively without needing a python verb
in the allowlist at all.
- `otdf-sdk-mgr schema dump [--out-dir]`: writes
`xtest/schema/{scenario,instance}.schema.json` from
`Model.model_json_schema()`, sorted-keys + trailing newline so output
is byte-stable. Add new models to `SCHEMAS` in cli_schema.py and they
get picked up automatically.
- `xtest/schema/` is committed with the generated files plus brief
README/CLAUDE.md (progressive-disclosure, mirroring xtest/features/).
- `test_schema_sync.py` parametrizes over `SCHEMAS` and fails if any
committed file drifts from the live model — the safety net for
"someone edited a Pydantic model without regenerating."
- `scenario-from-ticket` SKILL.md Step 5 now points at
`xtest/schema/scenario.schema.json` as the canonical field list.
- `xtest/README.md` lists the new directory alongside `scenarios/` and
`features/`.
No allowlist changes needed — `Bash(uv run otdf-sdk-mgr *)` already
covers the dump subcommand, and `Read` is unrestricted.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|



Summary
Final PR in the five-part stack. Adds Claude Code skills under
tests/.claude/skills/that turn a Jira ticket into a runnable scenario across one or more OpenTDF repos, plus a downstream-installable plugin manifest under.claude/plugin/plugin.json. Also lands the supportingotdf-sdk-mgr schema dumpCLI that emits canonical JSON Schemas the skills read.Skills
scenario-from-ticket— Pulls a Jira ticket of any type (Bug, Story, Task, Spike) viaacli jira workitem view --fields '*all' --json+acli jira workitem comment list --key, then writesxtest/scenarios/<jira-key-lowercased>.yaml. Branches on Issue Type: Bugs fillexpected:/actual:from reproduction prose and pin to released versions (dist:); Stories/Tasks use ref pins (source.ref: main, a feature branch, or a PR head SHA) for forward-looking regression gates. Drafts anxtest/bug_<id>_test.pyonly when no existing pytest covers the case; never silently lands assertions; bails on Spike/unclear tickets rather than fabricating.scenario-matrix— Given a base scenario plus a list of refs (PR numbers, branches, released versions), writes one scenario file per cell so the same suite runs across all of them. Resolves PR numbers to head SHAs viagh pr viewfor reproducibility.feature-design— For features (or bugs) that touch more than one OpenTDF repo (platform + Go/Java/JS SDKs), captures the work as a single declaration underxtest/features/<name>.yamlplus the tests-side artifacts that have to land first (afeature_typeentry inxtest/tdfs.py, the scenario, and a draft pytest gated onsupports("<feature>")). Propose-then-iterate authoring: drafts a complete spec from the Jira ticket on the first pass, then asks one composite redirect question.scenario-up—otdf-sdk-mgr install scenario→otdf-local instance init --from-scenario→otdf-local --instance <name> up, then polls status.scenario-run—otdf-local scenario runand classifies the result against the scenario'sexpected:/actual:(expected outcome / unexpected outcome / unrelated failure). Works for bug-repros (expected = test fails matchingactual:) and TDD scenarios (expected = test skipped viasupports()gate until the implementing SDK PR lands).scenario-tear-down— Stops the instance and optionally removes its directory after explicit confirmation.instance-status— Lists known instances, port bases, health, and port collisions.Schema introspection support
The skills
Readxtest/schema/scenario.schema.json(andinstance.schema.json) as the canonical reference for what fields the scenario YAML accepts. These files are committed and kept in sync byotdf-sdk-mgr schema dumpplus a parametrized pytest inotdf-sdk-mgr/tests/test_schema_sync.py. The point: agents (and humans) introspect the on-disk format from a declarative file rather than runningpython -c "from otdf_sdk_mgr.schema import ...", which keeps the Bash allowlist narrow.Jira safety
Permissions in both
.claude/settings.jsonandplugin.jsonallow only Jira read + comment-create via acli (workitem view,workitem search,workitem comment list,workitem comment create, plus a few read-only project/board/sprint queries).edit,delete,transition,assign,archive,link create,watcher add, etc. are explicitly denied. The plugin manifest carries apermission_notesblock explaining the policy.Distribution
.claude/plugin/plugin.jsondeclares the skill names, runtime requirements (uv, go, git, docker, acli, gh), and the canonical permission allowlist so downstream first/third-party integrators can install the plugin into their own Claude Code setups without dragging the wholetests/repo along.Stack
Iteration evidence
Skills were dogfooded across four headless
claude -p --model sonnetruns against Jira issue DSPX-2719 (a Story-typed coverage-gap ticket). Each run surfaced real gaps that landed as commits on this branch:Skilltool denied; fell back to manually readingSKILL.md; useddist: lts(wrong branch)scenario-from-ticketSkill(*)allow + acli--keyfixsource.ref: mainpins; onepython -cdenialxtest/schema/JSON Schemaspython -cattemptRun 4 produced a Story-shape scenario whose
actual:correctly enumerated all three test-infrastructure prerequisites (feature_typeentry,cli.sh supportscases, and awith_ecdsa_bindingparameter ontdfs.SDK.encrypt()) — content quality at parity with a hand-authored scenario, in less than half the original turn count.Test plan
scenario-from-ticketagainst a Bug ticket and verify the produced YAML validates withuv run otdf-sdk-mgr schema dumpregen-check and usesdist:pinsscenario-from-ticketagainst a Story/Task ticket and verify it producessource.ref:pins (notdist:) and asupports()-gated draft testfeature-designagainst a multi-repo ticket produces both anxtest/features/<name>.yamlspec and a tests-side scenario + test +tdfs.pypatchscenario-upend-to-end on a chosen scenario (requires Go toolchain to build the platform binary the first time)scenario-runclassifies a known-failing bug-repro scenario as "expected outcome" and a known-passing one as "unexpected outcome"uv run otdf-sdk-mgr schema dumpregenerates committed schemas without diff (sync test passes)acli jira workitem editattemptJira: https://virtru.atlassian.net/browse/DSPX-3302
🤖 Generated with Claude Code