Skip to content

[DSPX-3302] (5/5) Claude plugin: bug-repro skills for OpenTDF#454

Draft
dmihalcik-virtru wants to merge 5 commits into
DSPX-3302-04-xtest-conftestfrom
DSPX-3302-05-claude-plugin
Draft

[DSPX-3302] (5/5) Claude plugin: bug-repro skills for OpenTDF#454
dmihalcik-virtru wants to merge 5 commits into
DSPX-3302-04-xtest-conftestfrom
DSPX-3302-05-claude-plugin

Conversation

@dmihalcik-virtru
Copy link
Copy Markdown
Member

@dmihalcik-virtru dmihalcik-virtru commented May 15, 2026

Summary

Final PR in the five-part stack. Adds Claude Code skills under tests/.claude/skills/ that turn a Jira ticket into a runnable scenario across one or more OpenTDF repos, plus a downstream-installable plugin manifest under .claude/plugin/plugin.json. Also lands the supporting otdf-sdk-mgr schema dump CLI that emits canonical JSON Schemas the skills read.

Skills

  • scenario-from-ticket — Pulls a Jira ticket of any type (Bug, Story, Task, Spike) via acli jira workitem view --fields '*all' --json + acli jira workitem comment list --key, then writes xtest/scenarios/<jira-key-lowercased>.yaml. Branches on Issue Type: Bugs fill expected: / actual: from reproduction prose and pin to released versions (dist:); Stories/Tasks use ref pins (source.ref: main, a feature branch, or a PR head SHA) for forward-looking regression gates. Drafts an xtest/bug_<id>_test.py only when no existing pytest covers the case; never silently lands assertions; bails on Spike/unclear tickets rather than fabricating.
  • scenario-matrix — Given a base scenario plus a list of refs (PR numbers, branches, released versions), writes one scenario file per cell so the same suite runs across all of them. Resolves PR numbers to head SHAs via gh pr view for reproducibility.
  • feature-design — For features (or bugs) that touch more than one OpenTDF repo (platform + Go/Java/JS SDKs), captures the work as a single declaration under xtest/features/<name>.yaml plus the tests-side artifacts that have to land first (a feature_type entry in xtest/tdfs.py, the scenario, and a draft pytest gated on supports("<feature>")). Propose-then-iterate authoring: drafts a complete spec from the Jira ticket on the first pass, then asks one composite redirect question.
  • scenario-upotdf-sdk-mgr install scenariootdf-local instance init --from-scenariootdf-local --instance <name> up, then polls status.
  • scenario-runotdf-local scenario run and classifies the result against the scenario's expected: / actual: (expected outcome / unexpected outcome / unrelated failure). Works for bug-repros (expected = test fails matching actual:) and TDD scenarios (expected = test skipped via supports() gate until the implementing SDK PR lands).
  • scenario-tear-down — Stops the instance and optionally removes its directory after explicit confirmation.
  • instance-status — Lists known instances, port bases, health, and port collisions.

Schema introspection support

The skills Read xtest/schema/scenario.schema.json (and instance.schema.json) as the canonical reference for what fields the scenario YAML accepts. These files are committed and kept in sync by otdf-sdk-mgr schema dump plus a parametrized pytest in otdf-sdk-mgr/tests/test_schema_sync.py. The point: agents (and humans) introspect the on-disk format from a declarative file rather than running python -c "from otdf_sdk_mgr.schema import ...", which keeps the Bash allowlist narrow.

Jira safety

Permissions in both .claude/settings.json and plugin.json allow only Jira read + comment-create via acli (workitem view, workitem search, workitem comment list, workitem comment create, plus a few read-only project/board/sprint queries). edit, delete, transition, assign, archive, link create, watcher add, etc. are explicitly denied. The plugin manifest carries a permission_notes block explaining the policy.

Distribution

.claude/plugin/plugin.json declares the skill names, runtime requirements (uv, go, git, docker, acli, gh), and the canonical permission allowlist so downstream first/third-party integrators can install the plugin into their own Claude Code setups without dragging the whole tests/ repo along.

Stack

  1. (base) Shared schema — chore(xtest): Shared Scenario/Instance Pydantic schema in otdf-sdk-mgr #450
  2. (base) Platform installer + install scenario — [DSPX-3302] (2/5) Manage platform service + install scenario in otdf-sdk-mgr #451
  3. (base) otdf-local multi-instance refactor — [DSPX-3302] (3/5) otdf-local multi-instance refactor #452
  4. (base) xtest conftest integration — [DSPX-3302] (4/5) xtest conftest: --scenario and --instance flags #453
  5. This PR — Claude plugin + schema dump CLI

Iteration evidence

Skills were dogfooded across four headless claude -p --model sonnet runs against Jira issue DSPX-2719 (a Story-typed coverage-gap ticket). Each run surfaced real gaps that landed as commits on this branch:

Run After commit Turns Cost Permission denials Notable behavior
1 original 48 $1.27 2 Skill tool denied; fell back to manually reading SKILL.md; used dist: lts (wrong branch)
2 renamed → scenario-from-ticket Anthropic API stall (6h hang); killed
3 Skill(*) allow + acli --key fix 41 $1.07 1 First clean Skill invocation; correct source.ref: main pins; one python -c denial
4 xtest/schema/ JSON Schemas 23 $0.76 0 Agent reads the JSON Schema directly; no python -c attempt

Run 4 produced a Story-shape scenario whose actual: correctly enumerated all three test-infrastructure prerequisites (feature_type entry, cli.sh supports cases, and a with_ecdsa_binding parameter on tdfs.SDK.encrypt()) — content quality at parity with a hand-authored scenario, in less than half the original turn count.

Test plan

  • Invoke scenario-from-ticket against a Bug ticket and verify the produced YAML validates with uv run otdf-sdk-mgr schema dump regen-check and uses dist: pins
  • Invoke scenario-from-ticket against a Story/Task ticket and verify it produces source.ref: pins (not dist:) and a supports()-gated draft test
  • feature-design against a multi-repo ticket produces both an xtest/features/<name>.yaml spec and a tests-side scenario + test + tdfs.py patch
  • scenario-up end-to-end on a chosen scenario (requires Go toolchain to build the platform binary the first time)
  • scenario-run classifies a known-failing bug-repro scenario as "expected outcome" and a known-passing one as "unexpected outcome"
  • uv run otdf-sdk-mgr schema dump regenerates committed schemas without diff (sync test passes)
  • Verify the acli allowlist denies a deliberate acli jira workitem edit attempt

Jira: https://virtru.atlassian.net/browse/DSPX-3302

🤖 Generated with Claude Code

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 15, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 63be1c48-db73-45b1-acc1-756ae72abdad

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch DSPX-3302-05-claude-plugin

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the opentdf-test-harness Claude plugin, which includes several skills for bug reproduction, environment management, and test execution. The review feedback highlights the need to clarify that the agent lacks git write permissions for branch creation and suggests using the instance-status skill to prevent port collisions when initializing new scenarios.


- `metadata.id = <jira-key-lowercased>` — e.g. `DSPX-3302` → `dspx-3302`.
- Scenario file path: `xtest/scenarios/<jira-key-lowercased>.yaml`.
- If you need a new git branch, propose `<JIRA-KEY>-repro` (e.g. `DSPX-3302-repro`) and let the user confirm before switching.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current permissions defined in .claude/settings.json and .claude/plugin/plugin.json do not include git write access (e.g., git checkout -b or git branch). The instruction should be updated to direct the agent to ask the user to create and switch to the branch, rather than implying it can perform the switch itself after confirmation.

Suggested change
- If you need a new git branch, propose `<JIRA-KEY>-repro` (e.g. `DSPX-3302-repro`) and let the user confirm before switching.
- If you need a new git branch, propose <JIRA-KEY>-repro (e.g. DSPX-3302-repro) and instruct the user to create and switch to it (you do not have git write permissions).

instance:
metadata: { name: <jira-key-lowercased> }
platform: { dist: <platform_version> }
ports: { base: <pick free base; 8080 if first, +1000 per concurrent scenario> }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To avoid port collisions when multiple scenarios are active, the agent should be explicitly instructed to use the instance-status skill to check for existing instances and their port bases before picking a new one for the YAML manifest.

Suggested change
ports: { base: <pick free base; 8080 if first, +1000 per concurrent scenario> }
ports: { base: <check instance-status and pick a free base; 8080 if none, else highest + 1000> }

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a Claude plugin for the OpenTDF test harness, featuring skills to automate bug reproduction from Jira reports, environment provisioning, and test execution. Feedback focuses on correcting missing permissions in the plugin manifest and settings for tools like git, grep, and glob. Additionally, improvements were suggested for the scenario generation skill, specifically regarding the use of current dates in metadata and utilizing the instance-status skill for more reliable port allocation.

Comment on lines +21 to +35
"allow": [
"Bash(uv run otdf-local *)",
"Bash(uv run otdf-sdk-mgr *)",
"Bash(uv run pytest *)",
"Bash(acli jira workitem view *)",
"Bash(acli jira workitem search *)",
"Bash(acli jira workitem comment list *)",
"Bash(acli jira workitem comment create *)",
"Bash(acli jira workitem attachment list *)",
"Bash(acli jira workitem link list *)",
"Bash(acli jira project view *)",
"Write(xtest/scenarios/**)",
"Write(xtest/bug_*_test.py)",
"Write(tests/instances/**)"
]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The plugin manifest is missing several essential permissions required by the skills. All skills require Read access to examine files (like scenarios and test patterns), and scenario-from-bug-report specifically requires Grep, Glob, and git permissions to search for existing tests and manage reproduction branches. Additionally, Write access to .claude/tmp/** is often useful for intermediate operations.

    "allow": [
      "Read(**)",
      "Grep(**)",
      "Glob(**)",
      "Bash(git *)",
      "Bash(uv run otdf-local *)",
      "Bash(uv run otdf-sdk-mgr *)",
      "Bash(uv run pytest *)",
      "Bash(acli jira workitem view *)",
      "Bash(acli jira workitem search *)",
      "Bash(acli jira workitem comment list *)",
      "Bash(acli jira workitem comment create *)",
      "Bash(acli jira workitem attachment list *)",
      "Bash(acli jira workitem link list *)",
      "Bash(acli jira project view *)",
      "Write(xtest/scenarios/**)",
      "Write(xtest/bug_*_test.py)",
      "Write(tests/instances/**)",
      "Write(.claude/tmp/**)"
    ]

Comment thread .claude/settings.json
Comment on lines +3 to +30
"allow": [
"Bash(uv run otdf-local *)",
"Bash(uv run otdf-sdk-mgr *)",
"Bash(uv run pytest *)",
"Bash(uv sync *)",
"Bash(git status *)",
"Bash(git diff *)",
"Bash(git log *)",
"Bash(git show *)",
"Bash(gh api *)",
"Bash(gh issue view *)",
"Bash(gh pr view *)",
"Bash(gh run *)",
"Bash(acli jira workitem view *)",
"Bash(acli jira workitem search *)",
"Bash(acli jira workitem comment list *)",
"Bash(acli jira workitem comment create *)",
"Bash(acli jira workitem attachment list *)",
"Bash(acli jira workitem link list *)",
"Bash(acli jira workitem watcher list *)",
"Bash(acli jira project view *)",
"Bash(acli jira board view *)",
"Bash(acli jira sprint view *)",
"Write(xtest/scenarios/**)",
"Write(xtest/bug_*_test.py)",
"Write(tests/instances/**)",
"Write(.claude/tmp/**)"
]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The local settings are missing Read, Grep, and Glob permissions, which are listed as allowed-tools in the skills. Also, the git permissions should be expanded to allow branch management (e.g., checkout, branch) as required by the scenario-from-bug-report skill.

    "allow": [
      "Read(**)",
      "Grep(**)",
      "Glob(**)",
      "Bash(uv run otdf-local *)",
      "Bash(uv run otdf-sdk-mgr *)",
      "Bash(uv run pytest *)",
      "Bash(uv sync *)",
      "Bash(git *)",
      "Bash(gh *)",
      "Bash(acli jira workitem view *)",
      "Bash(acli jira workitem search *)",
      "Bash(acli jira workitem comment list *)",
      "Bash(acli jira workitem comment create *)",
      "Bash(acli jira workitem attachment list *)",
      "Bash(acli jira workitem link list *)",
      "Bash(acli jira workitem watcher list *)",
      "Bash(acli jira project view *)",
      "Bash(acli jira board view *)",
      "Bash(acli jira sprint view *)",
      "Write(xtest/scenarios/**)",
      "Write(xtest/bug_*_test.py)",
      "Write(tests/instances/**)",
      "Write(.claude/tmp/**)"
    ]

metadata:
id: <jira-key-lowercased>
title: "<Jira summary>"
created: <YYYY-MM-DD>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using a placeholder like <YYYY-MM-DD> might result in the LLM leaving it literal or guessing. It is better to explicitly instruct it to use the current date.

Suggested change
created: <YYYY-MM-DD>
created: YYYY-MM-DD # Use current date

instance:
metadata: { name: <jira-key-lowercased> }
platform: { dist: <platform_version> }
ports: { base: <pick free base; 8080 if first, +1000 per concurrent scenario> }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The port selection logic is a bit manual. Since the instance-status skill is specifically designed to identify running instances and port usage, it should be the primary method for determining a free port base to avoid collisions.

Suggested change
ports: { base: <pick free base; 8080 if first, +1000 per concurrent scenario> }
ports: { base: <pick a free base port; use the instance-status skill to check for collisions> }

@github-actions
Copy link
Copy Markdown

@dmihalcik-virtru dmihalcik-virtru force-pushed the DSPX-3302-04-xtest-conftest branch from 6d2e83c to f23ccce Compare May 15, 2026 16:37
@dmihalcik-virtru dmihalcik-virtru force-pushed the DSPX-3302-05-claude-plugin branch from 21535d2 to a5502f6 Compare May 15, 2026 16:38
@dmihalcik-virtru dmihalcik-virtru force-pushed the DSPX-3302-04-xtest-conftest branch from f23ccce to 6498765 Compare May 15, 2026 17:02
Adds five Claude Code skills under tests/.claude/skills/ that together
turn a Jira bug ticket into a running reproduction, plus a downstream-
installable plugin manifest under .claude/plugin/.

Why
---

The end-to-end goal of DSPX-3302 is to make bug reproduction approachable
for QA, downstream-product engineers, and CI. PRs 1-4 build the plumbing
(shared schema, platform installer, multi-instance otdf-local, xtest
conftest hooks). This PR is the user-facing surface: a Claude can pull
context from Jira, draft an xtest/scenarios/<jira-key>.yaml (and, when
needed, an xtest/bug_<jira_key>_test.py), bring the environment up at
the right version pins, run the scenario's pytest selection, and tear
down.

Skills
------

  scenario-from-bug-report
    Pulls the Jira issue and its comments via `acli jira workitem view
    --fields '*all' --json` and `acli jira workitem comment list`,
    extracts version pins / KAS topology / container type / feature
    flags, then writes xtest/scenarios/<jira-key-lowercased>.yaml
    validated against otdf_sdk_mgr.schema.Scenario. Drafts a new
    xtest/bug_<id>_test.py only when no existing pytest covers the
    case; never silently lands assertions.

  scenario-up
    Runs `otdf-sdk-mgr install scenario`, then `otdf-local instance
    init --from-scenario`, then `otdf-local --instance <name> up`, and
    polls status until healthy. Surfaces logs rather than retrying
    blindly when something stays unhealthy.

  scenario-run
    Invokes `otdf-local scenario run <path>` and classifies the
    result: "bug reproduced" / "not reproduced" / "unrelated failure".
    Cites the evidence line and points at per-service logs.

  scenario-tear-down
    Stops the instance and optionally removes the directory after
    explicit user confirmation.

  instance-status
    Lists known instances, their port bases, health, and flags port
    collisions.

Jira-safety
-----------

Permissions in both .claude/settings.json and the plugin manifest
allow only read+comment via acli jira: workitem view, workitem search,
workitem comment list, workitem comment create, plus a handful of
read-only project/board/sprint queries. edit, delete, transition,
assign, archive, link create, watcher add are all denied. The
plugin.json carries a permission_notes block explaining the policy.

Plugin manifest
---------------

.claude/plugin/plugin.json declares the skill names, runtime
requirements (uv, go, git, docker, acli), and the canonical permission
allowlist, so downstream first/third-party integrators can install
this plugin into their own Claude Code setups.

Refs: https://virtru.atlassian.net/browse/DSPX-3302

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@dmihalcik-virtru dmihalcik-virtru force-pushed the DSPX-3302-05-claude-plugin branch from a5502f6 to 943eee9 Compare May 15, 2026 17:04
…m-ticket (DSPX-3302)

Headless dogfooding (run-1 on DSPX-2719) showed the bug-only framing
was too narrow — the common workflow is writing tests for new features
first (TDD), not reproducing version-pinned bugs.

- Rename and rewrite the skill to branch on Jira Issue Type. Bug
  follows the old expected/actual flow; Story/Task uses ref pins
  (`main`, feature branch, PR SHA via `gh pr view --json headRefOid`)
  for forward-looking regression gates; Spike bails out rather than
  fabricating. Mandates `acli workitem comment list` and steers away
  from cli.sh greps (both were run-1 gaps).
- New `scenario-matrix` sibling skill: write N scenario files from a
  base × N refs (PRs/branches/releases). Schema/installer support was
  already there via `PlatformPin.source.ref` and
  `install_platform_source(ref)` — no other changes needed.
- `scenario-run` output classification generalized from "bug
  reproduced / not reproduced" to "expected / unexpected outcome",
  with explicit branches for bug-repro vs TDD interpretations.
- `scenario-up` description and `plugin.json` (description, skills
  array, requirements) updated to match.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

dmihalcik-virtru and others added 3 commits May 15, 2026 21:24
For features (or bugs) that touch more than one OpenTDF repo — platform
plus the Go / Java / JS SDKs — feature-design captures the work as a
single spec at xtest/features/<name>.yaml plus the tests-side artifacts
that land first (feature_type entry in tdfs.py, scenario, draft test).

The model matches the team's existing pattern: tests-side artifacts
merge first, dormant under a `supports("<feature>")` gate, and each
per-repo PR activates the gate by adding `supports <feature>` to its
cli.sh. PRs land async, in any order; no cross-PR lockstep needed.

- `feature-design` SKILL: propose-then-iterate authoring from a Jira
  ticket (or free-form description). Drafts a complete spec on the
  first pass, asks one composite redirect question, then writes the
  spec + patches tdfs.py + invokes scenario-from-ticket internally
  to produce the dormant scenario and draft test. Bails on Spike or
  unclear tickets rather than fabricating.
- `xtest/features/{README,CLAUDE}.md`: progressive-disclosure docs —
  human-facing README and agent-facing CLAUDE.md.
- `xtest/README.md` gains a brief "Test artifact directories" section
  pointing at scenarios/ and features/.
- `settings.json` + `plugin.json`: Write(xtest/features/**) allowlist,
  feature-design added to plugin skills array.

The complementary feature-orchestrate skill (fanning out per-repo
subagents to draft impl PRs in each touched repo) is a follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…PX-3302)

Headless dogfooding (runs 1 and 2 of scenario-from-ticket on DSPX-2719)
surfaced two real gaps:

- The `Skill` tool was denied on both runs because the allowlist didn't
  cover it, so the body of SKILL.md wasn't injected on invocation; the
  agent had to manually `Read` the skill file ~25 turns in, wasting time
  and biasing exploration toward grepping unrelated files first.
  Add `Skill(*)` to settings.json and per-skill `Skill(<name>)` entries
  to plugin.json (the latter enumerates exactly what downstream installs
  get, since they shouldn't inherit a wildcard).
- `acli jira workitem comment list` requires `--key <KEY>` (the
  subcommand differs from `view`, which takes the key positionally).
  Both scenario-from-ticket and feature-design had the wrong form;
  corrected, with a one-line note about the asymmetry so the next
  agent doesn't paraphrase.

Verified via run-3 on DSPX-2719: 41 turns / 5m16s / $1.07 (vs run-1's
48 turns / 6m44s / $1.27). Skill tool returned success on first call,
both acli commands ran cleanly, the Story/Task branch produced
`source.ref: main` pins correctly (no more incorrectly defaulting to
`dist: lts`), and the agent's `actual:` field correctly enumerated all
three test-infrastructure prerequisites including a `with_ecdsa_binding`
parameter that run-1's scenario missed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…emas (DSPX-3302)

Headless runs of scenario-from-ticket kept trying `python3 -c "from
otdf_sdk_mgr.schema import Scenario; ..."` to introspect Pydantic model
shape while authoring scenarios. That form isn't in the plugin's Bash
allowlist (deliberately — it's arbitrary code execution), so the agent
fell back to Reading schema.py source. Static, committed JSON Schemas
give the same information declaratively without needing a python verb
in the allowlist at all.

- `otdf-sdk-mgr schema dump [--out-dir]`: writes
  `xtest/schema/{scenario,instance}.schema.json` from
  `Model.model_json_schema()`, sorted-keys + trailing newline so output
  is byte-stable. Add new models to `SCHEMAS` in cli_schema.py and they
  get picked up automatically.
- `xtest/schema/` is committed with the generated files plus brief
  README/CLAUDE.md (progressive-disclosure, mirroring xtest/features/).
- `test_schema_sync.py` parametrizes over `SCHEMAS` and fails if any
  committed file drifts from the live model — the safety net for
  "someone edited a Pydantic model without regenerating."
- `scenario-from-ticket` SKILL.md Step 5 now points at
  `xtest/schema/scenario.schema.json` as the canonical field list.
- `xtest/README.md` lists the new directory alongside `scenarios/` and
  `features/`.

No allowlist changes needed — `Bash(uv run otdf-sdk-mgr *)` already
covers the dump subcommand, and `Read` is unrestricted.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant