Skip to content

Add development discipline plugin#18

Open
jwilger wants to merge 2 commits into
mainfrom
development-discipline
Open

Add development discipline plugin#18
jwilger wants to merge 2 commits into
mainfrom
development-discipline

Conversation

@jwilger

@jwilger jwilger commented Jul 5, 2026

Copy link
Copy Markdown
Owner

Summary

  • add the dual-harness development-discipline plugin with tuned personal workflow skills for TDD, verification, debugging, review handling, and skill authoring
  • register the plugin in Claude Code and Codex marketplace manifests and README catalog tables
  • add behavior fixtures and Bats coverage for the new plugin plus full-marketplace canary expectations

Validation

  • just ci
  • scripts/evals/run.sh --dry-run
  • scripts/evals/run.sh --suite canary --dry-run
  • plugin-eval analyze on all five new skills: 100/100, grade A

Provider-backed live evals and real plugin-eval benchmark runs were not run; no tailored benchmark verifier was specified.

Summary by CodeRabbit

  • New Features

    • Added a new development-discipline plugin to the marketplace, with support across the available tool catalogs.
    • Included several new guidance topics for testing, debugging, review responses, verification, and writing skills.
    • Expanded the documented plugin catalog to show the new entry.
  • Tests

    • Added checks to confirm the plugin is registered correctly and its expected guidance materials are present.
    • Updated marketplace validation coverage to recognize the new skill names.

@coderabbitai

coderabbitai Bot commented Jul 5, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

@jwilger, you've reached your PR review limit, so we couldn't start this review.

Next review available in: 53 minutes

Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available.
You're only billed for reviews past your plan's rate limits ($0.25/file).

How can I continue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews.

How do review limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please refer docs for additional details.

Review details
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 8582f5af-246d-4896-97db-e2fa7e91ceff

📥 Commits

Reviewing files that changed from the base of the PR and between 7911000 and 45bc6f2.

📒 Files selected for processing (1)
  • scripts/tests/development-discipline-plugin.bats
📝 Walkthrough

Walkthrough

Adds a new development-discipline plugin registered in both Claude and Codex marketplace manifests, with a README, five skill documents (TDD, verification-before-completion, systematic-debugging, receiving-code-review, writing-skills), catalog entries in README.md, five new behavior fixture cases, and validation tests including a new Bats script and updates to an existing canary test.

Changes

Development-discipline plugin addition

Layer / File(s) Summary
Plugin manifests and README
plugins/development-discipline/.claude-plugin/plugin.json, plugins/development-discipline/.codex-plugin/plugin.json, plugins/development-discipline/README.md
Defines plugin metadata (name, version, author, license) for both harnesses and documents purpose, packaged skills, and harness-agnostic sharing of skills/.
Skill documents
plugins/development-discipline/skills/*/SKILL.md
Adds five SKILL.md guides covering TDD, verification-before-completion, systematic-debugging, receiving-code-review, and writing-skills, each with frontmatter, rules, and checklists/tables.
Marketplace catalog registration
.agents/plugins/marketplace.json, .claude-plugin/marketplace.json, README.md
Registers the plugin entry (source, description, version, category, keywords) in both marketplace manifests and adds catalog rows in README.md for Claude Code and Codex sections.
Behavior fixtures for skills
evals/fixtures/behavior/full-marketplace/cases.json
Adds five new evaluation cases with rubric, calibration, and forbidden-intent assertions for TDD, verification claim scope, review feedback skepticism, skill authoring, and systematic debugging.
Plugin validation tests
scripts/tests/development-discipline-plugin.bats, scripts/tests/evals-config.bats
Adds a Bats test validating manifests, skill files, marketplace registration, README rows, and fixture completeness; updates canary skill-name lists to include the new plugin.

Estimated code review effort: 2 (Simple) | ~15 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely states the main change: adding the development-discipline plugin.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch development-discipline

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@jwilger jwilger enabled auto-merge (squash) July 5, 2026 03:59
@jwilger

jwilger commented Jul 5, 2026

Copy link
Copy Markdown
Owner Author

@coderabbitai re-review

@coderabbitai

coderabbitai Bot commented Jul 5, 2026

Copy link
Copy Markdown

Rate Limit Exceeded

@jwilger have exceeded the limit for the number of chat messages per hour. Please wait 0 minutes and 57 seconds before sending another message.

@jwilger

jwilger commented Jul 5, 2026

Copy link
Copy Markdown
Owner Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jul 5, 2026

Copy link
Copy Markdown

Rate Limit Exceeded

@jwilger have exceeded the limit for the number of chat messages per hour. Please wait 0 minutes and 36 seconds before sending another message.

@jwilger

jwilger commented Jul 5, 2026

Copy link
Copy Markdown
Owner Author

@coderabbitai re-review

@coderabbitai

coderabbitai Bot commented Jul 5, 2026

Copy link
Copy Markdown

@jwilger Sounds good, I'll re-review the PR now.

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
evals/fixtures/behavior/full-marketplace/cases.json (1)

349-375: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick win

Safety-critical case lacks a deterministic hard guard.

development-discipline-verification-claim-scope uses valueGate.mode: "safety-critical" with minPassRate: 1, same as development-discipline-tdd-one-test-first, but unlike that case it ships with hardAssertions: []. Relying purely on semanticRubric for a zero-tolerance gate reintroduces the LLM-rubric flakiness that the hard-guard pattern is meant to prevent for safety-critical cases.

Consider adding a forbiddenIntent (or similar) hard assertion targeting phrases like "the repo is ready" / "whole repo is ready" combined with no re-verification language, mirroring the pattern used for the TDD case.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@evals/fixtures/behavior/full-marketplace/cases.json` around lines 349 - 375,
The safety-critical case `development-discipline-verification-claim-scope`
currently depends only on the semantic rubric and has no deterministic hard
guard. Add a hard assertion in `hardAssertions` (for example a
`forbiddenIntent`-style rule) that rejects whole-repo readiness claims like “the
repo is ready” or “whole repo is ready” unless paired with fresh re-verification
language, using the same hard-guard pattern as the TDD case.
scripts/tests/development-discipline-plugin.bats (1)

40-45: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Forward reference: fail is defined before failures exists.

fail() (lines 40-42) closes over failures, but failures isn't declared until line 44. This works only because no call to fail() occurs before line 44 executes — functionally safe here, but fragile to future edits that might call fail() earlier in the script.

♻️ Proposed reordering
-function fail(message) {
-  failures.push(message);
-}
-
-const failures = [];
+const failures = [];
+function fail(message) {
+  failures.push(message);
+}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/tests/development-discipline-plugin.bats` around lines 40 - 45, Move
the shared state declaration so it exists before the helper uses it: in the test
script, declare `failures` before `fail(message)` or otherwise ensure `fail()`
does not close over an as-yet-undeclared variable. Keep `fail()` and `failures`
together near the top of the file so future calls to `fail()` cannot hit a
forward reference.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scripts/tests/development-discipline-plugin.bats`:
- Around line 29-34: The requiredCases list in
development-discipline-plugin.bats is missing the systematic-debugging behavior
fixture, so the test does not validate all added cases. Update the requiredCases
array to include development-discipline-systematic-debugging-root-cause so it
matches the full set of skills checked by requiredSkills and covers every
fixture used in the behavior cases JSON.

---

Nitpick comments:
In `@evals/fixtures/behavior/full-marketplace/cases.json`:
- Around line 349-375: The safety-critical case
`development-discipline-verification-claim-scope` currently depends only on the
semantic rubric and has no deterministic hard guard. Add a hard assertion in
`hardAssertions` (for example a `forbiddenIntent`-style rule) that rejects
whole-repo readiness claims like “the repo is ready” or “whole repo is ready”
unless paired with fresh re-verification language, using the same hard-guard
pattern as the TDD case.

In `@scripts/tests/development-discipline-plugin.bats`:
- Around line 40-45: Move the shared state declaration so it exists before the
helper uses it: in the test script, declare `failures` before `fail(message)` or
otherwise ensure `fail()` does not close over an as-yet-undeclared variable.
Keep `fail()` and `failures` together near the top of the file so future calls
to `fail()` cannot hit a forward reference.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: bd84995c-bfd6-49c1-926b-d0da48a20f6d

📥 Commits

Reviewing files that changed from the base of the PR and between 7a37d01 and 7911000.

📒 Files selected for processing (14)
  • .agents/plugins/marketplace.json
  • .claude-plugin/marketplace.json
  • README.md
  • evals/fixtures/behavior/full-marketplace/cases.json
  • plugins/development-discipline/.claude-plugin/plugin.json
  • plugins/development-discipline/.codex-plugin/plugin.json
  • plugins/development-discipline/README.md
  • plugins/development-discipline/skills/receiving-code-review/SKILL.md
  • plugins/development-discipline/skills/systematic-debugging/SKILL.md
  • plugins/development-discipline/skills/test-driven-development/SKILL.md
  • plugins/development-discipline/skills/verification-before-completion/SKILL.md
  • plugins/development-discipline/skills/writing-skills/SKILL.md
  • scripts/tests/development-discipline-plugin.bats
  • scripts/tests/evals-config.bats

Comment thread scripts/tests/development-discipline-plugin.bats
@jwilger

jwilger commented Jul 5, 2026

Copy link
Copy Markdown
Owner Author

/trunk merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant