Skip to content

feat(unic-spec-review): dedup + cancellable Approval Loop#237

Merged
orioltf merged 8 commits into
developfrom
archon/task-feature-unic-spec-review-208-dedup-approval-loop
Jun 9, 2026
Merged

feat(unic-spec-review): dedup + cancellable Approval Loop#237
orioltf merged 8 commits into
developfrom
archon/task-feature-unic-spec-review-208-dedup-approval-loop

Conversation

@orioltf

@orioltf orioltf commented Jun 9, 2026

Copy link
Copy Markdown
Member

What

Closes the unic-spec-review posting path (S8) with content-similarity deduplication and a fully cancellable multi-Finding Approval Loop. Resolves #208.

Why

The --post path (S5) had no deduplication and could post only one Finding per run, so reruns — by the same reviewer or anyone else — silently piled up duplicate comments on a page. There was no safeguard against re-raising an issue someone already commented on.

How

  • scripts/lib/dedup-matcher.mjs (new pure library): matchDedup(finding, existingComments) computes Jaccard word-token similarity between a Finding (title + body) and every existing page comment, returning a post / skip / flag decision plus near-duplicate candidates sorted by similarity. No hidden marker and no local state file — it reads the shared page, so it is multi-user and multi-run safe by construction (ADR-0002). Exports tokenize/jaccard helpers and SKIP_THRESHOLD/FLAG_THRESHOLD constants, plus a --findings-file/--comments-file CLI mode for command integration.
  • scripts/atlassian-fetch.mjs: adds a --comments <url> CLI mode wrapping fetchConfluenceComments in the same never-throws collectComments pattern as --child-pages.
  • commands/review-spec.md Step 10: replaced the single-Finding picker with a multi-Finding Approval Loop — fetch existing comments → run dedup-matcher → present a ranked list annotated with [~near-dup] / [~likely-dup] badges → comma-separated selection → explicit human tiebreak on flagged/skip matches (never silently dropped or re-raised) → post approved Findings via the S7 write path with anchors + attribution footers. Cancellable at every step, including a post-none exit after selection. Bare /review-spec stays strictly read-only.
  • tests/dedup-matcher.test.mjs: 22 node:test cases covering tokenize, jaccard, and matchDedup with injected comment sets; no live services. The Approval Loop driver (markdown instructions) is not unit-tested by design.

Validation

  • pnpm --filter unic-spec-review typecheck — ✅
  • pnpm --filter unic-spec-review test — ✅ 292 passed (22 new)
  • pnpm --filter unic-spec-review verify:changelog — ✅ → 0.1.7
  • pnpm ci:check — ✅
  • No em dash in authored text; no LICENSE files touched; no cross-plugin imports.

🤖 Generated with Claude Code

orioltf and others added 5 commits June 9, 2026 08:56
The --post path had no deduplication and could only post one Finding,
so reruns and other reviewers' runs silently piled up duplicate comments.

Changes:
- Add scripts/lib/dedup-matcher.mjs: pure Jaccard word-token similarity
  matcher comparing a Finding against all existing page comments; returns
  a post/skip/flag decision plus near-duplicates. No hidden marker, no
  local state (reads the shared page, so multi-user/multi-run safe).
- Add --comments <url> CLI mode to atlassian-fetch (collectComments,
  parseCommentsArg) mirroring the never-throws collectChildPages pattern.
- Replace review-spec.md Step 10 with a multi-Finding Approval Loop:
  annotated ranked list, comma-separated selection, human tiebreak on
  flagged/skip matches, cancellable at every step incl. post-none exit.
- Add tests/dedup-matcher.test.mjs covering tokenize/jaccard/matchDedup.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Promote [Unreleased] dedup + Approval Loop entries to 0.1.7 and sync
the version into marketplace.json and package.json.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The PR-merge CI build fails Prettier on this docs file (added to develop
in 4674275): it uses *asterisk* emphasis where Prettier expects
_underscore_. Normalise it so root checks pass. Docs-only, no semantic
change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Fixed:
- README.md/CONTEXT.md/AGENTS.md: update S5→S8 status blocks (multi-Finding loop + dedup now implemented, not deferred)
- atlassian-fetch.mjs: CommentsOutput.truncated JSDoc missing trailing clause; parseCommentsArg JSDoc missing empty-value null case
- dedup-matcher.mjs: remove unreachable `union===0` guard; split ambiguous try/catch to name failing file
- tests/dedup-matcher.test.mjs: fix Jaccard comment notation (single-letter→two-letter to match code); add null/undefined tests for tokenize defensive guard; add null-array test for matchDedup
- tests/atlassian-fetch.test.mjs: add collectComments (3 cases) and parseCommentsArg (3 cases) tests

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@orioltf

orioltf commented Jun 9, 2026

Copy link
Copy Markdown
Member Author

⚡ Self-Fix Report (Aggressive)

Status: COMPLETE
Pushed: ✅ Changes pushed to archon/task-feature-unic-spec-review-208-dedup-approval-loop
Commit: 695073f
Philosophy: Fix everything unless clearly a new concern


Fixes Applied (12 total)

Severity Count
🔴 CRITICAL 0
🟠 HIGH 3
🟡 MEDIUM 1
🟢 LOW 8
View all fixes
  • README.md S5→S8 callout (README.md:9) — Updated to describe multi-Finding loop, near-dup flags, Jaccard similarity; removed "planned for S8" sentence
  • CONTEXT.md status S1–S5→S1–S8 (CONTEXT.md:5) — Bumped status; added collectComments --comments and dedup-matcher to feature list; removed dedup/multi-Finding from deferred
  • AGENTS.md status S1–S5→S1–S8 (AGENTS.md:9) — Bumped status; single-Finding→multi-Finding write path; removed dedup from deferred slices
  • collectComments tests (tests/atlassian-fetch.test.mjs) — 3 cases: success, null-creds auth-error shape, per-page FetchError; mirrors collectChildPages exactly
  • CommentsOutput.truncated JSDoc (atlassian-fetch.mjs:940) — Appended "and the comment set is incomplete" to match ChildPagesOutput wording
  • Dead union===0 guard (dedup-matcher.mjs:77) — Removed unreachable guard after early-return already handles both-empty case
  • tokenize null/undefined tests (tests/dedup-matcher.test.mjs) — Added null and undefined defensive-guard test cases
  • matchDedup null-array guard test (tests/dedup-matcher.test.mjs) — Added it('returns post when existingComments is null')
  • Ambiguous file error in dedup-matcher CLI (dedup-matcher.mjs:142–159) — Split into two try/catch blocks naming findings file vs comments file
  • Jaccard comment notation (tests/dedup-matcher.test.mjs:87,92) — Updated {a,b}{aa,bb} to match actual two-letter token values in code
  • parseCommentsArg JSDoc (atlassian-fetch.mjs:993) — Added "or empty": "or null when absent or empty"
  • parseCommentsArg tests (tests/atlassian-fetch.test.mjs) — 3 cases mirroring parseChildPagesArg (present/absent/no-value)

Tests Added

  • tokenize — returns an empty set for null input
  • tokenize — returns an empty set for undefined input
  • matchDedup — returns post when existingComments is null
  • parseCommentsArg — returns the URL following --comments
  • parseCommentsArg — returns null when the flag is absent
  • parseCommentsArg — returns null when the flag has no value
  • collectComments — returns comments and no errors on success
  • collectComments — reports an auth-error with url "" when no credentials are configured
  • collectComments — collects a per-page FetchError instead of throwing

Skipped (1)

Finding Reason
Non-FetchError tagged as parse-error (atlassian-fetch.mjs:982–997) Pre-existing intentional pattern matching collectChildPages; review explicitly recommended no action; changing requires coordinated update to both functions

Suggested Follow-up Issues

(none — all actionable findings addressed)


Validation

✅ Type check | ✅ Lint | ✅ Tests (301 passed)


Self-fix by Archon · aggressive mode · fixes pushed to archon/task-feature-unic-spec-review-208-dedup-approval-loop

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
orioltf and others added 2 commits June 9, 2026 10:11
Issue #208 forbids em dashes in authored text (the CHANGELOG version
header is the only sanctioned exception). The S8 status-line rewrites in
README.md and AGENTS.md introduced em dashes; replace them with a colon
separator so the authored prose complies.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
PR #237 review follow-ups:

- Guard matchDedup against a Finding whose title/body is null/undefined so
  candidate text never tokenizes the literal "undefined".
- Add deterministic threshold-boundary tests pinning the exact Jaccard
  scores at and just below FLAG_THRESHOLD and SKIP_THRESHOLD (the prior
  prose fixtures sat far from the boundaries, so a >= to > drift or a
  constant change went undetected), plus null-field and credential-load
  exception coverage for collectComments.
- Correct the CommentsOutput.truncated JSDoc to name the comment-list
  pagination cap (MAX_PAGES) rather than a page-count cap, and document
  that the dedup-matcher CLI accepts a bare ConfluenceComment[] array.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@orioltf orioltf merged commit b665c89 into develop Jun 9, 2026
15 checks passed
@orioltf orioltf deleted the archon/task-feature-unic-spec-review-208-dedup-approval-loop branch June 9, 2026 09:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

unic-spec-review S8: dedup + Approval Loop — content-similarity dedup-matcher + cancellable loop with post-none exit

1 participant