Skip to content

feat: sensei scoring parity — WHEN: triggers, spec-security, Invalid level, advisory checks 16-18#79

Open
spboyer wants to merge 3 commits intomainfrom
squad/sensei-parity
Open

feat: sensei scoring parity — WHEN: triggers, spec-security, Invalid level, advisory checks 16-18#79
spboyer wants to merge 3 commits intomainfrom
squad/sensei-parity

Conversation

@spboyer
Copy link
Member

@spboyer spboyer commented Mar 4, 2026

Summary

Brings waza's scoring engine in line with spboyer/sensei v1.3.0. Adds 7 features across scoring and checks:

Changes

Issue Feature Type
Closes #72 WHEN: trigger pattern recognition Scoring
Closes #73 spec-security check (XML tags, reserved name prefixes) Spec compliance
Closes #74 Invalid score level for >1024 char descriptions Scoring
Closes #75 Cross-model description density check (advisory 16) Advisory
Closes #76 Body structure quality check (advisory 17) Advisory
Closes #77 Progressive disclosure check (advisory 18) Advisory
Closes #78 Context-dependent anti-trigger risk assessment Scoring

Files Changed (777 additions, 12 deletions)

  • internal/checks/advisory_checks.go — New: CrossModelDensityChecker, BodyStructureChecker, ProgressiveDisclosureChecker
  • internal/checks/advisory_checks_test.go — Tests for all 3 advisory checkers
  • internal/checks/score_checkers.go — Register new checkers in pipeline
  • internal/checks/spec_checks.go — New: SpecSecurityChecker
  • internal/checks/spec_checks_test.go — Tests for security checker
  • internal/scoring/scoring.go — WHEN: trigger, Invalid level, context-dependent anti-triggers
  • internal/scoring/scoring_test.go — Tests for all scoring changes

Review History

  • Linus (implementation) → Rusty rejected (dead code + panic risk)
  • Turk (fixes) → Rusty approved ✅

All tests pass. go vet clean.

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Copilot AI review requested due to automatic review settings March 4, 2026 23:38
@spboyer spboyer added the sensei-parity Parity with spboyer/sensei scoring label Mar 4, 2026
@github-actions github-actions bot enabled auto-merge (squash) March 4, 2026 23:38
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Aligns waza’s scoring and compliance checks with sensei v1.3.0 by extending the heuristic scoring model and adding new spec/advisory checkers.

Changes:

  • Adds AdherenceInvalid and short-circuits scoring when description length exceeds 1024 characters; adds WHEN: to trigger detection and introduces catalog-size-based anti-trigger risk warnings.
  • Introduces SpecSecurityChecker and registers it in the spec checker pipeline.
  • Adds and registers advisory checkers for cross-model description density, body structure quality, and progressive disclosure, with corresponding tests.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
internal/scoring/scoring.go Adds Invalid adherence, WHEN: trigger detection, and context-dependent anti-trigger risk logic.
internal/scoring/scoring_test.go Adds/updates tests for Invalid, WHEN:, and anti-trigger risk behavior.
internal/checks/spec_checks.go Adds SpecSecurityChecker implementation.
internal/checks/spec_checks_test.go Adds tests for SpecSecurityChecker.
internal/checks/score_checkers.go Registers the new spec-security and advisory checkers in the pipeline.
internal/checks/advisory_checks.go Adds advisory checkers (density, body structure, progressive disclosure).
internal/checks/advisory_checks_test.go Adds tests for the three new advisory checkers.
Comments suppressed due to low confidence (3)

internal/checks/advisory_checks.go:476

  • regexp.MustCompile is called inside BodyStructureChecker.Check, meaning the regex is recompiled on every check run. Since this runs per skill, consider hoisting this regex to a package-level var (similar to other patterns in this file) to avoid repeated compilation and keep the checker cheaper to run.

This issue also appears on line 544 of the same file.

	hasCodeBlocks := strings.Contains(content, "```")
	hasNumberedSteps := regexp.MustCompile(`(?m)^\s*\d+\.\s+`).MatchString(content)

internal/checks/advisory_checks.go:546

  • codeBlockPattern := regexp.MustCompile(...) is created inside ProgressiveDisclosureChecker.Check, so it recompiles for every skill. Consider making it a package-level var so the regex is compiled once and reused across checks.
	// Count large code blocks (>50 lines)
	codeBlockPattern := regexp.MustCompile("(?s)```[^`]*```")
	blocks := codeBlockPattern.FindAllString(sk.RawContent, -1)

internal/checks/advisory_checks.go:543

  • ProgressiveDisclosureChecker counts lines and code blocks over sk.RawContent, which includes YAML frontmatter. The advisory definition is about the SKILL.md body, so this can over-count and incorrectly trigger warnings. Consider running these checks over sk.Body (or otherwise excluding the frontmatter block) so the thresholds apply to the body content only.
func (*ProgressiveDisclosureChecker) Check(sk skill.Skill) (*CheckResult, error) {
	lines := strings.Split(sk.RawContent, "\n")
	bodyLines := len(lines)

spboyer added a commit that referenced this pull request Mar 4, 2026
…rsing, slice traversal, WHEN: count, body field, summary format

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
spboyer added a commit that referenced this pull request Mar 4, 2026
Orchestration logs:
- Linus: Implementation of 7 sensei scoring parity features
- Rusty (review): Initial code review — 2 must-fix issues identified
- Turk: Fixes applied — checker registration, panic guard
- Rusty (re-review): Approval — all issues resolved

Session log:
- /Users/shboyer/github/waza/.squad/log/2026-03-04T2320-sensei-parity.md
- Gap analysis, 7 issues created (#72-#78)
- Implementation, rejection, fix, and approval workflow
- PR #79 ready for merge

Decision merges from inbox:
- User directive: Code in GPT-5.3-Codex, reviews in Opus 4.6
- Sensei parity code review (initial + re-review)
- Inbox files cleaned

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
spboyer added a commit that referenced this pull request Mar 4, 2026
- Orchestration log: Linus fixed all 6 PR review comments
- Session log: All threads resolved, tests passing
- Model: gpt-5.3-codex on sync execution
spboyer added a commit that referenced this pull request Mar 4, 2026
…rsing, slice traversal, WHEN: count, body field, summary format

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@spboyer spboyer force-pushed the squad/sensei-parity branch from fd138c1 to e80ddc7 Compare March 4, 2026 23:59
spboyer added a commit that referenced this pull request Mar 5, 2026
Orchestration logs:
- Linus: Implementation of 7 sensei scoring parity features
- Rusty (review): Initial code review — 2 must-fix issues identified
- Turk: Fixes applied — checker registration, panic guard
- Rusty (re-review): Approval — all issues resolved

Session log:
- /Users/shboyer/github/waza/.squad/log/2026-03-04T2320-sensei-parity.md
- Gap analysis, 7 issues created (#72-#78)
- Implementation, rejection, fix, and approval workflow
- PR #79 ready for merge

Decision merges from inbox:
- User directive: Code in GPT-5.3-Codex, reviews in Opus 4.6
- Sensei parity code review (initial + re-review)
- Inbox files cleaned

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
spboyer added a commit that referenced this pull request Mar 5, 2026
- Orchestration log: Linus fixed all 6 PR review comments
- Session log: All threads resolved, tests passing
- Model: gpt-5.3-codex on sync execution
spboyer added a commit that referenced this pull request Mar 5, 2026
…rsing, slice traversal, WHEN: count, body field, summary format

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 5, 2026 00:18
@spboyer spboyer force-pushed the squad/sensei-parity branch from e80ddc7 to c30d412 Compare March 5, 2026 00:18
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 19 out of 19 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

internal/checks/advisory_checks.go:477

  • BodyStructureChecker recompiles the numbered-steps regexp on every Check() call via regexp.MustCompile(...). Since this checker can run across many skills, consider moving this to a package-level precompiled regexp (e.g., var numberedStepsRE = regexp.MustCompile(...)) to avoid repeated compilation overhead.
	hasCodeBlocks := strings.Contains(content, "```")
	hasNumberedSteps := regexp.MustCompile(`(?m)^\s*\d+\.\s+`).MatchString(content)

Copy link
Member Author

@spboyer spboyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Reviewed by Rusty — LGTM (third review). All 7 sensei parity features (#72-#78) implemented correctly. Checkers registered, panic guard in place, Invalid adherence level, WHEN: triggers, context-dependent risk. 777 additions with comprehensive test coverage. Turk's fixes resolved both must-fix issues. CI green across all checks. Ship it. Ready to merge (can't self-approve since you authored this).

@spboyer spboyer force-pushed the squad/sensei-parity branch from c30d412 to e8a1573 Compare March 5, 2026 16:00
spboyer added a commit that referenced this pull request Mar 5, 2026
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 5, 2026 17:32
spboyer and others added 2 commits March 5, 2026 12:46
…evel, advisory checks 16-18

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@spboyer spboyer force-pushed the squad/sensei-parity branch from da4aa80 to ad8b723 Compare March 5, 2026 17:46
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

chlowell pushed a commit to chlowell/waza that referenced this pull request Mar 5, 2026
Co-authored-by: Richard Park <ripark@microsoft.com>
wbreza
wbreza previously approved these changes Mar 5, 2026
Copy link
Collaborator

@wbreza wbreza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: PR #79 - feat: sensei scoring parity

What Looks Good

  • All 8 prior Copilot review comments addressed - thorough iteration
  • Comprehensive test coverage - 487 lines of new tests with boundary cases
  • Clean architecture - all checkers follow ComplianceChecker interface
  • sectionForHeader() isolates sections correctly - solves trigger/anti-trigger overlap
  • Invalid adherence level short-circuit - well-designed
  • Context-dependent anti-trigger risk fires only when anti-triggers are present
  • skillBodyContent() prefers sk.Body over raw parsing with test verification

Suggestions (non-blocking)

  1. sectionForHeader() repeated strings.ToUpper() - Consider pre-computing once. Minor perf.
  2. errorHandlingPatterns includes error - Broad match. Consider heading-level patterns.

Summary

Priority Count
Critical 0
High 0
Medium 2
Low 2

Overall Assessment: Approve - solid feature PR with comprehensive testing.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

sensei-parity Parity with spboyer/sensei scoring

Projects

None yet

3 participants