feat: Add trigger heuristic grader by spboyer · Pull Request #90 · microsoft/waza

spboyer · 2026-03-05T01:35:29Z

Closes #80

Copilot

Pull request overview

Adds a new trigger heuristic grader type to validate whether a prompt should (or should not) activate a skill, closing #80.

Changes:

Introduces trigger grader implementation + unit tests.
Wires trigger into grader type registries, creation factory, and JSON schemas.
Adds end-user documentation for configuring and interpreting trigger grading output.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
schemas/task.schema.json	Adds `trigger` grader type and `triggerGraderConfig` definition for task schema validation.
schemas/eval.schema.json	Adds `trigger` grader type and documented `triggerGraderConfig` definition for eval schema validation.
internal/suggest/suggest.go	Includes `trigger` in the set of grader types surfaced by suggest.
internal/suggest/grader_docs.go	Adds a short description for the `trigger` grader in suggest docs.
internal/models/outcome.go	Defines `GraderKindTrigger` constant for the new grader kind.
internal/graders/trigger_grader.go	Implements the trigger heuristic grader (keyword + phrase scoring).
internal/graders/trigger_grader_test.go	Adds tests for constructor validation, mode behavior, thresholds, and factory creation.
internal/graders/grader.go	Extends `Create` factory to construct the trigger grader from config.
docs/graders/trigger.md	Documents `trigger` grader configuration, scoring behavior, and output details.
docs/graders/README.md	Links the new `trigger` grader docs from the graders index.
.squad/log/2026-03-05T00-36-issue-assignment-pipeline.md	Adds session log entry (process documentation).
.squad/log/2026-03-05T00-26-rusty-token-diff-design.md	Adds session log entry (process documentation).
.squad/decisions.md	Records decision notes (process documentation).
.squad/agents/linus/history.md	Records learnings summary for #80 (process documentation).

Comments suppressed due to low confidence (4)

internal/graders/trigger_grader.go:163

bestPhraseScore re-tokenizes the prompt even though scorePrompt already computed promptTokens. Consider passing promptTokens (and/or a precomputed token set + lowercased prompt) into bestPhraseScore to avoid duplicate tokenization and allocation on every grade call.

func (g *triggerHeuristicGrader) scorePrompt(prompt string) (score float64, phraseScore float64, matched []string) {
	promptTokens := tokenize(prompt)
	if len(promptTokens) == 0 {
		return 0, 0, nil
	}

	seen := make(map[string]bool)
	for _, token := range promptTokens {
		if _, ok := g.keywords[token]; ok && !seen[token] {
			seen[token] = true
			matched = append(matched, token)
		}
	}

	tokenScore := float64(len(matched)) / float64(len(promptTokens))
	phraseScore = g.bestPhraseScore(prompt)
	if phraseScore > tokenScore {
		return phraseScore, phraseScore, matched
	}
	return tokenScore, phraseScore, matched
}

func (g *triggerHeuristicGrader) bestPhraseScore(prompt string) float64 {
	if len(g.triggerPhrases) == 0 {
		return 0
	}

	promptLower := strings.ToLower(prompt)
	promptTokenSet := make(map[string]struct{})
	for _, token := range tokenize(prompt) {
		promptTokenSet[token] = struct{}{}
	}

internal/graders/trigger_grader.go:203

The error messages hardcode SKILL.md, but skillPath may be any filename/path (even though docs recommend SKILL.md). Consider changing these messages to reference “skill file” (or include the provided path without assuming the basename) to avoid misleading users during debugging.

func loadTriggerHeuristicData(skillPath string) (map[string]struct{}, []string, error) {
	data, err := os.ReadFile(skillPath)
	if err != nil {
		return nil, nil, fmt.Errorf("reading SKILL.md %s: %w", skillPath, err)
	}

	var sk skill.Skill
	if err := sk.UnmarshalText(data); err != nil {
		return nil, nil, fmt.Errorf("parsing SKILL.md %s: %w", skillPath, err)
	}

schemas/task.schema.json:868

triggerGraderConfig in task.schema.json is missing the description fields that were added in eval.schema.json (for the object and its properties). Adding the same descriptions here would keep the schemas consistent and improve generated docs / editor intellisense for task configs.

    "triggerGraderConfig": {
      "type": "object",
      "required": [
        "skill_path",
        "mode"
      ],
      "additionalProperties": false,
      "properties": {
        "skill_path": {
          "type": "string",
          "minLength": 1
        },
        "mode": {
          "type": "string",
          "enum": [
            "positive",
            "negative"
          ]
        },
        "threshold": {
          "type": "number",
          "minimum": 0,
          "maximum": 1,
          "default": 0.6
        }
      }
    },

internal/graders/trigger_grader_test.go:141

Prefer require.GreaterOrEqual(t, result.Score, threshold) for clearer failure output and consistency with other assertions in this test file.

	require.True(t, result.Score >= threshold)
	require.True(t, result.Passed)

chlowell · 2026-03-05T02:21:21Z

I like the consistency of having everything be a grader instead of having graders and trigger tests defined separately as we do today (for example). Do you want to remove the existing trigger test feature in this PR?

spboyer

Rusty (Opus 4.6) — ⚠️ Blocked: Merge Conflict

The trigger heuristic grader implementation is solid architecturally — clean grader pattern, constructor validation, 4 test functions, \docs/graders/trigger.md, schema updates. I want to approve this.

Blocker: Merge conflict. GitHub reports \mergeable: CONFLICTING. Only 1 CI check (CLA) ran — the full Go CI matrix didn't execute.

Action needed:

Rebase onto main to resolve the merge conflict (likely in \internal/models/outcome.go\ where fields overlap with PR #91)
Verify all 7 CI checks pass
Push the rebased branch

Once CI is green, I'll approve immediately.

codecov-commenter · 2026-03-05T15:21:56Z

Codecov Report

❌ Patch coverage is 77.70701% with 35 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (main@a75477e). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
internal/graders/trigger_grader.go	78.14%	22 Missing and 11 partials ⚠️
internal/graders/grader.go	60.00%	1 Missing and 1 partial ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##             main      #90   +/-   ##
=======================================
  Coverage        ?   72.27%           
=======================================
  Files           ?      129           
  Lines           ?    14409           
  Branches        ?        0           
=======================================
  Hits            ?    10414           
  Misses          ?     3219           
  Partials        ?      776

Flag	Coverage Δ
go-implementation	`72.27% <77.70%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

spboyer requested a review from chlowell as a code owner March 5, 2026 01:35

Copilot AI review requested due to automatic review settings March 5, 2026 01:35

spboyer requested a review from richardpark-msft as a code owner March 5, 2026 01:35

spboyer self-assigned this Mar 5, 2026

github-actions bot enabled auto-merge (squash) March 5, 2026 01:35

Copilot AI reviewed Mar 5, 2026

View reviewed changes

spboyer commented Mar 5, 2026

View reviewed changes

spboyer force-pushed the squad/80-trigger-heuristic-grader branch from 7d59e0e to 671e8a9 Compare March 5, 2026 15:18

spboyer force-pushed the squad/80-trigger-heuristic-grader branch from 671e8a9 to 9fec0b1 Compare March 5, 2026 17:12

Copilot AI review requested due to automatic review settings March 5, 2026 17:12

Copilot AI reviewed Mar 5, 2026

View reviewed changes

feat: add trigger heuristic grader microsoft#80

3f422a9

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

spboyer force-pushed the squad/80-trigger-heuristic-grader branch from 9fec0b1 to 3f422a9 Compare March 5, 2026 17:46

chlowell pushed a commit to chlowell/waza that referenced this pull request Mar 5, 2026

feat: create waza SKILL.md (microsoft#52) (microsoft#90)

1e995e3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add trigger heuristic grader#90

feat: Add trigger heuristic grader#90
spboyer wants to merge 1 commit intomicrosoft:mainfrom
spboyer:squad/80-trigger-heuristic-grader

spboyer commented Mar 5, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

chlowell commented Mar 5, 2026

Uh oh!

spboyer left a comment

Uh oh!

codecov-commenter commented Mar 5, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

spboyer commented Mar 5, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

chlowell commented Mar 5, 2026

Uh oh!

spboyer left a comment

Choose a reason for hiding this comment

Uh oh!

codecov-commenter commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov-commenter commented Mar 5, 2026 •

edited

Loading