refactor(skill/verifier): replace regex heuristic with structured toolCalls for coverage check by hijzy · Pull Request #1552 · MemTensor/MemOS

hijzy · 2026-04-27T12:45:41Z

Summary

Replace the regex-based "command token guessing" in verifier.ts with structured tool name comparison using trace.toolCalls ground truth data
Add extractToolNames() utility that reads tc.name + first token of string tc.input from evidence traces
Crystallize prompt v3: inject EVIDENCE_TOOLS whitelist into the prompt, require LLM to output explicit tools: string[] field constrained to the whitelist
Verifier coverage check is now a clean set-containment: draft.tools ⊆ evidenceTools — no regex, no STOPWORDS, no actionBlob substring search
Add tools: string[] to SkillCrystallizationDraft, SkillProcedure; packager persists and renders "Tools used" section

Motivation

The old regex third branch [a-z_]{3,} pulled English verbs (install, verify, retry...) into the coverage denominator as false positives, while substring matching missed synonyms and CJK text. This caused systematically low coverage scores despite high resonance, blocking valid skills from passing verification.

Test plan

All 41 skill unit tests pass (verifier, crystallize, packager, lifecycle, eligibility, events, evidence, subscriber, integration)
Verifier tests rewritten with structured toolCalls evidence and tools draft field
Crystallize test verifies tools field parsing from LLM response
Zero lint errors

- Fix box-drawing alignment for emoji display width - Fix "Terminated" noise and ANSI escape code leaks - Suppress npm postinstall noise, add step numbering - Viewer readiness spinner with actual HTTP check - Handle launchctl KeepAlive conflict gracefully - Improve interactive picker with emoji and alignment

…lCalls for coverage check The old verifier used a regex to guess "command-like tokens" from the draft's natural language text, then searched for them in the evidence text via substring matching. The third regex branch `[a-z_]{3,}` pulled in English verbs (install, verify, retry...) as false positives, while substring search missed synonyms and CJK text — causing systematically low coverage scores despite high resonance. Replace the entire coverage pipeline with structured tool name comparison: - New `extractToolNames()` reads `trace.toolCalls[].name` + first token of string `tc.input` (for shell-like tools) as ground truth - Crystallize prompt v3 injects `EVIDENCE_TOOLS` whitelist and requires LLM to output explicit `tools: string[]` field - Verifier checks `draft.tools ⊆ evidenceTools` via Set comparison - Delete `collectCommandTokens`, `STOPWORDS`, `actionBlob` regex logic - Add `tools: string[]` to `SkillCrystallizationDraft` and `SkillProcedure` - Packager persists tools and renders "Tools used" in invocation guide

hijzy added 2 commits April 27, 2026 20:31

hijzy merged commit 5143d0c into MemTensor:mem-agent-0424 Apr 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(skill/verifier): replace regex heuristic with structured toolCalls for coverage check#1552

refactor(skill/verifier): replace regex heuristic with structured toolCalls for coverage check#1552
hijzy merged 2 commits into
MemTensor:mem-agent-0424from
hijzy:feat/verifier-structured-tool-coverage

hijzy commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hijzy commented Apr 27, 2026

Summary

Motivation

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant