feat(cli): add skill quality evaluator (#119) by luongnv89 · Pull Request #154 · luongnv89/asm

luongnv89 · 2026-04-18T07:33:50Z

Summary

Adds asm eval <skill-path> which scores a skill's SKILL.md against seven best-practice categories (structure, description, prompt engineering, context efficiency, safety, testability, naming & conventions) and emits a structured report with an overall 0–100 score plus the top three actionable suggestions.
--fix applies deterministic frontmatter fixes (add missing version, infer effort from body size, canonical key ordering, CRLF normalisation, trailing-whitespace stripping, creator from git config user.name) and creates a SKILL.md.bak backup before writing. --fix --dry-run previews a unified diff without touching disk.
Supports --json and --machine (v1 envelope) outputs for programmatic consumers, matching the shape used by doctor/publish.

Changes

New module src/evaluator.ts — category scorers, report aggregator, auto-fix planner, unified diff helper, formatters, machine-envelope helper.
New test file src/evaluator.test.ts — 37 unit tests covering every category scorer, each auto-fixable item in isolation, dry-run, backup creation, idempotency, format helpers, and end-to-end evaluateSkill.
CLI wiring src/cli.ts — new --fix flag in ParsedArgs, help text, cmdEval dispatcher, switch case, and eval added to the commands array in isCLIMode. doctor also added to the array (was previously relying on the fallback branch).
CLI integration tests in src/cli.test.ts — eval --help, missing-path error, --json, --machine, --fix --dry-run non-writing behaviour, and --fix backup creation. isCLIMode tests for eval and doctor.

Schema alignment (intentional)

The issue uses author / type / XS/S/M/L/XL, which don't match the existing SKILL.md schema described in the README and parsed by src/utils/frontmatter.ts. Rather than silently introduce a new schema, the evaluator maps issue terminology to existing conventions:

Issue wording	Codebase convention
`author`	`creator` (or `metadata.creator`)
top-level `version`	`metadata.version` preferred, `version` fallback (via `resolveVersion`)
`XS/S/M/L/XL`	`low/medium/high/max` (README table)
`type`	no existing field — deferred; can be added in a follow-up

The decision is documented in the module docstring at the top of src/evaluator.ts.

Scope note

The optional "Can be integrated into asm publish as pre-publish quality gate (medium)" item from the issue is deferred. Hooking into the existing publish pipeline would widen the blast radius of this PR (existing publish tests, a behavior change for an already-shipped command). This PR ships eval standalone; publish integration is a clean follow-up.

Testing

bun test src/evaluator.test.ts — 37 pass, 0 fail, 70 expect() calls.
bun test src/cli.test.ts --test-name-pattern "eval" — 7 new CLI integration tests pass.
bun run typecheck — clean.
bunx prettier --check src/evaluator.ts src/evaluator.test.ts src/cli.ts src/cli.test.ts — clean.
bun run build — succeeds (run by the pre-push hook).
bun test tests/e2e/bun-e2e.test.ts — passes (run by the pre-push hook).
Manual smoke tests: ran asm eval against ./skills/hello-world (scored 40/F) and ./skills/skill-index-updater (scored higher) to confirm scoring differentiates real skills rather than being degenerate.

Note on pre-existing test failures

Five unit tests fail locally on both main and this branch due to local environment state (4 publishSkill > ... tests that depend on git / gh CLI state, and 1 CLI integration: import > import existing skills are skipped test that collides with the user's globally-installed skills). These failures exist on main prior to this PR — verified via git stash + bun test. CI on main is green, so the CI sandbox is not affected. This PR does not add or touch any of those tests.

Test plan

CI passes on the branch (unit, typecheck, e2e, build)
asm eval ./skills/hello-world produces a scored report
asm eval ./skills/hello-world --json emits parseable JSON with 7 categories
asm eval <tempdir>/skill --fix --dry-run prints a diff and does not modify SKILL.md
asm eval <tempdir>/skill --fix creates SKILL.md.bak and rewrites the original
asm eval <bogus> exits with code 1 and a helpful error
asm eval with no path prints the usage error and exits with code 2
asm eval --help prints help
asm eval ./skills/hello-world --machine emits a v1 envelope with command: "eval"

Closes #119

Adds `asm eval <skill-path>` which scores a skill's SKILL.md against seven best-practice categories (structure, description, prompt engineering, context efficiency, safety, testability, naming) and emits a structured report with an overall 0-100 score plus the top three actionable suggestions. `--fix` applies deterministic frontmatter fixes (missing version, inferred effort, canonical key ordering, CRLF normalisation, trailing-whitespace stripping, creator from git). `--fix --dry-run` previews a unified diff without writing, and `--fix` on a real run creates a `SKILL.md.bak` before modifying. Supports `--json` and `--machine` for programmatic consumers. Scope choices documented inline in `src/evaluator.ts`: the issue uses `author`/`type`/`XS/S/M/L/XL` terminology which does not match the existing SKILL.md schema described in the README and `utils/frontmatter.ts` (`creator`, `metadata.version`, `low/medium/high/max`). The evaluator maps to existing conventions instead of silently introducing a schema change, and defers `type` since no codebase field uses it. The optional "integrate with asm publish" bullet from the issue is deferred for a follow-up — this PR ships eval standalone. Five pre-existing test failures (4 publishSkill gh-CLI flows, 1 import-integration) are environment-specific on `main` and unrelated to this change; CI on main is green. Closes #119

luongnv89 merged commit d106bc6 into main Apr 18, 2026
10 checks passed

luongnv89 deleted the feat/119-skill-quality-evaluator branch April 18, 2026 08:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(cli): add skill quality evaluator (#119)#154

feat(cli): add skill quality evaluator (#119)#154
luongnv89 merged 1 commit intomainfrom
feat/119-skill-quality-evaluator

luongnv89 commented Apr 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

luongnv89 commented Apr 18, 2026

Summary

Changes

Schema alignment (intentional)

Scope note

Testing

Note on pre-existing test failures

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant