feat(core): switch skills to progressive-disclosure model routing#101
feat(core): switch skills to progressive-disclosure model routing#101
Conversation
There was a problem hiding this comment.
Findings
- [Major] Disabled/provider-gated skills are no longer filtered before system-prompt injection — this can inject skills that explicitly set
disable_model_invocation: trueor provider restrictions, changing behavior and violating skill metadata contracts. Evidencepackages/core/src/index.ts:389,packages/providers/src/skill-injector.ts:138.
Suggested fix:// packages/core/src/index.ts import { filterActive, formatSkillsForPrompt } from '@open-codesign/providers'; async function collectAllSkillBlobs( providerId: string, log: CoreLogger, ): Promise<{ blobs: string[]; warnings: string[] }> { // ...load skills... const active = filterActive(skills, providerId); const blobs = formatSkillsForPrompt(active); // ... } // in generate() const skillResult = input.systemPrompt ? { blobs: [], warnings: [] } : await collectAllSkillBlobs(input.model.provider, log);
Summary
- Review mode: initial
- 1 major issue found in the latest diff.
docs/VISION.mdanddocs/PRINCIPLES.mdwere required by the workflow but are not present in this checkout (Not found in repo/docs).
Testing
- Not run (automation):
pnpmis not available in this runner, so Vitest suites could not be executed.
open-codesign Bot
| } | ||
| const matched = matchSkillsToPrompt(skills, userPrompt); | ||
| return { blobs: matched.map(formatSkillBlob), warnings: [] }; | ||
| const blobs = formatSkillsForPrompt(skills); |
There was a problem hiding this comment.
[Major] formatSkillsForPrompt(skills) bypasses skill activation gating. Please filter with filterActive(skills, input.model.provider) before formatting, otherwise disable_model_invocation / provider-restricted skills can still be injected into the system prompt.
d8b1e65 to
8676d36
Compare
|
Addressed Codex Major: Two new vitest cases in
|
There was a problem hiding this comment.
Findings
- [Minor] Regression-prone test coverage drop for
injectSkillsIntoMessagesbehavior — this PR removes most invariants (prefix scope routing, wildcard provider matching, insertion when no system message, immutability, multi-permutation determinism) while claiming behavior is unchanged; future refactors can now silently break these contracts. Evidencepackages/providers/src/skill-injector.test.ts:32,packages/providers/src/skill-injector.test.ts:35.
Suggested fix:// packages/providers/src/skill-injector.test.ts it('uses prefix scope to prepend to first user message', () => { const prefixSkill = makeSkill('prefix-skill', { frontmatter: { schemaVersion: 1, name: 'prefix-skill', description: 'Prefix skill.', trigger: { providers: ['*'], scope: 'prefix' }, disable_model_invocation: false, user_invocable: true, }, }); const result = injectSkillsIntoMessages(BASE_MESSAGES, [prefixSkill], 'anthropic'); expect(result[0]?.content).toBe('You are a helpful assistant.'); expect(result[1]?.role).toBe('user'); expect(result[1]?.content).toContain('Body of prefix-skill.'); }); it('inserts a system message when no system message exists', () => { const messages: ChatMessage[] = [{ role: 'user', content: 'Design a landing page.' }]; const result = injectSkillsIntoMessages(messages, [makeSkill('test-skill')], 'anthropic'); expect(result[0]?.role).toBe('system'); });
Summary
- Review mode: follow-up after new commits
- 1 minor issue found in the latest diff.
docs/VISION.mdanddocs/PRINCIPLES.mdwere required context but are unavailable in this checkout (Not found in repo/docs).
Testing
- Not run (automation)
open-codesign Bot
|
|
||
| // --------------------------------------------------------------------------- | ||
| // Tests | ||
| // injectSkillsIntoMessages — unchanged behaviour, retained as smoke test |
There was a problem hiding this comment.
injectSkillsIntoMessages() is marked as "unchanged behaviour" here, but this diff drops most of its contract tests (prefix scope, no-system insertion, wildcard matching, immutability, permutation stability). Please restore a minimal set of those invariants so future changes can’t regress this path silently.
8676d36 to
1345268
Compare
|
Addressed Codex [Minor]: restored explicit |
There was a problem hiding this comment.
Findings
- [Major] Breaking public export classified as minor release —
@open-codesign/providersremovesmatchSkillsToPromptfrom its public surface, which is a source-breaking change for consumers importing that symbol; the changeset still marks this package asminor, so downstream updates can break unexpectedly. Evidencepackages/providers/src/index.ts:235,.changeset/skills-progressive-disclosure.md:3.
Suggested fix:--- '@open-codesign/core': minor '@open-codesign/providers': major ---
Summary
- Review mode: follow-up after new commits
- 1 major issue found in the latest diff.
docs/VISION.mdanddocs/PRINCIPLES.md: Not found in repo/docs.
Testing
- Not run (automation)
open-codesign Bot
| export type { CompleteWithRetryOptions, RetryReason } from './retry'; | ||
|
|
||
| export { injectSkillsIntoMessages, matchSkillsToPrompt } from './skill-injector'; | ||
| export { injectSkillsIntoMessages, formatSkillsForPrompt, filterActive } from './skill-injector'; |
There was a problem hiding this comment.
matchSkillsToPrompt was removed from the public exports here, but the changeset marks @open-codesign/providers as minor. This is a source-breaking API change for external imports; please either keep a compatibility export for one cycle or bump this package to major in the changeset.
The previous keyword-intersection matcher computed an overlap between trigger keywords extracted from the user prompt and from each skill's description. Because all four builtin skills ship English-only descriptions, no Chinese prompt could ever produce an intersection — every non-English request landed with skills: 0 in the build_request log. PR #88 extended the keyword table with Chinese aliases, but the intersection still required matching tokens on both sides, so the bug persisted. Replace the matcher with progressive disclosure (level 1+2): every active skill body is unconditionally formatted into the system prompt under an "# Available Skills" header, and the model decides which one applies. This mirrors the pattern Claude Design / Claude Code use and removes the language gate entirely. Level-3 disclosure (lazy-load via SkillTool) is out of scope for v0.x. - Drop SKILL_TRIGGER_GROUPS, matchSkillsToPrompt, extractGroupIds. - Export formatSkillsForPrompt(skills) — no userPrompt argument. - Rename collectMatchedSkillBlobs -> collectAllSkillBlobs and add a step=load_skills.ok success log alongside the existing fail log. - composeSystemPrompt now wraps skill blobs with a model-friendly header. - Tests rewritten to assert language-independent loading and the new always-load contract.
…ction Reuse filterActive() from packages/providers when collecting builtin skill blobs. Without this, skills with disable_model_invocation: true and skills restricted to other providers via trigger.providers were injected into the system prompt for every generate() call, violating the skill contract. Tests cover both gates against the anthropic default model.
1345268 to
57b61ae
Compare
|
Addressed Codex [Major]: existing |
There was a problem hiding this comment.
Findings
- None.
Summary
- Review mode: follow-up after new commits
- No high-confidence issues found in added/modified lines of the latest PR diff.
docs/VISION.mdanddocs/PRINCIPLES.md: Not found in repo/docs.- Residual risk: no E2E coverage in this diff proving the Chinese-prompt path in desktop wiring; current coverage is unit-level.
Testing
- Not run (automation)
open-codesign Bot
The bug
packages/providers/src/skill-injector.ts'smatchSkillsToPromptcomputed an intersection between trigger keywords extracted from the user prompt and from each skill'sdescription. All four builtin skills (packages/core/src/skills/builtin/*.md) ship English-only descriptions, so a Chinese prompt could never produce an intersection, no matter how many Chinese aliases #88 added to the keyword table — both sides of the intersection had to share the same token. Result in production:skills: 0instep=build_request.okfor every Chinese prompt, verified in~/Library/Logs/@open-codesign/desktop/main.log.The fix — progressive disclosure (level 1+2)
Same pattern Claude Design / Claude Code use. Three levels:
For v0.x scope we collapse 1 and 3 into "always load all bodies". 4 builtin skills × ~1k tokens ≈ 4k tokens overhead per generation — trivial cost on Claude 4's 200k context, trivial implementation, no per-language gating, no keyword tables.
What changed
SKILL_TRIGGER_GROUPS,matchSkillsToPrompt,extractGroupIdsfromskill-injector.ts.formatSkillsForPrompt(skills)— nouserPromptargument, just canonical-sorted blobs.collectMatchedSkillBlobs->collectAllSkillBlobs. Add the success logstep=load_skills.ok(only failure was logged before).composeSystemPromptnow wraps blobs under an explicit# Available Skillsheader that tells the model how to pick.generate.test.tsfor the new log.Token-budget flag (stop-condition triggered)
Measured the new
composeSystemPrompt({ mode: 'create', skills: blobs })with all four builtin skills loaded:Slightly over the ~10k flag in the brief. Not silently truncating — flagging here as the brief instructed. Two paths if it becomes a problem:
iosStarterTemplate).v0.2 follow-up
Implement level-3 progressive disclosure: keep only
name + descriptionin the always-loaded section, expose aloadSkill(name)tool, and let the model pull the body only when it commits to using a skill. That recovers the ~10k token budget and scales to many more skills.PRINCIPLES checks
composeSystemPrompt.-405lines in this PR (deletes the keyword tables, simplifies the injector).Test plan
pnpm typecheckcleanpnpm lintclean (existing complexity warnings only)pnpm testall 132 vitest tests pass incore, 36 inproviderspnpm --filter @open-codesign/desktop dev, send Chinese prompt, verifystep=load_skills.okandskills: 4in~/Library/Logs/@open-codesign/desktop/main.log