Skip to content

feat(core): progressive disclosure for system prompt by user-prompt keywords#109

Merged
hqhq1025 merged 2 commits intomainfrom
wt/progressive-prompt
Apr 19, 2026
Merged

feat(core): progressive disclosure for system prompt by user-prompt keywords#109
hqhq1025 merged 2 commits intomainfrom
wt/progressive-prompt

Conversation

@hqhq1025
Copy link
Copy Markdown
Collaborator

Summary

The full create-mode system prompt was ~41 KB / ~10k tokens. That hard-fails small-context models (e.g. minimax-m2.5:free at 8k ctx) and dilutes attention on every other model so most directives get ignored. This PR splits the prompt into layers and emits only the layers that match the user's prompt keywords.

composeSystemPrompt() gains an optional userPrompt field. When provided in create mode, the composer assembles:

  • Layer 1 (always, ~12 KB): identity, workflow, output-rules, design-methodology, pre-flight, editmode-protocol, safety, plus a NEW condensed antiSlopDigest (forbidden-list bullets only, ~1.5 KB).
  • Layer 2 (keyword-matched):
    • dashboard / chart / 看板 → +chartRendering + craftDirectives "Dashboard ambient signals"
    • mobile / iOS / 移动端 → +iosStarterTemplate
    • landing / marketing / 案例 → +craftDirectives single-page-ladder, big-numbers, customer-quotes
    • logo / 品牌 → +craftDirectives "Logos and brand marks"
    • no match → fall back to FULL craftDirectives (better safe than sorted)
  • Layer 3 (deferred): retry-on-quality-fail injection of full ANTI_SLOP + ARTIFACT_TYPES. TODO comment in code.

Back-compat: userPrompt undefined OR mode tweak/revise → byte-identical to the pre-PR prompt.

Section sizes addressed (the bloat)

Section KB % of full Notes
craftDirectives 6.5 15% Only "dashboard ambient signals" is dashboard-only; rest is broadly useful
antiSlop 6.4 14% Long examples; condensed into antiSlopDigest in Layer 1
artifactTypes 5.1 12% 8-row table; only one row matters per request
iosStarterTemplate 3.8 9% Only mobile prompts need it
chartRendering 3.3 8% Only dashboard / chart prompts need it

Measured prompt size

Prompt Chars % of full
full (no userPrompt) 41,327 100%
做个数据看板 22,614 55%
iOS 移动端 onboarding 21,744 53%
indie marketing landing page 19,815 48%
随便做点东西 (no keyword) 24,464 59%

A regression-guard test asserts matched dashboard prompt < 25 KB.

Tests

Adds 9 new vitest cases (composeSystemPrompt() — progressive disclosure):

  • back-compat byte-identity when userPrompt is omitted
  • Layer 1 always-present invariant across 4 input shapes
  • dashboard prompt → chart-rendering present, iOS absent
  • mobile prompt → iOS present, chart-rendering absent
  • marketing prompt → single-page-ladder + customer-quotes subsections present
  • no-keyword prompt → full craft-directives fallback (verified by 5 distinct subsection headings)
  • 25 KB regression guard for dashboard prompt
  • mode tweak ignores userPrompt
  • mode revise ignores userPrompt

Total @open-codesign/core tests: 158 passing (was 149).

The drift contract for PROMPT_SECTION_FILES is preserved — the new antiSlopDigest constant ships its own anti-slop-digest.v1.txt.

apps/desktop consumer is wired through automatically: generate() now passes input.prompt as userPrompt to the composer.

PRINCIPLES checklist (§5b)

  • Compatibility: green — back-compat preserved when userPrompt is undefined; existing 88 generate.test.ts cases unchanged.
  • Upgradeability: green — pure refactor inside packages/core/src/prompts/; new section follows the same .v1.txt + TS-constant pattern as the others.
  • No bloat: green — no new dependencies; the change removes bytes from the wire on every create call.
  • Elegance: green — pure functions, two-layer composer split (composeFull / composeCreateProgressive), keyword routing extracted into planKeywordMatches.

Test plan

  • pnpm --filter @open-codesign/core test — 158/158 pass
  • pnpm --filter @open-codesign/core typecheck — clean
  • pnpm exec biome check on changed files — clean (cognitive complexity within budget)
  • Manual smoke: run a dashboard prompt through a small-ctx provider and confirm it no longer 400s

…eywords

The full create-mode system prompt is ~41 KB / ~10k tokens. Small-context
models (e.g. minimax-m2.5:free at 8k ctx) hard-fail on it; even
large-context models dilute their attention and ignore most of the
instructions.

composeSystemPrompt() now accepts an optional userPrompt. When provided
in create mode, the composer builds:

- Layer 1 (always, ~12 KB): identity, workflow, output-rules,
  design-methodology, pre-flight, editmode-protocol, safety, plus a new
  condensed antiSlopDigest section (forbidden-list bullets only).
- Layer 2 (keyword-matched): chart-rendering + dashboard-ambient-signals
  for dashboard cues; iOS starter template for mobile cues;
  single-page / big-numbers / customer-quotes craft subsections for
  marketing cues; logos subsection for brand cues. No keyword match →
  fall back to the full craft directives (better safe than sorted).
- Layer 3 (deferred TODO): retry-on-quality-fail injection of full
  ANTI_SLOP + ARTIFACT_TYPES.

Measurements for sample prompts:
  full (no userPrompt):     41,327 chars
  "做个数据看板":           22,614 chars (55%)
  "iOS 移动端 onboarding":  21,744 chars (53%)
  "indie marketing landing page": 19,815 chars (48%)
  "随便做点东西" (no kw):   24,464 chars (59%)

When userPrompt is omitted, or mode is tweak / revise, the output is
byte-identical to before — full back-compat. The drift contract for
PROMPT_SECTION_FILES is preserved (new antiSlopDigest section gets its
own .v1.txt).

Adds 9 vitest cases covering back-compat, layer 1 invariants, each
keyword bucket, no-keyword fallback, the 25 KB regression guard, and
mode tweak/revise pass-through. All 158 core tests pass.
Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  • [Major] Dashboard keyword regex has substring false positives — graph matches paragraph and metric matches asymmetric/biometric, so unrelated prompts can be mis-routed into dashboard/chart instructions. This changes prompt composition unexpectedly and can degrade output quality. Evidence packages/core/src/prompts/index.ts:847
    Suggested fix:

    const KEYWORDS_DASHBOARD =
      /\b(dashboard|chart|graph|plot|visualization|analytics|metric|kpi)s?\b|||/i;
  • [Minor] Tests cover positive keyword matches but miss false-positive guards for substring collisions introduced by regex routing. Evidence packages/core/src/generate.test.ts:1419
    Suggested fix:

    it('does not trigger dashboard routing on substring collisions', () => {
      const p = composeSystemPrompt({ mode: 'create', userPrompt: 'improve paragraph rhythm and asymmetric spacing' });
      expect(p).not.toContain('Chart rendering contract');
      expect(p).not.toContain('Dashboard ambient signals');
    });

Summary

  • Review mode: initial
  • 2 issues found (1 Major, 1 Minor), both introduced by the regex-based progressive routing change.
  • docs/VISION.md and docs/PRINCIPLES.md: Not found in repo/docs in this checkout.

Testing

  • Not run (automation)

Comment thread packages/core/src/prompts/index.ts Outdated
// ---------------------------------------------------------------------------

const KEYWORDS_DASHBOARD =
/(dashboard|chart|graph|plot|visualization|数据|看板|图表|analytics|metric|KPI)/i;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

KEYWORDS_DASHBOARD currently matches substrings (graph in paragraph, metric in asymmetric), which can route unrelated prompts into dashboard mode. Consider word boundaries for Latin terms while keeping CJK terms as-is:

const KEYWORDS_DASHBOARD =
  /\b(dashboard|chart|graph|plot|visualization|analytics|metric|kpi)s?\b|||/i;

}
});

it('dashboard prompt: includes chart rendering, excludes iOS starter', () => {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a negative routing test for substring collisions to prevent regressions (e.g., paragraph/asymmetric should not trigger dashboard routing):

it('does not trigger dashboard routing on substring collisions', () => {
  const p = composeSystemPrompt({ mode: 'create', userPrompt: 'improve paragraph rhythm and asymmetric spacing' });
  expect(p).not.toContain('Chart rendering contract');
  expect(p).not.toContain('Dashboard ambient signals');
});

Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  • None.

Summary

  • Review mode: initial
  • No blocking/major/minor/nit issues found in added/modified lines of this diff.
  • docs/VISION.md and docs/PRINCIPLES.md: Not found in repo/docs.
  • Residual risk: progressive-routing behavior is covered by unit tests in composeSystemPrompt() but this PR does not include provider-integrated smoke coverage.

Testing

  • Not run (automation)

open-codesign Bot

@hqhq1025 hqhq1025 merged commit bb59ee6 into main Apr 19, 2026
5 of 6 checks passed
@hqhq1025 hqhq1025 deleted the wt/progressive-prompt branch April 19, 2026 13:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant