Skip to content

feat(challenge): governance articles for E0008 challenge refactor#99

Merged
klappy merged 1 commit intomainfrom
feat/challenge-governance-articles
Apr 17, 2026
Merged

feat(challenge): governance articles for E0008 challenge refactor#99
klappy merged 1 commit intomainfrom
feat/challenge-governance-articles

Conversation

@klappy
Copy link
Copy Markdown
Owner

@klappy klappy commented Apr 17, 2026

Summary

Eleven governance articles plus an evidence note that set up the oddkit_challenge refactor to mirror the PR #96 encode pattern: governance-driven extraction replacing hardcoded source logic. Governance lands in canon first; the workers/src/orchestrate.ts refactor that extracts against these articles is a follow-up PR. No runtime behavior change here.

The honest framing: the extraction contract is domain-agnostic, but the shipped defaults are not. They are software-engineering-flavored on purpose, labeled as such, with a Domain Adaptation section showing three worked patterns — software engineering (the defaults), thought leadership from books (Tim's pattern), and comparative architectural writing (klappy.dev's pattern). Every other domain is expected to override with its own taxonomy. Custom types are markdown edits, not code changes.

Articles added

Meta governance (odd/challenge-types/)

  • how-to-write-challenge-types.md — extraction contract, Domain Adaptation with three worked patterns, six-step procedure for KB stewards to build their own taxonomy

Software-engineering default challenge types

  • strong-claim.md — definitive statements; maximum pressure
  • proposal.md — future-oriented plans; pressure scales with irreversibility
  • assumption.md — implicit premises; make explicit before they compound
  • observation.md — reports; lightest pressure; declared fallback type

Architectural-writing overlay challenge types (for this repo's writing work)

  • pattern-coinage.md — naming novel patterns; prior-art and precision discipline
  • comparative-positioning.md — positioning against a landscape; freshness and fair characterization
  • principle-extraction.md — elevating heuristics to principles; sample size and scope

Supporting articles (odd/challenge/, coexist both domains)

  • base-prerequisites.md — three universal checks (evidence, source, confidence)
  • normative-vocabulary.md — RFC 2119 plus architectural load-bearing terms
  • stakes-calibration.md — nine modes spanning software-dev and writing lifecycles; voice-dump mode suppresses all challenge output as invariant

Evidence

  • docs/oddkit/evidence/challenge-governance-articles-commit.md — gauntlet record

Gauntlet run

  • Preflight — surfaced three constraint docs: ai-voice-cliches, author-identity-language, definition-of-done
  • Writing Canon gate per-article — all 11 articles have Summary sections (caught as missing during validate, remediated before commit), blockquotes contain compressed argument, headers pass scan test, metadata references full file paths
  • AI voice clichés audit — clean. Zero hits on formulaic transitions, puffing, overclarification, summary clichés, bold-then-explain. Em-dash density 0.13/line vs precedent how-to-write-encoding-types.md at 0.11/line — same neighborhood
  • Author identity — no translator claims about Klappy in any article
  • Derives-from path audit — every referenced path verified against repo. One broken reference caught and fixed: canon/epistemic-modes.mdcanon/definitions/epistemic-modes.md in stakes-calibration.md
  • Session capture (OLDC+H) — 7 artifacts encoded (1D, 4L, 1C, 1H); see evidence note for references

Open risks (flagged during challenge, addressable in follow-ups)

  • Extraction-contract incompleteness — sections I didn't anticipate needing may emerge during implementation; remediated by amending governance articles
  • Detection-pattern overlap noise — operator-mode types and writing-mode types may both fire on ambiguous inputs even with stakes-calibration trimming; tunable via pattern narrowing
  • Domain Adaptation discoverability — Tim and other KB stewards may need more worked examples than the three patterns provide; more patterns can be added as domains are stood up
  • Voice-dump suppression aggressiveness — suppressing all challenge in voice-dump mode is an invariant here; may need revisiting if a critical check should still fire

Meta observation

oddkit_gate returned NOT_READY during the gauntlet because its hardcoded generic prereqs (problem-statement-defined, constraints-reviewed) could not see session state. This is the same failure mode the challenge refactor addresses — gate has its own version of the governance-disconnect problem. Noted honestly rather than papered over. A future PR can extend the governance-driven pattern to gate too.

Next step

workers/src/orchestrate.ts refactor. Mirror PR #96 encode pattern: discoverChallengeTypes() with per-canonUrl cache, fetch-and-parse for the three supporting articles, runChallengeAction() refactored to use extracted governance, runCleanupStorage() extended to clear four new caches, graceful degradation when articles are missing.


Note

Low Risk
Low risk because this PR only adds documentation/canon governance articles and an evidence note, with no runtime or UI changes. Risk is limited to future work that will start extracting behavior from these documents.

Overview
Introduces a new governance set for oddkit_challenge (11 new canon articles) that specifies a markdown extraction contract for challenge types, including multi-match semantics and governance-driven fallback routing.

Adds default challenge type articles (e.g. strong-claim, proposal, assumption, observation), writing-oriented overlay types (e.g. pattern-coinage, comparative-positioning, principle-extraction), and three supporting governance docs for universal prerequisites, normative vocabulary, and mode-based stakes calibration.

Includes a docs/oddkit/evidence/challenge-governance-articles-commit.md gauntlet/evidence record for the governance-only commit.

Reviewed by Cursor Bugbot for commit 5d982ae. Bugbot is set up for automated code reviews on this repo. Configure here.

Eleven governance articles plus an evidence note, mirroring the PR
#96 encode pattern. Governance lands in canon first; the future
workers/src/orchestrate.ts refactor extracts against live articles.
No runtime behavior change in this PR.

Meta governance
- odd/challenge-types/how-to-write-challenge-types.md — extraction
  contract, Domain Adaptation with three worked patterns (software
  engineering, thought leadership from books, comparative
  architectural writing), six-step procedure for KB stewards

Software-engineering default challenge types (shipped defaults,
labeled as such on purpose)
- odd/challenge-types/strong-claim.md
- odd/challenge-types/proposal.md
- odd/challenge-types/assumption.md
- odd/challenge-types/observation.md (fallback: true)

Architectural-writing overlay challenge types (klappy.dev additions
for comparative-analysis and principle-extraction writing work)
- odd/challenge-types/pattern-coinage.md
- odd/challenge-types/comparative-positioning.md
- odd/challenge-types/principle-extraction.md

Supporting articles (coexist both domains in klappy.dev canon)
- odd/challenge/base-prerequisites.md
- odd/challenge/normative-vocabulary.md
- odd/challenge/stakes-calibration.md (includes voice-dump mode
  which suppresses all challenge output as invariant)

Evidence
- docs/oddkit/evidence/challenge-governance-articles-commit.md
  captures the gauntlet run: preflight, AI voice cliches audit,
  author-identity check, derives-from path audit, Writing Canon
  gate per-article, session capture reference, open risks.

Gauntlet notes
- Writing Canon gate passed per-article (title, blockquote,
  summary section, headers, no buried claims) after remediation
  of missing Summary sections caught during the validate pass
- AI voice cliches audit clean
- One broken derives_from path caught and fixed
  (canon/epistemic-modes.md -> canon/definitions/epistemic-modes.md
  in stakes-calibration.md)
- oddkit_gate returned NOT_READY due to the same hardcoded-logic
  problem this refactor solves — the gate's generic prereqs cannot
  see session state. Noted honestly; proceeding because materially
  met and documented.

Co-authored-by: Claude <noreply@anthropic.com>
@klappy klappy merged commit a6d45a3 into main Apr 17, 2026
1 check passed
@klappy klappy deleted the feat/challenge-governance-articles branch April 17, 2026 05:06
klappy added a commit to klappy/oddkit that referenced this pull request Apr 17, 2026
Refactor runChallengeAction in workers/src/orchestrate.ts to extract
challenge-type behavior from canon governance articles at runtime rather
than hardcoding claim-type detection, questions, prerequisites, and
tension rules in source. Structural mirror of PR #96 (encode).

Detection upgraded mid-implementation from regex-OR to BM25 + stemming
after the gauntlet revealed that regex-based matching was morphologically
brittle ("coin" doesn't match trigger "coining"). The pivot removed an
entire class of bug and seeded a reusable pattern for future
governance-driven tools.

Changes in workers/src/orchestrate.ts:
- New: ChallengeTypeDef, BasePrerequisite, NormativeVocabulary,
  StakesModeConfig, StakesCalibration
- New: discoverChallengeTypes (builds per-canonUrl BM25 index over
  detection text), fetchBasePrerequisites, fetchNormativeVocabulary,
  fetchStakesCalibration — each with per-canonUrl cache and graceful
  degradation on missing articles
- New: evaluatePrerequisiteCheck — interprets natural-language check
  strings from prerequisite overlay tables
- Refactored runChallengeAction: multi-match via BM25 score > 0, base
  + overlay prerequisite aggregation, stakes calibration filtering,
  voice-dump suppression invariant, governance-driven tension detection
- Extended runCleanupStorage with five new cache clears (types,
  type-index, base prerequisites, vocabulary, calibration)
- Removed dead detectClaimType (legacy src/tasks/challenge.js retains
  its copy for CLI backward-compat)
- Added CHALLENGE_STOP_WORDS set preserving modal verbs as signal

Changes in workers/src/bm25.ts (backward-compatible extension):
- tokenize(), buildBM25Index() accept optional stopWords: Set<string>
- BM25Index gains optional stopWords field so searchBM25 tokenizes
  queries consistently with the index
- Default behavior unchanged — existing callers unaffected
- Motivation: default STOP_WORDS filters modals (must, should, shall,
  may, not) which are signal for challenge-type detection

New tests: workers/test/governance-parser.test.mjs — 94 assertions
against live governance articles fetched from klappy.dev raw. Covers
type parsing, fallback resolution, BM25 detection, stemming regression
cases (coin/coining, propose/proposed, principle/principles), multi-
match, and the voice-dump suppression invariant. 94/94 pass.

Bugs the gauntlet caught on this PR:
1. Voice-dump suppression invariant would have shipped broken — the
   calibration cell reads "none (suppress all challenge)" not bare
   "none". Strict-equality parser would have produced a single-element
   array, voice-dump mode would have surfaced all challenges in prod.
2. Morphological brittleness in regex detection (coin vs coining) —
   triggered the pivot to BM25 + stemming.
3. Default BM25 STOP_WORDS silently breaks strong-claim and proposal
   detection by filtering modal verbs. Fixed via custom stop word set.

Verification:
- npm run typecheck: clean
- tests/smoke.sh: 6/6 pass (legacy CLI path — backward compat preserved)
- workers/test/governance-parser.test.mjs: 94/94 pass
- AI voice clichés audit on new comments: clean
- oddkit_preflight, challenge, gate, validate: all run; gate NOT_READY
  due to same hardcoded-logic gap as challenge pre-refactor (flagged as
  follow-up)

Response shape change: adds mode, matched_types, type_definitions,
block_until_addressed; removes claim_type. Consumed programmatically,
not rendered.

Follow-ups flagged:
- Encode parity PR — same regex-OR brittleness in runEncodeAction;
  pattern proven here, port will be near-mechanical
- klappy.dev meta governance PR — "compiles into a case-insensitive
  word-boundary regex" is now stale language
- Gate refactor candidate — same hardcoded-logic shape as challenge pre-refactor

Refs:
- Depends on: klappy/klappy.dev#99 (governance articles this code reads)
- Structural mirror: #96 (governance-driven encode)
- Evidence: docs/oddkit/evidence/challenge-governance-code-refactor.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant