PAIDEIA-codex 26.04.24
Update Catalog
26.04.24 update
Codex edition reaches study-graph parity
PAIDEIA-codex was born as a sibling of the Claude Code edition, not a fork. The user interface is different (Codex CLI, AGENTS.md instead of CLAUDE.md, a stdio MCP server instead of inline scripts), but the on-disk artifact is supposed to be byte-for-byte the same: course-index/patterns.md, errors/log.md, weakmap/weakmap_<ts>.md, cheatsheet/final.md. You should be able to fork a course folder out of the Claude edition into the Codex edition — or vice versa — and have the new runner pick up without friction.
For that to actually hold, the semantics that read those artifacts have to agree. In 0.1.x the Codex edition quietly disagreed on three of them. paideia_mcp.course_phase treated quizzes/*.md as evidence of drilling even when no problem had been solved. $paideia-blind wrote a parallel YAML schema (pattern_missed_initial: / strategy_error_type:) that every downstream reader silently skipped. And the answer-PDF lifecycle ended at "OCR succeeded" — the original scan kept living in answers/ forever, so the "most recently modified in answers/" heuristic inside $paideia-grade re-picked yesterday's file when the student uploaded a newer one today.
This release closes those disagreements. The phase ladder now advances only when the student has actually graded something. The error log has one shape across every writer — grade, blind, and any future drill. Graded scans are archived after OCR, so the next run sees the next file.
What this fixes
The richest cases are the ones where a course folder would look correct on disk but the Codex edition interpreted it differently from the Claude one:
- Audit a seeded patterns file. Dropping a
course-index/patterns.mdfrom last semester's fork no longer flips the phase todrill.paideia-mcp.course_phasenow gatesdrillon at least one gradedpattern:entry inerrors/log.md. An artifact that was never acted on is not the same signal as one the student produced, so it doesn't move the phase forward. - Audit a
$paideia-blinderror. Running$paideia-blind hw3-p2and failing on the pattern axis now writes the samepattern:/error_type:/source:keys that$paideia-gradewrites. The blind entry appears in the next$paideia-weakmapwithout manual edits, andpaideia-mcp.course_phase's top-miss counter picks it up immediately. In 0.1.x those entries were present on disk but invisible to every consumer, because every consumer pattern-matched onpattern:and blind wrotepattern_missed_initial:. - Audit a mock-exam phase transition.
mockfires only when anerrors/log.mdentry has asource:containingmock— i.e., a mock was actually graded. Seeding an emptymock/<ts>.mdno longer advances the phase, which used to let the display race ahead of the student's real progress. - Audit a second scan of the same assignment. After
$paideia-grade answers/hw3.pdfsucceeds, the MCP movesanswers/hw3.pdfintoanswers/_archive/hw3_<ts>.pdf. Re-running$paideia-gradewith no positional argument no longer re-picks the stalehw3.pdf; it picks the genuinely most recent file. The convertedanswers/converted/hw3.mdstays put and is version-controlled — only the bulky scan is archived. - Audit a cross-course Qwen3-VL prompt. The
qwen3-vlengine used to inject the phrase "math / physics course" into every page prompt regardless of course. It now readsCOURSE_NAMEfrom.course-metaand substitutes it, so a Complex Analysis folder no longer transcribes under a generic framing. The default fallback is unchanged ("math / physics") for folders without a.course-meta.
How it is used
Existing Codex users update by pulling main. Course folders require no migration: the phase detector reads errors/log.md the same way it always did, and nothing in 0.1.x could have written the legacy blind schema into a folder that also had graded entries (the legacy keys only appeared in blind-only folders). A fresh $paideia-init-course in a new course folder seeds the canonical schema directly via bootstrap.py's updated ERRORS_LOG_SEED comment.
In a fresh session:
$paideia-phasecomputessetup → diag → drill → mock → cram → coolfrom three orthogonal signals: artifact-exists, has-graded-entry, mock-was-graded. Create apatterns.mdbut never quiz:diag. Grade one problem:drill. Grade a mock:mock. Drop acheatsheet/final.md:cram. Delete them back: the phase regresses. Unlike 0.1.x, the display tracks what the student did, not what the filesystem declares.$paideia-gradereturns an extraarchived_tofield in itspaideia-mcp.grade_pdfresponse. The skill can surface the archive path in its closing line if the user cares; skipping the line is fine too, the archive happens regardless.$paideia-blindwrites one schema, and that schema is literally the canonical one frompaideia-grade/SKILL.md§6. A student migrating from 0.1.x sees their new blind entries participate in the weakmap immediately — no manual edits, no schema translation pass.
The phase tool owns correctness, not freshness
Claude Code's statusline.py re-renders the phase display on every prompt, which is why that edition introduced an mtime-indexed disk cache: the display has to be cheap enough to recompute hundreds of times per session, and the cache is the only way to keep that bounded on mature course folders. Codex CLI has no persistent statusline slot. The phase is surfaced on demand via $paideia-phase — which calls the MCP tool exactly once per invocation — or by other skills that want to know where the student is in the cycle (also once per call).
Because of that, paideia-mcp.course_phase has no cache layer and does not need one. Freshness is guaranteed by the fact that the tool re-reads errors/log.md, cheatsheet/, mock/, and .course-meta every time it runs. This is the right trade-off under Codex's semantics: the caller decided this was the moment to ask, so we read the disk. Adding a cache would save wall time only if the same skill called the tool repeatedly within a session without the course folder changing — which is not a pattern any existing skill exhibits.
This is also why the "mtime-indexed cache" patch from the Claude edition's 0.6.0 release is explicitly not ported: there is nothing here it would make faster, and introducing it would add an invalidation surface with no compensating win.
What this does not change
The three OCR engines (codex-native / qwen3-vl / tesseract), the ingest_pdfs / grade_pdf / build_course_index / course_phase MCP tool boundaries, the markdown contents of answers/converted/, the 15 skills, the coloring of $paideia-phase's output, and the .mcp.json wiring all behave identically to 0.1.x. A course folder set up under the old version continues to work without migration.
The codex-native engine is also unaffected: its prompt lives in paideia-grade/SKILL.md Step 2a, not in paideia_mcp/ocr/qwen3vl.py, so the course-name parameterization is specific to the Ollama path. Users who stay on the default engine see no prompt-level difference.
The Claude-edition SessionStart hook is intentionally not ported. Codex has no session-start hook mechanism: AGENTS.md is read every turn but it is static markdown, so it cannot inject live phase or top-miss state the way a Python script can. The user-invoked $paideia-phase is the Codex-native equivalent, and it has been the recommended first-turn command since the phase tool shipped.
Technical notes
plugins/paideia/paideia-mcp/paideia_mcp/phase.pygains_has_error_entries(regex-searcheserrors/log.mdfor^\s*pattern:\s*P\d+) and_mock_was_graded(iteratessource:lines and returns true if any containsmock).detect_phaseis split into single-responsibility branches on those helpers. Module docstring is updated to describe the activity-based ladder so future readers don't re-introduce file-existence gating by accident.plugins/paideia/paideia-mcp/paideia_mcp/ocr/qwen3vl.pyrenamesPROMPTtoPROMPT_TEMPLATEwith a{course}placeholder, introducesbuild_prompt(course_name), and hastranscribe_page/transcribe_pagesthread the resolved course name through the Ollama call._is_noise_sentenceis factored out of_dedupe_loopswith a Korean + English hedge-prefix tuple._strip_ngram_tail(text, n=5, min_repeats=3)trims trailing n-gram loops that survive sentence-level dedup._WARMUP_TIMEOUT = 60.0is separated from_PER_PAGE_TIMEOUT = 1800.0so a hung Ollama daemon at startup surfaces quickly instead of hiding under the long per-page ceiling.plugins/paideia/paideia-mcp/paideia_mcp/ocr/__init__.pyadds_resolve_course_name(project_root)which reads.course-meta::COURSE_NAMEviapaideia_mcp.phase.parse_meta.run_ocrplumbs the resolved course name through toqwen3vl.transcribe_pages;tesseractignores it (no prompt to parameterize).plugins/paideia/paideia-mcp/paideia_mcp/grade.pyadds_archive_if_under_answers(pdf, root). Fires only when the resolved PDF path is exactly two components deep under<root>/and the first component isanswers— so absolute-path targets outside the course root are left alone and already-archived files underanswers/_archive/aren't re-archived. Timestamp is UTCYYYYMMDD-HHMMSSZ. Idempotent on missing files (returnsNoneand leaves the source alone). Both response dicts (ocr-completeandrasterize-only) gain anarchived_tofield.plugins/paideia/skills/paideia-grade/SKILL.md§6 is marked canonical — the single source of truth for theerrors/log.mdYAML schema — with a note that downstream readers (paideia-mcp.course_phase,$paideia-phase,$paideia-weakmap) pattern-match onpattern:andsource:, and any drift silently hides entries.paideia-blind/SKILL.md§7 references this canonical schema and documents the strategy-axis →error_typemapping (pattern axis →pattern-missed, variable axis →wrong-variable, end-form axis →wrong-end-form), rather than defining a parallel schema.plugins/paideia/skills/paideia-init-course/scripts/bootstrap.py'sERRORS_LOG_SEEDcomment now spells out the canonical six-key schema includingsource:and points atpaideia-grade/SKILL.md§6. TheGITIGNOREblock is unchanged —answers/**/*.pdfalready coversanswers/_archive/*.pdf, so the archive is gitignored without an extra rule.plugins/paideia/.codex-plugin/plugin.jsonand.claude-plugin/marketplace.jsonbump from0.1.0to0.2.0. No new runtime dependencies. No changes to any skill's argument contract or any MCP tool's input shape.
Notes
- The
_archive_if_under_answersdepth check intentionally requires exactly two path components (answers/<name>.pdf), not merely "starts withanswers/". That excludesanswers/_archive/…(already-archived files should never be re-archived) andanswers/converted/…(which shouldn't be PDFs anyway, but the check is belt-and-suspenders). Users who manage scans in a deeper subtree — e.g.answers/scans/hw3.pdf— are left alone on purpose; the caller chose that layout, so archiving would surprise them. - The canonical
source:values are loose on purpose.answers/converted/<stem>.md,blind/<id>, andmock/<ts>.mdare the conventions, butpaideia-mcp.course_phaseonly substring-matches on the wordmock. That means any future drill that wants to be treated as a mock for phase purposes simply has to includemockin itssource:value — no phase-detector change required. If multiple drills want finer-grained phase rules later, the single_SOURCE_RXhelper inphase.pyis the right place to extend. - The Claude-edition
SessionStarthook is listed as "not ported" rather than "cannot be ported." If Codex CLI ever adds a session-start hook mechanism, the same two-line reminder (phase + top miss with a suggested next command) could plug in trivially —paideia-mcp.course_phasereturns everything it would need. Until then, the manual workflow is: open Codex CLI in a course folder, run$paideia-phase, then read the suggestion. Three keystrokes instead of zero, but with no statusline slot and no hook to inject, there is no cheaper way. - Qwen3-VL's Korean hedge list was drawn from observed failure modes on real scans: 잠깐 / 음 / 아 / 근데 / 사실 are the five that showed up most often. The prefix match is literal (not tokenized), which means a sentence starting with a Korean hedge followed by
,or.is dropped — but a hedge buried mid-sentence is left alone, because the surrounding context is usually load-bearing content. Adjustments to the list belong in_HEDGE_PREFIXESinsidepaideia_mcp/ocr/qwen3vl.py; the rest of the dedup pipeline is prefix-agnostic. - Cross-edition migration is now symmetric: a course folder written by the Claude edition reads correctly in the Codex edition, and vice versa. The only artifact that differs between editions is the hook / statusline wiring in
.claude/settings.json, which the Claude edition's/paideia:init-coursewrites and the Codex edition ignores. A student can switch runners mid-semester without losing history.