Skip to content

History / Data Model Data Sources

Revisions

  • docs: update ORACC coverage, scale numbers, and quality notes - Data-Sources: rewrite ORACC section to reflect ~115 integrated projects, correct download URL format, update all row counts (416k lemmatizations, 834k norms, 220k glossary entries, 35k credits, 346k lexical lemmas), note ETCSL as wired-but-empty, remove stale 13-project table - Engineer-Onboarding: update Current Scale table with post-ingest numbers, correct lexical-glossaries connector description to ~115 projects/14 langs - Data-Quality: replace stale '309k tokens / 86k lemmatizations' issue row with accurate dead-letter note; update lemmatization coverage stat to 416k

    @wittkensis wittkensis committed May 19, 2026
  • port docs/ from main repo to wiki Ports 54 markdown files from the main repo's docs/ tree to flat wiki pages. Adds Home, _Sidebar, _Footer for navigation. YAML frontmatter stripped on ingest; internal cross-doc links rewritten to bare wiki page names.

    @wittkensis wittkensis committed May 18, 2026