Releases: kotoshu/kotoshu
Releases · kotoshu/kotoshu
Release list
v0.3.0 — two-stage resource model
[0.3.0] — 2026-06-27
The two-stage release. Resources are now downloaded explicitly via
Kotoshu.setup(:en); the hot path (correct?, suggest, check) reads
only from cache and raises a typed error when a language is missing instead
of triggering a network download. The CLI adds setup, status, language
auto-detection, SARIF/JSON output, and an interactive auto-setup prompt.
onnxruntime is now a soft dependency, so gem install kotoshu succeeds on
hosts that can't load native ONNX runtime.
Added
- Two-stage resource model (
Kotoshu::ResourceManager):
Kotoshu.setup(:en, want: %i[spelling frequency model])writes into the
cache;Kotoshu::ResourceManager.resolve(language:, want:)is instant and
cache-only, raisingResourceNotSetupErroron miss.Kotoshu.setup?is
the predicate for "is this language already cached?". The library never
triggers a surprise download; the CLI prompts the user viaAutoSetup. SourceRegistry— single source of truth for the three content repos'
URLs and per-repo pins.kotoshu/dictionariesis pinned to thev1
branch;frequency-list-kellyandmodels-fasttext-onnxare onmain.
Override at runtime viaKOTOSHU_REPOS_BASE_URL,KOTOSHU_DICTIONARIES_PIN,
KOTOSHU_FREQUENCY_PIN,KOTOSHU_MODELS_PIN.- XDG Base Directory layout (
Kotoshu::Paths): dictionaries, frequency
lists, ONNX models under$XDG_CACHE_HOME/kotoshu/; personal dictionary
andkotoshu.cfgunder$XDG_CONFIG_HOME/kotoshu/; audit log under
$XDG_DATA_HOME/kotoshu/audit.log. Override per-axis with
KOTOSHU_CACHE_PATH,KOTOSHU_CONFIG_PATH,KOTOSHU_DATA_PATH. - Integrity verification —
Kotoshu::Integrity::Manifest(SHA-256) is
fetched per content repo and matched against every download. Mismatches
raiseKotoshu::IntegrityError. Outcomes (verified / unverified / mismatch)
are written to the audit log. Missing manifests degrade gracefully. - CLI
setupcommand —kotoshu setup LANG [--force] [--no-frequency] [--no-model]writes the requested resources into the cache with progress
reporting. - CLI
statuscommand —kotoshu status [--json]summarises installed
resources, sizes, mtimes, and ONNX runtime availability. - CLI
check --language auto— auto-detects document language via
FastText LID; falls back to the configured default language when detection
is unavailable or the detected language is not set up. - CLI
check --format json|sarif— machine-readable output. SARIF
follows v2.1.0 withkotoshu/spellingrule id, JSON exposes
success/wordCount/errorCount/uniqueErrorCount/errors/source. - CLI auto-setup prompt — when the hot path raises
ResourceNotSetupErrorin an interactive session, the user is prompted to
run setup now and the original command is retried on success. Non-TTY,
offline (--offline), and--no-promptinvocations skip the prompt and
surface the error as before. - Download progress reporting (
Kotoshu::Cli::ProgressReporter) — TTY
mode renders a determinate/indeterminate progress bar; non-TTY mode prints
a periodic line every 10 MiB.Kotoshu.configuration.download_reporter=
exposes the reporter for programmatic use. - End-to-end smoke spec (
spec/integration/end_to_end_spec.rb) covers
install → setup →correct?→suggest.to_words→check→
setup?predicate →ResourceNotSetupError→ idempotent re-setup.
Tagged:network, opted into viaNETWORK_TESTS=1. - CLI format spec (
spec/kotoshu/cli/check_format_spec.rb) shells out to
the realkotoshuCLI and asserts JSON / SARIF structure and exit codes.
Changed
onnxruntimeis a soft dependency. Removed fromkotoshu.gemspec.
Kotoshu::Models::OnnxModelsoft-requires it at load time and exposes
ONNX_LOADED. When false, semantic methods raise
Kotoshu::Models::OnnxModel::OnnxUnavailablewith a caller-friendly
message.KOTOSHU_NO_ONNX=1forces semantic off even when the gem is
present. The traditional spell-checking path never touchesonnxruntime.- Loading strategy —
lib/kotoshu.rbeagerly loads only the facade
dependencies; heavier or optional pieces (ONNX models, interactive CLI,
caches, language detection) are wired through Rubyautoloadregistered
in their immediate parent namespace. - Public API —
suggestreturns aSuggestionSet; call.to_wordsfor
anArray<String>.Kotoshu.checkreturns aDocumentResult; iterate
errorsforWordResultinstances withword,position,line,
column,suggestions. - README quickstart — reflects the two-stage API; documents XDG paths;
marksonnxruntimeas optional.
Fixed
gem install kotoshuno longer requiresonnxruntimeor its native
toolchain.- Resource resolution no longer triggers downloads from inside the hot path.
- Per-repo pins are honoured — the
v1branch ofkotoshu/dictionariesis
fetched instead ofmain.
Known limitations (carried from 0.1.0, scope reduced)
- Hunspell correctness: compound rules, circumfix, ICONV/OCONV, German ß,
Turkish dotless-i remain partial. SeeTODO.impl/01-hunspell-correctness.md. - CJK and RTL: tokenizer, normalizer, and keyboard layouts exist for
supported languages; full CJK/RTL support deferred past 0.3.
SeeTODO.impl/06-cjk-support.mdandTODO.impl/07-rtl-support.md. - Grammar rules: the rule engine exists; no rule packs are shipped.
SeeTODO.impl/08-grammar-engine.md. - Audit log rotation, cache eviction policy, and shell completion are
deferred past 0.3 (T3 TODOs).
Internal
- 9 logical commits on
release-0.3cover the T1 (architectural) and T2
(user-facing) work for this release. SourceRegistry,Paths,ResourceManager,ResourceBundle,
SetupResult,Integrity::Manifest,Integrity::AuditLog,
Cli::AutoSetup,Cli::StatusReport,Cli::LanguageResolver,
Cli::ProgressReporterare new model-driven types.- 73 new specs added (source_registry, end_to_end, check_format,
progress_reporter, language_resolver, status_report, auto_setup).
Contributors
- Ribose Inc.