Kotobase Rewrite
A full rewrite of the package, its data and its tooling
0.2.7 shipped a flatter database covering JMdict, JMnedict, KanjiDic2, Japanese-Only Tatoeba
sentences and the Tanos JLPT lists, distributed through Google Drive
Added
-
New Data Sources →
KRADFILE/RADKFILEfor kanji to
radical decomposition and radical search,JmdictFuriganafor per-form
furigana,KanjiVGfor stroke-order SVG, and an optionalKanji Alive
pronunciation audio pack -
Tatoeba→ Now imports the links and English exports as well, aligning Japanese
sentences with their English translations -
Full
JMdict+JMnedictTag Extraction →part of speech,register
(slang, colloquial, ...),field,dialectandpriority tagsthat the previous
subset discarded -
New API Methods On
Kotobase→search_kanji,kanji_by_skip,stroke_svg,
radicals,by_radicals,jlpt_list,names,furigana,audio,
audio_bytes,save_audio,search_meaningandexpand_tags -
New
CLICommands Grouped Intolookup,dbandcache→lookup all,
kanji-find,radicals,jlpt-list,names,meaning,sentences,
furigana,kanji-svg,audio,cache path/size/clear -
dev+docsOptional-Dependency Extras →ruff/mypy/ pre-commit
tooling, and a shippedpy.typedmarker
Changed
-
The
CLIis rebuilt onTyperandRich, with panelled output and--jsonon
query commands. The entry point moved fromkotobase.cli:mainto
kotobase.cli:app -
The database is distributed through
GitHub Releasesas zstandard-compressed
assets, rebuilt weekly, replacing theGoogle Drivedistribution -
The
schemais normalized (child tables and a JSON column for read-only tag
blobs) instead of the previous flat tables with delimited-string columns, and
the build streams the rawEDRDGandTatoebasources straight intoSQLite -
Reads go through a
thread-safe,read-onlyengine and return immutable,
serializable data objects built withfrom_ormclassmethods -
The package is consolidated under
db/(connection,dtos,repos,
uow,models,builder). The oldcore/,repos/anddb_builder/
packages anddb/database.pywere restructured into it -
The minimum
Pythonversion is raised to3.10, with a modernized
pyproject.toml(full metadata and classifiers, includingTyping :: Typed)
Removed
-
The
Google Drivedistribution and thegdowndependency -
The
alembicdependency. The compiled database now records its format in a
db_metaschema version instead of migrations -
The
clickdependency, replaced byTyper -
MANIFEST.in, replaced by declarative package data