book-to-skill v1.2.0 turns the project into an installable Python package and makes chapter detection genuinely multilingual.
Highlights
📦 Installable package + CLI
book-to-skill is now a real book_to_skill package: pip install it, run the book-to-skill console script or python -m book_to_skill, and pull only the extractors you need via extras (epub, pdf, docx, rtf, technical, all). The base install stays dependency-free with stdlib fallbacks, and python3 scripts/extract.py still works unchanged, so existing skill flows keep running.
🌍 Multilingual chapter detection
- Markdown / AsciiDoc ATX headings (
#,==) detected when no numeric "Chapter N" is present. - setext / reStructuredText underline headings (
Titleover===/---), guarded against thematic breaks, table borders, and YAML front matter. - French, German, Italian, Dutch chapter words (
Chapitre,Kapitel,Capitolo,Hoofdstuk) and umlaut titles (Überblick). - Full-width Arabic digits in CJK headings (
第1章), common in Japanese typesetting. - Multilingual table-of-contents detection (CN, JP, FR, DE, IT, NL).
🔎 Diagnosable extraction
Unexpected parser errors are now logged to stderr (extractor name + exception type) instead of vanishing, while the fallback chain still continues. Corrupt files and encoding issues are finally visible.
🔒 Security & CI
CodeQL, Bandit (HIGH gate), Zizmor workflow audit, and grouped Dependabot. Test matrix now spans Python 3.9–3.13.
Thanks
Community contributions from @Marcelluxx, @dex0shubham, @RandMelville, @addy790, @yukaina, @wuji-labs, and everyone filing multilingual edge cases. 💖 Sponsor the project
Full changelog: https://github.com/virgiliojr94/book-to-skill/blob/master/CHANGELOG.md