Skip to content

v0.15.0 — Full Audit Closure: 9 New Modules

Choose a tag to compare

@ksanyok ksanyok released this 26 Feb 20:31
· 118 commits to main since this release

What's New

9 New Core Modules

  • ai_backend — Three-tier AI backend: OpenAI API → OSS Gradio model (rate-limited) → built-in rules. New humanize_ai() function.
  • pos_tagger — Rule-based POS tagger for EN (500+ exceptions), RU/UK (200+), DE (300+). Universal tagset.
  • cjk_segmenter — Chinese BiMM (2504 entries), Japanese character-type, Korean space+particle segmentation.
  • syntax_rewriter — 8 sentence-level transforms (active↔passive, clause inversion, enumeration reorder, adverb migration). 150+ irregular verbs.
  • statistical_detector — 35-feature ML classifier for AI text detection. Integrated into detect_ai() with 60/40 weighted merge.
  • word_lm — Word-level unigram/bigram language model for 14 languages. Perplexity, burstiness, naturalness scoring.
  • collocation_engine — PMI-based collocation scoring for context-aware synonym selection. EN ~130, RU ~30, DE ~20 collocations.
  • fingerprint_randomizer — Anti-fingerprint diversification for output variety.
  • benchmark_suite — 6-dimension automated quality benchmarking.

Pipeline & Detection

  • Pipeline expanded to 17 stages (added syntax rewriting + anti-fingerprint diversification)
  • detect_ai() now returns combined_score (statistical + heuristic)
  • Fixed NO-OP _reduce_adjacent_repeats() — now actually removes repetitions

Tests

  • 1,696 tests — 92 new, all passing (100% pass rate)