Skip to content

v0.3.4 - detection precision

Choose a tag to compare

@san64777 san64777 released this 08 Jun 20:53
· 13 commits to main since this release
fa17421

Best-effort detection precision improvement, driven by real-corpus QA.

The geometry detector's underline finder was firing on decorative full-width separator rules, header/footer margin rules, and near-duplicate lines, inflating false positives in detect() / make_fillable(). It now rejects near-full-width and page-margin rules and merges near-coincident duplicate lines, plus a tolerant cross-source dedup of table cells against underlines.

On a 125-form real corpus: detection precision 0.58 -> 0.61, false positives down 12%, recall held. Detection remains best-effort (a draft manifest to review).

pip install acroforge==0.3.4