Release v0.0.7 · cisocrgroup/ocrd_cis

Fixed:

recognize: regression from changed network initialization
recognize: also load uncompressed models (Python 3 port)
re/segment: avoid creating invalid polygon coordinates
ocrolib scale estimation: make DPI-dependent, add fallback for empty/noise pages
ocrolib morphology: avoid rounding artifacts and asymmetry

Changed:

ocrolib / all ocropy processors: boost performance via OpenCV/PIL intead of SciPy
binarize: expose threshold parameter
re/segment: require images to be binarized already instead of ad-hoc binarization
re/segment: much faster line segmentation, better separation of neighbouring lines
segment: much more robust fg h/v-line and bg column detection, new image detection
segment: add AlternativeImage with h/v-line or image non-text clipped to background
segment: rewrite of region aggregation via hybrid recursive X-Y cut
segment: also annotate detected lines (at detected regions) after page segmentation
segment: expose many new parameters
segment: add all new lines/regions in proper (but only top-down left-right) reading order
segment: add table level (like page segmentation, but horizontal-first split strategy)
segment: also recurse into table cells for region level
segment: incremental annotation (ignore and re-order existing text/image/separator regions)
profile: remove this CLI, but keep integrated via API in new pure OCR-D CLI postcorrect
wer, training: remove these CLI
postcorrect, align: update to latest JAR
remove non-OCR-D scripts from installation
add uninstall target
update documentation (esp. training, testing and postcorrection)
improve/extend automatic tests

Provide feedback