v0.4.0
What's Changed
Now it is possible to use OCR models that work on a single line.
Before the pipeline would pass the entire BOX to the OCR model which would make model trained on single line spit out nonsensical results.
Now model can be created with ocr_mode set to merged[default] or single.
If set to single the non-merged bounding boxes will be passed and the model.
The text results will afterward be stiched together by reasonably ordering the Boxes by line/column chunks.
- Modified the API for the
OCRBoxModel._box_detectionshould now return a list of dictionaries containing'merged: tuple[int, int, int, int]the merged bounding box and'single': list[tuple[int, int, int, int]]a list of single bounding boxes that has been merged intomerged. - Modified the database models:
OCRModel: Addedocr_modefield with possible values:merged[default]single.BBox: Foreign keyfrom_ocrrenamed tofrom_ocr_mergedBBox: Added foreign keyfrom_ocr_singleBBox: Added foreign keyto_merged(point to the mergedBBoxgenerated by merging THIS + other boxes)OCRRun: Foreign keyresultrenamed toresult_merged(denote the output was from a merged real/mock run)OCRRun: Added foreign keyresult_single(denote the output was from a single run)
- Fixed a bug related to Issue #11 where the
%userprofile%/.ocr_translatefolder was not being properly created by the EXE release if it did not exists.