Skip to content

v0.4.0

Choose a tag to compare

@Crivella Crivella released this 29 Oct 00:58
· 30 commits to master since this release
b3f8ae4

What's Changed

Now it is possible to use OCR models that work on a single line.
Before the pipeline would pass the entire BOX to the OCR model which would make model trained on single line spit out nonsensical results.
Now model can be created with ocr_mode set to merged[default] or single.
If set to single the non-merged bounding boxes will be passed and the model.
The text results will afterward be stiched together by reasonably ordering the Boxes by line/column chunks.

  • Modified the API for the OCRBoxModel._box_detection should now return a list of dictionaries containing 'merged: tuple[int, int, int, int] the merged bounding box and 'single': list[tuple[int, int, int, int]] a list of single bounding boxes that has been merged into merged.
  • Modified the database models:
    • OCRModel: Added ocr_mode field with possible values: merged[default] single.
    • BBox: Foreign key from_ocr renamed to from_ocr_merged
    • BBox: Added foreign key from_ocr_single
    • BBox: Added foreign key to_merged (point to the merged BBox generated by merging THIS + other boxes)
    • OCRRun: Foreign key result renamed to result_merged (denote the output was from a merged real/mock run)
    • OCRRun: Added foreign key result_single (denote the output was from a single run)
  • Fixed a bug related to Issue #11 where the %userprofile%/.ocr_translate folder was not being properly created by the EXE release if it did not exists.