Skip to content

Releases: knaw-huc/loghi-tooling

v2.0.0

08 Apr 13:23
Compare
Choose a tag to compare

Release Notes for Loghi-tooling Version 2.0.0

Date: 2024-04-03

Overview

2.0.0 of Loghi-tooling contains minor changes and change to integrate with 2.0.0 of Loghi-htr.

Major Updates

  • MinionConvertPageToTxt: MinionConvertPageToTxtDescription was added, which can can pagexml to plain text.
  • Improved tag support: Support for Unicode style text inputs + HTML style text inputs for MinionLoghiHTRMergePageXML. Support for unclosed tags.

Additional Improvements

  • fix invalid page production 1: Some pagexml was produced that was invalid. This has been fixed.
  • ignore null coords when loading pagexml: Null Coords are ignored when loading (broken) pageXML.
  • refactoring: Textline polygon calculation is now running in extractbaselines phase instead of the textline cutting phase.

Full Changelog: 1.3.12...2.0.0

v1.3.12

22 Mar 09:01
33b2efd
Compare
Choose a tag to compare

1.3.12
when saving PageXML removing TranskribusMetadata as it is not valid pagexml
fix ExtractBaselinesResource to recalculate textline contours
refactor/clean
add image file for testing extractbaselines via api
add image so textline polygons can be calculated
disable broken test

BREAKING: pagexml contours are now calculated in MinionExtractBaselines instead of MinionCutFromImageBasedOnPageXMLNew

1.3.11
allow empty points to be ignored as they might get fixed later
bump postgres to 42.7.2
bump opencv version to 4.9.0
pdf converter better support for jpeg's
fix minionshrinktextlines to use adaptive thresholding and avoid creating single pixel baselines.
pdf support
WIP: read v2 style format loghi-htr output
fix bug in setting correct namespace

1.3.10
fix bug in setting correct namespace
update log4j
fix pdfconverter (WIP)
add pdf converter (WIP)

1.3.9
update jackson
if it's 2013 use 2013
update libraries
add vulnerability scanner

v1.3.7

22 Dec 08:40
6c1012d
Compare
Choose a tag to compare

includes changes from 1.3.6

  • fix nullpointer exception when using older models without config
  • Add optional security to loghiwebservice
  • Fix recalculate reading order test
  • improvements in generic reading order detection
  • don't include textstyle for strings that are empty
  • avoid nullpointer exception when reading htr config

v1.3.5

08 Dec 14:12
4e27707
Compare
Choose a tag to compare

Fix inclusion of uuid & githash of loghi-htr
Improve logging:

  • better messages
  • avoid unnecessary info
  • write .error-files when errors occur

add quickfix for earlier invalid scriptdetection

Enable page validation

Word splitting:
Ignore lines without enough space for words
Make errors warnings when points get fixed via fixPoints
Remove the floor of charWidth, it gives errors
Throw an exception when mask width is 0

v1.3.4

24 Nov 08:46
253c36d
Compare
Choose a tag to compare

fix bug in wordsplitting
added option -addLaypaMetadata to API call, if not specified do not add metadata
Replace images / page xml for latest version

v1.3.3

22 Nov 13:11
cb75e12
Compare
Choose a tag to compare

fixed bug in languagedetection which resulted in fatal error running the minion

1.3.2

16 Nov 14:48
Compare
Choose a tag to compare
  • Fix bug when splitting TextLines into Words. This bug would occur when the last Word consisted of one character. The word would not receive any coordinates. The cause of this bug is rounding issues when calculating the character width. Now the last Word will get the remaining the remaining space.
  • Fix bug where TextLine would receive a "null"-value as part of the value of the "custom"-attribute.

1.3.0

16 Nov 08:22
85d66c1
Compare
Choose a tag to compare
  • Added validation for PAGE XML namespace for web api calls. Only 2013 is supported at the moment.
  • The language detection uses the right names for the PAGE XML 2013 version.
  • TextRegions will no longer be created with empty points.

Know bugs

  • Validation is currently disabled, it triggers on non-existing problems.