Skip to content

v0.3.x

Pre-release
Pre-release

Choose a tag to compare

@conjuncts conjuncts released this 21 Oct 03:50
· 78 commits to main since this release
ddcf402

v0.3.2

Changes:

  • Raise default threshold of heuristic for rejecting tables on high overlap. Makes ValueErrors more rare.
    • (total_overlap_reject_threshold) ValueError thrown on overlap > 90%, up from 20%
    • (total_overlap_warn_threshold) overlap warned on overlap > 10%, up from 5%
  • Python 3.9 compatability.

v0.3.1

Bugfix:

  • divide by 0 when taking median of empty list in row height estimate
  • Fix broken build in v0.3.0 (missing formatters)

Changes:

  • Added Img2TableDetector.
  • refactor of code into organizational modules, detectors and formatters
  • Importing from gmft is no longer encouraged. Please import from gmft.auto instead.
  • Tentative rich_text module and FormattedPage for direct RAG embedding usage
  • Configs are now dataclasses. However, a possibly breaking change is that passing config_overrides will now completely replace the config, rather than updating it.