Skip to content
Eliot Jones edited this page Apr 14, 2022 · 1 revision

Version 0.1.6 contains some public API changes to the Document Layout Analysis methods and classes.

The changes to Document Layout Analysis were added in this PR https://github.com/UglyToad/PdfPig/pull/432

In summary methods such as GetWords and GetBlocks taking options arguments have changed to take the options in the class constructor. In addition DlaOptions has been replaced by the IDlaOptions interface which has analyzer specific implementations.

For example:

NearestNeighbourWordExtractor.Instance.GetWords(page.Letters, new DlaOptions());

Becomes:

new NearestNeighbourWordExtractor(new NearestNeighbourWordExtractorOptions()).GetWords(page.Letters);

This change will support various improvements to the DLA functionality for future use-cases.

Changes affect:

  • DefaultPageSegmenter
  • DocstrumBoundingBoxes
  • NearestNeighbourWordExtractor
  • RecursiveXYCut
Clone this wiki locally