Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compact PDF417: spurious NotFoundException #1624

Closed
gredler opened this issue May 17, 2023 · 0 comments · Fixed by #1628
Closed

Compact PDF417: spurious NotFoundException #1624

gredler opened this issue May 17, 2023 · 0 comments · Fixed by #1628
Labels
Milestone

Comments

@gredler
Copy link
Contributor

gredler commented May 17, 2023

A Compact PDF417 is a PDF417 symbol with no right row indicator column and with the stop pattern reduced to one module bar.

image

Usually, when PDF417Reader is asked to decode a Compact PDF417 symbol, PDF417ScanningDecoder.merge(...) and PDF417ScanningDecoder.getBarcodeMetadata(...) are eventually asked to merge a correctly-identified left column with a null right column (since the right column doesn't exist), and the decoding moves forward based only on the metadata in the left column.

However, sometimes Detector.findVertices(...) (with help from Detector.findRowsWithPattern(...), Detector.findGuardPattern(...) and Detector.patternMatchVariance(...)) incorrectly identifies the full PDF417 stop pattern within the symbol data, and "finds" the end of the symbol where it doesn't exist. Eventually this creates spurious right column data which makes its way into PDF417ScanningDecoder.getBarcodeMetadata(...), where we realize that we have different information in the left column and the pseudo-right column, and we give up and return a null value which ultimately becomes a NotFoundException.

In the example below, the highlighted section (a pattern of 6 1 1 2 1 1 1 1 1) is mistaken for a stop pattern (a pattern of 7 1 1 3 1 1 1 2 1). Three of the nine counters are off by a full module width each, but the Detector methods mentioned above think it is good enough, based on the very high MAX_INDIVIDUAL_VARIANCE (80%) and MAX_AVG_VARIANCE (42%).

I can imagine a few possible fixes:

  • reduce the hardcoded variance thresholds (high risk of regressions elsewhere?)
  • make the variance thresholds configurable (but more knobs make it harder for users to get correct results)
  • make the compact option explicit, and the compact code path slightly different, rather than implicitly relying on the search for the stop pattern failing (another knob, but a simpler one)

Happy to hear any other ideas!

The sample below encodes +A+CDMEABCDEFGHIJABCDFEFGH in a Compact PDF417 symbol with 12 data column and 74 rows, using ECC level 8.

Sample with incorrect stop pattern highlighted

image

Sample without any highlighting

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging a pull request may close this issue.

2 participants