Skip to content

Fix duplicate textract result#2943

Merged
jelveh merged 1 commit intoHeyPuter:mainfrom
reynaldichernando:fix-duplicate-textract-result
May 7, 2026
Merged

Fix duplicate textract result#2943
jelveh merged 1 commit intoHeyPuter:mainfrom
reynaldichernando:fix-duplicate-textract-result

Conversation

@reynaldichernando
Copy link
Copy Markdown
Member

fix an issue where img2txt via textract returning duplicate OCR result

current
image

fix
image

so what's happening is, Textract by default will return LINE item and WORD item
https://docs.aws.amazon.com/textract/latest/dg/how-it-works-lines-words.html
currently we always output both resulting in the result being duplicated

notes:

  • LINE will always still be returned even if we exclude WORD, because they are hierarchical, LINE is the parent of WORD
  • the trade off for omitting WORD is, the user has to manually split the text, if they want to get word for word
    • but then again, we are returning pure string for this textract result
    • in the future if we ever do a v3, or maybe v2 with custom options, we might revisit just outputting the proper textract output
    • because it contains a lot more valuable data, which are the individual words, and the bounding box, if users wants to visualize the ocr

in any case, right now this is sufficient for our usecase
user input image, they want to get string result, and we provide it to them without it being duplicated

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes duplicated img2txt output from AWS Textract by preventing both LINE and WORD blocks from being emitted as text blocks, so downstream consumers that concatenate all text/* blocks don’t double-count OCR content.

Changes:

  • Filters out Textract WORD blocks during response normalization to avoid duplicate text output.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/backend/drivers/ai-ocr/OCRDriver.ts
@jelveh jelveh merged commit 5e01c13 into HeyPuter:main May 7, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants