Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System.AccessViolationException: Attempted to read or write protected memory when performing OCR Extraction on pdf files #11

Open
divyangashah opened this issue Oct 1, 2019 · 0 comments

Comments

@divyangashah
Copy link

divyangashah commented Oct 1, 2019

Hi team,

When we are calling:

string getContent = currentPage.GetText();

It gives error below mentioned in some of the pdf documents when trying to perform OCR Text Extraction.
Please update the solution if it is available.

Environment

Tesseract Version: 4.0.0.0-beta3
Platform: Windows 64-bit

Current Behavior:
giving below error in some documents:

[ERROR] 2019-09-30 14:14:51.43,Attempted to read or write protected memory. This is often an indication that other memory is corrupt.,(:0)
System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
at Imaging.OCR.TExtraction.RecognizeandExtractText(String inputFile, String outputFile, Boolean isReduceToBitonal)
at Imaging.IG.ConsumerTest.FormMain.OCRExtraction()

Expected Behavior:
It should give text content available in pdf file when perform OCR Text Extraction.

Suggested Fix:
Same issue was in tesseract 3.0v which was resolved in 3.0.2-alpha1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant