We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
We are using ExtractText() and from time to time, we are getting an index out of range error.
ExtractText()
Stacktrace:
panic: runtime error: index out of range [0] with length 0 [recovered] panic: runtime error: index out of range [0] with length 0 goroutine 21 [running]: testing.tRunner.func1.2({0x1009ba340, 0x140001f5d28}) /opt/homebrew/Cellar/go/1.18.1/libexec/src/testing/testing.go:1389 +0x1c8 testing.tRunner.func1() /opt/homebrew/Cellar/go/1.18.1/libexec/src/testing/testing.go:1392 +0x384 panic({0x1009ba340, 0x140001f5d28}) /opt/homebrew/Cellar/go/1.18.1/libexec/src/runtime/panic.go:838 +0x204 github.com/unidoc/unipdf/v3/internal/textencoding.CMapEncoder.CharcodeToRune(...) /Projects/go/workspace/pkg/mod/github.com/unidoc/unipdf/v3@v3.34.0/internal/textencoding/textencoding.go:552 github.com/unidoc/unipdf/v3/extractor.(*textObject).renderText(0x14000ab02c0, {0x14000759328, 0x1, 0x8}) /Projects/go/workspace/pkg/mod/github.com/unidoc/unipdf/v3@v3.34.0/extractor/extractor.go:762 +0xab0 github.com/unidoc/unipdf/v3/extractor.(*textObject).showTextAdjusted(0x14000ab02c0, 0x1400000fea8) /Projects/go/workspace/pkg/mod/github.com/unidoc/unipdf/v3@v3.34.0/extractor/extractor.go:132 +0x178 github.com/unidoc/unipdf/v3/extractor.(*Extractor).extractPageText.func1(0x1400034fdd0, {{0x1009f2d78, 0x100f63dc8}, {0x1009f2e80, 0x14000084360}, {0x1009801a0, 0x140006021c8}, {0x10099ad00, 0x140001f5cf8}, {0x3ff0000000000000, ...}}, ...) /Projects/go/workspace/pkg/mod/github.com/unidoc/unipdf/v3@v3.34.0/extractor/extractor.go:797 +0x2348 github.com/unidoc/unipdf/v3/contentstream.(*ContentStreamProcessor).Process(0x14000765aa0, 0x100f63dc8?) /Projects/go/workspace/pkg/mod/github.com/unidoc/unipdf/v3@v3.34.0/contentstream/contentstream.go:314 +0xa94 github.com/unidoc/unipdf/v3/extractor.(*Extractor).extractPageText(0x14000136060, {0x14000644000, 0x9a44e}, 0x14000418060?, {0x3ff0000000000000, 0x0, 0x0, 0x0, 0x3ff0000000000000, 0x0, ...}, ...) /Projects/go/workspace/pkg/mod/github.com/unidoc/unipdf/v3@v3.34.0/extractor/extractor.go:828 +0x754 github.com/unidoc/unipdf/v3/extractor.(*Extractor).ExtractPageText(0x14000136060) /Projects/go/workspace/pkg/mod/github.com/unidoc/unipdf/v3@v3.34.0/extractor/extractor.go:243 +0x74 github.com/unidoc/unipdf/v3/extractor.(*Extractor).ExtractTextWithStats(0x14000214380?) /Projects/go/workspace/pkg/mod/github.com/unidoc/unipdf/v3@v3.34.0/extractor/extractor.go:508 +0x20 github.com/unidoc/unipdf/v3/extractor.(*Extractor).ExtractText(...) /Projects/go/workspace/pkg/mod/github.com/unidoc/unipdf/v3@v3.34.0/extractor/extractor.go:526
Currently, the obfuscated code of CMapEncoder.CharcodeToRune, looks like:
CMapEncoder.CharcodeToRune
func (_agg CMapEncoder) CharcodeToRune(code CharCode) (rune, bool) { _egf, _ceg := _agg.charcodeToString(code) return ([]rune(_egf))[0], _ceg }
The error happens because charcodeToString returns in some cases for these files an empty string. And []rune("") = nil
charcodeToString
[]rune("")
nil
So a potential fix would be:
func (_agg CMapEncoder) CharcodeToRune(code CharCode) (rune, bool) { _egf, _ceg := _agg.charcodeToString(code) if _egf == "" { return MissingCodeRune, false } return ([]rune(_egf))[0], _ceg }
No panics when extracting text
Triggers a panic: runtime error: index out of range [0] with length 0 in certain cases
Sadly enough, I can't share a file due to GDPR reasons.
The text was updated successfully, but these errors were encountered:
Hi @becoded,
Thank you for reporting this issue and the potential fix. We released new version v3.35.0 https://github.com/unidoc/unipdf-src/releases/tag/v3.35.0
Sorry, something went wrong.
No branches or pull requests
Description
We are using
ExtractText()
and from time to time, we are getting an index out of range error.Stacktrace:
Currently, the obfuscated code of
CMapEncoder.CharcodeToRune
, looks like:The error happens because
charcodeToString
returns in some cases for these files an empty string. And[]rune("")
=nil
So a potential fix would be:
Expected Behavior
No panics when extracting text
Actual Behavior
Triggers a panic: runtime error: index out of range [0] with length 0 in certain cases
Attachments
Sadly enough, I can't share a file due to GDPR reasons.
The text was updated successfully, but these errors were encountered: