[CONJ-805] >1Byte UTF8 character issue if maxFieldSize is set #156
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Detected at one of my clients.
They use polish special characters like ść
The length in TextRowProtocol is set to 2 for each of those characters so 4 in our example.
With this the current code
return new String(buf, pos, Math.min(maxFieldSize * 3, length), StandardCharsets.UTF_8) .substring(0, Math.min(maxFieldSize, length));results into an ArrayOutOfBoundsError at the substring call.
I added a test case and a quick fix for it.
Travis fails on environment issues. Nothing to do with my change.