Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP][SPARK-42715][SQL] Tips for Optimizing NegativeArraySizeException #40341

Closed
wants to merge 1 commit into from

Conversation

chong0929
Copy link
Contributor

What changes were proposed in this pull request?

In orc batch read, the byte arrays is used to store the data of the read columns. When the total data of this batch exceeds Int.MaxValue can be caused NegativeArraySizeException, catch and throw the same exeception with a friendly msg.

Why are the changes needed?

Friendly msg where read orc file get exception about java.lang.NegativeArraySizeException.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Existing tests.

@github-actions github-actions bot added the SQL label Mar 8, 2023
@@ -204,7 +204,12 @@ public void initBatch(
* by copying from ORC VectorizedRowBatch columns to Spark ColumnarBatch columns.
*/
private boolean nextBatch() throws IOException {
recordReader.nextBatch(wrap.batch());
try {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will Parquet have the same issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughtful, i will make a test.

recordReader.nextBatch(wrap.batch());
try {
recordReader.nextBatch(wrap.batch());
} catch (NegativeArraySizeException e) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to build unit test and catch the exception?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your ideas, they sound nice, i will make it done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also encountered the same stack issue. How much adjustment would be appropriate. @chong0929

@chong0929 chong0929 changed the title [SPARK-42715][SQL] Tips for Optimizing NegativeArraySizeException [WIP][SPARK-42715][SQL] Tips for Optimizing NegativeArraySizeException Mar 10, 2023
@github-actions
Copy link

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@github-actions github-actions bot added the Stale label Jun 30, 2023
@github-actions github-actions bot closed this Jul 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants