Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HIVE-24948 Enhancing performance of OrcInputFormat.getSplits with bucket pruning #2126

Closed
wants to merge 1 commit into from

Conversation

EugeneChung
Copy link

What changes were proposed in this pull request?

apply bucket pruning BitSet to the list from AcidDirectory.getFiles() to remove unnecessary overhead of making whole OrcSplit lists

Why are the changes needed?

To enhance performance

Does this PR introduce any user-facing change?

No

How was this patch tested?

Tested on my company's environment by checking the result of the sample query

…ket pruning

- apply bucket pruning bitset to the list from AcidDirectory.getFiles()
@EugeneChung
Copy link
Author

EugeneChung commented Mar 29, 2021

There is only one failure, but it seems not be related to my modification.

Client Execution succeeded but contained differences (error code = 1) after executing script_broken_pipe2.q 
24c24
< Caused by: java.io.IOException: Broken pipe
---
> Caused by: java.io.IOException: Stream closed
46c46
< Caused by: java.io.IOException: Broken pipe
---
> Caused by: java.io.IOException: Stream closed
49,58d48
< FAILED: AssertionError java.lang.AssertionError: Client Execution succeeded but contained differences (error code = 1) after executing script_broken_pipe2.q 
< 24c24
< < Caused by: java.io.IOException: Broken pipe
< ---
< > Caused by: java.io.IOException: Stream closed
< 46c46
< < Caused by: java.io.IOException: Broken pipe
< ---
< > Caused by: java.io.IOException: Stream closed
Caused by: java.io.IOException: Stream closed
	at java.lang.ProcessBuilder$NullOutputStream.write(ProcessBuilder.java:433)
	at java.io.OutputStream.write(OutputStream.java:116)
	at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
	at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
	at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
	at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
	at java.io.DataOutputStream.write(DataOutputStream.java:107)
	at org.apache.hadoop.hive.ql.exec.TextRecordWriter.write(TextRecordWriter.java:53)
	at org.apache.hadoop.hive.ql.exec.ScriptOperator.process(ScriptOperator.java:431)
	... 26 more

@github-actions
Copy link

github-actions bot commented Jun 6, 2021

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
Feel free to reach out on the dev@hive.apache.org list if the patch is in need of reviews.

@github-actions github-actions bot added the stale label Jun 6, 2021
@github-actions github-actions bot closed this Jun 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants