Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HIVE-27686 ORC upgraded to 1.8.5. #4690

Merged
merged 1 commit into from
Oct 16, 2023
Merged

Conversation

zratkai
Copy link
Contributor

@zratkai zratkai commented Sep 12, 2023

What changes were proposed in this pull request?

ORC is upgraded to use 1.8.5 which contains a fix to use ORC row level filter.The tez.grouping.min-size needed to be changed to have 4 buckets for compaction testing.

Why are the changes needed?

To be able to use ORC row level filter.

Does this PR introduce any user-facing change?

No.

Is the change a dependency upgrade?

How was this patch tested?

Manually.

Copy link

@aturoczy aturoczy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the unnecessary Asserts

Change-Id: I9cef3ce5e91819ef2d2c169276aac96bcf0f80c8
@sonarcloud
Copy link

sonarcloud bot commented Oct 4, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

warning The version of Java (11.0.8) you have used to run this analysis is deprecated and we will stop accepting it soon. Please update to at least Java 17.
Read more here

Copy link
Contributor

@zhangbutao zhangbutao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the change! IMO, we can remove totalSize in every UT&QTest as its value would change in different ORC version, and it is not a very important element for ORC UT.

@@ -848,7 +848,7 @@ public void testCompactStatsGather() throws Exception {
.getParameters();
Assert.assertEquals("The number of files is differing from the expected", "1", parameters.get("numFiles"));
Assert.assertEquals("The number of rows is differing from the expected", "4", parameters.get("numRows"));
Assert.assertEquals("The total table size is differing from the expected", "704", parameters.get("totalSize"));
Assert.assertEquals("The total table size is differing from the expected", "705", parameters.get("totalSize"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw you have removed totalSize in the UT of TestCompactor.java, so should we remove this line as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to handle this in a separate ticket as refactoring. Would this be okay so we can close this ticket?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we're removing some totalSize assertions but not others, let's simply keep it consistent, choose 1 option now from the below ones:

  1. remove all occurrences that have been changed here by the patch, so it's visible now
  2. remove only from qtests
  3. remove all occurrences from the code -> time-consuming, I'm not recommending this
  4. don't remove any totalSize assertions

I believe only 1) makes sense

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just unresolved this conversation, it's time to finally address this whole totalSize somehow
option 5) would be to create another ticket to remove that stuff, merge ASAP, and then we can see how clear is this upgrade alone, because masking/totalSize related changes won't bring the noise into this PR, so I guess this is even better than 1)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Option 5 means we remove only the totalSize checks which are in this ticket and not others, which would be a little bit weird, because it is hard to address which to eliminate outside this ticket. How can be the other ticket reviewed without this? So I would say create a ticket with clear description which to delete: a) from Qtests, b) Qtest and unit tests. Or create 2 new ticket, one for Qtests only, one for unit tests only. Since the build env is really unstable I had to run many times this build to finally succeed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, makes sense, so we can agree to remove totalSize checks here from every place that were affected by this ORC upgrade and do the rest in subsequent patches, like HIVE-27791

@abstractdog abstractdog self-requested a review October 16, 2023 08:36
@abstractdog abstractdog merged commit 5d58a21 into apache:master Oct 16, 2023
5 checks passed
tarak271 pushed a commit to tarak271/hive-1 that referenced this pull request Dec 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants