Skip to content

Conversation

@dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented Oct 28, 2020

What changes were proposed in this pull request?

This PR aims to upgrade Apache ORC from 1.5.6 to 1.5.8.

Why are the changes needed?

This will bring eleven bug fixes.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Pass the CI with the existing test cases.

@pgaref
Copy link
Contributor

pgaref commented Oct 28, 2020

Thanks for the patch @dongjoon-hyun -- can you please reopen the PR as I dont see the pre-commit test results at all (I guess they were never triggered)

@dongjoon-hyun
Copy link
Member Author

Thanks, @pgaref . I closed and reopened this.

@dongjoon-hyun
Copy link
Member Author

Hi, @sunchao .
There is no way to trigger the real CI until now?

@sunchao
Copy link
Member

sunchao commented Oct 29, 2020

No. There is no jenkins file in branch-3.1 so there's no way to run CI at the moment. We'd have to do something similar to #1398 to enable that.

@dongjoon-hyun
Copy link
Member Author

Do you think you can do that for the Apache Hive community?

@sunchao
Copy link
Member

sunchao commented Oct 29, 2020

yeah I can help on that - I think it won't be too difficult after going through the process for branch-2.3.

@sunchao
Copy link
Member

sunchao commented Oct 29, 2020

opened #1626

@dongjoon-hyun
Copy link
Member Author

Thank you so much, @sunchao!

@sunchao
Copy link
Member

sunchao commented Nov 2, 2020

@dongjoon-hyun can you re-trigger CI perhaps with an empty commit?

@dongjoon-hyun
Copy link
Member Author

Sure!

Copy link
Member

@sunchao sunchao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. All test failures are existing and there's no new failures.

@dongjoon-hyun
Copy link
Member Author

Thank you for your review and approval, @sunchao !

@sunchao
Copy link
Member

sunchao commented Nov 3, 2020

@pgaref let me know if you want to take another look before I merge it. Thanks.

@pgaref
Copy link
Contributor

pgaref commented Nov 3, 2020

Hey @sunchao the orc version bump should be safe so patch LGTM -- I see we have some test failures with the new CI, is that expected, or something to look at?

Cheers

@sunchao
Copy link
Member

sunchao commented Nov 3, 2020

Thanks @pgaref . Yes those are existing test failures in the branch that are probably worth taking a look. Not sure whether they are flaky or real issues.

@sunchao sunchao merged commit f87a7c7 into apache:branch-3.1 Nov 3, 2020
@sunchao
Copy link
Member

sunchao commented Nov 3, 2020

Merged. Thanks @dongjoon-hyun

@dongjoon-hyun
Copy link
Member Author

Thank you, @sunchao and @pgaref !

@dongjoon-hyun dongjoon-hyun deleted the HIVE-24316 branch November 3, 2020 16:29
@dongjoon-hyun
Copy link
Member Author

Could you resolve HIVE-24316 please, @sunchao ?

@sunchao
Copy link
Member

sunchao commented Nov 4, 2020

it's done

@dongjoon-hyun
Copy link
Member Author

Thanks!

@glapark
Copy link
Contributor

glapark commented Nov 12, 2020

Hello,

This commit seems to fail four existing test cases in org.apache.hadoop.hive.ql.io.orc.TestOrcFile:

[ERROR] Failures:
[ERROR] TestOrcFile.testMemoryManagementV11:1992 stripe 1 is too long at 18454
[ERROR] TestOrcFile.testMemoryManagementV11:1992 stripe 1 is too long at 18454
[ERROR] TestOrcFile.testMemoryManagementV12:2029 stripe 1 is too long at 9626
[ERROR] TestOrcFile.testMemoryManagementV12:2029 stripe 1 is too long at 9626
[INFO]
[ERROR] Tests run: 44, Failures: 4, Errors: 0, Skipped: 0

In my own manual testing, the four tests succeed with HIVE-24331 (the last commit just before HIVE-24316 in branch-3.1), so there is a good chance that upgrading ORC to 1.5.8 introduces these errors.

The error is generated from the following assertTrue(), so replacing the constant 5000 with 2000 would fix the errors.

for(StripeInformation stripe: reader.getStripes()) {
i += 1;
assertTrue("stripe " + i + " is too long at " + stripe.getDataLength(),
stripe.getDataLength() < 5000);
}

Could someone test the latest commit in branch-3.1 to see if the same errors can be reproduced?

--- Sungwoo

@pgaref
Copy link
Contributor

pgaref commented Nov 12, 2020

Hey @glapark

This is most probably related to ORC-361 that changed the ORC MemoryManager implementation (to support multi-threaded writers).

The error is generated from the following assertTrue(), so replacing the constant 5000 with 2000 would fix the errors.

I assume that you replaced it with 20_000 as the stripe sizes are around that number -- we should probably have the assert check the StripeSize which is about 50_000.
More details here: https://github.com/apache/orc/pull/433/files#diff-eae6097544d7965e078d582758f745fe52a6772bc71cbafe6068317a68595583R2493

@dongjoon-hyun
Copy link
Member Author

Thank you guys. I'll take a look at them, too.

@dongjoon-hyun
Copy link
Member Author

Got it. I found what I missed. I'll make a follow-up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants