New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HADOOP-17224. Install Intel ISA-L library in Dockerfile. #2537
Conversation
(!) A patch to the testing environment has been detected. |
🎊 +1 overall
This message was automatically generated. |
(!) A patch to the testing environment has been detected. |
💔 -1 overall
This message was automatically generated. |
(!) A patch to the testing environment has been detected. |
(!) A patch to the testing environment has been detected. |
💔 -1 overall
This message was automatically generated. |
(!) A patch to the testing environment has been detected. |
💔 -1 overall
This message was automatically generated. |
@tasanuma In the last build, indeed some EC tests failed with |
All OOMs are "unable to create new native thread" indicating ulimit or resource shortage to create LWP. The first OOM is in TestJvmMetrics in hadoop-common. If ISA-L is related, the cause should be in the code path of ErasureCodeNative#loadLibrary. I don't have clear insight yet. I think we have been familiar with test failures by "unable to create new native thread" for a long time.. |
This PR ran Jenkins 4 times. The best is 25 failures and the worst is 274 failures. According to some of the latest result of QBT, the best is 17 failures(#349) and the worst is 224 failures(#348). The average is about 50 failures. I can't say for sure, but I don't think ISA-L has much to do with OOM. |
@amahussein What do you think about it? |
@iwasakims , I cannot fully confident that For sure, we do not want to blame those pre-existing failures to ISA-L. However, adding ISA-L could increase failures because of the hadoop code, or the native code. I think there are two approaches:
|
@tasanuma, @iwasakims Based on Yetus documentations :
|
@amahussein I didn't consider it. Thanks for trying it on HADOOP-17438. Let's see the result. I'm also paying attention to #2556 that Akira is trying to reduce threadCount for unit tests. The result seems very good for now. |
Hey @tasanuma and @iwasakims . |
65dbd42
to
4107c9b
Compare
(!) A patch to the testing environment has been detected. |
Thanks for letting me know, @amahussein. I rebased and pushed it again. |
💔 -1 overall
This message was automatically generated. |
(!) A patch to the testing environment has been detected. |
💔 -1 overall
This message was automatically generated. |
I triggered the Yetus build to run all unit tests twice, and OOM does not cause the failed tests. I think we can commit this again. |
(!) A patch to the testing environment has been detected. |
💔 -1 overall
This message was automatically generated. |
@tasanuma Have you verified that all the EC unit tests are passing after adding the library (i.e., |
+1 (non-binding) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to commit this again.
Merged it. Thanks for your comments and reviews, @amahussein, @iwasakims, @ayushtkn! |
(cherry picked from commit d09e3c9)
NOTICE
Please create an issue in ASF JIRA before opening a pull request,
and you need to set the title of the pull request which starts with
the corresponding JIRA issue number. (e.g. HADOOP-XXXXX. Fix a typo in YYY.)
For more details, please see https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute