-
Notifications
You must be signed in to change notification settings - Fork 8.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HADOOP-17125. Using snappy-java in SnappyCodec #2297
Conversation
@@ -1,166 +0,0 @@ | |||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per #2201 (comment) Are those native code used in hadoop-mapreduce-client-nativetask
? If so, we probably need to keep it now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, because we remove native method in java files, I think we don't generate .h file needed for compilation: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2297/1/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt
[WARNING] /home/jenkins/jenkins-home/workspace/hadoop-multibranch_PR-2297/src/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.c:32:10: fatal error: org_apache_hadoop_io_compress_snappy_SnappyDecompressor.h: No such file or directory
[WARNING] #include "org_apache_hadoop_io_compress_snappy_SnappyDecompressor.h"
[WARNING] ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[WARNING] compilation terminated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, I don't see they are used in hadoop-mapreduce-client-nativetask
if I don't miss it. Let's wait the build and test.
The only test failure in
, I guess it's because in |
@dbtsai Yeah, let me look at it today. Hope to pass all tests soon. |
@sunchao I think all tests are passed. But there are two -1, do you know what it means? |
@sunchao Thanks. I saw a check style failure:
But I don't change its style in the diff. |
And I think for checkstyle, it is just a -0? For two cc links, I don't see any errors: There are some warnings, but not related to the change here. Are they known issues? |
Yes. I also don't see any error in the log files. I think we can check Yetus repo to see how it decides whether it is a -1 or -0. cc @jojochuang @aajisaka @steveloughran do you have any idea what caused the CI failure here? |
BTW I think we no longer need the |
Yes, I plan to remove them once we get rid of -1. |
Checked the cc warnings and the related code. They are committed long time ago, e.g., 2014, and not touched here. Many of the cc warnings are So I am not sure which one is good, fixing these compilation warnings, or ignoring them? |
Yeah. Looks to me we can just ignore these for now and proceed to other things in this PR. |
Looks like CI failed to fetch and install yetus? @sunchao do you know how we can re-trigger CI build and testing? |
Just re-triggered the job let's see what happens |
It seems still failed to fetch and install yetus, and not just this PR, other PRs also encountered it... |
@sunchao Who we should let them know about the CI issue? |
@viirya interesting ... I think you can send an email to the Hadoop dev list ( |
OK, seems the CI is working now. |
I have run a benchmark and compatibility test locally. I use SnappyCodec to write and read a ~200MB SequenceFile. Before and after this change, the performance is nearly the same. For compatibility test, I write SequenceFile using two SnappyCodec and read it back using each other. The file can be read without problem. And the file size is also the same. |
Hmm, for CompressDecompressTester.java, it seems to me that it is from original code? else if (compressor.getClass().isAssignableFrom(ZlibCompressor.class)) {
return ZlibFactory.isNativeZlibLoaded(new Configuration());
- }
- else if (compressor.getClass().isAssignableFrom(SnappyCompressor.class)
- && isNativeSnappyLoadable())
+ }
+ else if (compressor.getClass().isAssignableFrom(SnappyCompressor.class)) Anyway, I can fix it here if you think it is ok. |
The style issue was fixed in the last run. The CI failed because of unit tests and ASF license (I don't really see the file |
Fixed another and last style issue. Checked with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yetus failures are all unrelated. One minor tweak suggested to that change on test reporting. I don't like ever losing stacks of nested exceptions, so if you are changing that code, just throw the AssertionError which fail() would normally do, with the caught exception as the cause. Not your fault, I know, but since you are there...
if (ex.getMessage() != null) { | ||
fail(joiner.join(name, ex.getMessage())); | ||
} else { | ||
fail(joiner.join(name, ExceptionUtils.getStackTrace(ex))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NPE is why toString() is what new code should do.
Why don't we just throw new AssertionError(name +ex, ex)
. That way, the stack trace doesn't get lost, which is something we never want to have happen,
If making a new PR, the ' compile' is redundant given its maven default? The license failure is:
No harm fixing it as part of this patch... add the 'jobTokenPassword' from below in ./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/pom.xml
Otherwise patch is looking good to me. |
The native compile complaints seem unrelated... |
…d for license check.
Thanks @steveloughran and @saintstack. Updated the diff based on your suggestions. |
💔 -1 overall
This message was automatically generated. |
I think it's actually some test runner bug, really it should be cleaned up. But we can pull in the patch to shut it up. |
Ok, I'm happy too +1, merging to trunk and branch-3.3 |
This switches the SnappyCodec to use the java-snappy codec, rather than the native one. To use the codec, snappy-java.jar (from org.xerial.snappy) needs to be on the classpath. This comesin as an avro dependency, so it is already on the hadoop-common classpath, as well as in hadoop-common/lib. The version used is now managed in the hadoop-project POM; initially 1.1.7.7 Contributed by DB Tsai and Liang-Chi Hsieh Change-Id: Id52a404a0005480e68917cd17f0a27b7744aea4e
Thanks for pushing this through @steveloughran +1 on master and branch-3.3. |
Thanks @steveloughran @saintstack @sunchao @dbtsai |
Thanks all for helping and pushing this through! This will simplify how people deploy snappy native lib greatly. |
JIRA on apache is offline & updated -we need to remember to update that, including something in the release notes |
|
Looks like the JIRA is back now? https://issues.apache.org/jira/browse/HADOOP-17125 |
JIRA closed, added a release note. |
Thanks @steveloughran - could you assign the JIRA to @viirya ? |
@viirya ...what's your JIRA username? |
@steveloughran Username is viirya too. Thanks. |
@viirya assigned JIRA to you. you are also free to assign any other Hadoop JIRAs to yourself... |
@steveloughran Thank you! I tried to assign this ticket, but seems cannot do it. |
you needed to be listed in the project settings as someone with the right permissions. its done now |
See https://issues.apache.org/jira/browse/HADOOP-17125 for details.
Offline discussed with @dbtsai and submitted this based on #2201.