New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AVRO-2456: Add interop test for the snappy and zstd codec #573
Conversation
I've just noticed AVRO-2436 broke the data interop test. After that change, the interop test for Python2 came to generate snappy-compressed test data, but Java interop test fails to read it due to the lack of the dependent library. |
@@ -142,6 +142,14 @@ | |||
<artifactId>javax.annotation-api</artifactId> | |||
<version>1.3.2</version> | |||
</dependency> | |||
<dependency> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should not define these dependencies for avro-ipc
. We're only using this in the avro dependency itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review and the comment @Fokko!
Hmm, I think we need to add dependencies on snappy-java and zstd-jni to the avro-ipc module, since that module contains the data interop test for Java, which reads snappy- and zstd-compressed Avro files generated by other language bindings. Without those dependencies, ./build.sh test
on the top-level directory fails as follows:
[INFO]
[INFO] --- maven-surefire-plugin:3.0.0-M3:test (default-test) @ avro-ipc ---
[INFO]
[INFO] -------------------------------------------------------
[INFO] T E S T S
[INFO] -------------------------------------------------------
Reading data files from directory: /home/sekikn/avro/lang/java/ipc/../../../build/interop/data
[INFO] Running org.apache.avro.DataFileInteropTest
Reading with specific:
Reading: py_snappy.avro
Reading with generic:
Reading: py_snappy.avro
[ERROR] Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 0.349 s <<< FAILURE! - in org.apache.avro.DataFileInteropTest
[ERROR] testGeneratedSpecific(org.apache.avro.DataFileInteropTest) Time elapsed: 0.333 s <<< ERROR!
org.apache.avro.AvroRuntimeException: Unrecognized codec: snappy
In addition, I've just noticed we also have to add a dependency on zstd-jni to the avro-tools module. Due to the lack of that, the currently released avro-tools can't read zstd-compressed Avro files by itself. For this module, snappy-java is already added to the dependencies.
$ curl -sLO https://www-us.apache.org/dist/avro/avro-1.9.0/java/avro-tools-1.9.0.jar
$ java -jar avro-tools-1.9.0.jar tojson build/interop/data/java_snappy.avro > /dev/null
19/07/05 14:09:01 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
$ echo $?
0
$ java -jar avro-tools-1.9.0.jar tojson build/interop/data/java_zstandard.avro
19/07/05 14:09:23 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Exception in thread "main" java.lang.NoClassDefFoundError: com/github/luben/zstd/ZstdOutputStream
at org.apache.avro.file.ZstandardCodec.decompress(ZstandardCodec.java:78)
at org.apache.avro.file.DataFileStream$DataBlock.decompressUsing(DataFileStream.java:379)
at org.apache.avro.file.DataFileStream.hasNext(DataFileStream.java:213)
at org.apache.avro.tool.DataFileReadTool.run(DataFileReadTool.java:80)
at org.apache.avro.tool.Main.run(Main.java:66)
at org.apache.avro.tool.Main.main(Main.java:55)
Caused by: java.lang.ClassNotFoundException: com.github.luben.zstd.ZstdOutputStream
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 6 more
$ echo $?
1
Good catch Seki. Thanks for the explanation.
Op vr 5 jul. 2019 om 07:22 schreef Kengo Seki <notifications@github.com>
… ***@***.**** commented on this pull request.
------------------------------
In lang/java/ipc/pom.xml
<#573 (comment)>:
> @@ -142,6 +142,14 @@
<artifactId>javax.annotation-api</artifactId>
<version>1.3.2</version>
</dependency>
+ <dependency>
Thanks for the review and the comment @Fokko <https://github.com/Fokko>!
Hmm, I think we need to add dependencies on snappy-java and zstd-jni to
the avro-ipc module, since that module contains the data interop test for
Java, which reads snappy- and zstd-compressed Avro files generated by other
language bindings. Without those dependencies, ./build.sh test on the
top-level directory fails as follows:
[INFO]
[INFO] --- maven-surefire-plugin:3.0.0-M3:test (default-test) @ avro-ipc ---
[INFO]
[INFO] -------------------------------------------------------
[INFO] T E S T S
[INFO] -------------------------------------------------------
Reading data files from directory: /home/sekikn/avro/lang/java/ipc/../../../build/interop/data
[INFO] Running org.apache.avro.DataFileInteropTest
Reading with specific:
Reading: py_snappy.avro
Reading with generic:
Reading: py_snappy.avro
[ERROR] Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 0.349 s <<< FAILURE! - in org.apache.avro.DataFileInteropTest
[ERROR] testGeneratedSpecific(org.apache.avro.DataFileInteropTest) Time elapsed: 0.333 s <<< ERROR!
org.apache.avro.AvroRuntimeException: Unrecognized codec: snappy
In addition, I've just noticed we also have to add a dependency on
zstd-jni to the avro-tools module. Due to the lack of that, the currently
released avro-tools can't read zstd-compressed Avro files by itself. For
this module, snappy-java is already added to the dependencies.
$ curl -sLO https://www-us.apache.org/dist/avro/avro-1.9.0/java/avro-tools-1.9.0.jar
$ java -jar avro-tools-1.9.0.jar tojson build/interop/data/java_snappy.avro > /dev/null
19/07/05 14:09:01 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
$ echo $?
0
$ java -jar avro-tools-1.9.0.jar tojson build/interop/data/java_zstandard.avro
19/07/05 14:09:23 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Exception in thread "main" java.lang.NoClassDefFoundError: com/github/luben/zstd/ZstdOutputStream
at org.apache.avro.file.ZstandardCodec.decompress(ZstandardCodec.java:78)
at org.apache.avro.file.DataFileStream$DataBlock.decompressUsing(DataFileStream.java:379)
at org.apache.avro.file.DataFileStream.hasNext(DataFileStream.java:213)
at org.apache.avro.tool.DataFileReadTool.run(DataFileReadTool.java:80)
at org.apache.avro.tool.Main.run(Main.java:66)
at org.apache.avro.tool.Main.main(Main.java:55)
Caused by: java.lang.ClassNotFoundException: com.github.luben.zstd.ZstdOutputStream
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 6 more
$ echo $?
1
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#573?email_source=notifications&email_token=AAIU5KBMIE7ADUV5UVF37ILP53K7VA5CNFSM4H4M3NG2YY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOB5SDHRA#discussion_r300543941>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAIU5KG3QWGIQVBCL5K4A7TP53K7VANCNFSM4H4M3NGQ>
.
|
Thanks @Fokko! May I add a dependency on zstd-jni to avro-tools' pom.xml within this PR? Or should I do that on another JIRA issue and PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @sekikn. A separate PR would be preferred for traceability.
Make sure you have checked all steps below.
Jira
Tests
This PR added snappy and zstd interop tests to the Java and Python2 bindings. It also fixed the Ruby bindings to run the interop tests only for null and deflate codecs so that its interop tests pass.
I ran
./build.sh test
locally on the top directory and confirmed that all tests passed.Commits
Documentation