Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AVRO-2456: Add interop test for the snappy and zstd codec #573

Merged
merged 2 commits into from Jul 12, 2019

Conversation

sekikn
Copy link
Contributor

@sekikn sekikn commented Jun 30, 2019

Make sure you have checked all steps below.

Jira

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:

This PR added snappy and zstd interop tests to the Java and Python2 bindings. It also fixed the Ruby bindings to run the interop tests only for null and deflate codecs so that its interop tests pass.
I ran ./build.sh test locally on the top directory and confirmed that all tests passed.

Commits

  • My commits all reference Jira issues in their subject lines. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • All the public functions and the classes in the PR contain Javadoc that explain what it does

@probot-autolabeler probot-autolabeler bot added build Java Pull Requests for Java binding Python Ruby labels Jun 30, 2019
@sekikn
Copy link
Contributor Author

sekikn commented Jul 2, 2019

I've just noticed AVRO-2436 broke the data interop test. After that change, the interop test for Python2 came to generate snappy-compressed test data, but Java interop test fails to read it due to the lack of the dependent library.
This PR also fixes that problem by revising pom.xml, so I'd appreciate if you review it ASAP. :)

@@ -142,6 +142,14 @@
<artifactId>javax.annotation-api</artifactId>
<version>1.3.2</version>
</dependency>
<dependency>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should not define these dependencies for avro-ipc. We're only using this in the avro dependency itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review and the comment @Fokko!

Hmm, I think we need to add dependencies on snappy-java and zstd-jni to the avro-ipc module, since that module contains the data interop test for Java, which reads snappy- and zstd-compressed Avro files generated by other language bindings. Without those dependencies, ./build.sh test on the top-level directory fails as follows:

[INFO] 
[INFO] --- maven-surefire-plugin:3.0.0-M3:test (default-test) @ avro-ipc ---
[INFO] 
[INFO] -------------------------------------------------------
[INFO]  T E S T S
[INFO] -------------------------------------------------------
Reading data files from directory: /home/sekikn/avro/lang/java/ipc/../../../build/interop/data
[INFO] Running org.apache.avro.DataFileInteropTest
Reading with specific:
Reading: py_snappy.avro
Reading with generic:
Reading: py_snappy.avro
[ERROR] Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 0.349 s <<< FAILURE! - in org.apache.avro.DataFileInteropTest
[ERROR] testGeneratedSpecific(org.apache.avro.DataFileInteropTest)  Time elapsed: 0.333 s  <<< ERROR!
org.apache.avro.AvroRuntimeException: Unrecognized codec: snappy

In addition, I've just noticed we also have to add a dependency on zstd-jni to the avro-tools module. Due to the lack of that, the currently released avro-tools can't read zstd-compressed Avro files by itself. For this module, snappy-java is already added to the dependencies.

$ curl -sLO https://www-us.apache.org/dist/avro/avro-1.9.0/java/avro-tools-1.9.0.jar
$ java -jar avro-tools-1.9.0.jar tojson build/interop/data/java_snappy.avro > /dev/null 
19/07/05 14:09:01 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
$ echo $?
0
$ java -jar avro-tools-1.9.0.jar tojson build/interop/data/java_zstandard.avro 
19/07/05 14:09:23 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Exception in thread "main" java.lang.NoClassDefFoundError: com/github/luben/zstd/ZstdOutputStream
	at org.apache.avro.file.ZstandardCodec.decompress(ZstandardCodec.java:78)
	at org.apache.avro.file.DataFileStream$DataBlock.decompressUsing(DataFileStream.java:379)
	at org.apache.avro.file.DataFileStream.hasNext(DataFileStream.java:213)
	at org.apache.avro.tool.DataFileReadTool.run(DataFileReadTool.java:80)
	at org.apache.avro.tool.Main.run(Main.java:66)
	at org.apache.avro.tool.Main.main(Main.java:55)
Caused by: java.lang.ClassNotFoundException: com.github.luben.zstd.ZstdOutputStream
	at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	... 6 more
$ echo $?
1

@Fokko
Copy link
Contributor

Fokko commented Jul 9, 2019 via email

@sekikn
Copy link
Contributor Author

sekikn commented Jul 10, 2019

Thanks @Fokko! May I add a dependency on zstd-jni to avro-tools' pom.xml within this PR? Or should I do that on another JIRA issue and PR?

Copy link
Contributor

@Fokko Fokko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @sekikn. A separate PR would be preferred for traceability.

@Fokko Fokko merged commit fbfb135 into apache:master Jul 12, 2019
RyanSkraba pushed a commit to kojiromike/avro that referenced this pull request Jan 23, 2020
RyanSkraba pushed a commit to kojiromike/avro that referenced this pull request Jan 27, 2020
RyanSkraba pushed a commit to RyanSkraba/avro that referenced this pull request Jan 27, 2020
RyanSkraba pushed a commit to kojiromike/avro that referenced this pull request Jan 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Java Pull Requests for Java binding Python Ruby
Projects
None yet
2 participants