Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

native-image -H:Optimize=2 leads to wrong execution / code. #1995

Closed
michael-simons opened this issue Dec 16, 2019 · 10 comments
Closed

native-image -H:Optimize=2 leads to wrong execution / code. #1995

michael-simons opened this issue Dec 16, 2019 · 10 comments
Assignees
Milestone

Comments

@michael-simons
Copy link

michael-simons commented Dec 16, 2019

Hello everyone.
My name is Michael, I work for Neo4j and I contributed the Neo4j extension to the Quarkus project. That extension worked nicely in native mode and stopped doing so since Quarkus upgraded to GraalVM 19.3 (See related issue here quarkusio/quarkus#6115).

A little background to our driver: We shade Netty and have some substitutions in place to make SSL and some reflections work, which you can find here:

The bug happens when the driver receives a message from the server in a pack stream format. I have a fully runnable example here https://github.com/michael-simons/neo4j-java-driver-native-example, but you need to have a running Neo4j instance to make use of it.

I was however able to pin point our issue to the code inside Unpacker, which you find here: https://github.com/neo4j/neo4j-java-driver/blob/4.0.0/driver/src/main/java/org/neo4j/driver/internal/packstream/PackStream.java#L401

In GraalVM prior to 19.3, regardless of -H:Optimize=n settings, the GraalVM native image compiler produced working code. Since 19.3, the various methods, and I think the partial switch statements, seem to fold into one which lead to wrong exceptions.

You'll find attached a reproducer, that doesn't need a running Neo4j instance. It's only dependency is io.netty:netty-buffer and I create the scenario when the message handler receives a welcome message from our server. The class optimizationissue.Reproducer creates a Netty ByteBuf from the message that the server sent and decodes it as the original Neo4j driver would do. It uses a copy of PackStream$Unpacker that I linked above.

The message starts with a struct header, so we start by calling optimizationissue.PackStream.Unpacker#unpackStructHeader. And while the message fit's a struct header, some other branch seems to be called.

The expected output is {server=Neo4j/4.0.0-rc01, connection_id=bolt-536}.

If you add -H:Optimize=0 to the native image generation, things work as expected.

Find the complete project attached and thanks for looking into this.

graal-issue.zip

graal-issue-no-netty-dependencies.zip

michael-simons added a commit to michael-simons/neo4j-java-driver that referenced this issue Dec 16, 2019
@eginez
Copy link
Contributor

eginez commented Dec 16, 2019

Hi @michael-simons is this a problem with jdk8 or jdk11?

@michael-simons
Copy link
Author

The reproducer above was created and verified to fail on macOS 10.13 with GraalVM 19.3 CE JDK 8.

19.2 and prior it works as expected.

The people in the linked Quarkus Ticket using the Neo4j driver are probably on Linux or at least Quarkus’ CI is.

@michael-simons
Copy link
Author

Just mvn clean package && ./target/reproducer should be enough to end with a stack trace and not with the expected exception.

@michael-simons
Copy link
Author

Hi @eginez I can confirm that this happens on

  • macOS Graal 19.3 JDK 11
  • Linux amd64 Graal 19.3 JDK 8 and JDK 11

as well.

I also wanted to exclude the possibility that our usage of Netty leads to the error, so I created a second reproducer without Netty as dependency. This is attached to the original comment as https://github.com/oracle/graal/files/3972548/graal-issue-no-netty-dependencies.zip. Hope that helps. Thank you!

@eginez
Copy link
Contributor

eginez commented Dec 17, 2019

@michael-simons yes I was able to replicate the problem even in the head of master(building from source). We are working on it

@thomaswue thomaswue added this to the 19.3.1 milestone Dec 18, 2019
@vjovanov vjovanov added the bug label Dec 18, 2019
@eginez
Copy link
Contributor

eginez commented Dec 19, 2019

@michael-simons we have found the source of the bug, we are in the process of running verifications on the fix. A commit with it will be merged shortly

@eginez
Copy link
Contributor

eginez commented Dec 19, 2019

@michael-simons the fix just landed in the master branch 276ff2b

Feel free to verify it against the downstream dependencies

@michael-simons
Copy link
Author

I’ll gladly do this. Do you have snapshot builds or should I build GraalVM myself?

@eginez
Copy link
Contributor

eginez commented Dec 19, 2019

Fastest option is to build from source, until we have a build with it

@michael-simons
Copy link
Author

I can confirm that the Neo4j driver works correctly again with 19.3.0.2. Thanks so much for making this a priority.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants