Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent NullPointerException under 9.1.1.0/9.1.2.0 #3999

Closed
abiacco opened this issue Jul 7, 2016 · 19 comments
Closed

Intermittent NullPointerException under 9.1.1.0/9.1.2.0 #3999

abiacco opened this issue Jul 7, 2016 · 19 comments

Comments

@abiacco
Copy link

@abiacco abiacco commented Jul 7, 2016

I'm running my jruby app as a war under Tomcat 8.0.35.
Under 9.1.1.0 and 9.1.2.0 (i haven't seen the error on <= 9.1.0.0, yet) I will receive, upon war deployment:

LoadError: load error: ethon/curls/options -- java.lang.NullPointerException: null
require at org/jruby/RubyKernel.java:944
block in require at WEB-INF/vendor/bundle/gems/activesupport-4.2.6/lib/active_support/dependencies.rb:274
load_dependency at WEB-INF/vendor/bundle/gems/activesupport-4.2.6/lib/active_support/dependencies.rb:240
require at WEB-INF/vendor/bundle/gems/activesupport-4.2.6/lib/active_support/dependencies.rb:274
at WEB-INF/vendor/bundle/gems/ethon-0.8.1/lib/ethon/curl.rb:2
require at org/jruby/RubyKernel.java:944

Now I thought this could be an ethon problem, but i've been using it a while and never saw this error (yet) up until trying 9.1.1.0/9.1.2.0.
In my testing to reproduce this, i am NOT hot deploying. I am stopping tomcat, removing the webapp dir, and starting tomcat every time.
@PetrKaleta mentioned that he was seeing this in issue #3680 under rake. I've only seen it once in rake upon building my webapp (out of a couple dozen builds), every other time it's on deployment.
It happens maybe once every 10 deployments.
Under suggestion from @enebo to enable ir.reading.debug I have posted my full log to:

https://gist.github.com/abiacco/897fb4829b1eb601670c09d9366c51f5

I do have a large log with ir.debug enabled as well. If requested, i prefer to provide that privately.

Thanks

-Tony

Environment

jruby 9.1.2.0 (2.3.0) 2016-05-26 7357c8f Java HotSpot(TM) 64-Bit Server VM 25.91-b14 on 1.8.0_91-b14 +jit [linux-x86_64]

CentOS release 6.7 (Final)
Tomcat 8.0.35
java server jre 8u91
warbler 2.0.3
bundler 1.12.5

Also tested on jruby 9.1.1.0 and java 8u77

Linux app-01.local 2.6.32-573.18.1.el6.x86_64 #1 SMP Tue Feb 9 22:46:17 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

Expected Behavior

No NullPointerException and rack failure :)

Actual Behavior

Intermittent NullPointerException from ethon gem.

@abiacco
Copy link
Author

@abiacco abiacco commented Jul 11, 2016

After some testing on 9.1.0.0 I hit the error on 1 out of the 100 consecutive deploys i did.

@headius
Copy link
Member

@headius headius commented Jul 11, 2016

Intermittent usually means concurrency. I'll see if I can reproduce.

@headius
Copy link
Member

@headius headius commented Jul 11, 2016

Ok, the link you provided for ir.reading.debug output appears to be broken. Perhaps you can update that.

I'd be more interested in getting the following: reproduce with -d -Xbacktrace.style=full -Xjit.threshold=0 passed to JRuby, so we get the LoadError raw output and a full trace of all Ruby and JRuby code together.

Without a NullPointException trace (which -d should give you) I can only guess at this right now.

@abiacco
Copy link
Author

@abiacco abiacco commented Jul 11, 2016

@abiacco
Copy link
Author

@abiacco abiacco commented Jul 11, 2016

I'll try with the additional flags you requested. I can figure out their java/tomcat equivalents, except for -d. maybe -Djruby.debug.fullTrace=true?

@enebo enebo added this to the JRuby 9.1.3.0 milestone Jul 12, 2016
@enebo enebo added the ir label Jul 12, 2016
@enebo
Copy link
Member

@enebo enebo commented Jul 12, 2016

So looking at that gist I can surmise when we prepare this code for compilation we do some data flow analysis (live variable). During that analysis we ask for the data for a particular BB and it returns null. This means something has a reference to and ID in our CFG which was never setup during our DFG construction (missed or dangling edge)

I think DataFlowGraph.buildFlowGraph will follow all nodes in the CFG. Unless we somehow have some errant incoming/outgoing edge referring to a node which no longer exists (which would be an invariant broken in CFG).

@abiacco we really really hope this is reproducible with -Xjit.threshold=0 (from Java end -Djruby.jit.threshold=0)

@abiacco
Copy link
Author

@abiacco abiacco commented Jul 12, 2016

I didn't really understand any of that but ok :)

Last night i ran a 100 deployment test and did not hit the error at all using:

-Djruby.jit.threshold=0 -Djruby.backtrace.style=full -Djruby.debug.fullTrace=true

I started the test again this morning without those 3 params and hit the error on deployment # 7

@abiacco
Copy link
Author

@abiacco abiacco commented Jul 19, 2016

Here's the gist with the extra trace flags

https://gist.github.com/abiacco/fc907d8afc57a56b3c11f17afc065680

@headius
Copy link
Member

@headius headius commented Aug 11, 2016

Can you still reproduce this with a master build (from http://ci.jruby.org)?

If so, I need you to pass -d to JRuby to force it to dump the underlying exception causing the LoadError in your traces. As it is I can't see what line the NPE actually happened at.

@abiacco
Copy link
Author

@abiacco abiacco commented Aug 12, 2016

I'll give it a go.

Regarding the -d, i still don't know how to pass this down through tomcat with a property.

@headius
Copy link
Member

@headius headius commented Aug 12, 2016

For Tomcat, set JVM property "jruby.cli.debug" to "true". That should be equivalent to passing -d.

@abiacco
Copy link
Author

@abiacco abiacco commented Aug 13, 2016

roger

@abiacco
Copy link
Author

@abiacco abiacco commented Aug 15, 2016

@headius whelp, 500 deployments with snapshot and I could not get the error to rear its head.

@headius
Copy link
Member

@headius headius commented Aug 15, 2016

@abiacco Ok. We're pushing to get 9.1.3.0 out so we'll call this fixed for now. If you can reproduce we'll reopen and look into it.

@headius headius closed this Aug 15, 2016
@abiacco
Copy link
Author

@abiacco abiacco commented Aug 16, 2016

@headius Yep yep, that's fair. And i'll re-test once 9.1.3.0 is out obviously. Much appreciated

@JasonLunn
Copy link

@JasonLunn JasonLunn commented Nov 28, 2016

@abiacco - I'm seeing this intermittently on JRuby 9.1.6.0

@PetrKaleta
Copy link

@PetrKaleta PetrKaleta commented Nov 28, 2016

Same here, happening again in JRuby 9.1.6.0

@enebo
Copy link
Member

@enebo enebo commented Nov 29, 2016

@JasonLunn @PetrKaleta can you supply your backtraces? I am trying to make sure this is the same NPE as in https://gist.github.com/abiacco/fc907d8afc57a56b3c11f17afc065680

@abiacco
Copy link
Author

@abiacco abiacco commented Nov 29, 2016

FYI, i haven't seen my NPE anymore, as of yet. i've been using 9.1.5.0 and just started with 9.1.6.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants
You can’t perform that action at this time.