New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rails csfr_meta_tag requires minutes to render due to jruby use of /dev/random on a virtual machine #4685

Closed
roboyeti opened this Issue Jun 23, 2017 · 6 comments

Comments

Projects
None yet
4 participants
@roboyeti

roboyeti commented Jun 23, 2017

Environment

  • jruby 1.5.6 (ruby 1.8.7 patchlevel 249) (2014-02-03 6586) (OpenJDK 64-Bit Server VM 1.7.0_131) [amd64-java]
  • rails 4.2.8
  • Linux XXXXXXXX 3.13.0-32-generic #57-Ubuntu SMP Tue Jul 15 03:51:08 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
  • System is a virtual machine

Expected Behavior

Expected that after conversion of MRI to jruby on rails application for it to render views in a reasonable amount of time (at least as fast as MRI rendering) on a virtual machine.

Actual Behavior

Client requests to even a basic rails view using a layout that includes csrf_meta_tag may require several minutes to render. Welcome#index works fine, as does using "render layout: false" in controller method, but including the line <%= csrf_meta_tags %> will drop response time to minutes per request at best. Presumably, /dev/random is used by jruby and is not getting populated fast enough on a virtual machine.

As common as VM are, how bad blocking on /dev/random is, and how common csrf_meta_tags are used in rails, a better out of the box solution should be provided in jruby. Really, /dev/urandom should probably be used (if not initially, then as a fallback to /dev/random blocking). Most of the "concerns" proposed on the man page of /dev/urandom are extremely unlikely or situational (see links below).

Failing that, perhaps a timeout on /dev/random and throwing a warning to that effect would be great before trying /dev/random again, as the problem can be difficult to isolate and throwing warnings in a read loop with a timeout seems a reasonable way to inform jruby users why their software is just suddenly frozen for unpredictable (or at least not easily predictable) amounts of time, but works fine under MRI.

See:
https://www.2uo.de/myths-about-urandom/
https://unix.stackexchange.com/questions/324209/when-to-use-dev-random-vs-dev-urandom

Used work around.

Installing "haveged" resolved the issue as a work around.

Related issues (closed ... but not very resolved by 9.1.12.0... ): #1896

@roboyeti roboyeti changed the title from Jruby + rails csfr_meta_tag requires minutes to render due to jruby use of /dev/random on a virtual machine to Rails csfr_meta_tag requires minutes to render due to jruby use of /dev/random on a virtual machine Jun 23, 2017

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Jun 23, 2017

Member

I hate to give bad news, but obviously we're not going to be patching anything for a JRuby 1.5.x release. 😄

As you've found this is a known issue that affects not only JRuby but many other random number generators. We continue to wrestle with the crypto libraries on the JVM, trying to find the right balance of RNG and PRNG but there are many combinations.

As an example, we recently thought we'd solved this by explicitly using an OpenSSL-like PRNG seeded from /dev/urandom. It stopped our blocking for a while. Then a configuration change in the JVM seemed to start using /dev/random again. Then an update of our Bouncy Castle crypto library started using /dev/random in that library as well. It's rather frustrating.

At this point, given the many variables, we have generally been recommending the haveged workaround. Otherwise there's wiki articles about configuring JVM crypto and other flags to do the right thing. But I'm not sure we have a good way to "solve" this other than educating folks.

If you are interested in prototyping some mechanism for "probing" the RNG to pick the right configuration, we'd happily work with you on that. I expect the main challenge will be that it's hard or impossible to query all the various layers of JVM security to know what they're currently using, and varying ways to configure them to use something specific.

Member

headius commented Jun 23, 2017

I hate to give bad news, but obviously we're not going to be patching anything for a JRuby 1.5.x release. 😄

As you've found this is a known issue that affects not only JRuby but many other random number generators. We continue to wrestle with the crypto libraries on the JVM, trying to find the right balance of RNG and PRNG but there are many combinations.

As an example, we recently thought we'd solved this by explicitly using an OpenSSL-like PRNG seeded from /dev/urandom. It stopped our blocking for a while. Then a configuration change in the JVM seemed to start using /dev/random again. Then an update of our Bouncy Castle crypto library started using /dev/random in that library as well. It's rather frustrating.

At this point, given the many variables, we have generally been recommending the haveged workaround. Otherwise there's wiki articles about configuring JVM crypto and other flags to do the right thing. But I'm not sure we have a good way to "solve" this other than educating folks.

If you are interested in prototyping some mechanism for "probing" the RNG to pick the right configuration, we'd happily work with you on that. I expect the main challenge will be that it's hard or impossible to query all the various layers of JVM security to know what they're currently using, and varying ways to configure them to use something specific.

@kares kares added this to the Won't Fix milestone Jun 24, 2017

@roboyeti

This comment has been minimized.

Show comment
Hide comment
@roboyeti

roboyeti Jun 25, 2017

Well, before you get to excited about the version of jruby this is under, that was a copy and paste mistake from the wrong terminal window, jruby 1.5.6 is the default on the system, but I am actually running the rails app from the following:

jruby 9.1.12.0 (2.3.3) 2017-06-15 33c6439 OpenJDK 64-Bit Server VM 24.131-b00 on 1.7.0_131-b00 +jit [linux-x86_64]

Sorry for the mistake. I was a bit annoyed and exhausted after being up for hours trying to figure out why it wasn't working to hit an impending deployment deadline.

You just have to ask yourselves over there, if rails is the "killer app" of ruby, and right now, with the most basic rails install not only not able to work properly under jruby and the only way to actually find out about the problem is to already kind of know what the problem is in the first place is probably not ideal for converting people into using jruby. A wiki article that requires knowing enough about the issue to find it is probably not the best solution.

I understand it is a frustrating problem, especially as the "man in the middle" role of jruby. I will start thinking about possible options and digging into the issue to see if I can help at all.

roboyeti commented Jun 25, 2017

Well, before you get to excited about the version of jruby this is under, that was a copy and paste mistake from the wrong terminal window, jruby 1.5.6 is the default on the system, but I am actually running the rails app from the following:

jruby 9.1.12.0 (2.3.3) 2017-06-15 33c6439 OpenJDK 64-Bit Server VM 24.131-b00 on 1.7.0_131-b00 +jit [linux-x86_64]

Sorry for the mistake. I was a bit annoyed and exhausted after being up for hours trying to figure out why it wasn't working to hit an impending deployment deadline.

You just have to ask yourselves over there, if rails is the "killer app" of ruby, and right now, with the most basic rails install not only not able to work properly under jruby and the only way to actually find out about the problem is to already kind of know what the problem is in the first place is probably not ideal for converting people into using jruby. A wiki article that requires knowing enough about the issue to find it is probably not the best solution.

I understand it is a frustrating problem, especially as the "man in the middle" role of jruby. I will start thinking about possible options and digging into the issue to see if I can help at all.

@ryanrolland

This comment has been minimized.

Show comment
Hide comment
@ryanrolland

ryanrolland Sep 30, 2017

I believe I have a similar problem going on trying to access a '.js.erb' resource via link_to (remote) in rails 4 and jruby 2.3.3 (linux vmware host). I am getting a random (~1min) delay accessing this resource. I am also using OpenJDK so thinking that might be related. Using 'jstack' I was able to pull up the below stack trace (references IO inside bouncycastle lib):
java.lang.Thread.State: RUNNABLE
at java.io.FileInputStream.readBytes(Native Method)
at java.io.FileInputStream.read(FileInputStream.java:255)
at sun.security.provider.SeedGenerator$URLSeedGenerator.getSeedBytes(SeedGenerator.java:539)
at sun.security.provider.SeedGenerator.generateSeed(SeedGenerator.java:144)
at sun.security.provider.SecureRandom.engineGenerateSeed(SecureRandom.java:139)
at java.security.SecureRandom.generateSeed(SecureRandom.java:533)
at org.bouncycastle.crypto.prng.BasicEntropySourceProvider$1.getEntropy(Unknown Source)
at org.bouncycastle.crypto.prng.drbg.HashSP800DRBG.getEntropy(Unknown Source)
at org.bouncycastle.crypto.prng.drbg.HashSP800DRBG.reseed(Unknown Source)
at org.bouncycastle.crypto.prng.drbg.HashSP800DRBG.generate(Unknown Source)
at org.bouncycastle.crypto.prng.SP800SecureRandom.nextBytes(Unknown Source)
- locked <0x00000000f13c7948> (a org.bouncycastle.crypto.prng.SP800SecureRandom)
at org.bouncycastle.jcajce.provider.drbg.DRBG$Default.engineNextBytes(Unknown Source)
at java.security.SecureRandom.nextBytes(SecureRandom.java:468)
at org.jruby.ext.openssl.Random.generate(Random.java:314)
at org.jruby.ext.openssl.Random.random_bytes(Random.java:288)
at org.jruby.ext.openssl.Cipher.random_iv(Cipher.java:1333)
at org.jruby.ext.openssl.Cipher$INVOKER$i$0$0$random_iv.call(Cipher$INVOKER$i$0$0$random_iv.gen)
at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:129)
at org.jruby.ir.interpreter.InterpreterEngine.processCall(InterpreterEngine.java:339)
at org.jruby.ir.interpreter.StartupInterpreterEngine.interpret(StartupInterpreterEngine.java:73)
at org.jruby.ir.interpreter.InterpreterEngine.interpret(InterpreterEngine.java:83)
at org.jruby.internal.runtime.methods.MixedModeIRMethod.INTERPRET_METHO

ryanrolland commented Sep 30, 2017

I believe I have a similar problem going on trying to access a '.js.erb' resource via link_to (remote) in rails 4 and jruby 2.3.3 (linux vmware host). I am getting a random (~1min) delay accessing this resource. I am also using OpenJDK so thinking that might be related. Using 'jstack' I was able to pull up the below stack trace (references IO inside bouncycastle lib):
java.lang.Thread.State: RUNNABLE
at java.io.FileInputStream.readBytes(Native Method)
at java.io.FileInputStream.read(FileInputStream.java:255)
at sun.security.provider.SeedGenerator$URLSeedGenerator.getSeedBytes(SeedGenerator.java:539)
at sun.security.provider.SeedGenerator.generateSeed(SeedGenerator.java:144)
at sun.security.provider.SecureRandom.engineGenerateSeed(SecureRandom.java:139)
at java.security.SecureRandom.generateSeed(SecureRandom.java:533)
at org.bouncycastle.crypto.prng.BasicEntropySourceProvider$1.getEntropy(Unknown Source)
at org.bouncycastle.crypto.prng.drbg.HashSP800DRBG.getEntropy(Unknown Source)
at org.bouncycastle.crypto.prng.drbg.HashSP800DRBG.reseed(Unknown Source)
at org.bouncycastle.crypto.prng.drbg.HashSP800DRBG.generate(Unknown Source)
at org.bouncycastle.crypto.prng.SP800SecureRandom.nextBytes(Unknown Source)
- locked <0x00000000f13c7948> (a org.bouncycastle.crypto.prng.SP800SecureRandom)
at org.bouncycastle.jcajce.provider.drbg.DRBG$Default.engineNextBytes(Unknown Source)
at java.security.SecureRandom.nextBytes(SecureRandom.java:468)
at org.jruby.ext.openssl.Random.generate(Random.java:314)
at org.jruby.ext.openssl.Random.random_bytes(Random.java:288)
at org.jruby.ext.openssl.Cipher.random_iv(Cipher.java:1333)
at org.jruby.ext.openssl.Cipher$INVOKER$i$0$0$random_iv.call(Cipher$INVOKER$i$0$0$random_iv.gen)
at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:129)
at org.jruby.ir.interpreter.InterpreterEngine.processCall(InterpreterEngine.java:339)
at org.jruby.ir.interpreter.StartupInterpreterEngine.interpret(StartupInterpreterEngine.java:73)
at org.jruby.ir.interpreter.InterpreterEngine.interpret(InterpreterEngine.java:83)
at org.jruby.internal.runtime.methods.MixedModeIRMethod.INTERPRET_METHO

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Nov 10, 2017

Member

@roboyeti @ryanrolland Circling back to this for a moment...

I am not sure what the best solution is. Apparently most Java folks don't run into this because otherwise there'd be more posts about what to do.

One thing I can recommend to all of you is this flag:

-Djava.security.egd=file:/dev/urandom

This tells the JDK to use urandom instead of random, which generally avoids the entropy exhaustion problem on Linux.

What To Do?

At this point I think we have a few options:

  • When running from our launcher, we could always specify the java.security.egd flag. It is generally accepted by most security folks I've encountered that urandom is fine to use, so this isn't really a major security issue. And to @roboyeti's point, the experience we provide when you don't know about this problem is horrible.
  • We could try to do a better job of determining what source is actually being used at runtime, and twiddle it to do something different. This may require invasive reflection, however, since these implementation details are not generally exposed to user code (perhaps with good reason).
  • We could ditch our Java-wrapping SecureRandom and do something closer to what MRI does, using urandom directly with a PRNG, perhaps one from the BouncyCastle library we ship for SSL support. This would probably be the most work.

Ping @enebo since this continues to bite people.

Member

headius commented Nov 10, 2017

@roboyeti @ryanrolland Circling back to this for a moment...

I am not sure what the best solution is. Apparently most Java folks don't run into this because otherwise there'd be more posts about what to do.

One thing I can recommend to all of you is this flag:

-Djava.security.egd=file:/dev/urandom

This tells the JDK to use urandom instead of random, which generally avoids the entropy exhaustion problem on Linux.

What To Do?

At this point I think we have a few options:

  • When running from our launcher, we could always specify the java.security.egd flag. It is generally accepted by most security folks I've encountered that urandom is fine to use, so this isn't really a major security issue. And to @roboyeti's point, the experience we provide when you don't know about this problem is horrible.
  • We could try to do a better job of determining what source is actually being used at runtime, and twiddle it to do something different. This may require invasive reflection, however, since these implementation details are not generally exposed to user code (perhaps with good reason).
  • We could ditch our Java-wrapping SecureRandom and do something closer to what MRI does, using urandom directly with a PRNG, perhaps one from the BouncyCastle library we ship for SSL support. This would probably be the most work.

Ping @enebo since this continues to bite people.

headius added a commit that referenced this issue Nov 10, 2017

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Nov 10, 2017

Member

Ok, I went ahead and did it. Seems like this just needs to be the default given the endless problems we have had. The patch in 1a4f888 can be done to any JRuby version, for those following along. We'll also need to get this into the native jruby-launcher before we can call this good.

Member

headius commented Nov 10, 2017

Ok, I went ahead and did it. Seems like this just needs to be the default given the endless problems we have had. The patch in 1a4f888 can be done to any JRuby version, for those following along. We'll also need to get this into the native jruby-launcher before we can call this good.

@headius headius modified the milestones: Won't Fix, JRuby 9.1.15.0 Nov 10, 2017

headius added a commit to jruby/jruby-launcher that referenced this issue Nov 12, 2017

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Nov 12, 2017

Member

Ok, I have updated JRuby's bash script and the native launcher gem (gem install jruby-launcher) to both force the entropy source to /dev/urandom.

I did not update the Windows executable, but I don't believe this property does anything in Windows anyway.

Going to call this fixed, as of JRuby 9.1.15 and jruby-launcher 1.1.3.

Member

headius commented Nov 12, 2017

Ok, I have updated JRuby's bash script and the native launcher gem (gem install jruby-launcher) to both force the entropy source to /dev/urandom.

I did not update the Windows executable, but I don't believe this property does anything in Windows anyway.

Going to call this fixed, as of JRuby 9.1.15 and jruby-launcher 1.1.3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment