Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add locking around read/write in NativeSsl. #490

Merged
merged 1 commit into from
May 16, 2018
Merged

Conversation

flooey
Copy link
Contributor

@flooey flooey commented May 15, 2018

This uses a read/write lock around the ssl instance variable for
NativeSsl. The write lock is only taken during close(), where ssl is
cleared, so all other operations can proceed in parallel with one
another. I only added locking to the read- and write-style methods in
the class, rather than to methods that only read or write a property,
since the latter tend to be used only right when the SSL is created
and it would add a lot of noise to the code to lock everywhere, but
it's possible we want to add that as well for complete safety.

This should solve some longstanding but infrequent crashes we've seen
that involve race conditions with finalizers and other related
situations.

Fixes #455.

This uses a read/write lock around the ssl instance variable for
NativeSsl.  The write lock is only taken during close(), where ssl is
cleared, so all other operations can proceed in parallel with one
another.  I only added locking to the read- and write-style methods in
the class, rather than to methods that only read or write a property,
since the latter tend to be used only right when the SSL is created
and it would add a lot of noise to the code to lock everywhere, but
it's possible we want to add that as well for complete safety.

This should solve some longstanding but infrequent crashes we've seen
that involve race conditions with finalizers and other related
situations.

Fixes google#455.
@flooey flooey requested a review from nmittler May 15, 2018 15:32
@flooey
Copy link
Contributor Author

flooey commented May 15, 2018

The benchmark numbers here are suspiciously good. Specifically, they indicate a net positive effect on performance when I ran them (and I ran them again because I didn't believe it the first time, with a similar result, this is the second set). I assume that's probably a side effect of something on my machine happening to execute at the same time as the reference benchmark but not with the new code, but it should indicate that the cost of this isn't particularly high.

      NONE    64         CONSCRYPT   17688345.778 ±  3111118.874  ops/s (+13.31%)
      NONE    64  CONSCRYPT_ENGINE   21273607.537 ±  3189187.740  ops/s ( +0.96%)
      NONE   512         CONSCRYPT  114418552.695 ± 17302028.251  ops/s ( -5.82%)
      NONE   512  CONSCRYPT_ENGINE  148139832.426 ±  8139463.613  ops/s ( -0.28%)
      NONE  4096         CONSCRYPT  553848173.878 ± 23717893.410  ops/s ( +6.62%)
      NONE  4096  CONSCRYPT_ENGINE  586239295.383 ± 32966261.566  ops/s ( +4.44%)
NO_CHANNEL    64         CONSCRYPT   15810682.732 ±  1267327.055  ops/s ( +2.39%)
NO_CHANNEL    64  CONSCRYPT_ENGINE   23022021.773 ±  4186527.252  ops/s ( +8.68%)
NO_CHANNEL   512         CONSCRYPT  114978083.465 ± 18055875.469  ops/s ( +3.37%)
NO_CHANNEL   512  CONSCRYPT_ENGINE  150916941.956 ± 14622379.269  ops/s ( -0.01%)
NO_CHANNEL  4096         CONSCRYPT  542585253.944 ± 51784004.786  ops/s ( -3.49%)
NO_CHANNEL  4096  CONSCRYPT_ENGINE  573461370.069 ± 61598200.653  ops/s ( +4.91%)
   CHANNEL    64         CONSCRYPT   18837200.891 ±  3417826.217  ops/s ( +3.81%)
   CHANNEL    64  CONSCRYPT_ENGINE   19933630.917 ±   931112.292  ops/s ( +5.08%)
   CHANNEL   512         CONSCRYPT  132525078.943 ± 16666955.880  ops/s ( +0.61%)
   CHANNEL   512  CONSCRYPT_ENGINE  124909429.417 ±  7940473.947  ops/s ( -5.52%)
   CHANNEL  4096         CONSCRYPT  581208155.722 ± 62190229.829  ops/s ( +6.37%)
   CHANNEL  4096  CONSCRYPT_ENGINE  545679332.419 ± 39780463.154  ops/s ( -8.59%)
Total diff: +36.85%

try {
if (isClosed() || fd == null || !fd.valid()) {
throw new SocketException("Socket is closed");
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This if () block is repeated in several places. Why not extract it out into a separate function ? (maybe with an "@GuardedBy" ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems like a good idea for a future change, yeah.

return NativeCrypto.ENGINE_SSL_do_handshake(ssl, this, handshakeCallbacks);
lock.readLock().lock();
try {
return NativeCrypto.ENGINE_SSL_do_handshake(ssl, this, handshakeCallbacks);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do some methods have the isClosed check whereas others do not ? (other examples, readDirectByteBuffer, writeDirectByteBuffer and forceRead)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pattern is that the methods used by SSLSocket have the check and the ones used by SSLEngine don't, but they should probably all check for being closed. (Note that it's a case of which exception is thrown rather than correctness, as the native code checks for ssl == 0 and throws NullPointerException in that case. It'd be better to throw SSLException if it's closed.) Another change I'd like to push to a future change.

@flooey
Copy link
Contributor Author

flooey commented May 16, 2018

@narayank As a general comment, I'd like to keep this as small as possible so it can be cherry-picked with minimal conflicts and risk, so I'd rather put refactorings into another change.

Copy link
Collaborator

@narayank narayank left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think that the extraction of the exception throwing "if" into a separate function could have been done in this change, but don't feel too strongly about it.

@flooey
Copy link
Contributor Author

flooey commented May 16, 2018

Alright, I reran the benchmarks on a GCE instance, which should hopefully be more reproducible, and got more sensible results with an average slowdown of 0.50%. I think that's a completely acceptable cost for safety from race conditions, so I'm going to commit this.

      NONE    64         CONSCRYPT   15840794.891 ±   177249.518  ops/s ( -2.03%)
      NONE    64  CONSCRYPT_ENGINE   19785608.294 ±   359805.154  ops/s ( +6.38%)
      NONE   512         CONSCRYPT  105130527.679 ±  6637201.340  ops/s ( -2.40%)
      NONE   512  CONSCRYPT_ENGINE  123731249.013 ±  3022181.491  ops/s ( +1.23%)
      NONE  4096         CONSCRYPT  379936100.486 ± 14893116.887  ops/s ( +1.38%)
      NONE  4096  CONSCRYPT_ENGINE  387862424.866 ± 29583017.114  ops/s ( +1.05%)
NO_CHANNEL    64         CONSCRYPT   15792030.955 ±   266760.226  ops/s ( -1.70%)
NO_CHANNEL    64  CONSCRYPT_ENGINE   19274677.547 ±   819181.444  ops/s ( -6.39%)
NO_CHANNEL   512         CONSCRYPT   96848260.291 ±  7593676.189  ops/s ( -1.76%)
NO_CHANNEL   512  CONSCRYPT_ENGINE  124166548.959 ±  3571961.449  ops/s ( +2.64%)
NO_CHANNEL  4096         CONSCRYPT  371600657.432 ± 21942165.965  ops/s ( -2.14%)
NO_CHANNEL  4096  CONSCRYPT_ENGINE  381451289.512 ± 25145805.846  ops/s ( -7.04%)
   CHANNEL    64         CONSCRYPT   15105381.731 ±  3949621.294  ops/s ( -9.48%)
   CHANNEL    64  CONSCRYPT_ENGINE   19340130.747 ±   525339.894  ops/s ( +2.56%)
   CHANNEL   512         CONSCRYPT  112860409.783 ±  2730012.452  ops/s ( +4.42%)
   CHANNEL   512  CONSCRYPT_ENGINE  109600841.619 ± 29698766.004  ops/s ( -4.36%)
   CHANNEL  4096         CONSCRYPT  408662983.396 ± 13115129.241  ops/s ( +4.80%)
   CHANNEL  4096  CONSCRYPT_ENGINE  379264813.630 ±  4637778.248  ops/s ( +3.92%)
Average diff: -0.50%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Possible race condition in ssl pointer free vs finalizer
2 participants