Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #4462 - Prevent jetty 10 WebSocket close deadlocks #4472

Merged
merged 25 commits into from
Jan 28, 2020

Conversation

lachlan-roberts
Copy link
Contributor

@lachlan-roberts lachlan-roberts commented Jan 10, 2020

Closes #4462

Signed-off-by: Lachlan Roberts <lachlan@webtide.com>
Signed-off-by: Lachlan Roberts <lachlan@webtide.com>
Signed-off-by: Lachlan Roberts <lachlan@webtide.com>
Signed-off-by: Lachlan Roberts <lachlan@webtide.com>
Signed-off-by: Lachlan Roberts <lachlan@webtide.com>
Signed-off-by: Lachlan Roberts <lachlan@webtide.com>
Signed-off-by: Lachlan Roberts <lachlan@webtide.com>
@lachlan-roberts lachlan-roberts marked this pull request as ready for review January 13, 2020 23:25
@joakime
Copy link
Contributor

joakime commented Jan 13, 2020

Something is going on with this build / test.
It's taking far too long (currently at 1 hour 47 minutes and counting)

Signed-off-by: Lachlan Roberts <lachlan@webtide.com>
@joakime
Copy link
Contributor

joakime commented Jan 14, 2020

Last change breaks 8 test cases.

Signed-off-by: Lachlan Roberts <lachlan@webtide.com>
Signed-off-by: Lachlan Roberts <lachlan@webtide.com>
Signed-off-by: Lachlan Roberts <lachlan@webtide.com>
Signed-off-by: Lachlan Roberts <lachlan@webtide.com>
Signed-off-by: Lachlan Roberts <lachlan@webtide.com>
Signed-off-by: Lachlan Roberts <lachlan@webtide.com>
Signed-off-by: Lachlan Roberts <lachlan@webtide.com>
@lachlan-roberts
Copy link
Contributor Author

I changed it so we no longer fail the callback with a ISE if a frame is sent and we are closed, now it is a ClosedChannelException which seems more correct. The javax API close throws IOException so now if you do a second close you will get the ClosedChannelException.

@sbordet
Copy link
Contributor

sbordet commented Jan 17, 2020

@lachlan-roberts I would like to see, if it's not there yet, a test that cements the behavior of core for closes: that for 2 concurrent closes, the loser gets an exception thrown.

Similarly, I would like to see a test that proves that it's still possible to write from onWebSocketClose() if the close is normal (if the test is not there already).

Would be great if you can test concurrency of flushBatch() and close() as well and again cement the expected behavior in a test.

Signed-off-by: Lachlan Roberts <lachlan@webtide.com>
Signed-off-by: Lachlan Roberts <lachlan@webtide.com>
@lachlan-roberts
Copy link
Contributor Author

@simone I introduced the extra tests you wanted and addressed the review points.

Similarly, I would like to see a test that proves that it's still possible to write from onWebSocketClose() if the close is normal (if the test is not there already).

It is never possible to write from onWebSocketClose(), even when the close is normal, this is notified when the close handshake is completely done and the connection is closed and not when you just received a close frame.

@sbordet
Copy link
Contributor

sbordet commented Jan 20, 2020

It is never possible to write from onWebSocketClose(), even when the close is normal, this is notified when the close handshake is completely done and the connection is closed and not when you just received a close frame.

That is so wrong.
Applications must be able to send messages from onClose events, things like "last processed item" and such. Otherwise there is no point in notifying the event, if applications cannot do anything from the callback apart logging.

Signed-off-by: Lachlan Roberts <lachlan@webtide.com>
Signed-off-by: Lachlan Roberts <lachlan@webtide.com>
Signed-off-by: Lachlan Roberts <lachlan@webtide.com>
Signed-off-by: Lachlan Roberts <lachlan@webtide.com>
Signed-off-by: Lachlan Roberts <lachlan@webtide.com>
b.block(getBlockingTimeout(), TimeUnit.MILLISECONDS);
}

private long getBlockingTimeout()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we should repeat the pattern (mistake?) of HTTP here. With HTTP we have a separately configurable blocking timeout, that only if not set do we default to idleTimeout. This way we can handle (if need be) the timeouts for sending a huge frame that might take longer than the idle timeout, even though it is never actually idle.

However, in HTTP, we have deprecated the blocking timeout in favour of the data rate mechanisms.... Which we don't really have here (but are probably good ideas for @ extensions?), but it may still be good to have to turn off the blocking timeout if we have confidence in our callback mechanism.

@sbordet - thoughts? I'm only making this a comment as I don't think we should hold this PR up while we ponder this.

Copy link
Contributor

@gregw gregw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - still lots of questions about this, but I think we should merge and resolve in other PRs

@lachlan-roberts lachlan-roberts merged commit f6fd3c4 into jetty-10.0.x Jan 28, 2020
@lachlan-roberts lachlan-roberts deleted the jetty-10.0.x-4462-WSCloseWithLockHeld branch January 28, 2020 22:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

WebSocketSession.close() calls onWebSocketClose() with lock held
4 participants