8373409: java/net/httpclient/http3/H3ErrorHandlingTest.java failed due to deadlock#28788
8373409: java/net/httpclient/http3/H3ErrorHandlingTest.java failed due to deadlock#28788djelinski wants to merge 4 commits intoopenjdk:masterfrom
Conversation
|
👋 Welcome back djelinski! A progress list of the required criteria for merging this PR into |
|
@djelinski This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be: You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 85 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
|
@djelinski The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
|
I checked all other uses of EDIT - checked the |
Webrevs
|
| DrainingConnection draining = new DrainingConnection(connection.connectionIds(), idleTimeout); | ||
| // we can ignore stateless reset in the draining state. | ||
| remapPeerIssuedResetToken(connection, draining); | ||
| draining.startTimer(); |
There was a problem hiding this comment.
shouldn't we start the timer only if the connection has been added, and therefore call startTimer in compute?
There was a problem hiding this comment.
I'd rather not call this from compute; every method called from the compute lambda increases the risk of reintroducing the deadlock, so I'd like to keep the lambda to a minimum.
Most of the time the connection will be added; the only case where it won't is when there are multiple threads attempting to close the connection in parallel. The timer task only removes the connection from the endpoint, so, worst case, we will remove the connection IDs and the reset tokens twice. The second removal will likely be a no-op, unless we somehow manage to reassign the IDs or the tokens to a different connection.
There was a problem hiding this comment.
A look at startTimer lets me think that it should be safe, but OK. I suspect the double removal would not be an issue - IIRC we check with == on the removed connection, but we would be potentially adding an event to the timer queue which will wake up the timer queue for nothing. We could add a boolean field to ClosingConnection that we could set to true if the connection is added and check that after compute has been called, and start the timer then. I'll let you decide if it's worth it.
There was a problem hiding this comment.
We don't check with ==; technically we could, but with multiple threads accessing the map, I'm not sure if we would guarantee that all connection IDs are unmapped when the connection is removed.
Multiple threads racing to update the connection map should be rare. Most of the time the compute calls will replace the connection. so the extra check is probably not worth the effort.
I'll move the startTimer call after the connection map updates; I observed occasional failures to update the map, because the timer fired before the map was updated.
| connection.localConnectionIdManager().close(); | ||
| var closingConnection = new ClosingConnection(connection.connectionIds(), idleTimeout, datagram); | ||
| remapPeerIssuedResetToken(connection, closingConnection); | ||
| closingConnection.startTimer(); |
Thanks for checking the other uses of |
|
Thanks @dfuch for the review! /integrate |
|
Going to push as commit 386ad61.
Your commit was automatically rebased without conflicts. |
|
@djelinski Pushed as commit 386ad61. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
This PR fixes a deadlock between the
localConnectionIdManagerand theconnectionsmap by closing the manager before callingconnections.compute.No new tests; the issue requires a complex setup to reproduce, and the new code is easy enough to reason about. Existing tests continue to pass.
Progress
Issue
Reviewers
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/28788/head:pull/28788$ git checkout pull/28788Update a local copy of the PR:
$ git checkout pull/28788$ git pull https://git.openjdk.org/jdk.git pull/28788/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 28788View PR using the GUI difftool:
$ git pr show -t 28788Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/28788.diff
Using Webrev
Link to Webrev Comment