Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8312433: HttpClient request fails due to connection being considered idle and closed #15012

Closed
wants to merge 5 commits into from

Conversation

jaikiran
Copy link
Member

@jaikiran jaikiran commented Jul 25, 2023

Can I please get a review of this change which proposes to fix the issue noted in https://bugs.openjdk.org/browse/JDK-8312433?

In JDK 20, we introduced a way to identify and close idle HTTP2 connections in the HttpClient implementation through https://bugs.openjdk.org/browse/JDK-8288717. Whenever a HTTP2 stream gets closed and if there are no other streams on the connection, we create a idle timeout event that times out after a (configurable) duration. After the event has been registered, if any more streams get opened on that connection, then the event is cancelled. If no streams are created during that period, then the event fires and we close the connection.
We have a race condition in this implementation, as noticed in https://bugs.openjdk.org/browse/JDK-8312433. When a HTTP2 connection is created, we pool that connection. Any subsequent request against the same target server will reuse this pooled connection if it's still open. When user code issues a request, we checkout a connection from this pool (if relevant) and then use that connection to create a HTTP2 stream and issue the request. This sequence has a small window of time, where a pooled connection which has a idle timeout event scheduled (because it had no active streams) is handed out of the pool and before it can create a new HTTP2 stream, gets closed (concurrently by the idle connection timeout event). This results in the HTTP2 stream creation to fail with an exception that effectively fails the user initiated request.

The commit in this PR addresses it by introducing an internal method tryReserveForPoolCheckout() on the Http2Connection which is responsible for co-ordinating with the idle connection management to ensure that if this connection does get handed out of the pool, then the idle connection event doesn't end up closing it.

A new jtreg test has been added to test this issue. This test consistently reproduces the issue without this fix and passes with this fix. Ideally, a new test method could have been added to the existing test/jdk/java/net/httpclient/http2/IdleConnectionTimeoutTest.java but that test has multiple @run with different timeout values ranging all the way upto 30 seconds and it focuses more on the timeout value parsing and honouring that value. Introducing this test method in that existing test would end up testing this against all those values for no reason, plus would also end up increasing the duration of that test. So I decided to introduce a new test which uses a much lower timeout value.

tier1, tier2 and tier3 testing has passed with this change and multiple test runs of test/jdk/java/net/httpclient too has passed with this change.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8312433: HttpClient request fails due to connection being considered idle and closed (Bug - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/15012/head:pull/15012
$ git checkout pull/15012

Update a local copy of the PR:
$ git checkout pull/15012
$ git pull https://git.openjdk.org/jdk.git pull/15012/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 15012

View PR using the GUI difftool:
$ git pr show -t 15012

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/15012.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Jul 25, 2023

👋 Welcome back jpai! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Jul 25, 2023
@openjdk
Copy link

openjdk bot commented Jul 25, 2023

@jaikiran The following label will be automatically applied to this pull request:

  • net

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the net net-dev@openjdk.org label Jul 25, 2023
@mlbridge
Copy link

mlbridge bot commented Jul 25, 2023

Webrevs

@@ -131,7 +131,7 @@ class Http2Connection {

private static final int MAX_CLIENT_STREAM_ID = Integer.MAX_VALUE; // 2147483647
private static final int MAX_SERVER_STREAM_ID = Integer.MAX_VALUE - 1; // 2147483646
private IdleConnectionTimeoutEvent idleConnectionTimeoutEvent; // may be null
private volatile IdleConnectionTimeoutEvent idleConnectionTimeoutEvent; // may be null
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the reason for this change? idleConnectionTimerEvent is always protected by stateLock.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello Daniel, you are right. I missed cleaning up that part from my early experiments. I'll fix that and do the necessary changes shortly.

Copy link
Member

@djelinski djelinski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Jaikiran, thanks for taking care of this. Changes look good, except for volatile usage that looks unnecessary.

@@ -196,31 +196,65 @@ class Http2Connection {
// and has not sent the final stream flag
final class IdleConnectionTimeoutEvent extends TimeoutEvent {

private boolean fired;
// expected to be accessed/updated with "stateLock" being held
private volatile boolean cancelled;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cancelled doesn't need to be volatile if it's consistently protected by the same lock

@jaikiran
Copy link
Member Author

jaikiran commented Jul 27, 2023

I've now updated the PR to remove the update which used volatile and cleaned up the usage of the idleConnectionTimeoutEvent field a bit. Tests continue to pass with this change.

Copy link
Member

@djelinski djelinski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@openjdk
Copy link

openjdk bot commented Jul 27, 2023

@jaikiran This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8312433: HttpClient request fails due to connection being considered idle and closed

Reviewed-by: djelinski

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been no new commits pushed to the master branch. If another commit should be pushed before you perform the /integrate command, your PR will be automatically rebased. If you prefer to avoid any potential automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jul 27, 2023
@jaikiran
Copy link
Member Author

Thank you Daniel for the review. CI tests passed with these changes. I'll go ahead with integrating this.

@jaikiran
Copy link
Member Author

/integrate

@openjdk
Copy link

openjdk bot commented Jul 27, 2023

Going to push as commit 486c784.
Since your change was applied there have been 2 commits pushed to the master branch:

  • 271417a: 8312579: [JVMCI] JVMCI support for virtual Vector API objects
  • 44576a7: 8312466: /bin/nm usage in AIX makes needs -X64 flag

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Jul 27, 2023
@openjdk openjdk bot closed this Jul 27, 2023
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Jul 27, 2023
@openjdk
Copy link

openjdk bot commented Jul 27, 2023

@jaikiran Pushed as commit 486c784.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@jaikiran jaikiran deleted the 8312433 branch July 27, 2023 12:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integrated Pull request has been integrated net net-dev@openjdk.org
2 participants