Skip to content

8302635: Race condition in HttpBodySubscriberWrapper when cancelling request #12587

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 6 commits into from

Conversation

dfuch
Copy link
Member

@dfuch dfuch commented Feb 16, 2023

The HttpBodySubscriberWrapper is a class that ensures that a subscriber will be subscribed to before it is completed. It also provides hooks to its two subclasses (one for HTTP/1, one for HTTP/2) that allows subclasses to register the susbscriber with the HttpClient at subscription time, and to unregister it when it is eventualy completed, or when the subscription is cancelled.

There is however a race condition that can happen when a subscription is cancelled: it can happen that unregister is called before register. The CancelRequestTest has been observed failing once or twice on personal jobs. Though the particular mechanics of this race is hard to understand, the logs of the tests have brought sufficient evidence that this is what was happening.

The symptom is finding one subscriber still registered after completion of the exchange:

test CancelRequestTest.testGetSendAsync("https://localhost:42711/https1/x/same/interrupt", true, true): failure
java.lang.AssertionError: WARNING: tracker for HttpClientImpl(13) has outstanding operations:
Pending HTTP Requests: 0
Pending HTTP/1.1 operations: 0
Pending HTTP/2 streams: 0
Pending WebSocket operations: 0
Pending TCP connections: 0
Pending Subscribers: 1
Total pending operations: 0
Facade referenced: true
Selector alive: true

The proposed fix hoist special hooks for register/unregister in the superclass, merges all various volatile boolean states into a single int state, and protect the state changes to subscribed/register/unregister by the same subscription lock.
If cancelling the subscription happens at around the same time that the subscriber is subscribed this ensures that the subscriber won't be removed from the map before it is added.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8302635: Race condition in HttpBodySubscriberWrapper when cancelling request

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/12587/head:pull/12587
$ git checkout pull/12587

Update a local copy of the PR:
$ git checkout pull/12587
$ git pull https://git.openjdk.org/jdk pull/12587/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 12587

View PR using the GUI difftool:
$ git pr show -t 12587

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/12587.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Feb 16, 2023

👋 Welcome back dfuchs! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot changed the title 8302635 8302635: Race condition in HttpBodySubscriberWrapper when cancelling request Feb 16, 2023
@openjdk openjdk bot added the rfr Pull request is ready for review label Feb 16, 2023
@openjdk
Copy link

openjdk bot commented Feb 16, 2023

@dfuch The following label will be automatically applied to this pull request:

  • net

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the net net-dev@openjdk.org label Feb 16, 2023
@mlbridge
Copy link

mlbridge bot commented Feb 16, 2023

Webrevs

static final AtomicLong IDS = new AtomicLong();
final long id = IDS.incrementAndGet();
final BodySubscriber<T> userSubscriber;
final AtomicBoolean completed = new AtomicBoolean();
final AtomicBoolean subscribed = new AtomicBoolean();
volatile int state;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello Daniel, should we make this private and all the newly introduced state values, too? Or maybe you want to leave it package private to be consistent with some other fields here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea.

@@ -125,34 +172,139 @@ private void propagateError(Throwable t) {
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't fully wrapped my head around the possible flow of the user subscription, but is there a possiblity where this call to onError(...) here results in an reentrant call to this propagateError()? For that matter, not just reentrant but perhaps from a different thread concurrently? The reason I ask is, should we call this onError just once? The java.util.concurrent.Flow.Subscriber.onError(Throwable) method says this:

Method invoked upon an unrecoverable error encountered by a
Publisher or Subscription, after which no other Subscriber
methods are invoked by the Subscription.

So I'm wondering if we should maintain some state to only invoke it once?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the purpose of this class: complete() and propagateError() should ensure that onError() is only called once in the wrapped subscriber. The markCompleted() call should ensure that, even if there is a reentrant call.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right - the error propagation through propagateError() always happens through the complete() method of this wrapper class and that method has the necessary state management to call this only once.

static final AtomicLong IDS = new AtomicLong();
final long id = IDS.incrementAndGet();
final BodySubscriber<T> userSubscriber;
final AtomicBoolean completed = new AtomicBoolean();
final AtomicBoolean subscribed = new AtomicBoolean();
volatile int state;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea.

@@ -125,34 +172,139 @@ private void propagateError(Throwable t) {
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the purpose of this class: complete() and propagateError() should ensure that onError() is only called once in the wrapped subscriber. The markCompleted() call should ensure that, even if there is a reentrant call.

…ttpBodySubscriberWrapper.java


make `state` private
Copy link
Member

@jaikiran jaikiran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good to me. HttpBodySubscriberWrapper and Http1Exchange files will need a copyright year update, before integrating.

@openjdk
Copy link

openjdk bot commented Feb 17, 2023

@dfuch This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8302635: Race condition in HttpBodySubscriberWrapper when cancelling request

Reviewed-by: jpai

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 18 new commits pushed to the master branch:

  • cd77fcf: 8290822: C2: assert in PhaseIdealLoop::do_unroll() is subject to undefined behavior
  • 57c9bc3: 8302335: IGV: Bytecode not showing
  • 57fde75: 8302113: Improve CRC32 intrinsic with crypto pmull on AArch64
  • b8c9d6c: 8302158: PPC: test/jdk/jdk/internal/vm/Continuation/Fuzz.java: AssertionError: res: false shouldPin: false
  • dc55a7f: 8302202: Incorrect desugaring of null-allowed nested patterns
  • c4ffe4b: 8301494: Replace NULL with nullptr in cpu/arm
  • 4f1cffd: 8302674: Parallel: Remove unused methods in MutableNUMASpace
  • c91cd28: 8301481: Replace NULL with nullptr in os/windows
  • 47ca577: 8301491: C2: java.lang.StringUTF16::indexOfChar intrinsic called with negative character argument
  • 49eb68b: 8296158: Refactor the verification of CDS region checksum
  • ... and 8 more: https://git.openjdk.org/jdk/compare/f558a6c5992cf5168e44d73e84e7713728a3ed9b...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Feb 17, 2023
@openjdk
Copy link

openjdk bot commented Feb 17, 2023

⚠️ @dfuch This pull request contains merges that bring in commits not present in the target repository. Since this is not a "merge style" pull request, these changes will be squashed when this pull request in integrated. If this is your intention, then please ignore this message. If you want to preserve the commit structure, you must change the title of this pull request to Merge <project>:<branch> where <project> is the name of another project in the OpenJDK organization (for example Merge jdk:master).

@dfuch
Copy link
Member Author

dfuch commented Feb 17, 2023

/integrate

@openjdk
Copy link

openjdk bot commented Feb 17, 2023

Going to push as commit edf238b.
Since your change was applied there have been 18 commits pushed to the master branch:

  • cd77fcf: 8290822: C2: assert in PhaseIdealLoop::do_unroll() is subject to undefined behavior
  • 57c9bc3: 8302335: IGV: Bytecode not showing
  • 57fde75: 8302113: Improve CRC32 intrinsic with crypto pmull on AArch64
  • b8c9d6c: 8302158: PPC: test/jdk/jdk/internal/vm/Continuation/Fuzz.java: AssertionError: res: false shouldPin: false
  • dc55a7f: 8302202: Incorrect desugaring of null-allowed nested patterns
  • c4ffe4b: 8301494: Replace NULL with nullptr in cpu/arm
  • 4f1cffd: 8302674: Parallel: Remove unused methods in MutableNUMASpace
  • c91cd28: 8301481: Replace NULL with nullptr in os/windows
  • 47ca577: 8301491: C2: java.lang.StringUTF16::indexOfChar intrinsic called with negative character argument
  • 49eb68b: 8296158: Refactor the verification of CDS region checksum
  • ... and 8 more: https://git.openjdk.org/jdk/compare/f558a6c5992cf5168e44d73e84e7713728a3ed9b...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Feb 17, 2023
@openjdk openjdk bot closed this Feb 17, 2023
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Feb 17, 2023
@openjdk
Copy link

openjdk bot commented Feb 17, 2023

@dfuch Pushed as commit edf238b.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@dfuch dfuch deleted the HttpBodySubscriberWrapper-8302635 branch February 20, 2023 17:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integrated Pull request has been integrated net net-dev@openjdk.org
Development

Successfully merging this pull request may close these issues.

2 participants