Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MaxRequestLimitExceededException exception from closed connection is confusing #279

Merged
merged 1 commit into from
Feb 8, 2019

Conversation

jayv
Copy link
Contributor

@jayv jayv commented Jan 29, 2019

Motivation

With an HttpConnection (not HttpClient), if a request is sent on a closed connection, instead of a ClosedChannelException (or any IOException), users receive MaxRequestLimitExceededException. This is somewhat confusing because it’s not an IOException, it’s a RetryableException.
By combining connection status and transportError we can provide more clear error messages to the user.

Modifications

  • enhance RequestConcurrencyController to allow exposing an unavailable connection state
  • enhance ConcurrentRequestsHttpConnectionFilter to leverage transport errors and connection state

Result

Users won't get confusing error messages suggesting wrong configuration when the connection is closed.

@jayv
Copy link
Contributor Author

jayv commented Jan 29, 2019

This is a stacked PR depending on #278

@jayv jayv requested review from NiteshKant, Scottmitch, lewisd32, ddossot and idelpivnitskiy and removed request for NiteshKant and Scottmitch January 29, 2019 18:49
Copy link
Contributor

@lewisd32 lewisd32 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, just a few small things.

try (ServerContext serverContext = HttpServers.forPort(0)
.listenAndAwait((ctx, request, responseFactory) ->
// builds up concurrency, allows test to complete quickly
Single.<HttpResponse>never().timeout(ofMillis(10)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be a bit short for CI.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya time dependency here will lead to flaky tests. One way to make sure that we are actually having concurrent requests is to have a response payload and not consume it in the client.

@jayv jayv force-pushed the maxconcurrmsg branch 2 times, most recently from 1f5cb8f to 68ba44f Compare February 2, 2019 01:52
@jayv jayv requested a review from lewisd32 February 2, 2019 01:52
Copy link
Collaborator

@NiteshKant NiteshKant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have only reviewed the commit aabe0eb. Other commits as I understand are for the related PR.

@jayv
Copy link
Contributor Author

jayv commented Feb 7, 2019

@NiteshKant PTAL addressed comments

Copy link
Collaborator

@NiteshKant NiteshKant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question about unused loggers else LGTM

@@ -29,6 +32,8 @@
pendingRequestsUpdater = newUpdater(AbstractRequestConcurrencyController.class,
"pendingRequests");

private static final Logger LOGGER = LoggerFactory.getLogger(AbstractRequestConcurrencyController.class);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we using these loggers in sub-classes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also noticed, removing as we speak.

…confusing

__Motivation__

With an HttpConnection (not HttpClient), if a request is sent on a closed connection, instead of a ClosedChannelException (or any IOException), users receive MaxRequestLimitExceededException. This is somewhat confusing because it’s not an IOException, it’s a RetryableException.
By combining connection status and transportError we can provide more clear error messages to the user.

__Modifications__

- enhance `RequestConcurrencyController` to allow exposing an unavailable connection state
- enhance `ConcurrentRequestsHttpConnectionFilter` to leverage transport errors and connection state

__Result__

Users won't get confusing error messages suggesting wrong configuration when the connection is closed.
@jayv jayv merged commit 7c66579 into apple:master Feb 8, 2019
@jayv jayv deleted the maxconcurrmsg branch February 8, 2019 01:38
Copy link
Member

@idelpivnitskiy idelpivnitskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minutes late review :)

package io.servicetalk.client.api;

/**
* Thrown when the connection is no longer available.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is "available" a right word in this case?
Apache HTTP client javadocs the same exception as:

Signals that the connection has been closed unexpectedly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably use it in different context and have the cause to give more details on whether it was actually "unexpected" or normal close when the user closed the connection (or the application shutting down).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not pointing to the "unexpected" word, just feel that "is no longer available" is not the same as "closed".

/**
* Thrown when the connection is no longer available.
*/
public class ConnectionClosedException extends RuntimeException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be an extend of IoException?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, the exception is not originating from the transport layer but from a ConnectionFilter.

try (ServerContext serverContext = HttpServers.forPort(0)
.listenStreamingAndAwait((ctx, request, responseFactory) -> {
ctx.onClose().subscribe(serverClosed);
return Single.success(responseFactory.ok().setHeader(HttpHeaderNames.CONNECTION, "close"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You already have a static import for success

public void throwMaxConcurrencyExceededOnOversubscribedConnection() throws Exception {
try (ServerContext serverContext = HttpServers.forPort(0)
.listenAndAwait((ctx, request, responseFactory) ->
Single.success(responseFactory.ok().payloadBody("Test", textSerializer())));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You already have a static import for success

} catch (ExecutionException e) {
assertThat(e.getCause(), instanceOf(ConnectionClosedException.class));
assertThat(e.getCause().getCause(), instanceOf(IOException.class));
assertThat(e.getCause().getCause(), equalTo(ioEx.get())); // Assert connection reset
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test doesn't properly check that this is a "connection reset" exception. If you will run this test until failure it may throw:

<io.servicetalk.transport.netty.internal.CloseHandler$CloseEventObservedException:
CHANNEL_CLOSED_INBOUND(The transport backing this connection has been shutdown (read))
[id: 0xf6de3cd5, L:/127.0.0.1:58678 ! R:127.0.0.1/127.0.0.1:58677]>

The fun thing that this test fails, because netty's AbstractChannel.toString() produce different result, depending on the Channel state (active/inactive). Without this toString() behavior we could never discover that this is not a "connection reset" exception, because CloseEventObservedException is also an instance of IOException 😄

This is why the master build failed. Tried to fix, but didn't understood why it may throw CloseEventObservedException: CHANNEL_CLOSED_INBOUND in this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't matter whether it is actually a connection reset, there is no reliable way in the JDK to assert that one anyway, but all this test needs to check is that subsequent requests after connection close also receive the original cause of connection close.

I'm not sure I understand how this test can fail due to toString() if we're comparing references

assertThat(e.getCause().getCause(), equalTo(ioEx.get())); // Assert connection reset

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try to run it locally with "until failure" setting in Intellij IDEA.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't matter whether it is actually a connection reset.

If there is no strong requirement to receive exactly a connection reset exception in this case, consider removing the comment // Assert connection reset from the test. When I run it locally it returns a connection reset in most cases, but 1 out of 10 cases fail with another exception described in the original message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try to run it locally with "until failure" setting in Intellij IDEA.

I was not suggesting I didn't believe you ;) Was just saying that the toString() is a red herring. It proves that we're capturing 2 different close events on the transport at 2 different phases of channel.

The point of this test was to provoke a reset (unexpected close), which seems to be happening either way (fail or pass), but we don't need to assert that.

What I was hoping to achieve in this test, was to ensure we get the right close event CHANNEL_CLOSED_INBOUND and ideally a consistent root cause.

But we may need to refactor the transport layer a bit, right now channelInactive(..) bypasses close handler, which I suspect is what you see 10% of the time because the ordering of events doesn't seem deterministic. With some tweaking of payload size and socket options I managed to reduce it to 0.05%, but I think I may just have to remove this assert for now.
The important part is that we have the right event CHANNEL_CLOSED_INBOUND.

What I don't fully understand yet is how resp1 sees the exception from the inactive channel, and resp2 gets and exception from the active channel. I suspect this is because the way our read/write sources are hooked up, but I need to dig a little more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants