Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Close an HTTP/1.1 connection after delay #5616

Merged
merged 28 commits into from
May 11, 2024

Conversation

dachshu
Copy link
Contributor

@dachshu dachshu commented Apr 17, 2024

Motivation:

Modifications:

  • Remove the force shutdown mode of AbstractHttpResponseHandler that is activated when a user sets the Connection: close HTTP header
  • Create a schedule that waits and closes the connection to wait for the client-side connection to close, rather than closing the HTTP/1.1 connection immediately.

Result:

  • Closes Close an HTTP/1.1 connection after delay #4849
  • When a user wants to close an HTTP/1.1 server connection, the armeria server first waits for the client to close the connection for a while to make the connection state to CLOSED instead of TIME_WAIT.

@CLAassistant
Copy link

CLAassistant commented Apr 17, 2024

CLA assistant check
All committers have signed the CLA.

Copy link
Contributor

@jrhee17 jrhee17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall! Left some minor comments 🙇

@@ -780,7 +780,9 @@ private void handleRequestOrResponseComplete() {
// Stop receiving new requests.
handledLastRequest = true;
if (unfinishedRequests.isEmpty()) {
ctx.writeAndFlush(Unpooled.EMPTY_BUFFER).addListener(CLOSE);
ctx.executor().schedule(() ->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ServiceRequestContext#executor isn't necessarily the same as Channel#eventLoop

final EventLoopGroup serviceWorkerGroup = serviceCfg.serviceWorkerGroup();
if (serviceWorkerGroup == config.workerGroup()) {
serviceEventLoop = channelEventLoop;
needsDirectExecution = true;
} else {
serviceEventLoop = serviceWorkerGroup.next();
needsDirectExecution = serviceEventLoop == channelEventLoop;
}
final DefaultServiceRequestContext reqCtx = new DefaultServiceRequestContext(
serviceCfg, channel, serviceEventLoop, config.meterRegistry(), protocol,
nextRequestId(routingCtx, serviceCfg), routingCtx, routingResult, req.exchangeType(),
req, sslSession, proxiedAddresses, clientAddress, remoteAddress, localAddress,
req.requestStartTimeNanos(), req.requestStartTimeMicros(), serviceCfg.contextHook());

We probably want to close from the channel's event loop (otherwise, an exception will be raised)

Suggested change
ctx.executor().schedule(() ->
ctx.channel().eventLoop().schedule(() ->

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I missed it. Thanks for letting me know.
I fixed it to use the channel's event loop.

ctx.writeAndFlush(Unpooled.EMPTY_BUFFER).addListener(CLOSE);
ctx.executor().schedule(() ->
ctx.writeAndFlush(Unpooled.EMPTY_BUFFER).addListener(CLOSE),
1000, TimeUnit.MILLISECONDS);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional nit; using seconds is probably slightly easier to reason about

Suggested change
1000, TimeUnit.MILLISECONDS);
1, TimeUnit.SECONDS);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this value can be configured in milliseconds on fast networks.

@jrhee17
Copy link
Contributor

jrhee17 commented Apr 17, 2024

#5616 (comment)
Please also sign the CLA 🙇

@jrhee17 jrhee17 added the defect label Apr 17, 2024
@jrhee17 jrhee17 added this to the 1.29.0 milestone Apr 17, 2024
@@ -780,7 +780,9 @@ private void handleRequestOrResponseComplete() {
// Stop receiving new requests.
handledLastRequest = true;
if (unfinishedRequests.isEmpty()) {
ctx.writeAndFlush(Unpooled.EMPTY_BUFFER).addListener(CLOSE);
ctx.executor().schedule(() ->
ctx.writeAndFlush(Unpooled.EMPTY_BUFFER).addListener(CLOSE),
Copy link
Contributor

@ikhoon ikhoon Apr 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we check if the channel is still active? Meanwhile, the channel may have been closed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for checking! I added to check ctx.channel().isActive() before executing the close task.

Copy link
Member

@trustin trustin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a test case. You might want to implement a simple low-level HTTP/1 client in your test to know the socket-level timings.

I'd recommend checking out the source code of Http1ServerKeepAliveTest to get the basic ideas.

@dachshu dachshu force-pushed the ISSUE-4849-close-connection-after-delay branch from 42f87ea to 4a7f0ef Compare April 24, 2024 00:33
@jrhee17
Copy link
Contributor

jrhee17 commented Apr 30, 2024

Looking into the failing test, I guess there is a case where Connection: close header isn't sent but we set needsDisconnection() somewhere.

Copy link
Contributor

@jrhee17 jrhee17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall! Left a few style related comments 🙇

Comment on lines 798 to 802
ctx.channel().eventLoop().schedule(() -> {
if (ctx.channel().isActive()) {
ctx.writeAndFlush(Unpooled.EMPTY_BUFFER).addListener(CLOSE);
}
}, Flags.defaultHttp1ConnectionCloseDelayMillis(), TimeUnit.MILLISECONDS);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you align these lines? (you may also want to check if code style is correctly set in your IDE ref: https://armeria.dev/community/developer-guide#setting-up-your-ide)

Suggested change
ctx.channel().eventLoop().schedule(() -> {
if (ctx.channel().isActive()) {
ctx.writeAndFlush(Unpooled.EMPTY_BUFFER).addListener(CLOSE);
}
}, Flags.defaultHttp1ConnectionCloseDelayMillis(), TimeUnit.MILLISECONDS);
ctx.channel().eventLoop().schedule(() -> {
if (ctx.channel().isActive()) {
ctx.writeAndFlush(Unpooled.EMPTY_BUFFER).addListener(CLOSE);
}
}, Flags.defaultHttp1ConnectionCloseDelayMillis(), TimeUnit.MILLISECONDS);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for letting me know. I have checked that the code style setting was not set correctly.
I have fixed the code above.

@@ -121,7 +121,11 @@ private static void urlPathAssertion(HttpStatus expected, String path) throws Ex
try (Socket s = new Socket(NetUtil.LOCALHOST, server.httpPort())) {
s.setSoTimeout(10000);
s.getOutputStream().write(requestString.getBytes(StandardCharsets.US_ASCII));
assertThat(new String(ByteStreams.toByteArray(s.getInputStream()), StandardCharsets.US_ASCII))
final BufferedReader in = new BufferedReader(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question) I think the previous approach of collecting all bytes was correct (in the off-chance that the bytes are fragmented). What do you think of reverting this file?

Copy link
Contributor Author

@dachshu dachshu May 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to the following implementation, the ByteStreams.toByteArray(s.getInputStream()) call returns the value when the inputStream is closed which is only happened when the connection is closed from the server.
https://github.com/google/guava/blob/master/guava/src/com/google/common/io/ByteStreams.java#L195~L198

With this PR, I changed the server to wait for the client to close first instead of closing the connection immediately. However in this test, the client receives the Connection: Close header from the server but ignores it. So the test waits for the server to close the connction, causing the test timeout.
I tried to adjust the defaultHttp1ConnectionCloseDelayMillis using @SetSystemProperty or System.setProperty() to reduce the test time, but it didn't work well in shadedTest 😭
Since only the first line is needed for this test validation, I changed it to read only the first line and check the result to make the test take less time.
If it is a problem, I would consider reducing the defaultHttp1ConnectionCloseDelayMillis.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I think I missed this point. I think it's fine to leave as-is then. Thanks for the explanation 👍

I tried to adjust the defaultHttp1ConnectionCloseDelayMillis using @SetSystemProperty or System.setProperty() to reduce the test time, but it didn't work well in shadedTest

I assume this is because Flags is already initialized when other tests are run or ServerExtension is constructed.

}

@Test
void shouldWaitForDisconnectByClientSideFirst() throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question) I didn't understand the intention of this test. If the functionality isn't very different from shouldDelayDisconnectByServerSideIfClientDoesNotHandleConnectionClose, what do you think of just removing it?

Copy link
Contributor Author

@dachshu dachshu May 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The shouldWaitForDisconnectByClientSideFirst test checks if the server sends the Connection: close header and waits for the client to close the connection first, rather than closing the connection immediately. It additionally checks if the server closes the connection after the client closes it.
On the other hand, the shouldDelayDisconnectByServerSideIfClientDoesNotHandleConnectionClose test verifies that if the client receives the Connection: close header but ignores it without attempting to close the connection, and the close connection task scheduled by the server executes well waiting for the set time on the sever side before closing the connection.
Therefore, I wrote two tests because I think the purpose of the two tests and what they verify are different.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I think I missed that point in the last review cycle. Thanks for the explanation 👍

Comment on lines 68 to 69
final short localPort = (short) random.nextInt(Short.MAX_VALUE + 1);
try (Socket socket = new Socket("127.0.0.1", server.httpPort(), null, localPort)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's 1) overkill to check the socket state as this is out of our control 2) binding to a random socket introduces unnecessary flakiness.

Instead of testing the OS's behavior of managing TCP states, what do you think of sticking to testing the behavior of the server-side closing the connection? (So what do you think of removing the localPort related logic?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the purpose of this PR is to prevent the server's socket state from getting in TIME_WAIT by waiting for the client to close the connection first. Therefore, I thought it was necessary to check which side closed the connection, and as a way to check this behaviour I thought it was meaningful to try to reuse the client socket and check which side performed the close first through immediate reusability.
In order to try to reuse the client socket, it is necessary to test by setting the port number directly instead of using the port number assigned by the OS, so I implemented the test by setting the localPort number.
I agree with the risk of using a random port number as you mentioned, and I found there is a function socket.getLocalPort(). So I fixed it to reuse the port number assigned from the OS.
However, if checking the socket state like this is unnecessary, I would remove the part about checking if the socket is in TIME_WAIT or CLOSE state through socket reuse!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think as long as the test consistently passes it is OK to keep the behavior 👍

@jrhee17
Copy link
Contributor

jrhee17 commented Apr 30, 2024

Went through the code, and I think the issue is the following:

Overall, I think it's probably fine to just close the connection immediately if we can't write the 413 response headers. This isn't a perfect solution, but at least preserves backwards compatibility.

  • We can check encoder.isResponseHeadersSent(id, 1) at the following location, and close the channel immediately if the response headers have already been written.
  • Otherwise the 413 with Connection: close will be written, so we can preserve the previous behavior.

@dachshu
Copy link
Contributor Author

dachshu commented May 6, 2024

#5616 (comment)

When also I analysed the reason for the failure of this test, I found that the response is sent immediately at 404. And when the content size exceeds maxRequestLength while reading the rest of the request content, the Connetion: close header is not sent and the server closes the connection. The problem was that the connection discarding flag was set and the will be closed soon was reused in the next test. Because this PR does not check if a response has already been sent, but just waits for the set time before closing the connection, the client didn't know that the connection was about to be closed and reused it causing the connection to be closed during the test.
To solve this, I was trying to figure out how to make it fail fast, as I think the correct behaviour is to ignore the content if it's a 404.
If I knew the length of the content, it shouldn't be a problem. But I thought there might be an issue with using chunked transfer encoding.
I also wondered if it would be better to just set discarding to true and read the request to the end and send the response, since even in the 404, the server reads all the content sent by the client anyway.
When I thought about the method you advised, it might be a problem when it takes a long time to reach the content max length so the same phenomenon can occur like in the test. It would look like the connection is suddenly closed from the client's point of view.
However, I will modify it to ensure backwards compatibility as you advised and consider a better way as another issue.
Thanks for your advice 🙇

@dachshu dachshu force-pushed the ISSUE-4849-close-connection-after-delay branch from fd7e68e to bbc35b7 Compare May 7, 2024 11:35
Copy link
Contributor

@jrhee17 jrhee17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good for merging to me. Thanks for tackling this difficult issue @dachshu! 🙇 👍 🙇

@@ -121,7 +121,11 @@ private static void urlPathAssertion(HttpStatus expected, String path) throws Ex
try (Socket s = new Socket(NetUtil.LOCALHOST, server.httpPort())) {
s.setSoTimeout(10000);
s.getOutputStream().write(requestString.getBytes(StandardCharsets.US_ASCII));
assertThat(new String(ByteStreams.toByteArray(s.getInputStream()), StandardCharsets.US_ASCII))
final BufferedReader in = new BufferedReader(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I think I missed this point. I think it's fine to leave as-is then. Thanks for the explanation 👍

I tried to adjust the defaultHttp1ConnectionCloseDelayMillis using @SetSystemProperty or System.setProperty() to reduce the test time, but it didn't work well in shadedTest

I assume this is because Flags is already initialized when other tests are run or ServerExtension is constructed.

Comment on lines 119 to 135
final BufferedReader in = new BufferedReader(new InputStreamReader(socket.getInputStream()));
assertThat(in.readLine()).isEqualTo("HTTP/1.1 200 OK");

String line;
boolean hasConnectionClose = false;
while ((line = in.readLine()) != null) {
if ("connection: close".equalsIgnoreCase(line)) {
hasConnectionClose = true;
}
if (line.isEmpty() || line.contains(":")) {
continue;
}
if (line.startsWith("OK")) {
break;
}
}
assertThat(hasConnectionClose).isTrue();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Suggested change
final BufferedReader in = new BufferedReader(new InputStreamReader(socket.getInputStream()));
assertThat(in.readLine()).isEqualTo("HTTP/1.1 200 OK");
String line;
boolean hasConnectionClose = false;
while ((line = in.readLine()) != null) {
if ("connection: close".equalsIgnoreCase(line)) {
hasConnectionClose = true;
}
if (line.isEmpty() || line.contains(":")) {
continue;
}
if (line.startsWith("OK")) {
break;
}
}
assertThat(hasConnectionClose).isTrue();
final BufferedReader in = new BufferedReader(new InputStreamReader(socket.getInputStream()));
assertThat(in.readLine()).isEqualTo("HTTP/1.1 200 OK");
in.readLine(); // content-type
in.readLine(); // content-length
in.readLine(); // server
in.readLine(); // date
assertThat(in.readLine()).isEqualToIgnoringCase("connection: close");
assertThat(in.readLine()).isEmpty();
assertThat(in.readLine()).isEqualToIgnoringCase("OK");

Copy link
Contributor Author

@dachshu dachshu May 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to read reads each line!

Comment on lines 72 to 87
final BufferedReader in = new BufferedReader(new InputStreamReader(socket.getInputStream()));
assertThat(in.readLine()).isEqualTo("HTTP/1.1 200 OK");

String line;
boolean hasConnectionClose = false;
while ((line = in.readLine()) != null) {
if ("connection: close".equalsIgnoreCase(line)) {
hasConnectionClose = true;
}
if (line.isEmpty() || line.contains(":")) {
continue;
}
if (line.startsWith("OK")) {
break;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, can we just explicitly check each line?

Suggested change
final BufferedReader in = new BufferedReader(new InputStreamReader(socket.getInputStream()));
assertThat(in.readLine()).isEqualTo("HTTP/1.1 200 OK");
String line;
boolean hasConnectionClose = false;
while ((line = in.readLine()) != null) {
if ("connection: close".equalsIgnoreCase(line)) {
hasConnectionClose = true;
}
if (line.isEmpty() || line.contains(":")) {
continue;
}
if (line.startsWith("OK")) {
break;
}
}
final BufferedReader in = new BufferedReader(new InputStreamReader(socket.getInputStream()));
assertThat(in.readLine()).isEqualTo("HTTP/1.1 200 OK");
in.readLine(); // content-type
in.readLine(); // content-length
in.readLine(); // server
in.readLine(); // date
assertThat(in.readLine()).isEqualToIgnoringCase("connection: close");
assertThat(in.readLine()).isEmpty();
assertThat(in.readLine()).isEqualToIgnoringCase("OK");

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refactored it too 🙇

}

@Test
void shouldWaitForDisconnectByClientSideFirst() throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I think I missed that point in the last review cycle. Thanks for the explanation 👍

assertThatThrownBy(
() -> {
final Socket reuseSock = new Socket("127.0.0.1", server.httpPort(), null, socketPort);
reuseSock.close();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that we know why this logic exists

Suggested change
reuseSock.close();
// close the socket in case initializing the socket doesn't throw an exception
reuseSock.close();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for good suggestions 🙇

Comment on lines 68 to 69
final short localPort = (short) random.nextInt(Short.MAX_VALUE + 1);
try (Socket socket = new Socket("127.0.0.1", server.httpPort(), null, localPort)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think as long as the test consistently passes it is OK to keep the behavior 👍

@jrhee17
Copy link
Contributor

jrhee17 commented May 8, 2024

However, I will modify it to ensure backwards compatibility as you advised and consider a better way as another issue.

Thanks! Let us know if you would like to try handling this as a follow-up issue.

Copy link
Member

@trustin trustin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once @jrhee17's comments are all addressed. Nice job, @dachshu!

Copy link
Contributor

@ikhoon ikhoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Left minor comments.

@@ -1159,4 +1159,20 @@ default Long defaultUnhandledExceptionsReportIntervalMillis() {
default DistributionStatisticConfig distributionStatisticConfig() {
return null;
}

/**
* Returns the default time in milliseconds to wait before closing an HTTP1 connection when a server needs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
* Returns the default time in milliseconds to wait before closing an HTTP1 connection when a server needs
* Returns the default time in milliseconds to wait before closing an HTTP/1 connection when a server needs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for helping to correct 🙇

Copy link
Member

@minwoox minwoox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left just small suggestions. Thanks!

@dachshu
Copy link
Contributor Author

dachshu commented May 9, 2024

However, I will modify it to ensure backwards compatibility as you advised and consider a better way as another issue.

Thanks! Let us know if you would like to try handling this as a follow-up issue.

Yes, I would like to!

Copy link
Contributor

@jrhee17 jrhee17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still looks good to me! 👍 👍

Copy link
Member

@minwoox minwoox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, thanks a lot, @dachshu! 👍 👍 👍

Copy link
Contributor

@ikhoon ikhoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @dachshu! 🙇‍♂️👍

@trustin trustin merged commit 550b931 into line:main May 11, 2024
15 checks passed
Dogacel pushed a commit to Dogacel/armeria that referenced this pull request Jun 8, 2024
Motivation:
- Closes line#4849
- To suppress a server socket from remaining in the `TIME_WAIT` state
instead of `CLOSED` when a connection is closed.

Modifications:
- Remove the force shutdown mode of `AbstractHttpResponseHandler` that
is activated when a user sets the `Connection: close` HTTP header
- Create a schedule that waits and closes the connection to wait for the
client-side connection to close, rather than closing the HTTP/1.1
connection immediately.

Result:
- Closes line#4849
- When a user wants to close an HTTP/1.1 server connection, the armeria
server first waits for the client to close the connection for a while to
make the connection state to `CLOSED` instead of `TIME_WAIT`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Close an HTTP/1.1 connection after delay
6 participants