Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: auto retry once for connection closed #3426

Merged
merged 8 commits into from
Feb 6, 2024

Conversation

inventvenkat
Copy link
Contributor

@inventvenkat inventvenkat commented Jan 23, 2024

Type of Change

  • Bugfix
  • New feature
  • Enhancement
  • Refactoring
  • Dependency updates
  • Documentation
  • CI/CD

Description

Hyper crate has a bug which allows few calls to connect with a dead connections and fails to send request to the server. This bug of hyper can be tracked with below link.
Error: IncompleteMessage: connection closed before message completed - Link

Temporary fix: Retry connection with the server by sending the request again when call fails with this issue.

Permanent fix would be, whenever reqwest crate release a stable version with hyper's 1.0 version

Another Link
This is just due to the racy nature of networking. hyper has a connection pool of idle connections, and it selected one to send your request. Most of the time, hyper will receive the server’s FIN and drop the dead connection from its pool. But occasionally, a connection will be selected from the pool and written to at the same time the server is deciding to close the connection. Since hyper already wrote some of the request, it can’t really retry it automatically on a new connection, since the server may have acted already

Additional Changes

  • This PR modifies the API contract
  • This PR modifies the database schema
  • This PR modifies application configuration/environment variables

Motivation and Context

Few external api calls fails due to this connection close

How did you test it?

Running the test case collections

Checklist

  • I formatted the code cargo +nightly fmt --all
  • I addressed lints thrown by cargo clippy
  • I reviewed the submitted code
  • I added unit tests for my changes where possible
  • I added a CHANGELOG entry if applicable

@inventvenkat inventvenkat changed the title fix: add keepalive while idle configurations for reqwest fix: auto retry once for connection closed Jan 29, 2024
Copy link
Member

@NishantJoshi00 NishantJoshi00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than that, looks good to me

Comment on lines 667 to 684
Err(error) => {
if error.current_context() == &errors::ApiClientError::ConnectionClosed {
metrics::AUTO_RETRY_CONNECTION_CLOSED.add(&metrics::CONTEXT, 1, &[]);
match cloned_send_request {
Some(cloned_request) => {
metrics_request::record_operation_time(
cloned_request,
&metrics::EXTERNAL_REQUEST_TIME,
&[metrics_tag],
)
.await
}
None => Err(error),
}
} else {
Err(error)
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Err(error) => {
if error.current_context() == &errors::ApiClientError::ConnectionClosed {
metrics::AUTO_RETRY_CONNECTION_CLOSED.add(&metrics::CONTEXT, 1, &[]);
match cloned_send_request {
Some(cloned_request) => {
metrics_request::record_operation_time(
cloned_request,
&metrics::EXTERNAL_REQUEST_TIME,
&[metrics_tag],
)
.await
}
None => Err(error),
}
} else {
Err(error)
}
}
Err(error) if error.current_context() == &errors::ApiClientError::ConnectionClosed => {
metrics::AUTO_RETRY_CONNECTION_CLOSED.add(&metrics::CONTEXT, 1, &[]);
match cloned_send_request {
Some(cloned_request) => {
metrics_request::record_operation_time(
cloned_request,
&metrics::EXTERNAL_REQUEST_TIME,
&[metrics_tag],
)
.await
}
None => Err(error),
}
},
err @ Err(_) => err

@inventvenkat
Copy link
Contributor Author

@jarnura , this Error: IncompleteMessage: connection closed before message completed is the error message by hyper, when the connection gets closed before the message could complete from the client side. Which infers that we are not able to send the message to the server.
https://docs.rs/hyper/0.14.28/hyper/struct.Error.html#method.is_incomplete_message

metrics::AUTO_RETRY_CONNECTION_CLOSED.add(&metrics::CONTEXT, 1, &[]);
match cloned_send_request {
Some(cloned_request) => {
metrics_request::record_operation_time(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we are retrying for all connection closed cases, but for non-idempotent request this will cause business related errors and audit for this is complex.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NishantJoshi00
NishantJoshi00 previously approved these changes Feb 5, 2024
jarnura
jarnura previously approved these changes Feb 5, 2024
@Narayanbhat166
Copy link
Member

@inventvenkat instead of doing this, what if we do not have any idle_pools, AFAIK the error is caused mainly because of the idle pools. Would we have any significant performance hit without idle pools?

@Gnanasundari24 Gnanasundari24 added this pull request to the merge queue Feb 6, 2024
Merged via the queue into main with commit 94e9b26 Feb 6, 2024
10 of 12 checks passed
@Gnanasundari24 Gnanasundari24 deleted the fix/connection_closed branch February 6, 2024 09:29
@inventvenkat
Copy link
Contributor Author

@inventvenkat instead of doing this, what if we do not have any idle_pools, AFAIK the error is caused mainly because of the idle pools. Would we have any significant performance hit without idle pools?

pool_idle_timeout disabling timeout didn't help this case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error: IncompleteMessage: connection closed before message completed
7 participants