Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ingress: Add failfast to the forwarder #1035

Merged
merged 1 commit into from
Jun 7, 2021
Merged

Conversation

olix0r
Copy link
Member

@olix0r olix0r commented Jun 7, 2021

The ingress-mode proxy's forwarding stack--used when a request does not
set the l5d-dst-override header--has no failfast implemetation. This
means that when a connection can't be obtained for the endpoint,
requests are buffered indefinitely.

This change adds a failfast layer so that these requests are failed
eagerly after 3s of unavailability, causing the serverside connection to
be dropped (so that the application client may re-resolve the endpoint).

This is really a temporary solution. We should probably avoid
implementing reconnection at all in this case so that connection errors
can be used in place of failfast errors.

Related to linkerd/linkerd2#6184

The ingress-mode proxy's forwarding stack--used when a request does not
set the `l5d-dst-override` header--has no failfast implemetation. This
means that when a connection can't be obtained for the endpoint,
requests are buffered indefinitely.

This change adds a failfast layer so that these requests are failed
eagerly after 3s of unavailability, causing the serverside connection to
be dropped (so that the application client may re-resolve the endpoint).

This is really a temporary solution. We should probably avoid
implementing reconnection at all in this case so that connection errors
can be used in place of failfast errors.

Related to linkerd/linkerd2#6184
@olix0r olix0r requested a review from a team June 7, 2021 17:44
Copy link
Member

@hawkw hawkw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 let's get this in as a stopgap solution!

@olix0r olix0r merged commit 8d9c8df into main Jun 7, 2021
@olix0r olix0r deleted the ver/ingress-forward-failfast branch June 7, 2021 22:24
olix0r added a commit to linkerd/linkerd2 that referenced this pull request Jun 9, 2021
This release fixes a problem with the HTTP body buffering that was added
to support gRPC retries. The proxy would buffer all request bodies,
regardless of size or retry configurations. This has been fixed so that
only requests with a retry configuration are buffered (and only when
their bodies are less than 64KB).

This release also fixes an issue with the outbound ingress-mode proxy
where forwarded HTTP traffic could fail to detect when the target pod
was deleted, retrying connections forever. This only impacted traffic
forwarded directly to pod IPs (and not load balanced services). This has
been fixed temporarily by adding a failfast layer that triggers 502
errors when the endpoint has disconected, which cause the connection to
be torn down so that the ingress may reconnect. A more robust solution
will replace this in the future.

Furthermore, core dependencies have been updated including: futures,
hyper, socket2, and tokio.

---

* Fix MacOS conditional build in telemetry::process (linkerd/linkerd2-proxy#1023)
* deps: update `futures` to 0.3.15 (linkerd/linkerd2-proxy#1022)
* tracing: Split HTML-formatting into admin module (linkerd/linkerd2-proxy#1025)
* tracing: Simplify initialization (linkerd/linkerd2-proxy#1026)
* Repace linkerd-drain with drain from crates.io (linkerd/linkerd2-proxy#1027)
* app: Move the admin server into a subcrate (linkerd/linkerd2-proxy#1028)
* inbound: Simplify protocol-detection skipping (linkerd/linkerd2-proxy#1031)
* proxy-api: Update proxy-api to use the main branch (linkerd/linkerd2-proxy#1029)
* outbound: don't double-wrap replay bodies (linkerd/linkerd2-proxy#1036)
* ingress: Add failfast to the forwarder (linkerd/linkerd2-proxy#1035)
* Update tokio, hyper, and socket2 (linkerd/linkerd2-proxy#1037)
* Implement reconnect as a NewService (linkerd/linkerd2-proxy#1032)
* Introduce the tonic-watch crate (linkerd/linkerd2-proxy#1034)
* service-profiles: Wrap receiver types (linkerd/linkerd2-proxy#1038)
* retry: only wrap bodies when a request can be retried (linkerd/linkerd2-proxy#1039)
Pothulapati pushed a commit to linkerd/linkerd2 that referenced this pull request Jun 10, 2021
This release fixes a problem with the HTTP body buffering that was added
to support gRPC retries. The proxy would buffer all request bodies,
regardless of size or retry configurations. This has been fixed so that
only requests with a retry configuration are buffered (and only when
their bodies are less than 64KB).

This release also fixes an issue with the outbound ingress-mode proxy
where forwarded HTTP traffic could fail to detect when the target pod
was deleted, retrying connections forever. This only impacted traffic
forwarded directly to pod IPs (and not load balanced services). This has
been fixed temporarily by adding a failfast layer that triggers 502
errors when the endpoint has disconected, which cause the connection to
be torn down so that the ingress may reconnect. A more robust solution
will replace this in the future.

Furthermore, core dependencies have been updated including: futures,
hyper, socket2, and tokio.

---

* Fix MacOS conditional build in telemetry::process (linkerd/linkerd2-proxy#1023)
* deps: update `futures` to 0.3.15 (linkerd/linkerd2-proxy#1022)
* tracing: Split HTML-formatting into admin module (linkerd/linkerd2-proxy#1025)
* tracing: Simplify initialization (linkerd/linkerd2-proxy#1026)
* Repace linkerd-drain with drain from crates.io (linkerd/linkerd2-proxy#1027)
* app: Move the admin server into a subcrate (linkerd/linkerd2-proxy#1028)
* inbound: Simplify protocol-detection skipping (linkerd/linkerd2-proxy#1031)
* proxy-api: Update proxy-api to use the main branch (linkerd/linkerd2-proxy#1029)
* outbound: don't double-wrap replay bodies (linkerd/linkerd2-proxy#1036)
* ingress: Add failfast to the forwarder (linkerd/linkerd2-proxy#1035)
* Update tokio, hyper, and socket2 (linkerd/linkerd2-proxy#1037)
* Implement reconnect as a NewService (linkerd/linkerd2-proxy#1032)
* Introduce the tonic-watch crate (linkerd/linkerd2-proxy#1034)
* service-profiles: Wrap receiver types (linkerd/linkerd2-proxy#1038)
* retry: only wrap bodies when a request can be retried (linkerd/linkerd2-proxy#1039)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants