-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proxy freeze on outbound request with long GET request #2859
Comments
@tanuck thanks for the report! we're actively looking into this class of issue, but the repro we've been tracking sounds a bit different.
This is surprising! Can you share specific repro steps? Is there a curl command that can trigger this behavior, for instance? |
@olix0r yeah there seems to be a few issues revolving around cpu/memory spikes but I couldn't seem to find anything with similar symptoms we were seeing. There's an nginx yaml and an example curl command that can reproduce this in the following gist FYI I've just verified this in |
Can you try running |
Will do, my hunch from our investigations is that this will resolve it... will give it a go now. |
@olix0r yup that fixes it. Is there a request header that can be set to disable the h2 upgrade? |
For now, I'd just leave this in this state (and we may want to change the default on master until we identify the underlying issue). I'd hope the header wouldn't be generally useful outside of the rare case where the h2 codepath has a bug. |
I've managed to reproduce this using the provided configuration and done some testing. This definitely seems like it's an The destination proxy logs these messages repeatedly:
While the source proxy just logs:
Here's a gist with a longer section of the logs, starting with when outbound client connection is established, and a total log dump starting with when the two proxies were initially spun up. |
There's an upstream PR hyperium/h2#360 in progress that should fix this issue. Thanks @tanuck for the report and repro steps! |
I believe this was fixed in |
@tanuck We believe that we've fixed this issue in stable-2.3.1 and edge-19.5.4. Please reopen this issue if you still see issues! |
@olix0r yeah, confirmed the fix in |
Bug Report
What is the issue?
When making an outbound GET (other http methods may also be affected) request with a long url, the request blocks and the proxy exhibits an endless memory leak and cpu spike until permanently freezing. This usually results in pod restarts because health checks can not reach the application container.
How can it be reproduced?
We originally reproduced this by rendering a grafana chart with a long prometheus query (>6000 characters). This then caused the proxy freeze when traefik was forwarding the request to the grafana pod.
But you can also exec to any meshed pod and perform an outbound http request to another pod with an arbitrarily large url (~6kb)
Logs, error output, etc
No logs with default log levels
linkerd check
outputEnvironment
Possible solution
Additional context
If there is some acceptable max query/url length somewhere in the rust stack the proxy should at least handle that gracefully back to the client and not freeze.
The text was updated successfully, but these errors were encountered: