-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SSE connections leaked #468
Comments
Ok, looking at the metrics from Prometheus, it does confirm the extension is part of the issue, but it is kinda hard to know where exactly the problem is since VSCode does not provide any way to debug extension requests. Trying to get more info, I set a proxy as suggested from this user in this issue, and from what I can tell, they look normal. The request is being closed/stop when a watched agent gets stopped for example. Talking with @bcpeinhardt maybe it is better, for now, to undo the changes from #440 until we have a better idea on what is happening. Another option, is to use the web socket endpoint instead of SSE and see if it fixes the issue. Thoughts? cc.: @spikecurtis |
My guess is Axios is not closing the connection with the server correctly due to its own internal or because of our custom streaming implementation. I'm starting to think the best move here is to move to the web socket connection as we do in the UI. |
We need to understand the cause of the leak so that we can be sure we have fixed it. Ideally this means writing a unit test that reproduces the issue, then fixing it. If we don't understand the cause we won't know whether swapping SSE for Websocket truly fixes the issue, or just makes it less likely. |
I'm going to post all my investigation findings here so we can keep track on what investigation paths we already tried. |
I went through this part of the code: Lines 112 to 131 in 0d97271
It is where we define a custom stream, but after read the ReadableStream and IncomingMessage docs, our implementation looks correct and the connections should be closed as expected. One thing that I was not able to do in this part of the code was to debug the events. For some reason, only the |
Investigation note: After debugging a bit more I figured out the issue is not on removing the singleton and not with Axios. I tried to use const eventSource = new EventSource(watchUrl.toString(), {
fetch: (input, init) =>
fetch(input, {
...init,
headers: {
...init?.headers,
"Coder-Session-Token": token ?? "",
},
}),
}) Another big change we made was to update the |
Investigation note: After downgrading
We may better need to understand how to implement these agent/dispatcher correctly. |
We leak SSE connections to
/api/v2/workspaceagents/{id}/watch-metadata
in at least the following scenarios, since v1.4.2.I have a strong suspicion that #440 is involved, although possibly not the root cause.
Likely the cause of https://github.com/coder/customers/issues/741
How to reproduce
coderd_api_concurrent_requests
gauge via the Prometheus endpoint on Coder server.Either:
Then:
The text was updated successfully, but these errors were encountered: