-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
slow initial connection #1200
Comments
I don't think I have ever encountered this on Chrome |
Try FF/VDI, the lag can be up to one minute. |
I've replicated this with offline mode on the Vue 3 branch :( |
Assuming this is just an issue on VDI, what could be the cause? |
I'm assuming this is a general issue exacerbated by a laggy platform, you can see multiple connection attempts in the console. |
(What's VDI - virtual desktop?) |
I've done some investigation into this but have come out empty handed. I've also tried migrating from So far the error has only been reported in RedHat builds of Firefox where it presents like so:
This error will persist across multiple retries, it may last for over one minute before clearing. This happens randomly on page load, sometimes it connects straight away. |
P.s. I used to see this when developing the UI. Never investigated the cause though. Sometimes ffox would keep showing the red icon at the top, showing that I was offline and there would be those js messages. Other times it was after I opened the browser the first time. It happened with just that five workflow, or with multiple workflows (i.e probably not the amount of data). However, when I used that fake uiserver (offline mode or random tests in python + pre made request/response) I would not get that. So my guess would be something between browser, uis, and first xhr and/or wss request. Maybe setting a debugger in some parts of uis graphene or web request handling code that could be busy (handling an old request, waiting for some background task?) would replicate the issue... |
This report on Bugzilla looks very similar:
As does this issue:
There a bunch of other similar sounding reports out there with varying levels of details. Here are some summarised statements pulled out of these reports:
I've replicated the issue with:
|
Issue now replicated on:
Chromium error:
Gecko error:
|
Connection TimeoutAfter some experimenting, I've seen a couple cases where connections failed on connection timeout. The timeout starts at 1s and increases with each failure. This diff bumps the default 1s connection timeout up to 10s: diff --git a/src/graphql/index.js b/src/graphql/index.js
index cc252c96..141ee0ef 100644
--- a/src/graphql/index.js
+++ b/src/graphql/index.js
@@ -74,7 +74,9 @@ export function getCylcHeaders () {
export function createSubscriptionClient (wsUrl, options = {}, wsImpl = null) {
const opts = Object.assign({
reconnect: true,
- lazy: false
+ lazy: false,
+ minTimeout: 10000, // default 1000
+ // timeout: 300000 // default 30000
}, options)
const subscriptionClient = new SubscriptionClient(wsUrl, opts, wsImpl)
// these are the available hooks in the subscription client lifecycle Which appears to clear connection timeout issues, however, this would appear to be a distraction which does not solve the main issue. Connection closed before it's established
Call stack (note line numbers shifted slightly by diff): onclose first argument
|
Looks like I can reproduce the problem by using a custom network profile in Chrome with latency of ≥ 3 secs! This might explain why you in particular keep seeing this, as you are plagued by bad ping times? |
I think that would cause the connection timeout issue I mentioned above which I think presents in the same way, though should go away after enough repeats (timeouts willing). (note ping time issues shouldn't affect connections to local servers). To tell the difference, I've been jamming debugging into This bit handles connection timeout issues:
But this one is the real issue:
|
From testing with Dave on Friday (who seldom notices the issue), we found evidence of this issue in the console more times than not, i.e. the issue hist most of the time, but isn't noticed unless it persists for long enough. Worth checking the console for these messages. |
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
From manual inspection, the websocket error code is 1006 which is:
This appears to mean that the connection was closed unexpectedly (no close message), but I'm not sure how it's determining that, is ping-pong established this early on. Note, the reason field is blank. This exactly matches the response you get when you stop a running server. |
@oliver-sanders ' in this issue show what I can now easily reproduce locally 🎉 I was reading an article on JS and performance for Autosubmit UI, when I noticed the author commented something interesting about browser settings that control the maximum number of connections to a server. The author also mentioned about this being a problem to WebSockets, which is why some people used a service workers to have a single client connecting to the server. Anyhoo, I opened Firefox, then Everything was working fine in the UI, and I started closing my tabs and... eventually it connected. I can't spend much more time on this issue, but I think
Open questions (that I wish I had time to keep digging 🤓)
So if anyone is experiencing this a lot, and if you have a lot of tabs, go to about:config and increase that number (assuming you consider it safe as the server & client memory would be affected). Possible solution:
I don't have this issue as we are not using WebSockets in the Autosubmit UI, so this is as far as my curiosity took me 🙂 , hope that helps. Bruno |
Just commenting (and following) that I see this a lot too. But, I haven't seen this error message mentioned, so I'll add it (it always happens after the "can't establish a connection to the server" line.
|
It isn't just if you have many tabs. I have a single tab open and this happens. The page eventually loads, but it is slow. |
Updates
OP
Sometimes it takes a long time for the UI to connect with the backend.
It seems to get there in the end but it often takes in excess of 45 seconds to establish the subscription connection. The issue occurs randomly when the page is loaded and causes the no-connection error box hangs around.
Whilst we are waiting for the connection this message appears multiple times (Firefox):
Which suggests multiple failed attempts at connection, however, after jamming some logging into the UIS (backend server) I can see no evidence of connection attempts (at least none which actually get into the websocket code). The UIS responds quickly once it receives the request.
It's not clear what's causing this, however, it is worth noting that we are using a deprecated framework for subscriptions, we should try moving to something more modern and see if the problem goes away by itself - #1028.
The text was updated successfully, but these errors were encountered: