-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
View Workflow Page Not Receiving Progress Updates #4301
Comments
Would you like to see if you can identify the offending change? |
Sure I'll have a look |
I have a feeling it has something to do with the changes here: 6e3c5be#diff-884cea1f0a354998c12cefc2b8097a881fe6d621fa91c1c549d241d6023fa5b5L42 . In my browser i added a breakpoint here ui/src/app/shared/services/requests.ts and here ui/src/app/shared/services/requests.ts . In 2.10 it seems like the error (where the error is actually undefined) is caught and received by the chained |
What is the UI behaviour? Does it show an error? |
No error in the ui i can see. It just stays static in the state it was when loading the page (workflow-details). |
Locally i reverted just these changes 6e3c5be#diff-0670537605e4212d6dada675c6498eabe38d45eb0d3594d59c0779905a83f18bR319-R322 and after the first request times out another one is established. i have a nginx proxy set up locally in order to mimic the ingress in my deployment with configuration like:
devServer: {
historyApiFallback: {
disableDotRule: true
},
proxy: {
"/api": {
"target": isProd ? "" : "http://localhost:2345",
"secure": false
}, |
Can you test with |
@alexec I had to modify my proxy config since the cli only works with a grpc connection:
However the watch still times out after one minute because of nginx's default keep alive setting of 1 minute. This keepalive setting is also the reason why the ui request times out. If the request times out the client should re start the watch in my opinion.
|
@alexmt any thoughts on what does Argo CD does about this? Does it send some kind of keep-alive? |
The way the UI is coded is that it should present an error notice if there is an error, and then automatically reconnect. |
that's what i thought as well since the eventsource handles that but it doesn't seem to work. In the event of an upstream timeout it does appear the request ends with a response code of 200. So it's possible the logic reestablishing the connection isn't triggered in this case. However the same response is returned on 2.10 and it does establish a new request. |
So we think that the request ends with 200, but we do not reconnect like we should. |
Signed-off-by: Alex Collins <alex_collins@intuit.com>
Can you please test |
I've made another change. Can you pull the image again and try again? |
That seems to work! Very neat, should we make this change for the other eventsource subscribers? |
@terev would you be interested in taking over on this fix? |
I think we should maybe determine if the error is a disconnect (so reconnecting is a good idea) rather than 403 or 500 (which would result in going into a hot-loop). |
Sweet that makes sense |
Signed-off-by: Alex Collins <alex_collins@intuit.com> Signed-off-by: Alex Capras <alexcapras@gmail.com>
Summary
It seems like the view workflow page is no longer continuously receiving events. On 2.10.2 this did work where progress of the workflow could be viewed without refreshing. The SSE endpoint used to retrieve these events
/workflow-events
seems to be called once then after the request times out, a new watch request is not started. This may not be the only affected SSE endpoint.Update:
This also appears to occur on the workflow list page. Where the SSE request eventually terminates with a 504
Gateway Timeout.
Diagnostics
GKE 1.15
On 2.11.4 i've noticed that the workflow graph isn't updating anymore. I upgraded to 2.11.4 from 2.10.2 where this was working before.
This has occurred when viewing any in progress workflow i've tried.
Message from the maintainers:
Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.
The text was updated successfully, but these errors were encountered: