Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Page constantly reloads by itself after a websocket connection fails #14797

Closed
echarlus opened this issue Oct 11, 2022 · 38 comments
Closed

Page constantly reloads by itself after a websocket connection fails #14797

echarlus opened this issue Oct 11, 2022 · 38 comments

Comments

@echarlus
Copy link

Description of the bug

On some occasions, the browser will start reloading the current page forever.
Looking at Chrome console one can see js exceptions.
Looking at the network activity, one can also see that there is a recurrent POST being made to the server and it results in an exception on the client.
The only way to get out of this behavior is to force reload the whole page.

Expected behavior

Page would not be reloaded forever ....

Minimal reproducible example

Unfortunately as this is fairly random I do not have a sample to provide.
Here's a screenshot of the console (the image load which results in a 404 is "normal", it's probably when the page is being reloaded that it tries to refetch this resource that does not exist for this particular case).

image

logs.zip

Here are logs from the client (chrome console + chrome network activity) and the corresponding tomcat logs on the server side taken when the issue occurred.

Versions

  • Vaadin 23.2.3 with flow 23.2.4 (as I had also faced issue Resynchronizing UI by client's request #14232)
  • Java version: 11
  • OS version: Mac OS 12.6
  • Browser version (if applicable): Chrome 106.0.5249.103
  • Application Server (if applicable): Linux with Tomcat 9.0.65 on Java openjdk version "11.0.16.1" 2022-08-12
  • IDE (if applicable): IntelliJ IDEA 2022.2.2
@caalador
Copy link
Contributor

Would it be possible to strip away everything from the app so that it still fails with the issue?
From this it would seem like something goes wrong after the 404 for the image.

@echarlus
Copy link
Author

The 404 for the image is not from the app itself since it's a direct URL to the resource embedded in the war file : /img/areas/logo_21.png for example. How could this affect the behavior of the app ? I was assuming this type of query is handled directly by the browser and the app server, without any vaadin interaction. Am I wrong ?

@caalador
Copy link
Contributor

Mainly because I heard of another project where a resync happens and there I heard it also seems to happen after a failing image download. Though that seems to do a Gave up waiting for message {} from the server then resync.

But without a project to test this is a guessing game with not good guesses.

@echarlus
Copy link
Author

I'm assuming there are other conditions since depending on the user profile, I have a lot of accounts which will have the missing image and they do not experience the issue (or on a random manner). I have seen it occasionally with no evident pattern so it's quite hard to figure out a way to trigger the issue.
I can make basic app with an image that fails to load 100% of the time but I then one would need to trigger the issue, which is another story. I'll try anyway and let you know.

@echarlus
Copy link
Author

echarlus commented Oct 13, 2022

@caalador I cannot reproduce upon request but one of my user reported that the issue occurred today during a presentation.
I check the logs and found this in both catalina.out and access.log :

10.1.0.5 - - [13/Oct/2022:15:42:09 +0200] "POST /?v-r=uidl&v-uiId=2 HTTP/1.1" 200 39
10.1.0.5 - - [13/Oct/2022:15:42:11 +0200] "POST /?v-r=uidl&v-uiId=2 HTTP/1.1" 200 31623
10.1.0.5 - - [13/Oct/2022:15:42:12 +0200] "POST /?v-r=uidl&v-uiId=2 HTTP/1.1" 200 217
10.1.0.5 - - [13/Oct/2022:15:42:17 +0200] "POST /?v-r=uidl&v-uiId=2 HTTP/1.1" 200 109182
10.1.0.5 - - [13/Oct/2022:15:42:17 +0200] "POST /?v-r=uidl&v-uiId=2 HTTP/1.1" 200 39
10.1.0.5 - - [13/Oct/2022:15:42:22 +0200] "POST /?v-r=uidl&v-uiId=2 HTTP/1.1" 200 106920
10.1.0.5 - - [13/Oct/2022:15:42:24 +0200] "POST /?v-r=heartbeat&v-uiId=2 HTTP/1.1" 200 -
10.1.0.5 - - [13/Oct/2022:15:42:24 +0200] "POST /?v-r=uidl&v-uiId=2 HTTP/1.1" 200 39
10.1.0.5 - - [13/Oct/2022:15:42:29 +0200] "POST /?v-r=uidl&v-uiId=2 HTTP/1.1" 200 106795
10.1.0.5 - - [13/Oct/2022:15:42:30 +0200] "POST /?v-r=uidl&v-uiId=2 HTTP/1.1" 200 39
10.1.0.5 - - [13/Oct/2022:15:42:30 +0200] "POST /?v-r=uidl&v-uiId=2 HTTP/1.1" 200 137
10.1.0.5 - - [13/Oct/2022:15:42:30 +0200] "POST /?v-r=uidl&v-uiId=2 HTTP/1.1" 200 39

At the same time, in catalina.out :

[http-nio-8080-exec-7] WARN com.vaadin.flow.server.communication.ServerRpcHandler - Resynchronizing UI by client's request. A network message was lost before reaching the client and the client is reloading the full UI state. This typically happens because of a bad network connection with packet loss or because of some part of the network infrastructure (load balancer, proxy) terminating a push (websocket or long-polling) connection. If you are using push with a proxy, make sure the push timeout is set to be smaller than the proxy connection timeout
[http-nio-8080-exec-10] WARN com.vaadin.flow.server.communication.ServerRpcHandler - Resynchronizing UI by client's request. A network message was lost before reaching the client and the client is reloading the full UI state. This typically happens because of a bad network connection with packet loss or because of some part of the network infrastructure (load balancer, proxy) terminating a push (websocket or long-polling) connection. If you are using push with a proxy, make sure the push timeout is set to be smaller than the proxy connection timeout
oct. 13, 2022 3:42:23 PM fr.ocell.ovisuserver.ui.MultiIndicatorsChart timeLineRangeChanged
FINE: timeLineRangeChanged for chart fr.ocell.ovisuserver.ui.MultiIndicatorsChart@777e909a min=Wed Oct 05 22:14:36 CEST 2022 max=Thu Oct 13 17:07:24 CEST 2022 changed=true
oct. 13, 2022 3:42:23 PM fr.ocell.ovisuserver.ui.MultiIndicatorsChart timeLineRangeChanged
FINE: timeLineRangeChanged for chart fr.ocell.ovisuserver.ui.MultiIndicatorsChart@71d3168d min=Wed Oct 05 22:21:16 CEST 2022 max=Thu Oct 13 17:07:20 CEST 2022 changed=true
[http-nio-8080-exec-5] WARN com.vaadin.flow.server.communication.ServerRpcHandler - Resynchronizing UI by client's request. A network message was lost before reaching the client and the client is reloading the full UI state. This typically happens because of a bad network connection with packet loss or because of some part of the network infrastructure (load balancer, proxy) terminating a push (websocket or long-polling) connection. If you are using push with a proxy, make sure the push timeout is set to be smaller than the proxy connection timeout

The user told me he was on Wifi and that everything else seemed to work fine ...
FYI for this particular user, the image causing a 404 on the logs I sent above was present, therefore no 404 was returned so I believe it's not linked to that ...
So this is still happening even with the latest version and seems to be random which is a real pain to track down.
Any idea of what I could do to gather more context info when this happens ?
This is really a killer and should be tracked down and fixed.
Thanks.

@caalador
Copy link
Contributor

The request and response payloads might help to see if there's some component/add-on that causes it.
As now we don't know if it is the server that moves the sync id forward or if the client steps wrong, nor what happened just before the resync.
Do you ever get this locally with the applications?

@echarlus
Copy link
Author

@caalador if you load the .xhr file I've provided in the zip referenced in the bug report (see logs.zip link just below the console logs screen capture) you'll have the payload details for every request. Unfortunately I do not have it for the last report that I got from a customer but the one I managed to capture might help you. You have the Chrome network panel logs (full requests history + payloads), the Chrome console output and the Tomcat logs. I hope these info will be enough because I do not see how I can get more if the issue occurs again ...

@caalador
Copy link
Contributor

The har starts with the first resynchronize request.
syncId/clientId -> 21/18 vs response 24/19 then request 24/19 vs response 26/20
So the syncId is multiple steps wrong (updates multiples against one request) all the time making the resync fail.
Do you have automatic push in the application(s)?

@echarlus
Copy link
Author

yes Push is set to automatic.

caalador added a commit that referenced this issue Oct 14, 2022
Synchronize pwa handler on the
requestHandlerMap instead of
locking the session.

Locking and unlocking session may
fire a push event that might make the
server client sync faulty.

touches #14797
caalador added a commit that referenced this issue Oct 17, 2022
Synchronize pwa handler on the
requestHandlerMap instead of
locking the session.

Locking and unlocking session may
fire a push event that might make the
server client sync faulty.

touches #14797
vaadin-bot pushed a commit that referenced this issue Oct 17, 2022
Synchronize pwa handler on the
requestHandlerMap instead of
locking the session.

Locking and unlocking session may
fire a push event that might make the
server client sync faulty.

touches #14797
vaadin-bot pushed a commit that referenced this issue Oct 17, 2022
Synchronize pwa handler on the
requestHandlerMap instead of
locking the session.

Locking and unlocking session may
fire a push event that might make the
server client sync faulty.

touches #14797
vaadin-bot pushed a commit that referenced this issue Oct 17, 2022
Synchronize pwa handler on the
requestHandlerMap instead of
locking the session.

Locking and unlocking session may
fire a push event that might make the
server client sync faulty.

touches #14797
caalador added a commit that referenced this issue Oct 17, 2022
Synchronize pwa handler on the
requestHandlerMap instead of
locking the session.

Locking and unlocking session may
fire a push event that might make the
server client sync faulty.

touches #14797
caalador added a commit that referenced this issue Oct 17, 2022
Synchronize pwa handler on the
requestHandlerMap instead of
locking the session.

Locking and unlocking session may
fire a push event that might make the
server client sync faulty.

touches #14797

Co-authored-by: caalador <mikael.grankvist@vaadin.com>
caalador added a commit that referenced this issue Oct 17, 2022
Synchronize pwa handler on the
requestHandlerMap instead of
locking the session.

Locking and unlocking session may
fire a push event that might make the
server client sync faulty.

touches #14797

Co-authored-by: caalador <mikael.grankvist@vaadin.com>
caalador added a commit that referenced this issue Oct 17, 2022
Synchronize pwa handler on the
requestHandlerMap instead of
locking the session.

Locking and unlocking session may
fire a push event that might make the
server client sync faulty.

touches #14797

Co-authored-by: caalador <mikael.grankvist@vaadin.com>
caalador added a commit that referenced this issue Oct 18, 2022
Synchronize pwa handler on the
requestHandlerMap instead of
locking the session.

Locking and unlocking session may
fire a push event that might make the
server client sync faulty.

touches #14797
@caalador
Copy link
Contributor

Flow 23.2.5 (out with Vaadin 23.2.6) got the pwa change. Does that help with the issue or is it still happening?

@echarlus
Copy link
Author

Thanks for the update. I'll check that next week and get back to you

@alexanoid
Copy link

I have pretty much the same issue with resync and Vaadin 23.2.5 https://stackoverflow.com/questions/74147029/vaadin-23-unable-to-fully-reload-application-after-redeploy

@tiagomartins91
Copy link

Same in Vaadin 14. After redeploying, resync happens a lot.

@echarlus
Copy link
Author

echarlus commented Nov 8, 2022

@caalador unfortunately, the latest flow version (Vaadin 23.2.6 with Flow 23.2.5) does not fix the issue. One of my customers just experienced it again.
I found this in the logs :
[http-nio-8080-exec-5] WARN com.vaadin.flow.server.communication.ServerRpcHandler - Resynchronizing UI by client's request. A network message was lost before reaching the client and the client is reloading the full UI state. This typically happens because of a bad network connection with packet loss or because of some part of the network infrastructure (load balancer, proxy) terminating a push (websocket or long-polling) connection. If you are using push with a proxy, make sure the push timeout is set to be smaller than the proxy connection timeout [http-nio-8080-exec-4] WARN com.vaadin.flow.server.communication.ServerRpcHandler - Resynchronizing UI by client's request. A network message was lost before reaching the client and the client is reloading the full UI state. This typically happens because of a bad network connection with packet loss or because of some part of the network infrastructure (load balancer, proxy) terminating a push (websocket or long-polling) connection. If you are using push with a proxy, make sure the push timeout is set to be smaller than the proxy connection timeout [http-nio-8080-exec-7] WARN com.vaadin.flow.server.communication.ServerRpcHandler - Resynchronizing UI by client's request. A network message was lost before reaching the client and the client is reloading the full UI state. This typically happens because of a bad network connection with packet loss or because of some part of the network infrastructure (load balancer, proxy) terminating a push (websocket or long-polling) connection. If you are using push with a proxy, make sure the push timeout is set to be smaller than the proxy connection timeout [http-nio-8080-exec-9] WARN com.vaadin.flow.server.communication.ServerRpcHandler - Resynchronizing UI by client's request. A network message was lost before reaching the client and the client is reloading the full UI state. This typically happens because of a bad network connection with packet loss or because of some part of the network infrastructure (load balancer, proxy) terminating a push (websocket or long-polling) connection. If you are using push with a proxy, make sure the push timeout is set to be smaller than the proxy connection timeout [http-nio-8080-exec-1] WARN com.vaadin.flow.server.communication.ServerRpcHandler - Resynchronizing UI by client's request. A network message was lost before reaching the client and the client is reloading the full UI state. This typically happens because of a bad network connection with packet loss or because of some part of the network infrastructure (load balancer, proxy) terminating a push (websocket or long-polling) connection. If you are using push with a proxy, make sure the push timeout is set to be smaller than the proxy connection timeout [http-nio-8080-exec-6] WARN com.vaadin.flow.server.communication.ServerRpcHandler - Resynchronizing UI by client's request. A network message was lost before reaching the client and the client is reloading the full UI state. This typically happens because of a bad network connection with packet loss or because of some part of the network infrastructure (load balancer, proxy) terminating a push (websocket or long-polling) connection. If you are using push with a proxy, make sure the push timeout is set to be smaller than the proxy connection timeout [http-nio-8080-exec-10] WARN com.vaadin.flow.server.communication.ServerRpcHandler - Resynchronizing UI by client's request. A network message was lost before reaching the client and the client is reloading the full UI state. This typically happens because of a bad network connection with packet loss or because of some part of the network infrastructure (load balancer, proxy) terminating a push (websocket or long-polling) connection. If you are using push with a proxy, make sure the push timeout is set to be smaller than the proxy connection timeout
Page reloaded itself in a loop (see video)

reload.mp4

Any other hints on your side on what could be causing this ?

@caalador
Copy link
Contributor

caalador commented Nov 9, 2022

Based on previous investigation we noted that (in this instance) the sync id jumps 3 at a time for some reason.
It would seem like there is something happening that increments the internal serever id but for the server that is only done for UIDL creation. I at least can not see what could cause this.

@alexanoid
Copy link

alexanoid commented Nov 19, 2022

I may also collect any logs in order to help to get this issue solved. In what logs are you interested in? I very often see the resync issue when using my app on my iPhone. I use Nginx, Undertow or Tomcat embedded via Spring Boot.

Please tell me which logs to activate, for example, for which java packages. I'm sure I can reproduce this issue with resync quickly

@mcollovati
Copy link
Collaborator

Not completely sure, but the additional increment may be caused by a PUSH message produced when UIDLRequestHandler unlocks the session.
As reported in #14887 it may happen that after UIDL response is generated, the UI is still dirty and on VaadinSession unlock this causes push to be invoked.

@echarlus
Copy link
Author

@mcollovati that would be great news ... Anyone working on issue #14887 ?

@alexanoid
Copy link

Today may see this issue a lot:

error

@alexanoid
Copy link

@mcollovati taking into account the mentioned issue with a dirty UI, does this make any sense to use Manual PUSH model instead of Automatic in order to try to avoid such an issue?

@mcollovati
Copy link
Collaborator

It may be worth it to try. At least it can provide some additional information to find out the issue

@alexanoid
Copy link

Sure! Already reconfigured the system for the manual push. Now, monitoring the logs.

@alexanoid
Copy link

resync1

still resyncs (

caalador added a commit that referenced this issue Nov 28, 2022
Add debug logging for incrementing
server id to be able to see what is
calling it during runtime.

Targets #14797
caalador added a commit that referenced this issue Nov 28, 2022
Add debug logging for incrementing
server id to be able to see what is
calling it during runtime.

Targets #14797
mcollovati pushed a commit that referenced this issue Nov 28, 2022
Add debug logging for incrementing
server id to be able to see what is
calling it during runtime.

Targets #14797
vaadin-bot pushed a commit that referenced this issue Nov 28, 2022
Add debug logging for incrementing
server id to be able to see what is
calling it during runtime.

Targets #14797
vaadin-bot pushed a commit that referenced this issue Nov 28, 2022
Add debug logging for incrementing
server id to be able to see what is
calling it during runtime.

Targets #14797
vaadin-bot added a commit that referenced this issue Nov 28, 2022
Add debug logging for incrementing
server id to be able to see what is
calling it during runtime.

Targets #14797

Co-authored-by: caalador <mikael.grankvist@vaadin.com>
vaadin-bot added a commit that referenced this issue Nov 28, 2022
Add debug logging for incrementing
server id to be able to see what is
calling it during runtime.

Targets #14797

Co-authored-by: caalador <mikael.grankvist@vaadin.com>
@mshabarov mshabarov moved this from Needs triage to P1 - High priority in OLD Vaadin Flow bugs & maintenance (Vaadin 10+) Nov 29, 2022
@mshabarov
Copy link
Contributor

Waits for release with debug improvements. Then this new release can be put into monitoring to collect more stack traces for us, and with that information we can move forward with fixing the issue.

@mshabarov mshabarov self-assigned this Nov 29, 2022
MarcinVaadin pushed a commit that referenced this issue Dec 21, 2022
Add debug logging for incrementing
server id to be able to see what is
calling it during runtime.

Targets #14797
@jonasrotilli
Copy link

Issue reported by me last July and still having it: #14232
In fact, it got worse. I have the latest version (23.2.9) and it happens when I leave the application unused for a few minutes.
Curious that it happens a lot more on Mac computers. I tested it on more than one mac computer and it happened on both. Already on a Windows, the problem occurs much less frequently.
I tested it in Google Chrome browser and Safari, in both the same behavior. If I open the browsers log, it shows an error:

(in promise) Error: Client is resynchronizing at FlowClient.42a5821f.js:1:42438

We need an urgent solution for this, the situation is becoming unsustainable, many customers are complaining.

@jonasrotilli
Copy link

Issue reported by me last July and still having it: #14232 In fact, it got worse. I have the latest version (23.2.9) and it happens when I leave the application unused for a few minutes. Curious that it happens a lot more on Mac computers. I tested it on more than one mac computer and it happened on both. Already on a Windows, the problem occurs much less frequently. I tested it in Google Chrome browser and Safari, in both the same behavior. If I open the browsers log, it shows an error:

(in promise) Error: Client is resynchronizing at FlowClient.42a5821f.js:1:42438

We need an urgent solution for this, the situation is becoming unsustainable, many customers are complaining.

For me it was resolved with version 23.2.13.

@mcollovati
Copy link
Collaborator

@echarlus @alexanoid did you tried the latest Vaadin version with the fix for #14887? Did it solve the issues?

@echarlus
Copy link
Author

Since I've upgrade to the 23.3 version that includes the fix I did not have any issue. I'm now testing on 23.3.6 in which I see there are more changes related to push and events management on the server side that are supposed to prevent some UI resync events. I hope it will not create new issues ... I'll keep you posted.

@mcollovati
Copy link
Collaborator

@echarlus Do you have any updates? Did the changes fix the issue for you?

@echarlus
Copy link
Author

So far everything seems to be ok, I've not had the problem since 23.2.5

@mcollovati
Copy link
Collaborator

Thank you for the feedback. I close the issue as the problem seems to be resolved.
Please open another ticket if the problem should happen again.

OLD Vaadin Flow bugs & maintenance (Vaadin 10+) automation moved this from P1 - High priority to Closed Apr 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

7 participants