New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wamp.close.transport_lost [WAMP transport was lost without closing the session before] #514
Comments
Which version of labgrid is your custom fork based on? |
Your log shows the exporter logfile, can you print the coordinator log file around the same timestamps? It would be interesting whether this is the coordinator dropping the exporter due to ping timeout. It would also be relevant what kind of workloads are run on your exporter, maybe we still have a bottleneck somewhere in the exporter implementation which prevents timely responses to the coordinator pings. |
We have 6 exporters, only 4 were active at the time the log was gathered. I had to take the logs from today and replace the real workstations/exporters/coordinator name and ip with dummy ones for security reasons. So these are the new names after which you can search for:
See attached the log with the exporter part at the beginning and the coordinator in the end. They are on different machines (I just concatenated the log). |
Your coordinator log is off by an hour. It might be easier to search for lines like:
in the exporter log yourself. |
yes, I forgot to mention that the coordinator machine is on a different timezone and that is why it is one our off. I have seen "kicking exporter" on coordinator log only once, but it was 2 days ago and for another exporter |
This indicates that crossbar stopped responding to low-level websocket pings from the exporter. This could be caused by network issues or an unresponsive crossbar process. Note that the coordinator is not involved in answering these websocket pings from exporters. We've merged some stability fixes for the communication between exporters and coordinator in PR #547. It doesn't look like your issues are exactly like the symptoms in that case, but it would be useful to test anyway. If you still get WebSocket ping timeouts, it could be useful to check with wireshark whether the TCP connection still ACKs the WebSocket ping request or not... |
Hi, we will try the patch at next rebase. For now, we have worked around the issue with a sleep time of 2 minutes into a loop that was checking a file from the exporter - this somehow solved the problem. |
Closing, I expect this to be fixed either by the update or a local workaround. Ping me for a reopen if required. |
Hi,
We have multiple exporters running as services and they often fail (at least 2-3 times per day) with below error. Is there any specific setting/configuration that we can use to avoid these failures?
The text was updated successfully, but these errors were encountered: