-
Notifications
You must be signed in to change notification settings - Fork 309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Event stream cannot always be opened #6338
Comments
"Last activity" uses the application-level event stream, while when looking at the end device event, another separate end device event stream is opened. I remember an earlier issue where the event stream would not open if too many connections are already open or pending. Iirc, this happened because event streams were not closed after leaving the live data view. Having a regression there could be one possible cause. I'll check to see if we can rule this out. |
That's likely. It seems also that there are only six concurrent connections allowed per browser. So if we already consume two (application + device) per device tab, this quickly adds up, not even considering what the other tabs keep open. In any case, I really think we should switch to websockets for the Console which does not fall under our API compatibility commitment, and recommend against using the SSE endpoint for use in browsers. This is not about being right or wrong, about who to blame or about fixing this for a particular browser/tabs open/network combination. This is simply about improving user experience and saving support time on our side. I discussed this offline with @adriansmares, mentioning him to share his thoughts himself for the record. I don't think we should escalate this now to endpoints that the Console uses instead of gRPC gateway directly. The omitted fields issue is very annoying too, not just for the Console, but for everyone, not only when using our gRPC API but also in webhooks and MQTT. We never wanted to touch this because of gogoproto, but now we're cleared to gradually and incrementally improve our developer experience on this front. This is not particular to the Console, so let's focus on a websockets event stream. As websockets are bidirectional, we can let the browser send "request" messages to subscribe and unsubscribe from events, filtering event names and/or verbose mode. This would allow us to maintain one websocket connection and multiplex entity events, as long as the Console correctly unsubscribes when the user is navigating away. Backend wise this would mean multiple event subscriptions per websocket connection that are dynamically created and released, and events are all funneled in JSON over the websocket connection. For background, this is really painful for customers and very hard to debug. Example user report from earlier this week:
And then a screenshot showing events that just stop with a warning and don't recover. |
Please note that the current event stream is based on HTTP2 and all of the streams go via the same physical connection - there is multiplexing already in HTTP2, and everything goes via a singular TLS connection. You can test this today by opening more than 6 tabs with different end devices (or even the same one) and observe that you still receive the traffic. The 6 connection limit is not relevant for the issue that we are having right now. If we move to WebSockets, we cannot use HTTP2 (there are no WebSockets in HTTP2) and then we really have at most 6 tabs open at the same time, because a WebSockets connection is really one physical TLS connection. The tradeoff between WebSockets and HTTP2 long polling is not as trivial as we are making it look like here. I still believe that we are mishandling the streams in the Console and this causes issues. We've been using these event streams for years, but only recently have started to receive reports regarding the missing events or frozen streams. The problem is real, but I don't think that refactoring this to WebSockets is the solution that we should be rushing towards. |
This might have been resolved via #6387. @johanstokking can you check if you can still recreate this? |
Good, let's close this and I'll reopen if I encounter this again. |
Summary
The event stream cannot always be opened and/or it gets aborted, so that the live traffic view does not work.
Steps to Reproduce
Unfortunately I do not have clear reproduction steps. It did happen multiple times now.
So far I only encountered this in the end device live traffic view, being unable to see the simulated uplinks. I do get to see the "last activity" timer updated, but this is probably working locally.
Current Result
The previous events are shown, but new events do not come in.
Expected Result
The live traffic view works.
Relevant Logs
When this happens, I see in the browser network panel that the
POST
request to/api/v3/events
fails withNS_BINDING_ABORTED
.URL
No response
Deployment
The Things Stack Community Edition
The Things Stack Version
3.26.1
Client Name and Version
Other Information
No response
Proposed Fix
According to https://stackoverflow.com/questions/704561/ns-binding-aborted-shown-in-firefox-with-httpfox, this may be related to caching. I also see other potential reasons.
Contributing
Code of Conduct
The text was updated successfully, but these errors were encountered: