Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support sending a keepalive packet to SSE item state connections #3086

Merged
merged 1 commit into from Sep 26, 2022

Conversation

digitaldan
Copy link
Contributor

Browser SSE implementations have no support for monitoring connection states once a connection is made. This makes detecting the health of our UI problematic. This PR adds a separate 'keepalive' message to the SSE stream using the 'keepalive' event name, so should not affect existing clients who may not support this new message. The message itself contains the interval value so clients can easily discover and setup their own watchdog functions.

See openhab/openhab-webui#1441 for information on the UI point of view of this issue.

Signed-off-by: Dan Cunningham dan@digitaldan.com

@digitaldan digitaldan added the work in progress A PR that is not yet ready to be merged label Sep 18, 2022
@digitaldan digitaldan requested a review from a team as a code owner September 18, 2022 18:32
@J-N-K
Copy link
Member

J-N-K commented Sep 18, 2022

Why can't the ALIVE be used for that?

@digitaldan
Copy link
Contributor Author

It could, but i wanted to send something A) much more frequently then the Alive message and B) i wanted to send down the server configured interval so clients did not have to guess or know ahead of time what that is.

I was also not sure the history of the ALIVE message with regards to sitemaps, so did not want to change anything there, like frequency and possibly have an unintended consequences.

@J-N-K
Copy link
Member

J-N-K commented Sep 18, 2022

It was only recently introduced and the only reason was to detect broken connections. I don’t think it‘s used by any client. So I would prefer modifying that one.

@digitaldan
Copy link
Contributor Author

If its not used by clients, and was recently added, then yes i agree it makes sense to have one version, i'll update the PR shortly.

@digitaldan
Copy link
Contributor Author

I just pushed a simplified version.

@J-N-K
Copy link
Member

J-N-K commented Sep 19, 2022

I was talking about the event introduced here: #2983

Edit: This was a response to a comment by @lolodomo which seems to have disappeared somehow.

@lolodomo
Copy link
Contributor

I deleted my message because I have a doubt it was correct. I will first check.

private static class ServerAliveEvent {
public final String type = "ALIVE";
private static class KeepaliveEvent {
public final int interval = KEEPALIVE_INTERVAL_SECONDS;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep the type field as it is expected in any message.
And if you keep value "ALIVE", it would avoid changing again what was done after @J-N-K change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who is expecting this message? If you read the thread above it sounds like the assumption is that the UI's are not aware of this message? Maybe you could explain how UIs use the "ALIVE" message ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Until now, all messages sent by openHAB have a type field.
The type was used to distinguish an ALIVE message from any other "standard" message.
This is used in the remote openHAB binding.
Basic UI is using the same kind of check but this is for sitemap SSE messages (so not applicable here). But htose SSE messages also have a type field.
I don't know what is the standard with SSE message but having a type looks like a good idea to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also suggest to keep the type and add the interval as additional field. So everyone not interested (already filtering out ALIVE currently) does not have to change the code. And I agree that always having a type for an event sounds like a good idea.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will refactor this again and merge the two concepts.

And I agree that always having a type for an event sounds like a good idea.

I agree, but SSE connections have an event name attribute which allows you to specify the type of event, right now the ALIVE message is sending an event type of event (event->event ?) which is a little confusing, and why i called my event keepalive since that is the event type of the object i was sending.

Since it sounds like there are legacy clients who depend on the event->event->ALIVE message, i will use this as well and add the additional interval field as well as turning up the frequency. I'll update the PR shortly.

logger.debug("Sending keepalive to SSE connections");
OutboundSseEvent keepaliveEvent = sse.newEventBuilder().name("keepalive")
.mediaType(MediaType.APPLICATION_JSON_TYPE).data(new KeepaliveEvent()).build();
itemStatesBroadcaster.send(keepaliveEvent);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is new. We have to study if it could impact any sitemap UI. If I am not wrong, the core UI is listening for item state changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I'm not following? If you mean the "keepalive" event, yes it is new, but as a UI you would specifically need to listen to this event type to process the message. If its a standard browser using the "onMessage" callback, then there would not be an issue, if we have clients who have rolled their own SSE client implementation, then i could see this possibly being an issue.

My original approach was not to touch the ALIVE message for fear of breaking something sitemap related fyi.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean you added the line "itemStatesBroadcaster.send(keepaliveEvent);" in addition to the already present line "topicBroadcaster.send(keepaliveEvent);". I am not sure what this line with itemStatesBroadcaster is doing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what this line with itemStatesBroadcaster is doing.

The /states endpoint, which the main UI uses, is using the itemStatesBroadcaster, not the topicBroadcaster , so i'm sending this new message event of type keepalive with a payload of the JSON encoded KeepaliveEvent object to those connections as well as the topicBroadcaster which was the original impl. In terms of the main UI, this will have no affect unless the client starts to listen to this specific event type of "keepalaive", item events are sent with no event type, which is the default. I will refactor this again to use the ALIVE type, as to not break existing impls.

@digitaldan
Copy link
Contributor Author

Here is the result of the new PR as seen in a dev console when using the Main UI. The alive message now has an event type of "event" and a data type of "ALIVE", which seems confusing to me, but i guess this is needed to make other legacy clients work who are not checking the SSE type and just looking at the data.

image

@lolodomo
Copy link
Contributor

If you really think the type is not required, that's fine for me too. I agree that the type field in that case is more or less redundant with the event name.
No need to consider the remote openHAB binding as a constraint here. I will adjust it whatever is decided here and it will take 15 minutes of coding and test. The binding is already filtering on events with name "message" So if your event has "alive" or "keepalive" as name, it will be easy to filter it, even more than checking a type field.

Sorry if I change my mind today, but I am myself very open to the best solution if we agree that the only known impacted components are the MainUI and the remote openHAB binding.

@J-N-K
Copy link
Member

J-N-K commented Sep 21, 2022

Are you sure that existing clients do not rely on each message in the /states stream to be a state-message?

@digitaldan
Copy link
Contributor Author

but I am myself very open to the best solution if we agree that the only known impacted components are the MainUI and the remote openHAB binding

Understood, i also realize how touchy some of the sitemap based logic is out there and appreciate not wanting to break things for those clients!

I did some digging around on our official clients, here is what i can find that uses these endpoints:

  • /rest/events/states is only being used by the Main UI and it only subscribes to message events (default)
  • /rest/events is being used by everything else
    • all openhab-ui projects, so Main UI, Habbot, Habpanel and Classic as seen here Ui Search
    • Remote openHAB
    • Android client as seen here Android Search

The subscriptions to /rest/events could be aversely affected with changes here as most are ignoring the SSE type field and just operating off the data object. Some clients may just ignore unknown data types, others may complain in logging, and some may break if the JSON object does not have a type field.

I'm ok with how it is now, it should work with all clients afaik and solves the issue i am trying to fix. The other solution would be to go back to my original PR which crafted a specific keep alive message just for the /reset/event/states which only affects the Main UI , but again i'm good with this as is.

@digitaldan
Copy link
Contributor Author

One other thing i just realized, no client is actually checking for an SSE event type of "event", and "ALIVE" is the only message using it, so we could rename this sse type to something more descriptive like "alive" so future clients could listen for this specific event (or not). This should not have any affect on current clients. All other messages use the default SSE event type of "message" for both /rest/events and /rest/events/state

@lolodomo
Copy link
Contributor

lolodomo commented Sep 22, 2022

  • all openhab-ui projects, so Main UI, Habbot, Habpanel and Classic as seen here Ui Search

Your search also finds Basic UI for web audio support. For everything else, Basic UI is based on other SSE events (sitemap SSE events).

@lolodomo
Copy link
Contributor

One other thing i just realized, no client is actually checking for an SSE event type of "event", and "ALIVE" is the only message using it, so we could rename this sse type to something more descriptive like "alive" so future clients could listen for this specific event (or not). This should not have any affect on current clients. All other messages use the default SSE event type of "message" for both /rest/events and /rest/events/state

Good idea IMHO

@spacemanspiff2007
Copy link
Contributor

HABApp is currently filtering {"type":"ALIVE"} messages. Adding the interval will break things so a heads up will be nice before this gets merged.

@digitaldan
Copy link
Contributor Author

Adding the interval will break things so a heads up will be nice before this gets merged.

This is an extra field on a JSON object, why would it break HABApp? Is it not parsing this object using a JSON library?

@spacemanspiff2007
Copy link
Contributor

No - this was added quick and dirty.
String compare is cheap so why parse it into a json when the event gets discarded anyway.

Obviously it's an easy fix but it still has to be done otherwise it'll generate a huge error message with trace back in the log every time this event is received.

Signed-off-by: Dan Cunningham <dan@digitaldan.com>
@digitaldan
Copy link
Contributor Author

String compare is cheap so why parse it into a json when the event gets discarded anyway.

Probably for this exact reason, future proofing :-) I would go ahead and parse the message as JSON like any other message now in anticipation of this PR (or optionally check the SSE type field for "alive" and ignore that).

@lolodomo @J-N-K I think this is ready, thanks for reviewing so far. I also have the client side code working which i will open a PR for @ghys to review.

@digitaldan digitaldan removed the work in progress A PR that is not yet ready to be merged label Sep 23, 2022
@digitaldan digitaldan changed the title [WIP] Support sending a keepalive packet to SSE item state connections Support sending a keepalive packet to SSE item state connections Sep 23, 2022
@J-N-K
Copy link
Member

J-N-K commented Sep 23, 2022

I am still not convinced that this should be part of the /states stream. This is specifically for states and ignores all other events. Shouldn't a client subscribe /events if it is interested in anything other than states? @openhab/core-maintainers can someone who is more familiar with SSE design than me comment here? Thanks.

Code looks good otherwise.

@lolodomo
Copy link
Contributor

That's fine for me but I have no opinion on last @J-N-K comment which has to be considered.
I am also asking myself if an event every 10 seconds is not too much?

@digitaldan
Copy link
Contributor Author

I am still not convinced that this should be part of the /states stream.

The main UI uses this call almost exclusively during normal use, the calls to /rest/event are only used when touching the settings part of the app. So in my opinion it's the most important stream to add this to, at least for the health of the app.

So only sending this on the /rest/event endpoint would do nothing for our Main UI.

The keepalive issue is a general SSE problem, not specific to openHAB, so i feel like adding this to all SSE connections is a good idea, and does not pollute the message stream.

Plus this is using a SSE event type called "alive", our main UI only subscribes to "messages", so if we pushed this change out and did nothing to our Main UI, it would have no knowledge of this message and never see it. I feel like thats a good messaging design pattern.

/rest/event/states was specifically added by @ghys i believe for the main UI, so i'm not sure why we would be hesitant to make changes to it to accommodate improvements.

@digitaldan
Copy link
Contributor Author

I am also asking myself if an event every 10 seconds is not too much?

I guess this is a question of how long would a client want to know that they are no longer getting messages from the server? For me 10 seconds is pretty long, plus the client needs to pad that a bit, so its more like 12 or 15 seconds of being disconnected before the UI can do anything like display a message or pop up a warning.

I did initially think about adding a parameter to the sse endpoint to allow a client to specify an interval value, I could see value in doing that across the board on both endpoints.

@lolodomo
Copy link
Contributor

I am preparing a code optimization in the remote openHAB binding related to this change.

lolodomo added a commit to lolodomo/openhab-addons that referenced this pull request Sep 23, 2022
Depends on openhab/openhab-core#3086

Signed-off-by: Laurent Garnier <lg.hc@free.fr>
@ghys
Copy link
Member

ghys commented Sep 24, 2022

The main UI uses this call almost exclusively during normal use, the calls to /rest/event are only used when touching the settings part of the app. So in my opinion it's the most important stream to add this to, at least for the health of the app.

I would agree to this, from the perspective of the main UI, when you're not in the administration area, and simply viewing pages, the most important type of SSE to keep alive is the state tracker i.e. /rest/events/states (you also have a /rest/events connection but only if you enabled web audio support in Help & About, this was added recently; if you haven't, there's no connection made).

So introducing a keepalive event for the "regular" SSE connection, while useful in general, from a main UI end user perspective, - that is, a user who never uses/doesn't have access to the admin area - would only make sure the dedicated SSE connection for web audio is still alive. If you're an admin and you suspect the SSE connection(s) are dead, you can navigate to another screen and go back, as the SSE connections are closed and open when navigating (unless you have the developer sidebar open!), so they're usually short lived, as opposed to user pages which can remain on the foreground for a long time.

It would however benefit other clients like HABPanel who only use the regular SSE endpoint only.

Since there's no type information in state trackers' event data, the only clean way is to change the type of the SSE event itself - then clients like the UI can subscribe to them with eventSource.addEventHandler('alive', ...).

See https://developer.mozilla.org/en-US/docs/Web/API/EventSource#events.
Note that eventSource.onmessage is equivalent to eventSource.addEventHandler('message', ...).

Plus this is using a SSE event type called "alive", our main UI only subscribes to "messages", so if we pushed this change out and did nothing to our Main UI, it would have no knowledge of this message and never see it. I feel like thats a good messaging design pattern.

👍

@J-N-K
Copy link
Member

J-N-K commented Sep 25, 2022

@spacemanspiff2007 I think that the way this is now should be merged, so an event with type alive (instead of event like it was before) that contains {"type" : "ALIVE", "interval" : 10} (interval added). How long do you need for the necessary adjustments?

@J-N-K J-N-K added enhancement An enhancement or new feature of the Core REST/SSE labels Sep 25, 2022
Copy link
Member

@J-N-K J-N-K left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, waiting for @spacemanspiff2007 before merging.

@spacemanspiff2007
Copy link
Contributor

@J-N-K thank you for the consideration.
I'll provide a fix probably next week since I am currently busy.
Since I know what I have to do this can be merged.
Thank you!

@J-N-K J-N-K merged commit b808ea6 into openhab:main Sep 26, 2022
@J-N-K J-N-K added this to the 3.4 milestone Sep 26, 2022
splatch pushed a commit to ConnectorIO/copybara-hab-core that referenced this pull request Jul 12, 2023
Signed-off-by: Dan Cunningham <dan@digitaldan.com>
GitOrigin-RevId: b808ea6
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement An enhancement or new feature of the Core REST/SSE
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants