Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[boschshc] Cache mDNS-based bridge discovery results #16211

Merged
merged 5 commits into from Jan 15, 2024

Conversation

david-pace
Copy link
Member

@david-pace david-pace commented Jan 5, 2024

The bridge discovery participant receives lots of mDNS events. Previously, all events that contained IP addresses of potential bridges were actively contacted using HTTP requests. On some systems eventually the long polling stops due to too many requests.

With this change, we

  • only consider mDNS events where the name property starts with "Bosch SHC"
  • cache already discovered bridges so we don't have to contact them over and over again
  • make sure that this happens in a thread-safe manner because the mDNS events are handled in individual concurrently running threads

closes #16174

The bridge discovery participant receives lots of mDNS events.
Previously, all events that contained IP addresses of potential bridges
were actively contacted using HTTP requests. On some systems eventually
the long polling stops due to too many requests.

With this change, we
* only consider mDNS events where the name property starts with "Bosch
SHC"
* cache already discovered bridges so we don't have to contact them over
and over again
* make sure that this happens in a thread-safe manner because the mDNS
events are handled in individual concurrently running threads

Signed-off-by: David Pace <dev@davidpace.de>
@david-pace david-pace requested a review from jlaur January 5, 2024 00:18
@david-pace david-pace linked an issue Jan 5, 2024 that may be closed by this pull request
@david-pace david-pace added the bug An unexpected problem or unintended behavior of an add-on label Jan 5, 2024
Copy link
Contributor

@lsiepel lsiepel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caching these results is a smart way to workaround the performance issue, but is filtering on the name not allready enough? Maybe i'm wrong, but as these results are cached permanently, there is no way to fix an earlier cached entry.
For instance if you re-connect a bridge under an IP that is allready cached, one could only disable/enable the binding to get the cache cleared.

@GerdZanker
Copy link
Contributor

Caching these results is a smart way to workaround the performance issue, but is filtering on the name not allready enough?

The executed REST request for the Bosch SHC could cases issues - see related discussion.
The REST request for other network members will avoid side effects too.

Maybe i'm wrong, but as these results are cached permanently, there is no way to fix an earlier cached entry. For instance if you re-connect a bridge under an IP that is allready cached, one could only disable/enable the binding to get the cache cleared.

We should mention this in the Binding ReadMe.
From my point of view this is a good trade-off between functionality and simple code.

Copy link
Contributor

@GerdZanker GerdZanker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caching works for my test environment.
Based on the traces I can also confirm that we avoid with the name check many other REST requests.

@david-pace david-pace added the work in progress A PR that is not yet ready to be merged label Jan 5, 2024
@mike-bike
Copy link

Hi all, quick update: without the excessive mDNS requests/discoveries my test PI2 is still working stable. Whatever the appropriate fix in the binding is, the amount auf discovery calls need to be drastically minimized. Caching might be a workaround but there might also be better solutions possible to manage that. I am wondering how other bindings handle that. Or is Bosch suddenly excessively sending mDNS announcements? As this cannot be controlled by any 3rd party, it needs to be managed.
Important aspects to consider might be:

  • if the discovery is only required to identify new bridges or is it also needed to get notified about new devices?
  • IP address changes should not be an issue. Either the user adds a DNS name then the resolver need to manage that, or the user need to change the IP address in the configuration manually.
  • How does the binding check that a bridge with a changed address is actually be the same and hence no need to reconfigure anything? I'd assume that needs to be done by a unique ID provide by the bridge anyway...

Thanks for all your effort and time resolving the issue and maintaining the binding.

Please let me know if I could support with any sort of testing. I am happy helping with the new generation II devices. I am really keen to get the Roller Shutter, Dimmer, and Relais supported.

Have a great weekend!

Cheers, Michael

* use openHAB cache implementation with 10 minutes timeout
* set TTL of openHAB discovery results to 10 minutes as well
* remove obsolete parameter and member
* enhance Javadocs
* adapt/enhance unit tests

Signed-off-by: David Pace <dev@davidpace.de>
@david-pace david-pace removed awaiting feedback work in progress A PR that is not yet ready to be merged labels Jan 10, 2024
Signed-off-by: David Pace <dev@davidpace.de>
@david-pace
Copy link
Member Author

david-pace commented Jan 14, 2024

Hi @mike-bike, thank you for your comments. To be honest I don't know why we get so many mDNS discovery events, and whether this could be a bug in an underlying implementation. But I think it is still a good idea to cache the results for a certain amount of time. Currently I get ~175 mDNS events in 5 minutes. I only see that they contain different combinations of IP addresses and metadata. Even if this is reduced to 100, 50 or 25 events in 5 minutes, I would still say we should not query the bridge for every event we receive. Once we discovered a bridge successfully, we should remember this result for a few minutes in my opinion.

To answer your other questions:

  • The mDNS discovery is just used to discover bridges. The other devices are discovered with a different mechanism directly through REST requests to the bridge.
  • The IP address of the bridge must be configured to be static in the network. But if it should change for some reason it is configurable.
  • If the IP changes the configuration must be changed manually. There is no feature to detect that the IP address changed automatically (and this is also not planned so far).

However, your comments brought me to the conclusion that the Thing ID we propose by default for discovered bridges is not optimal. Currently it is the IP address with dashes, e.g. 192-168-0-42. But since the IP address can theoretically change we should rather use an identifier that does not change, for example the MAC address. We will solve this in another Github issue / pull request: #16279

Thank you for your support and your valuable input ❤️

@GerdZanker @jlaur @wborn if you also agree that caching the discovered bridges for a few minutes is the right approach, this PR is ready to be reviewed and merged. Of course it still would be interesting whether you are aware of any changes/bugs in the underlying mDNS framework. Thank you all for reviewing the code 👍

@mike-bike
Copy link

Hi @mike-bike, thank you for your comments. To be honest I don't know why we get so many mDNS discovery events, and whether this could be a bug in an underlying implementation. But I think it is still a good idea to cache the results for a certain amount of time. Currently I get ~175 mDNS events in 5 minutes. I only see that they contain different combinations of IP addresses and metadata. Even if this is reduced to 100, 50 or 25 events in 5 minutes, I would still say we should not query the bridge for every event we receive. Once we discovered a bridge successfully, we should remember this result for a few minutes in my opinion.

Absolutely! The binding cannot control how many mDNS events will be triggered. It would be the responsibility of the binding to be sensitive to stress on the network and especially on the bridges. Would it be possible to define the TTL as a user attribute in the bridge binding?

To answer your other questions:

  • The mDNS discovery is just used to discover bridges. The other devices are discovered with a different mechanism directly through REST requests to the bridge.

That's good to know. Then mDNS rediscoveries could be significantly reduced without negative user experience.

  • The IP address of the bridge must be configured to be static in the network. But if it should change for some reason it is configurable.
  • If the IP changes the configuration must be changed manually. There is no feature to detect that the IP address changed automatically (and this is also not planned so far).

I typically rely on the resolver and use DNS names rather than IP addresses. But this is my personal preference - even on static or fixed assigned addresses. If for whatever reason, I'd need to change network configuration, I'd just need to update the /etc/hosts or rely on the DHCP server to manage that. I think, that's why DNS and resolvers have been invented for😀

However, your comments brought me to the conclusion that the Thing ID we propose by default for discovered bridges is not optimal. Currently it is the IP address with dashes, e.g. 192-168-0-42. But since the IP address can theoretically change we should rather use an identifier that does not change, for example the MAC address. We will solve this in another Github issue / pull request: #16279

MAC address would be my second preferred option. I think the Bosch bridge announces itself with a unique name. Mine had shc1083a7. If that's available during discovery, I'd suggest using that (assuming it is really a unique ID provided by Bosch)

Thank you for your support and your valuable input ❤️

Thanks for your support!

Copy link
Contributor

@GerdZanker GerdZanker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @david-pace for your discovery improvements.
It will reduce a lot the REST requests. The SHC requests will happen only every 10 min with your ``ExpiringCacheMap``` and other mDNS entries will never be requested, because the name doesn't match!

@GerdZanker GerdZanker self-requested a review January 15, 2024 18:46
Copy link
Contributor

@lsiepel lsiepel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM

@lsiepel lsiepel merged commit 777def6 into openhab:main Jan 15, 2024
3 checks passed
@lsiepel
Copy link
Contributor

lsiepel commented Jan 15, 2024

@david-pace and @jlaur is this something to consider for patching 4.1 ?

@lsiepel lsiepel added this to the 4.2 milestone Jan 15, 2024
@david-pace david-pace deleted the 16174-mdns-discovery-caching branch January 15, 2024 21:26
@david-pace
Copy link
Member Author

Thanks for reviewing and merging this PR @lsiepel 👍

In my opinion it's not an absolutely critical fix, but it certainly wouldn't be bad to limit the requests on 4.1 as well. What do you think @jlaur?

@jlaur
Copy link
Contributor

jlaur commented Jan 15, 2024

In my opinion it's not an absolutely critical fix, but it certainly wouldn't be bad to limit the requests on 4.1 as well. What do you think @jlaur?

If the benefits outweights the risk of regressions, feel free to cherry-pick into 4.1.x @lsiepel. I did not read it myself, so can't assess the risk in general, or if there are any compatibility issues between 4.1 and 4.2 affecting this PR.

andrasU pushed a commit to andrasU/openhab-addons that referenced this pull request Jan 27, 2024
* [boschshc] Cache mDNS-based bridge discovery results

The bridge discovery participant receives lots of mDNS events.
Previously, all events that contained IP addresses of potential bridges
were actively contacted using HTTP requests. On some systems eventually
the long polling stops due to too many requests.

With this change, we
* only consider mDNS events where the name property starts with "Bosch
SHC"
* cache already discovered bridges so we don't have to contact them over
and over again
* make sure that this happens in a thread-safe manner because the mDNS
events are handled in individual concurrently running threads

Signed-off-by: David Pace <dev@davidpace.de>
Signed-off-by: Andras Uhrin <andras.uhrin@gmail.com>
austvik pushed a commit to austvik/openhab-addons that referenced this pull request Mar 27, 2024
* [boschshc] Cache mDNS-based bridge discovery results

The bridge discovery participant receives lots of mDNS events.
Previously, all events that contained IP addresses of potential bridges
were actively contacted using HTTP requests. On some systems eventually
the long polling stops due to too many requests.

With this change, we
* only consider mDNS events where the name property starts with "Bosch
SHC"
* cache already discovered bridges so we don't have to contact them over
and over again
* make sure that this happens in a thread-safe manner because the mDNS
events are handled in individual concurrently running threads

Signed-off-by: David Pace <dev@davidpace.de>
Signed-off-by: Jørgen Austvik <jaustvik@acm.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug An unexpected problem or unintended behavior of an add-on
Projects
None yet
7 participants