Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Google Home option to disable adding devices to known_devices.yaml #21357

Closed
kmlucy opened this issue Feb 23, 2019 · 32 comments · Fixed by #26035
Closed

Google Home option to disable adding devices to known_devices.yaml #21357

kmlucy opened this issue Feb 23, 2019 · 32 comments · Fixed by #26035

Comments

@kmlucy
Copy link

kmlucy commented Feb 23, 2019

Home Assistant release with the issue:
0.88.0

Last working Home Assistant release (if known):
N/A

Operating environment (Hass.io/Docker/Windows/etc.):
Docker

Component/platform:
https://www.home-assistant.io/components/googlehome/

Description of problem:
The Google Home component automatically adds new devices it finds to the known_devices.yaml file. Even if track_new_devices is false, it still adds the devices to the file, just with track set to false. The problem is that many bluetooth devices randomize their MAC addresses if they are not paired, so it winds up adding thousands of entries to the known_devices.yaml file. This results in Home Assistant taking multiple minutes to restart.

I first added the Google Home tracker about a month ago, and my known_devices.yaml file had already gotten to nearly 10MB, and startup of device tracker was taking over two minutes. When I cleared all the untracked devices from my file, startup time dropped to under 10 seconds.

I tried just removing the write permission bit from the known_devices.yaml file, but as I believe Home Assistant runs as root inside the Docker container, it just added the write bit back. As of now, my only option is to keep a separate known_devices.yaml file with only the tracked devices, and overwrite the normal one before every restart, which is not a sustainable option. An option like look_for_new_devices with a default of true would take care of this problem.

Problem-relevant configuration.yaml entries:

googlehome:
  devices:
    - host: 192.168.5.27
      rssi_threshold: -80
    - host: 192.168.5.72
      rssi_threshold: -80
    - host: 192.168.5.81
      rssi_threshold: -80
    - host: 192.168.5.45
      rssi_threshold: -80
    - host: 192.168.5.36
      rssi_threshold: -80
    - host: 192.168.5.63
      rssi_threshold: -80

Traceback (if applicable):
N/A

Additional information:
N/A

@tjorim
Copy link
Contributor

tjorim commented Feb 24, 2019

The Google Home component automatically adds new devices it finds to the known_devices.yaml file. Even if track_new_devices is false, it still adds the devices to the file, just with track set to false.

That's just how the track_new_devices option works, as noted in the docs: https://www.home-assistant.io/components/device_tracker/:
"Note that setting track_new_devices: false will still result in new devices being recorded in known_devices.yaml, but they won’t be tracked (track: false)."

@andrewsayre
Copy link
Member

@ludeeus anything we can do to improve the UX here?

@ludeeus ludeeus self-assigned this Feb 24, 2019
@ludeeus
Copy link
Member

ludeeus commented Feb 24, 2019

Limiting the device_type can certainly help.
Usually, you do not need/want to track devices that are LE only (2)
https://www.home-assistant.io/components/googlehome/#device_types

device_types: [1, 3]

@kmlucy
Copy link
Author

kmlucy commented Feb 24, 2019

@tjorim I'm aware that's how it currently works, but the current behavior is causing the extended startup times I mentioned above.

@ludeeus That could probably help, but it would only slow down how long it takes for the file to get oversized. I want to avoid having to manually go in and clear out all the unneeded entries every so often.

@ludeeus
Copy link
Member

ludeeus commented Feb 24, 2019

The issue is with LE devices that changes their mac, if you do not track LE devices you will not have an issue :)

Note: I'm not discarding the idea of a look_for_new_devices key, just trying to help you with what you can do now

@kmlucy
Copy link
Author

kmlucy commented Feb 24, 2019

I'll add the device types filter to my config and see how that goes. I'm hoping adding an option to not look for new devices won't be a huge amount of work, and will definitely help for people who do need to track LE only devices.

@ludeeus
Copy link
Member

ludeeus commented Feb 24, 2019

As of now, I'm not sure which option would be best.

  • Add logic similar to device_tracker/bluetooth_le_tracker.py (track_new_devices) option
  • Add include option to specify the mac adresses.
googlehome:
  - host: 192.168.2.2
    include:
      - 23:34:f3:d3:32

Option 2 will give more controll, and be lighter on the system not having to read and parse the known_devices file.

@awarecan
Copy link
Contributor

I vote option 1 to deal with BLE devices

@kmlucy
Copy link
Author

kmlucy commented Feb 26, 2019

I would also vote for option 1 so there aren't two places to keep up with tracked devices. I would rather everything be managed in one place (known_devices.yaml).

@HudsonMC16
Copy link

I like the idea of it being light on the system, per Ludeeus' option 2. Can the parsing of the known-devices.yaml be limited to just startup, or is that already current behavior?

@ludeeus
Copy link
Member

ludeeus commented Feb 28, 2019

Since it would be a config option (you have to restart for it to take effect) loading on startup migth work

@sibero80
Copy link

Given that this component is both a SENSOR and a DEVICE_TRACKER, would it not make sense to be consistent with how these kinds of devices have been historically configured?
I underestand the Device Type solution, but I also think that adding the default options would be a great step towards configuration and documentation consistency.

    new_device_defaults:
      track_new_devices: true
      hide_if_away: false

@HudsonMC16
Copy link

The issue is that known_devices.yaml is being polluted with thousands of entries. Device defaults don't address that, as they are still recorded in the yaml file, just not tracked in HA.

@apetlock
Copy link

apetlock commented May 4, 2019

I've run into this issue, not specifically with Google Home, but just devices cluttering up my known_devices.yaml file. I'd prefer not to track every device on my network as being "Home" just for simplicity's sake, just the few I want to track. I ended up adding a add_new_devices option that defaults to True. If False, it just doesn't add anything new that it finds.

new_device_defaults:
  add_new_devices: false
  track_new_devices: false
  hide_if_away: false

I've forked and pushed a commit and if it's something the HA team wants I can do a pull request to fix both this issue and #20022, and then add the appropriate documentation.

Also while in there I can add more features for granularity if folks want, but I felt this was a pretty straightforward solution for those who want it. Any more suggestions appreciated.

@mrand
Copy link

mrand commented May 4, 2019

@apetlock Presumably the documentation would need to be updated as well? Hopefully it could be made crystal clear the difference between "add_new" and "track_new" (it isn't entirely clear to me).

@apetlock
Copy link

apetlock commented May 4, 2019

@apetlock Presumably the documentation would need to be updated as well? Hopefully it could be made crystal clear the difference between "add_new" and "track_new" (it isn't entirely clear to me).

Yeah, I will add and make a pull request for the docs too, as I could see the confusion. Also could just change the option name to something else if anyone has better ideas.

@elupus
Copy link
Contributor

elupus commented May 4, 2019

Since i can't see it mentioned here. What is the benefit/use of it to add to know devices if track_new is false?

@apetlock
Copy link

apetlock commented May 5, 2019

@elupus It still adds them to known_devices.yaml and they show up in History as well, all being labeled as "Home" even if track is false and they don't show up in search lists.

Between this issue and the other one listed, either hundreds of devices are added (which affects restart/load times according to those users) and some change their hardware id thus showing up multiple times and continually adding to the file. My issue is more like the former, although I haven't seen performance issues yet. I added the devices I wanted to known_devices.yaml, and those are the ones I want to maintain. There's no reason for anything else to be added to that file, at least in my case (and in others apparently.)

This could also be fixed simply by making track_new_devices: false just not add anything to known_devices.yaml as well. That'd be a single line of code, excluding the test. (Although from an ease-of-use perspective I can see why it is that way, which is why defaulting to True would be best for both track and/or add.)

It could also be considered a privacy issue of sorts. My particular device_tracker is the ubiquiti unifi component, which means every time a guest connects to the network, their device info is added (effectively permanently until I remove it manually).

It was enough of a desire that I decided to make the code change on my local setup, at the very least, and saw that a couple open issues were related.

@mrand
Copy link

mrand commented May 5, 2019

either hundreds of devices are added

If HA is deployed in a large "house," like a college house (frat, sorority, or otherwise), I think it could be thousands.

Or even a small house, but a scanner located where it might pick up devices from a bustling nearby sidewalk in a urban city, lots of thousands.

@elupus
Copy link
Contributor

elupus commented May 5, 2019

What is use case of adding to know_devices of track new is false?

@bagobones
Copy link

Maybe instead of constantly scanning and adding devices a service call could be created that just adds devices for a short time window after calling the service?

Also the more minis you have the more of a problem this is.. Why doesn't this platform treat all discovered devices as one device like all of the other device trackers? It would cut down on the known device clutter a lot for those of us with multiple devices.

@HudsonMC16
Copy link

@bagobones having them as separate devices opens up the possibility of using them for room-level presence detection. If we can get these little issues worked out, that's how I was planning on using them.

@bagobones
Copy link

@bagobones having them as separate devices opens up the possibility of using them for room-level presence detection. If we can get these little issues worked out, that's how I was planning on using them.

More in line with the way other modules work the tracked devices are always considered one device and any tracker that tracks the same device simply updates its current state. The tracked device is supposed to represent the device not the tracker and could be updated by multiple trackers that support the same device type.

I would suggest that simply adding an attribute for closest_mini: 192.168.1.x would give you the same functionality. The component as a whole would take the reports from ALL minis and for any BLE mac addresses that are the same, the closest_mini would always return the name / IP of the mini closest to that device.

You could also make the naming consistent with the bluetooth_le_tracker and just use the MAC field prefixed with BLE_ for BLE devices and a normal mac for others like the bluetooth_tracker.

You might also be able to allow the user to define the zone name for each mini, and when polled the component simply returns the zone name for the closest mini as the current location.

The current naming of known devices for this module is completely inconsistent with other components and doesn't really allow the user to rename them to something friendly like the other components currently do.

@HudsonMC16
Copy link

@bagobones are there any other presence detection components which are distributed like this, with the ability to add multiple nodes? I'm not up-to-date on every component, but happy bubbles is the first one that comes to mind, and it requires you running a separate server to crunch the data and publish messages to mqtt.

This component is different. Closest mini wouldn't be appropriate in all use cases, as triangulation would sometimes be necessary.

To do the automations from within home assistant, it must be structured like this, or major changes would need to be made to the component. I'm not saying that shouldn't happen, but it's outside the scope of this discussion.

@elupus
Copy link
Contributor

elupus commented May 15, 2019

@bagobones isnt the whole problem with ble devices that their MAC is randomized? That is a privacy feature to avoid stuff like presence detections. I you can never attribute them in that state to a tracked device.

@bagobones
Copy link

@bagobones are there any other presence detection components which are distributed like this, with the ability to add multiple nodes?

If we take signal strength out of separate components can update the state of the same object already.. For example if you setup the iOS app AND you setup the ubiquiti unifi tracker, a phone till be tracked both by WiFi AND by iOS GPS, it will simply take the most recent hit. This works without multipule nodes.

This component however has been setup as a platform instead of individual device trackers, so I would suspect logic wise the platform could internally do something more intelligent.

@bagobones
Copy link

bagobones commented May 15, 2019

@bagobones isnt the whole problem with ble devices that their MAC is randomized? That is a privacy feature to avoid stuff like presence detections. I you can never attribute them in that state to a tracked device.

Not all BLE devices are randomized, you can purchase BLE tags that do not randomize on Aliexpress etc. You can get devices that would work great.

However smart phones and many other devices with privacy enabled will randomize, which is why constantly adding all discovered devices to known devices is so bad.

@HudsonMC16
Copy link

If we take signal strength out of separate components can update the state of the same object already.. For example if you setup the iOS app AND you setup the ubiquiti unifi tracker, a phone till be tracked both by WiFi AND by iOS GPS, it will simply take the most recent hit. This works without multipule nodes.

@bagobones I don't this is true. This is only of you combine the device trackers as a "person" entity. They are separate device trackers, and recorded as such, otherwise. Heck, even if you do combine them as a person, the device trackers are still recorded separately.

@bagobones
Copy link

bagobones commented May 16, 2019

@bagobones I don't this is true. This is only of you combine the device trackers as a "person" entity. They are separate device trackers, and recorded as such, otherwise. Heck, even if you do combine them as a person, the device trackers are still recorded separately.

The Person entity is a VERY recent addition, I have been using this for a long time, items that track a device based on the MAC address can share the single logical item and update it. It has worked this way for as long as I have used HA..

Edit: The person entity lets you combine trackers that DON'T share anything in common accept the person.

https://www.home-assistant.io/components/device_tracker/

"Multiple device trackers can be used in parallel, such as Owntracks and Nmap. The state of the device will be determined by the source that reported last."

This is why the current implementation of the googlehome component is so strange. In no way what so ever should there be more than one known device object for each bluetooth device tracked.

@HudsonMC16
Copy link

@bagobones I see, that's news to me. I, personally don't think a departure from existing implementations of device trackers is a bad thing. Just because there isn't an implementation which doesn't currently function this way doesn't mean one should be forbidden. That's why we have these discussions. Your use case and assumed implementation doesn't necessarily match every other user's use case or mean it's the best way of leveraging the capabilities of the Google home API in home assistant.

Something that might work is if the rssi values of each Google home were included as attributes of the device tracker. Might need rssi and last update time to make it complete, but seems doable. Would mean major changes to the existing component, however, which someone would have to implement 😉

@bagobones
Copy link

@bagobones I see, that's news to me. I, personally don't think a departure from existing implementations of device trackers is a bad thing.

I disagree, there needs to be some level of consistency in large shared projects in terms of implementation. You can have diversity in functionality and extend the implementations however.

How a device_tracker should respond and how the fields should be used is actually fairly clear here.

https://www.home-assistant.io/components/device_tracker/

Expectations:

  • If a tracked device has a UNIQUE MAC address it should be used to identify the device
  • no component should under normal use fill the hard drive by filling the know_devices file with random devices ( a new configuration attribute for scanning or a service is needed to deal with randomized ID) this could be a big problem for PI users with small SD cards. Not to mention it will constantly slow HA down.
  • setting the zone by the device_tracker is also supported in the documentation.
  • having attributes that show the last RSSI for each mini that saw the device at the same time would still fit under extending the functionality.

@stale
Copy link

stale bot commented Aug 15, 2019

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates.
Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍
This issue now has been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.