Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not make Life360 entities unavailable for server errors #76141

Closed
wants to merge 3 commits into from

Conversation

pnbruckner
Copy link
Contributor

Proposed change

Converting the Life360 integration from a "legacy" tracker to a new entity-based one,
and using the standard DataUpdateCoordinator and corresponding CoordinatorEntity
classes, had an unexpected and unintentional side-effect when there are errors
retrieving data from the Life360 server. This now causes the state of the corresponding
device_tracker entities to become unavailable which did not happen before.

This can (and has) caused user automations to fire when they shouldn't (because the
devices no longer appear "Home", or in some other zone the automation cares about.)

This change prevents the entities from changing to unavailable, but rather to retain their
current state, just as they did before, and as other device tracker integrations do in a
similar scenario.

Type of change

  • Dependency upgrade
  • Bugfix (non-breaking change which fixes an issue)
  • New integration (thank you!)
  • New feature (which adds functionality to an existing integration)
  • Deprecation (breaking change to happen in the future)
  • Breaking change (fix/feature causing existing functionality to break)
  • Code quality improvements to existing code or addition of tests

Additional information

Checklist

  • The code change is tested and works locally.
  • Local tests pass. Your PR cannot be merged unless tests pass
  • There is no commented out code in this PR.
  • I have followed the development checklist
  • The code has been formatted using Black (black --fast homeassistant tests)
  • Tests have been added to verify that the new code works.

If user exposed functionality or configuration variables are added/changed:

If the code communicates with devices, web services, or third-party tools:

  • The manifest file has all fields filled out correctly.
    Updated and included derived files by running: python3 -m script.hassfest.
  • New or updated dependencies have been added to requirements_all.txt.
    Updated by running python3 -m script.gen_requirements_all.
  • For the updated dependencies - a link to the changelog, or at minimum a diff between library versions is added to the PR description.
  • Untested files have been added to .coveragerc.

The integration reached or maintains the following Integration Quality Scale:

  • No score or internal
  • 馃 Silver
  • 馃 Gold
  • 馃弳 Platinum

To help with the load of incoming pull requests:

This prevents firing automations due to temporary network errors.
@project-bot project-bot bot added this to Needs review in Dev Aug 3, 2022
@project-bot project-bot bot moved this from Needs review to By Code Owner in Dev Aug 3, 2022
@MartinHjelmare
Copy link
Member

I don't think we should change it to the legacy behavior. The standard for entities is to become unavailable if the connection to the service provider is lost.

@pnbruckner
Copy link
Contributor Author

I don't think we should change it to the legacy behavior. The standard for entities is to become unavailable if the connection to the service provider is lost.

Except that many other integrations that use the newer TrackerEntity class do not become unavailable when they lose connection. I browsed the code of several (mobile_app, owntracks, etc.) and none of them even have an available method on their tracker entity classes (but many do on their other entity classes, such as sensors.) This seems to strongly suggest that this standard doesn't, and shouldn't, apply to device_tracker entities.

I wouldn't really call this "legacy" behavior. It's just acknowledging that a device_tracker changing to unavailable can have (and does in practice) undesirable side-effects (such as firing automations that should only fire when a device leaves home.)

@MartinHjelmare
Copy link
Member

Lacking an available property in existing integrations doesn't mean it's standard for device tracker entities to not follow our normal design guide that says that entities should become unavailable when they loose the connection.

If all entities follow our guidelines it becomes predictable for users how to interact with them.

@pnbruckner
Copy link
Contributor Author

Lacking an available property in existing integrations doesn't mean it's standard for device tracker entities to not follow our normal design guide that says that entities should become unavailable when they loose the connection.

Well, when they pretty much all do it, that seems to say something.

I understand the desire for predictability. But there's also practical issues to consider. How does an automation deal with a device tracker becoming unavailable? Even if it ignores transitions to unavailable, how does it decide what to do when it becomes available again and the state changes to home (or whatever)? Is this a real not_home to home event that should trigger a lot of actions? In order to determine that it would need to keep track of what the entity's state was before it became unavailable. That just complicates matters for users when it is often hard enough to write robust automations. I can tell you from logged issues and community forum discussions that users are having a hard time with this. And, to avoid these difficulties, it seems reasonable to duplicate the behavior of other device tracker integrations (which also leads to more predictability, meaning life360 isn't an outlier that is harder to use than other tracker integrations.)

So, I basically disagree with you on the need for a special case for device trackers. How would you suggest I "raise" this issue to a broader discussion?

@MartinHjelmare
Copy link
Member

Here's a similar discussion:
home-assistant/architecture#668

Both Netgear and Mikrotik device trackers that use the update coordinator implements our standard unavailable guideline, so Life360 will not be an outlier.

@pnbruckner
Copy link
Contributor Author

pnbruckner commented Aug 3, 2022

That discussion, if anything, seems to prove my point. Having an unavailable state during a communication "blackout" only complicates things and makes it harder to write robust automations.

If you think about it some more, any entity that samples, or updates periodically, (which is basically every entity) only provides valid data for instances in time. Just because they retain their state between sample points doesn't mean the real source of the information isn't changing. The thing it samples could change many times between samples. Yet we don't have unavailable states between those sample points because the real values aren't known. A communication blackout is basically just a longer time between samples. Why, then, should that cause an unavailable state? That's very inconsistent.

I suppose it could be argued that a communication error is another data point. But at the same time, it's very possible a communication error could happen between sample points (if communications were tried during that time); it's just not seen because it was of short duration and just so happened to be between sample points.

To me, the main point is that a HA entity (of any type, but for this discussion I'm focusing on device trackers) should update only when it has new useful data. What happens between those times is irrelevant to the state of that entity. If someone really wanted to know if there were communication problems, they could see that in the log. Or, a binary sensor could be created that indicates a period of unavailability, just like many integrations provide today. In fact, I'd be happy to add that to this integration. To me, that would be the proper way (i.e., out of band) to create a state (or event, when it changes) that a system could check or respond to if it wanted.

Regarding other device tracker integrations, I looked briefly at mobile_app, owntracks, geofency, gpslogger & locative. They don't create unavailable states when data isn't available. They simply don't update (because the data "coordinator" they're using simply doesn't send them data when it doesn't have any.) What I'm suggesting doing with this integration is the same thing. Just because it happens to use a common update coordinator isn't a reason for it to become unavailable. That's just an implementation detail.

So, there are some existing tracker integrations that become unavailable when there is a communications error, but there are also many others that do not. In my opinion, none of them should. If the inconsistency is allowed, and the majority seem to work the way I think they all should, and especially since the current behavior of the life360 integration is causing real problems for users which is extremely easy to solve, then I don't see why this change shouldn't be allowed.

@jerobins
Copy link

jerobins commented Aug 4, 2022

A Person becomes unknown when the device tracker API skips a beat? Every single HA users needs to create a bank of sensors to filter out unavailable/unknown states for every tracker? Or install another integration to fix the device trackers? I came looking for a github issue only to find a PR with a philosophical discussion; I'm getting texts from my kids wondering why everything in the house shut off while they are sitting in it. Netgear and Mikrotik aren't cloud polling; that's practically apples and oranges.

@pnbruckner pnbruckner marked this pull request as draft August 4, 2022 01:28
@pnbruckner pnbruckner marked this pull request as ready for review August 4, 2022 02:09
@frenck
Copy link
Member

frenck commented Aug 4, 2022

I agree with Martin on this one.

@MartinHjelmare
Copy link
Member

The point is that we standardize how the common entity state properties should be set in the state machine. If users need an improved automation experience for a state, that improvement should be done in the automation layer, not in the entity state layer.

See the different automation trigger helpers for different trigger types, number, sun, zone etc.

I won't discuss more in this PR. You're welcome to open an architecture discussion. Thanks!

@pnbruckner
Copy link
Contributor Author

@MartinHjelmare @frenck thanks for listening. I'm closing this PR and will open a different one that just fixes the bug without changing the overall behavior.

@pnbruckner pnbruckner closed this Aug 4, 2022
Dev automation moved this from By Code Owner to Cancelled Aug 4, 2022
@github-actions github-actions bot locked and limited conversation to collaborators Aug 5, 2022
@pnbruckner pnbruckner deleted the life360-unavailable branch August 10, 2022 01:00
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
Dev
  
Cancelled
5 participants