Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Persist pyscript.* states between home assistant restarts #48

Merged
merged 10 commits into from
Oct 21, 2020

Conversation

swazrgb
Copy link
Contributor

@swazrgb swazrgb commented Oct 17, 2020

With this change any states set by pyscript in the pyscript domain
will be persisted when home assistant next reboots.

For example when a script runs pyscript.abc = 'foo' then this value
will be persisted.

Fixes #47

With this change any states set by pyscript in the pyscript domain
will be persisted when home assistant next reboots.

For example when a script runs `pyscript.abc = 'foo'` then this value
will be persisted.

Fixes custom-components#47
@dlashua
Copy link
Contributor

dlashua commented Oct 17, 2020

I tested this with the following code:

@service
def testset(**params):
    try:
        cnt = int(pyscript.testsetcnt)
    except:
        pyscript.testsetcnt = 0
        cnt = 0

    if "name" not in params:
        log.error("name required")
        return

    if "value" not in params:
        log.error("value required")
        return

    attributes = params.get("attributes", {})
    if not isinstance(attributes, dict):
        log.error("attributes must be a dict")
        return

    state.set(f'pyscript.{params["name"]}', params["value"], attributes)
    cnt = cnt + 1
    pyscript.testsetcnt = cnt

There is an issue.

If I set a value and then restart, that value is there. However, if I then restart again (without setting it again) that value goes away.

Also, we likely need a way to REMOVE a set value.

@swazrgb
Copy link
Contributor Author

swazrgb commented Oct 17, 2020 via email

# have this var tracked for restore
restore_data = await RestoreStateData.async_get_instance(cls.hass)
restore_data.async_restore_entity_added(var_name)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How expensive is it to call these functions every time a pyscript state variable is set? Would it be better to only do it the first time each state variable is set?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not very experienced with the Home Assistant codebase or python async code.

The async_restore_entity_added simply puts the entry on a set, so it isn't expensive at all. The get_instance method I'm not sure, I expect it to be cheap due to the singleton? https://github.com/home-assistant/core/blob/dev/homeassistant/helpers/restore_state.py#L65

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've since introduced our own set to track persisted states, so this code block will only evaluate once.

@craigbarratt
Copy link
Member

@swazrgb - thanks for the PR!

@dlashua - we could extend del to delete a state variable:

del pyscript.var1

@swazrgb
Copy link
Contributor Author

swazrgb commented Oct 17, 2020

The state expiration built into restore_state works differently than I first assumed, and will likely not be relevant to how pyscript utilizes it.

There are two scenarios to consider for deleting state variables:

  1. A state variable is registered by pyscript, and the pyscript that registered it has since been uninstalled. Should the state variable stick around forever? Probably not.

  2. A script registers a state variable, and wants to then delete it. This might not necessarily be in-scope for this PR, as it also applies to other kinds of state.

For scenario 1 I am considering the following solutions:

  1. Persist pyscript.* state variables that are either written OR read. So if pyscript writes a state variable, then restarts, and reads it, it's all good. But if for some reason the read were to happen within a conditional, then it wouldn't be persisted.

  2. Provide a function that must be called at the top-level by pyscript for any state variables that should be persisted. This simply bridges to async_restore_entity_added. This method could also add a convenience argument for setting a default value.

  3. Combine both solutions. Automagically persist during write and read, but also provide the function for complex scenarios where the write/read might happen conditionally and not every time home assistant is running. This might perhaps be confusing, since people will assume it "Just Works" until it doesn't.

I'm leaning towards solution 3, as that would still allow succinct scripts for simple cases. So I'll work on updating the PR accordingly. However, do please let me know if you have a different preference or additional thoughts. Thanks!

Automatically persist pyscript.* state variables that are read
from pyscript, either directly or using a state_trigger.

Also provide a state.persist method to pyscript to explicitly
register variables to be persisted.
@swazrgb
Copy link
Contributor Author

swazrgb commented Oct 18, 2020

The example provided by @dlashua will still not automatically work as-is with my latest commit, as the state variable names are entirely dynamic. The only way to support this out of the box is to permanently persist all pyscript.* variables, even those of scripts that have since been removed. This might just be the easiest solution, but feels dirty.

This example explicitly registers the state variable, which is static, and works as expected:

state.persist("pyscript.count_hours", default_value=0)

@time_trigger("period(0:00, 1 hour)")
def hourly():
    pyscript.count_hours = int(pyscript.count_hours) + 1

This example doesn't explicitly register the state variable, but does have a state_trigger, so it also works as expected:

@state_trigger("True or pyscript.some_var")
def some_var_changed():
    log.error(f"some_var: {pyscript.some_var}")

@service
def set_some_var(value):
    pyscript.some_var = value

However, the following example DOES NOT work as expected! The state variable is set conditionally, and never read or explicitly registered (because it is dynamic):

@service
def set_var(key, value):
    # This var will be lost after restart!
    state.set(f"pyscript.{key}", value)

A user could work around this by using a special prefix, though:

for var_name in state.names("pyscript"):
    if var_name.startswith("pyscript.set_var_"):
        log.error(f"Explicitly persisting: {var_name}")
        state.persist(var_name)

# or: [state.persist(var) for var in state.names("pyscript") if var.startswith("pyscript.set_var_")]

@service
def set_var(key, value):
    state.set(f"pyscript.set_var_{key}", value)

Users could also just choose to permanently persist all variables by dropping the if statement. Depends on the user, I guess.

It might also be possible to simplify the final example by supplying a state.persist_prefix function if this pattern winds up being common.

Thoughts?

@swazrgb
Copy link
Contributor Author

swazrgb commented Oct 18, 2020

I wound up implementing state.persist_prefix, as this already came up a few times when rewriting my pyscripts.

So now the following can be done:

state.persist("pyscript.some_var", default_value=123)
state.persist_prefix("pyscript.set_var_")

@service
def set_var(key, value):
    state.set(f"pyscript.set_var_{key}", value)

@swazrgb
Copy link
Contributor Author

swazrgb commented Oct 18, 2020

It is now also possible to do the following:

state.persist("pyscript.light", "on", {
  'brightness': 255,
  'color': [255, 255, 255]
})

If the state already existed and a new attribute is added to the default_attributes, then this attribute is patched in. Existing attributes that aren't mentioned in persist will be kept.

@dlashua
Copy link
Contributor

dlashua commented Oct 18, 2020

To make sure I understand what you have here...

In order to be sure a variable persists, I will need to do one of the following at some point after EACH Home Assistant start:

  1. set it using state.set('pyscript.test_me', '123'), or pyscript.test_me = '123'
  2. get it using state.get('pyscript.test_me'), or state.get_attr('pyscript.test_me'), or use it directly like if pyscript.test_me == '123'
  3. call state.persist('pyscript.test_me')
  4. call state.persist_prefix('pyscript.test_')

In addition to the above requirements, only entities beginning with pyscript. (i.e in the pyscript domain) will be persisted.

In practice, any entity I want persisted should be explicitly persisted using state.persist() or state.persist_prefix() unless I can be certain that code using methods 1 or 2 above will be run shortly after pyscript starts.

There is no need to "delete" persisted entities. Simply removing the pyscript code using that entity and any calls to state.persist() for that entity will cause the entity to be "removed" after two restarts of Home Assistant.

For additional clarity, using the entity in a Home Assistant Automation, Template Sensor/Binary Sensor, or Script will not ensure its persistence. One of the 4 methods listed above must be used in pyscript code.

If this is correct, I think the above text (or something like it) should be included in the Documentation.

We should also consider REQUIRING the use of state.persist() or state.persist_prefix() (i.e. don't automatically persist the entities on set/get). Since the pyscript user will have to know the entity name (or a prefix) anyway, but may not fully realize they've placed a set or get in a code location that is not guaranteed to run shortly after startup, requiring state.persist() will ensure that they do the right thing. Additionally, it'll ensure that no entities are persisted unexpectedly.

@swazrgb
Copy link
Contributor Author

swazrgb commented Oct 18, 2020

@dlashua That is all largely correct.

In order to be sure a variable persists, I will need to do one of the following at some point after EACH Home Assistant start:

Or mention the entity in a @state_trigger

In addition to the above requirements, only entities beginning with pyscript. (i.e in the pyscript domain) will be persisted.

I've decided to limit the scope of all of this to pyscript., since it seems to be unusual in Home Assistant ecosystem to create arbitrary domains that aren't backed by a component, and my purpose here is for pyscript scripts to define their own new states that can then be used as usual.

In practice, any entity I want persisted should be explicitly persisted using state.persist() or state.persist_prefix() unless I can be certain that code using methods 1 or 2 above will be run shortly after pyscript starts.

Yes. Or having a @state_trigger that watches this entity.

There is no need to "delete" persisted entities. Simply removing the pyscript code using that entity and any calls to state.persist() for that entity will cause the entity to be "removed" after two restarts of Home Assistant.

Correct.

For additional clarity, using the entity in a Home Assistant Automation, Template Sensor/Binary Sensor, or Script will not ensure its persistence. One of the 4 methods listed above must be used in pyscript code.

Correct.

If this is correct, I think the above text (or something like it) should be included in the Documentation.

I agree that the documentation should encourage users to always call state.persist or state.persist_prefix. Probably with a default value, since that is very handy.

Disabling the automatic persisting on read/write will make the implementation of persist_prefix a bit more complex. The reason it works right now is:

  1. On the first run, when persist_prefix is ran, it doesn't do anything. The states it should tell home assistant to persist don't yet exist, so it can't enumerate them. Nothing happens.

  2. On the first run a write happens, the state is created AND persisted.

  3. Home assistant restarts. During pyscript startup the persisted state is loaded back in. Then once the pyscript script starts, it enumerates the states, finds its prefixed states, and tells home assistant to keep persisting it.

So if we were to completely disable the automatic persisting on read/write, then this would have to be solved otherwise. Possibly by checking the set of states that have been passed to persist_prefix during write, seeing if any of the prefixes match, and then telling home assistant to persist it there.

@dlashua
Copy link
Contributor

dlashua commented Oct 18, 2020

If having it in @state_trigger also persists it, then it is likely to do the right thing for most of the users most of the time. So, with that, I think you can leave it as it is and just ensure the documentation is good. I'm happy to help with this.

@swazrgb
Copy link
Contributor Author

swazrgb commented Oct 18, 2020

If having it in @state_trigger also persists it, then it is likely to do the right thing for most of the users most of the time.

This is my thinking too.

I think you can leave it as it is and just ensure the documentation is good. I'm happy to help with this.

Thanks! I was hoping for further review from @craigbarratt before writing docs that describe these new features, but I've given you write access to my fork so if you feel so inclined, have at it 😄

@craigbarratt
Copy link
Member

craigbarratt commented Oct 18, 2020

Thanks for the progress on this.

I still don't understand the use cases well enough to know what the best solution is. Fixed "persistent" parameters would be done via config. So I presume it's mainly internal states (of anything) that you want to persist so a restart is not so disruptive.

First, stepping back a little, HASS state variables aren't a great way to preserve useful information since they are coerced into strings, although their attributes can be arbitrary types (I presume they at least need to be serializable). On the plus side, HASS already supports persistence of state variables, so that saves us from having to develop and maintain some other mechanism.

Since the HASS state variable namespace is global, there is also the issue of different pyscript apps colliding if they fail to use unique names.

Combining these pros/cons, I somewhat agree that persisting HASS state variables is the right choice, although the namespace collision risks are a concern. The alternative of persisting designated global variables in each app's context would avoid the namespace collision, and allow richer data types to be persistent (so long as they are serializable), but then we'd have to come up with a new way to do persistence, which is highly undesirable. I do worry about the performance also, if every persistent variable setting causes a disk write (at least eventually). I'm not sure how HASS does that - are they cached and flushed only periodically? Presumably it turns into a database write, so it should be relatively efficient.

I'd prefer not to have several different explicit and implicit mechanisms (eg, get and set) that denote persistence. I think it would be cleaner if there was just one way to do it. For example, state.persist could behave like a set that you simply add names to (or discard them). Or we could have functions like state.persist_add() and state.persist_stop(). Those wouldn't be restricted to just names starting with pyscript., but that would be strongly encouraged. Each script or app would explicitly add their state variables on startup. Assuming the set of names itself is not persisted, then when you disable or remove a script, its persisted variables will disappear after the 2nd restart.

Alternatively, we could have a persist attribute on state variables which could be set or cleared. On restart that persist attribute is set back to False, so that a disabled script also will have its persisted variables disappear on the 2nd restart. Example:

pyscript.my_important_var.persist = True

Reactions?

@swazrgb
Copy link
Contributor Author

swazrgb commented Oct 18, 2020

My intention is to create HASS states that can be used in other automations, while also having them be persisted like input_string and friends are. So having a separate variable storage within the context of a pyscript app does sound like a good idea, but a different use-case.

Namespace collisions are a risk, but that is a risk that already exists. Any pyscript app that exists today can already set arbitrary states and collide with other pyscript apps (or other custom components or anything, really). Adding persistence to these states doesn't change anything about that. Some apps seem to solve this by letting the user configure the state to use, where the user is expected to pre-create a input_* component, which is a hassle. Now instead they could use a static prefix to identify their app, or let the user configure the prefix.

I want to make it as easy as possible for me, as a developer writing automations for my own home, to declare new persisted state variables that I can use to drive my home. That means I would really prefer to be able to do this all from within pyscript, without having to go through other config files for each state parameter I wish to store. Especially when those state parameters may depend upon auto-discovery of other entities, and as such cannot be predicated ahead of time easily.

I do agree that combining implicit and explicit registration of persisted variables is probably going to be confusing, though the current implementation does the right thing for most use-cases.

My preferred approach to requiring explicit persistence would be to keep the state.persist (with default value & attribute capability) and state.persist_prefix functions, and only persist variables that are registered there. These prefixed state variables must be re-registered as persistent with HASS each time the pyscript component is loaded by HASS, else they will be discarded

This does mean a new set will have to be introduced of all prefixes that get passed to persist_prefix, and this set must be checked during each pyscript.* variable write. The performance of this might not be ideal, although it would only impact non-persisted pyscript. prefixed state variable writes. Probably not too many of those?

Alternatively we could drop persist_prefix entirely, and and simply require the script to also call state.persist when it writes the prefixed variable, but that would be significantly less convenient, since state.persist must then be placed twice within the user's pyscript script, once in a loop on the top-level to re-register previously created dynamic state variables, and once within the code block that actually declares the dynamic state variable. So, that'd be something like this:

for var_name in state.names("pyscript"):
    if var_name.startswith("pyscript.set_var_"):
        state.persist(var_name)

@service
def set_var(key, value):
    state.persist("pyscript.set_var_{key}")
    state.set(f"pyscript.set_var_{key}", value)

I'd probably wind up having to include such a snippet in every pyscript I make, then. So I'd like a solution that keeps the capability of state.persist_prefix

I do worry about the performance also, if every persistent variable setting causes a disk write (at least eventually). I'm not sure how HASS does that - are they cached and flushed only periodically? Presumably it turns into a database write, so it should be relatively efficient.

This shouldn't be that much of a concern, the states are collected and flushed periodically (15 minutes) and upon HASS shutdown. A lot of components are also already utilizing this state storage by using the RestoreEntity mixin class.

So with all that said I propose the following:

# This is required for any state variable that should be persisted. Best practice is to put this on the top-level in a pyscript. If this method is not executed during the pyscript lifecycle the persisted value will eventually be lost.
state.persist("pyscript.my_var", default_value="on", default_attributes={'some_attr': 123})

# Alternatively it is possible to register all states with a certain prefix to be persisted
state.persist_prefix("pyscript.some_prefix_")

The persist function would remain unchanged as in the current PR.

The persist_prefix function would internally do the following:

    @classmethod
    async def persist_prefix(cls, prefix):
        """Ensures all pyscript domain state variable with prefix are persisted."""
        if prefix.count(".") != 1 or not prefix.startswith("pyscript."):
            raise NameError(f"invalid prefix {prefix} (should be 'pyscript.entity')")
        
        # Store the prefix to ensure the prefixed var is also persisted if it doesn't yet exist
        cls.persisted_prefixes.add(prefix)

        for name in await cls.names("pyscript"):
            if name.startswith(prefix):
                await cls.register_persist(name)

And the set function would do the following to ensure these prefixed vars also get persisted:

    @classmethod
    async def set(cls, var_name, value=None, new_attributes=None, **kwargs):
        """Set a state variable and optional attributes in hass."""
        # ...

        if var_name.startswith("pyscript.") and var_name not in cls.persisted_vars and any(var_name.startswith(prefix) for prefix in cls.persisted_prefixes):
            await cls.register_persist(var_name)

        cls.hass.states.async_set(var_name, value, new_attributes)

And the changes to notify_add and get would be reverted.

@swazrgb
Copy link
Contributor Author

swazrgb commented Oct 18, 2020

I wanted to add that I don't think it would be desirable to allow pyscript to let states outside of the pyscript domain to be persisted. This would let you break other components that do not expect their state to be persisted. Although it would be neat to be able to create arbitrary domains for grouping your state variables, this seems to go against how the home assistant state storage is intended to be used. Also only considering state variables prefixed with pyscript. helps performance. Maybe I'm wrong though, I only recently started developing for home assistant.

I also am not a big fan of the pyscript.my_important_var.persist = True because that would collide with the syntax in my other PR (#49), and it looks like you're setting an attribute there. And the persist attribute isn't special in HASS, so I don't think it should be in pyscript either.

@dlashua
Copy link
Contributor

dlashua commented Oct 18, 2020

I still don't understand the use cases well enough to know what the best solution is.

There are several situations where persisting state is useful. I'm sure this list isn't comprehensive.

  1. Storing data that is expensive to recalculate. If I have a pyscript that runs at midnight every night, accesses the Home Assistant events database and determines how many hours my HVAC system was running that day and then stores it using state.set('sensor.yesterday_hvac_hours', calculated_value). This could be used, perhaps, to add some notifications to let me know if I'm using more energy today heating and cooling my house so that, if I'm trying to keep my electric bill down, I can do something about it. When Home Assistant is restarted, that data is lost and now has to be recalculated.

  2. Storing data based on events that can't easily be recalculated. Let say I am automating some kind of plant watering and care system with artificial lights. I need the lights to be on for 8 hours a day, but they are noisy, so I only want to run them when I'm away from home, if possible. So I have:

@state_trigger('binary_sensor.home == 'off')
def grow_lights_trigger():
  if pyscript.performed_grow_lights == "off":
    turn_on_grow_lights(60 * 60 * 8)

@time_trigger('startup')
def grow_lights_startup():
  if pyscript.last_grow_lights_on == 0:
    return

  seconds = pyscript.last_grow_lights + (60 * 60 * 8) - time.time()
  if seconds > 0:
    turn_on_grow_lights(seconds)

@time_trigger('cron(0 0 * * *)')
def reset_grow_lights():
  pyscript.performed_grow_lights = "off"

def turn_on_grow_lights(seconds)
  task.unique('grow_lights', kill_me=True)

  light.turn_on('light.grow_lights)
  if pyscript.last_grow_lights_on == 0:
    pyscript.last_grow_lights_on = time.time()
  task.sleep(seconds)
  pyscript.last_grow_lights_on = 0
  light.turn_off('light.grow_lights)
  pyscript.performed_grow_lights = "on"

I'm sure I missed something as I'm just making this up in the comment window, but I think you get the idea.

There are ways around this. Primarily, we use input_text to handle something light last_grow_lights_on and we use input_boolean to handle performed_grow_lights. But it would be really nice if pyscript could remember this data without having to take the extra steps to create those "helpers".

As @swazrgb indicated, namespace collisions are a risk. But it's somewhat in the nature of Home Assistant. These days, when there's a collision on, say binary_sensor.test, Home Assistant will create a binary_sensor.test_2 instead. While this prevented the collision, it doesn't do what the user wants because binary_sensor.test may or may not hold the data they want. I can't even count how many times this has bitten me. So in the case of pyscript, it will be entirely up to the user to avoid these collisions. But, at least there are fewer integrations competing in that namespace, so, less likely.

The "right" way to do this is to register each entity using all the entity registry stuff that I've never bothered to unravel the code for. Home Assistant will, then, try to assign an entity_id according to the "name" of the entity, but will name it whatever it wants if it can't. Then the user can change the entity_id in the UI. But, doing this makes it harder to access that entity in other pyscripts because it may have been named differently than you expect.

@swazrgb
Copy link
Contributor Author

swazrgb commented Oct 18, 2020

I also use this functionality to create derived states, which can be based on other state changes & events. Due to them being based on events some kind of persistence is necessary.

An example is the hue dimmer switch, this sends an event for each button press. I'd like to store the dim or scene state in a pyscript state variable (or attribute thereof), which ideally has persistence. Without persistence all the lights reset when HA restarts.

@craigbarratt
Copy link
Member

Thanks for the really excellent comments and background. It's really helpful to understand potential use cases.

Ok, my preference is to keep things as simple as possible, so let's go with just:

# This is required for any state variable that should be persisted. Best practice is to put this on the top-level in a pyscript. If this method is not executed during the pyscript lifecycle the persisted value will eventually be lost.
state.persist("pyscript.my_var", default_value="on", default_attributes={'some_attr': 123})

Scripts that want persistence would have a sequence of calls like this, with the default values being applied only the first time. And that should be all that you need - there shouldn't be any need to call state.persist later.

While I do hear your points about the benefits of persist_prefix, I don't like the action and its effect being potentially far apart (if it applies to variables not yet created), so let's not have that. I also agree using a "special" attribute as I proposed is a bad idea.

It sounds like we should add support for using del on a state variable, which should also remove the persist if present. I'm happy to do that separately after I merge this PR, or feel free to add that.

Thanks again for your patience - this looks like a very useful feature.

@dlashua
Copy link
Contributor

dlashua commented Oct 21, 2020

I have updated this branch to include only state.persist(). I have corrected the documentation. I have tested with the following script:

state.persist("pyscript.boot_count", default_value="0")


@time_trigger("startup")
def inc_boot_count():
    try:
        cur = int(pyscript.boot_count)
    except NameError:
        cur = 0

    pyscript.boot_count = cur + 1

@@ -551,12 +551,11 @@ variable has a numeric value, you might want to convert it to a numeric type (eg
Persistent State
^^^^^^^^^^^^^^^^

Two methods are provided to explicitly indicate that a particular entity_id should be persisted.
This method is provided to indicate that a particular entity_id should be persisted. This is only effective for entitys in the `pyscript` domain.

``state.persist(entity_id, default_value=None)``
Indicates that the entity named in `entity_id` should be persisted. Optionally, a default value can be provided.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add optional default_attributes=None argument.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

@craigbarratt craigbarratt merged commit 827a1fc into custom-components:master Oct 21, 2020
@craigbarratt
Copy link
Member

Looks great - thanks @swazrgb and @dlashua.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature Request: Declare new restorable state from pyscript
3 participants