Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiple instances of the same class conflict #33

Closed
dlashua opened this issue Oct 9, 2020 · 12 comments
Closed

multiple instances of the same class conflict #33

dlashua opened this issue Oct 9, 2020 · 12 comments

Comments

@dlashua
Copy link
Contributor

dlashua commented Oct 9, 2020

It's taken a bit of effort to boil this down to only the part that seems to be broken. Hopefully, this will be broken for you too.

I've put the pieces in a gist so it's a bit easier to read and copy:

https://gist.github.com/dlashua/67f3accdaf9bac8e4ed056196b039ae4

Notice that the log lines stop showing input_boolean.test_1 once the "on" message is received. Also, in Home Assistant, input_boolean.test_1 never turns off.

@dlashua
Copy link
Contributor Author

dlashua commented Oct 9, 2020

And, as an added point of debugging, swapping the order of these in configuration.yaml does what you'd expect... test_1 eventually turns off, while test_2 logs up to "on" and then never turns off.

@craigbarratt
Copy link
Member

Yikes - thanks for the example. I've stripped it down to this, which definitely does the wrong thing. Somehow the two instances get task.sleep mixed up during the parallel execution. Will debug later today.

import time
registered_triggers = []
class Test:
    def __init__(self, timeout):
        self.timeout = timeout
        @time_trigger('startup')
        def startup_trigger():
            log.info(f'__init__ {self.timeout} {self}: startup trigger')
            self.wait()
        registered_triggers.append(startup_trigger)

    def wait(self):
        log.info(f"waiting for {self.timeout} {self}")
        t0 = time.monotonic()
        task.sleep(self.timeout)
        delay = time.monotonic() - t0
        log.info(f"finished waiting for {self.timeout} {self} (took {delay:.3f}sec)")
    
t1 = Test(5)
task.sleep(3)
t2 = Test(10)

output shows last two statements have the wrong delay:

__init__ 5 <custom_components.pyscript.eval.Test object at 0x109de9700>: startup trigger
waiting for 5 <custom_components.pyscript.eval.Test object at 0x109de9700>
__init__ 10 <custom_components.pyscript.eval.Test object at 0x109bbab80>: startup trigger
waiting for 10 <custom_components.pyscript.eval.Test object at 0x109bbab80>
finished waiting for 10 <custom_components.pyscript.eval.Test object at 0x109bbab80> (took 1.995sec)
finished waiting for 5 <custom_components.pyscript.eval.Test object at 0x109de9700> (took 13.011sec)

@dlashua
Copy link
Contributor Author

dlashua commented Oct 9, 2020

Excellent. I'm glad you were able to reproduce.

In my version, it was a bit harder to watch the logs, but, as best as I can tell, one of the tasks actually stops running (either it doesn't trigger or it finds an exception that doesn't make it to the logs). In fact, I thought I was looking at a task.unique() bug at first, because I was using it in my code and it is supposed to kill other tasks. Part of the reason I stripped it down like I did was to remove any use of task.unique to see if the problem still persisted. But, before all of that was done, I could see in the logs that 2 (of the 4) instances of this app wouldn't even get to the task.sleep() call in some iterations of my testing. No error, it would just not issue the next log line that should have occurred on the next tick for that particular chain of events.

@dlashua
Copy link
Contributor Author

dlashua commented Oct 10, 2020

In an attempt to grok enough of the AST bits to fix this myself, I have run the code you pasted above. For me, it's working fine:

2020-10-10 05:47:19 INFO (MainThread) [custom_components.pyscript.file.test.startup_trigger] __init__ 10 <custom_components.pyscript.eval.Test object at 0x7fb066681850>: startup trigger
2020-10-10 05:47:19 INFO (MainThread) [custom_components.pyscript.file.test] waiting for 10 <custom_components.pyscript.eval.Test object at 0x7fb066681850>
2020-10-10 05:47:19 INFO (MainThread) [custom_components.pyscript.file.test.startup_trigger] __init__ 5 <custom_components.pyscript.eval.Test object at 0x7fb05b1076a0>: startup trigger
2020-10-10 05:47:19 INFO (MainThread) [custom_components.pyscript.file.test] waiting for 5 <custom_components.pyscript.eval.Test object at 0x7fb05b1076a0>
2020-10-10 05:47:24 INFO (MainThread) [custom_components.pyscript.file.test] finished waiting for 5 <custom_components.pyscript.eval.Test object at 0x7fb05b1076a0> (took 5.001sec)
2020-10-10 05:47:29 INFO (MainThread) [custom_components.pyscript.file.test] finished waiting for 10 <custom_components.pyscript.eval.Test object at 0x7fb066681850> (took 10.001sec)

However, that is with MY original test code commented out. As soon as I add my test code back in, your test code breaks:

2020-10-10 05:52:22 INFO (MainThread) [custom_components.pyscript.apps.occupancy_manager] input_boolean.test_2: STARTING
2020-10-10 05:52:22 INFO (MainThread) [custom_components.pyscript.apps.occupancy_manager] input_boolean.test_1: STARTING
2020-10-10 05:52:22 INFO (MainThread) [custom_components.pyscript.file.test.startup_trigger] __init__ 5 <custom_components.pyscript.eval.Test object at 0x7fb06d8ea430>: startup trigger
2020-10-10 05:52:22 INFO (MainThread) [custom_components.pyscript.file.test] waiting for 5 <custom_components.pyscript.eval.Test object at 0x7fb06d8ea430>
2020-10-10 05:52:22 INFO (MainThread) [custom_components.pyscript.file.test.startup_trigger] __init__ 10 <custom_components.pyscript.eval.Test object at 0x7fb067b289a0>: startup trigger
2020-10-10 05:52:22 INFO (MainThread) [custom_components.pyscript.file.test] waiting for 10 <custom_components.pyscript.eval.Test object at 0x7fb067b289a0>
2020-10-10 05:52:22 INFO (MainThread) [custom_components.pyscript.apps.occupancy_manager.startup_trigger] input_boolean.test_2: startup trigger
2020-10-10 05:52:22 INFO (MainThread) [custom_components.pyscript.apps.occupancy_manager] input_boolean.test_2: on
2020-10-10 05:52:22 INFO (MainThread) [custom_components.pyscript.apps.occupancy_manager.startup_trigger] input_boolean.test_1: startup trigger
2020-10-10 05:52:22 INFO (MainThread) [custom_components.pyscript.apps.occupancy_manager] input_boolean.test_1: on
2020-10-10 05:52:24 INFO (MainThread) [custom_components.pyscript.apps.occupancy_manager] input_boolean.test_1: pending_off
2020-10-10 05:52:24 INFO (MainThread) [custom_components.pyscript.apps.occupancy_manager] input_boolean.test_1: delay off for 5
2020-10-10 05:52:24 INFO (MainThread) [custom_components.pyscript.apps.occupancy_manager] input_boolean.test_1: pending_off
2020-10-10 05:52:24 INFO (MainThread) [custom_components.pyscript.apps.occupancy_manager] input_boolean.test_1: delay off for 5
2020-10-10 05:52:27 INFO (MainThread) [custom_components.pyscript.file.test] finished waiting for 10 <custom_components.pyscript.eval.Test object at 0x7fb067b289a0> (took 5.001sec)
2020-10-10 05:52:29 INFO (MainThread) [custom_components.pyscript.apps.occupancy_manager] input_boolean.test_1: off
2020-10-10 05:52:29 INFO (MainThread) [custom_components.pyscript.apps.occupancy_manager] input_boolean.test_1: off
2020-10-10 05:52:32 INFO (MainThread) [custom_components.pyscript.file.test] finished waiting for 5 <custom_components.pyscript.eval.Test object at 0x7fb06d8ea430> (took 10.002sec)

@dlashua
Copy link
Contributor Author

dlashua commented Oct 10, 2020

Taking the content of my def startup() and moving it into def startup_trigger() causes the correct input_booleans to turn on and off, however, the time in task.sleep() is incorrect.

@dlashua
Copy link
Contributor Author

dlashua commented Oct 10, 2020

As expected, though it can't hurt to confirm, converting my test case as well as my actual application to not use a class at all and, instead, pass around a data dict to non-class defined functions causes everything to work exactly as I expect it would with no issues.

@craigbarratt
Copy link
Member

craigbarratt commented Oct 10, 2020

Thanks for the additional info. Yes, this is another bug related to variable scoping for closures, in this case specific to classes. self inside the closure is bound to self in __init__, but that binding is incorrectly across all instances of the class, rather than being unique per instance. So as you figured out, self is not unique across class instances when the class function contains a closure. I'm still thinking about how to best fix this - it's quite tricky.

@craigbarratt
Copy link
Member

Here's another test case that exhibits the bug (at least for me):

class ClosureInstanceBug:
    def __init__(self, name):
        self.name = name
        log.info(f"__init__ {self} {self.name}")
        task.sleep(2)
        def closure():
            self.any_function()
        log.info(f"func {self} {self.name}")
        
@time_trigger("startup")
def func1():
    c1 = ClosureInstanceBug("c1")
    
@time_trigger("startup")
def func2():
    task.sleep(1)
    c2 = ClosureInstanceBug("c2")

When you run it, c2 finishes before c1.

@dlashua
Copy link
Contributor Author

dlashua commented Oct 10, 2020

Does it affect only closures in __init__ or anywhere in any method of the class?

@craigbarratt
Copy link
Member

craigbarratt commented Oct 11, 2020

Scratch that. The variable binding in closures inside class instances turned out to be correct. The bug was that the execution context was being bound to the class methods when the class was defined. So the two async functions were calling their instance's methods with a common execution context, which includes the symbol table, so bad things happen. Should be fixed now with 8dbde8b.

@dlashua
Copy link
Contributor Author

dlashua commented Oct 11, 2020

Initial tests show that it is working correctly. I'm going to leave it running live so I can SEE it working and report back after a bit.

Thanks for digging through this!

@dlashua
Copy link
Contributor Author

dlashua commented Oct 12, 2020

This seems to be resolved with the latest commits. Closing.

@dlashua dlashua closed this as completed Oct 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants