Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

list_services() not showing newly registered services #977

Closed
blyons16 opened this issue May 18, 2020 · 60 comments
Closed

list_services() not showing newly registered services #977

blyons16 opened this issue May 18, 2020 · 60 comments

Comments

@blyons16
Copy link

For the past few months, I have been struggling with "unknown domain" and other errors which I have traced to missing services. I have posted that journey here: https://community.home-assistant.io/t/unknown-domain-default-remote-in-call-service/173595/6

I spent the weekend spinning up a new RPi4, and in that process had plenty of opportunity to experiment with different methods, configuration, and timing. I think I have found the root cause.

It appears that the list of services returned by the AppDaemon function self.list_services() is only updated once, at the launch of AppDaemon. If any new services come online after AD startup, they will not be returned by list_services(). Since, after a cold system start, it takes a while for z-wave services to be registered, this is why I am experiencing the problem mostly with z-wave. However, the problem happens with any newly added services with new integrations or components. Since AppDaemon runs in a separate container from Core HA, newly registered components regularly appear when AppDaemon has not been restarted.

I proved this by having my initialize() function test for the existence of a zwave device, and if it is not ready, set a listener for a z-wave service_registered event. Then, when that event is triggered, get list_services() still does not show the services that was just registered. And if you manually run list_services() later, it still doesn't show up.

Example code snippets:

        if self.get_state("zwave.skimmer_pump_outlet") != 'ready':
            self.zwave_initializer = self.listen_event(self.initialize_zwave_listener,"service_registered",domain="zwave",service="test_node")
...
    def initialize_zwave_listener(self, name, id, type, event = None, function = None, data = None, kwargs = None):
        if "initialize" in self.logging_features:
            #message = "Event handler - name={}, id={}, type={}, event={}, function={}, data={}, kwargs={}".format(name, id, type, event, function, data, kwargs)
            #self.log(message)
            self.log("Initializing z-wave")
        self.cancel_listen_event(self.zwave_initializer)
        services_now = self.list_services(namespace="global")
        if len([i for i in self.expected_services + services_now if i not in self.expected_services or i not in services_now]) == 0:
            self.log("Services list matches")
        elif len(self.expected_services) < len(services_now):
            self.log("More services than expected")
        else:
            self.log("Services list does NOT match")

The only workaround I have found is to manually restart AppDaemon after all services have been registered, and only after that, compare my stored list of expected services with the list returned by list_services().

Currently running:
HassOS 3.13
Supervisor 222
HA Core 0.109.6
AppDaemon 4.0.2.5

@Odianosen25
Copy link
Collaborator

Hello @blyons16,

Yes it’s a bug in AD and there is presently a PR to fix it here #965.

You can pull it and merge it to your test system and see if it fixes it. It should fix it, so be rest assured in the next release, it’s gonna be fixed.

Regards

@blyons16
Copy link
Author

Thanks much! I do appreciate the work you do.

@blyons16
Copy link
Author

blyons16 commented Jun 2, 2020

FWIW, every major system has been updated, yet the problem remains. Now running
HassOS 4.8
Supervisor 225
HA Core 0.110.4
AppDaemon 4.0.2.6

I am not "resting assured." In looking at the above linked PR, it looks like it is being blocked by one reviewer?

@Odianosen25
Copy link
Collaborator

@blyons16 the PR has not been merged, so you will have to merge it locally in your own instance

@blyons16
Copy link
Author

blyons16 commented Jun 5, 2020

Please excuse my ignorance, but I have no idea how to "merge it locally." Can you direct me to some documentation of how to do that? Is that even possible with what used to be known as "hassio?" I thought that was locked down to prevent deviations from the standard releases.

@ReneTode
Copy link
Contributor

ReneTode commented Jun 5, 2020

you are right, you cant do that with hassio (at least not easy, when you want the change to stay)

@definitio
Copy link

I got the problem again after Home Assistant 0.113 update. Dev version worked fine with 0.112.

@dekiesel
Copy link

I have the same (?) issue.

Appdaemon 4.0.5
Home assistant 115

Basic services like "light" are missing until I restart appdaemon.

I think it is because home assistant now uses lazy loading when starting so appdaemon doesn't get the full list if services.

@ReneTode
Copy link
Contributor

make sure that HA is completely loaded and ready before AD starts.
if needed use the options for delay in the configuration.

@dekiesel
Copy link

dekiesel commented Sep 30, 2020

@ReneTode

Thanks, will do. Wouldn't it be nicer though if appdaemon waited with loading the services list until the "ha started" event was sent?

edit: Just to be sure, you are talking about app_init_delay, right?

@ReneTode
Copy link
Contributor

AD waits for that event.
but HA isnt ready when it sends that event.
the definition from "ready" has changed.

there are several ways to delay ADs connection to HA.
time delay, or wait for certain events, or entities if im not wrong. but its all in the docs. (configuration from hass plugin.)

@dekiesel
Copy link

Thanks, I missed this part of the documentation.

https://appdaemon.readthedocs.io/en/latest/CONFIGURE.html#hass-plugin-startup-conditions

Excellent, thank you!

@bg1000
Copy link
Contributor

bg1000 commented Sep 30, 2020

I think ideally there would be a second event from home assistant indicating when it is fully initialized. I am, however, grateful that appdaemon includes these very well thought out startup options.

For those following, the documentation does state this but I found it easy to misunderstand:
The conditions you place under plugin_startup_conditions are only checked if home assistant is restarted while appdaemon is running. These checks are not made if appdaemon is restarted. So for example:

      plugin_startup_conditions:
        state: { entity: cover.garage_door }
        delay: 60

Based on this entry appdaemon will delay startup of the HASS plugin for 60 seconds and wait (if needed) for the existence of entity cover.garage_door if you restart home assistant while appdaemon continues to run. If you restart appdaemon or restart appdaemon and home assistant this entry does nothing. If you want the same functionality in either case you would need to change the entry to:

      appdaemon_startup_conditions:
        state: { entity: cover.garage_door }
        delay: 60
      plugin_startup_conditions:
        state: { entity: cover.garage_door }
        delay: 60

@ReneTode
Copy link
Contributor

i think it is something that is missed while creating it.

the plugin startup conditions should work also for the first time the plugin is started.
@acockburn am i correct with that?

if so then its a bug that the plugin startup conditions are not checked when AD is restarted.

allthough when i think about it more, i guess the problem lies in that when 1 plugin initialisation is hold up all others are also hold up.

@bg1000
Copy link
Contributor

bg1000 commented Sep 30, 2020

@ReneTode , @acockburn -
I honestly don't know if it's a bug or intended functionality. In my (brief) testing it worked the way I stated in my comment per the appdaemon.log file. It does produce the desired behavior when configured in a certain way. I was confused by it initially but now see that it gives you the flexibility to do different checks for a HASS restart vs. an Appdaemon restart which might have a use case. If it is intended and you think an example like above would help the docs I am willing to contribute. If you decide it might be a bug and want more details on versions, testing, etc. please let me know.

@acockburn
Copy link
Member

Replying on my phone so can’t easily check but there are separate arguments for plug-in restart vs start delays/conditions

@definitio
Copy link

definitio commented Sep 30, 2020

I have delays (120 seconds) in both appdaemon_startup_conditions and plugin_startup_conditions but it doesn't help. AppDaemon always loads less then 100 services (it should be ~200) and I need to restart it (I have an app reloader for that).

2020-09-30 23:42:55.965353 INFO AppDaemon: Found 8 total apps,
2020-09-30 23:42:55.965607 INFO AppDaemon: Starting Apps with 8 workers and 8 pins,
2020-09-30 23:42:55.972741 INFO AppDaemon: Running on port 5050,
2020-09-30 23:42:55.973378 INFO AppDaemon: Got initial state from namespace mqtt,
2020-09-30 23:42:55.973639 INFO MQTT: MQTT Plugin initialization complete,
2020-09-30 23:43:00.947744 INFO HASS: Connected to Home Assistant 0.115.6,
2020-09-30 23:43:01.105211 INFO HASS: Evaluating startup conditions,
2020-09-30 23:43:01.105532 INFO HASS: Delaying startup for 120 seconds,
2020-09-30 23:45:01.118631 INFO AppDaemon: Got initial state from namespace default,
2020-09-30 23:45:03.102077 INFO AppDaemon: Scheduler running in realtime,
2020-09-30 23:45:03.104131 INFO AppDaemon: Adding /conf/apps to module import path,
2020-09-30 23:45:03.113135 INFO AppDaemon: Loading App Module: ...
2020-09-30 23:45:03.116690 INFO AppDaemon: Initializing app ...
event fire mqtt,
event fire default,
event fire admin,
event fire rules,
event fire mqtt,
event fire default,
event fire admin,
event fire rules,
2020-09-30 23:45:03.219107 ERROR reloader: Loaded services: 86,

@ReneTode
Copy link
Contributor

@definitio

just to make sure:

when you start AD 120 seconds after you started HA you get less then 100 services,
when you restart AD after that you get 200?

how about when you delay AD for 5 mins?

@ReneTode
Copy link
Contributor

@bg1000 i didnt say there shouldnt be seperate startup conditions, and they are there for a good reason. as intended.

plugin restart isnt the same as HASS restart. allthough when you restart HASS you will also see that the plugin will be restarted.
but also when you dont restart HASS but AD loses connection to HASS for a few seconds.

i just concluded that the plugin start conditions should always take effect when the plugin starts. (initial start and restart)
if thats not the case, then indeed that should be communicated better (mention in the docs that it doesnt effect first start for example, or rename it to plugin_restart_conditions)

i never used those conditions, but if the plugin start conditions are intentionally not for the first plugin start, i always advised people the wrong way.

@definitio
Copy link

@ReneTode It still loads less then 100 services with 300 seconds delays too.

@ReneTode
Copy link
Contributor

ReneTode commented Oct 1, 2020

hmm, thats a very strange problem.

what if you dont start AD, when you start HA and you start AD 5 mins after HA is started manually?

@definitio
Copy link

All services will be loaded in this case. I have a problem only when HA and AD start simultaneously.

@ReneTode
Copy link
Contributor

ReneTode commented Oct 1, 2020

i just looked at your log again.
and i see what happens.

the plugin connects before the delay.
so that is definitely the problem.

i see the big problem with that though, AD needs to be connected to HA to be able to evaluate if the startup conditions are met, but as soon as HA is connected there is no update from the services anymore.

and i dont believe HA has a new_service, or service_added event, so AD cant listen for new services.

the only solution i see (but thats not for hassio users) is to delay the startup from the AD service.

@electrofloat
Copy link

electrofloat commented Oct 3, 2020

Unfortunately I also regularly have an issue where AppDaemon cannot see a zwave service (2020-10-03 08:00:00.010497 WARNING AppDaemon: Unknown service (default/zwave/set_config_parameter) in call_service) no matter how much delay I put in.

Also.. do multiple startup_conditions work in the State condition type? Because AppDaemon logs only the last one always:

      appdaemon_startup_conditions:
        state: {entity: "zwave.z_wave_sigma_designs_uzb_z_wave_usb_adapter", value: {state: "ready"}}
        state: {entity: "zwave.eurotronic_eur_spiritz_wall_radiator_thermostat_2", value: {state: "ready"}}
        state: {entity: "zwave.aeon_labs_zw100_multisensor_6", value: {state: "ready"}}
      plugin_startup_conditions:
        state: {entity: "zwave.z_wave_sigma_designs_uzb_z_wave_usb_adapter", value: {state: "ready"}}
        state: {entity: "zwave.eurotronic_eur_spiritz_wall_radiator_thermostat_2", value: {state: "ready"}}
        state: {entity: "zwave.aeon_labs_zw100_multisensor_6", value: {state: "ready"}}
2020-10-03 14:37:55.691215 INFO HASS: Evaluating startup conditions
2020-10-03 14:37:55.692947 INFO HASS: Startup condition met: hass state=RUNNING
2020-10-03 14:37:55.696974 INFO HASS: Startup condition met: zwave.aeon_labs_zw100_multisensor_6={'state': 'ready'}
2020-10-03 14:37:55.697107 INFO HASS: All startup conditions met

@ReneTode
Copy link
Contributor

ReneTode commented Oct 5, 2020

yes multiple conditions work if you do it right ;)

        state: {entity: "zwave.z_wave_sigma_designs_uzb_z_wave_usb_adapter", value: {state: "ready"}}
        state: {entity: "zwave.eurotronic_eur_spiritz_wall_radiator_thermostat_2", value: {state: "ready"}}
        state: {entity: "zwave.aeon_labs_zw100_multisensor_6", value: {state: "ready"}}

is no list, so yaml only sees the last line.

        - state: {entity: "zwave.z_wave_sigma_designs_uzb_z_wave_usb_adapter", value: {state: "ready"}}
        - state: {entity: "zwave.eurotronic_eur_spiritz_wall_radiator_thermostat_2", value: {state: "ready"}}
        - state: {entity: "zwave.aeon_labs_zw100_multisensor_6", value: {state: "ready"}}

should work.

@electrofloat
Copy link

I created a new issue about the startup conditions, I don't want to hijack this one. ( #1051)

@bbrendon
Copy link
Contributor

bbrendon commented Feb 1, 2021

I just had an issue where I was getting a lot of Unknown domain (default/script) in call_service from errors. This is appdaemon 4.0.5 so I guess this problem is still ongoing. Restarting appdaemon fixed it as stated above.

Strange I've been running AppD for all these years and never noticed this one.

@ReneTode
Copy link
Contributor

ReneTode commented Feb 1, 2021

Strange I've been running AppD for all these years and never noticed this one.

the problem is not AD, but HA.
nowadays HA tells that its ready before all integrations are loaded. (they changed that a few months ago to startup the frontend faster) and that means that not all services are available.
and as far as i know there still isnt an event that tells that new services are added to HA, so AD cant reload based on that.

if the problem comes back, the best way it to add a delay to the start from AD.

@Odianosen25
Copy link
Collaborator

Actually I believe this has been fixed in dev presently

@ReneTode
Copy link
Contributor

ReneTode commented Feb 1, 2021

is it?
i dont remember anything about that and i still advise people to add a delay, so please confirm, so that i know for sure that there has been found a way that AD updates when HA adds new services.

@ReneTode
Copy link
Contributor

ReneTode commented Feb 1, 2021

listen for service_registered event

if i look at it, then that comment says that AD now listens for newly registered events.
please odia, when you merge a PR (from anyone) make sure there is a changelog entry.
that makes my life so much easier ;)

@Odianosen25
Copy link
Collaborator

Ah just checked and this was actually merged in 4.0.4, which is part of the present release; it’s actually in the change log.

Now if this issue still exists, then the fix not working properly. Need to investigate a bit more on that then.

@ReneTode
Copy link
Contributor

ReneTode commented Feb 2, 2021

i cant find it in the changes from 4.0.4, but if it is added there, it certainly isnt solved ;)
because 4.0.4 is already over an half year old, and there have been reported several issues after that.

@ruifung
Copy link

ruifung commented Feb 5, 2021

I do still occasionally get issues like WARNING AppDaemon: Unknown domain (default/remote) in call_service from time to time until I restart appdaemon.

@ReneTode
Copy link
Contributor

ReneTode commented Feb 5, 2021

@ruifung only after a restart, right?
so when you have restarted HA.

in that case i suspect that your remote integration is starting slow, so i advise to add a delay to appdaemon.yaml

@bbrendon
Copy link
Contributor

bbrendon commented Feb 6, 2021

Happened again. This seems to be more prevalent with the recent versions of HASS. Until someone figures out a fix I'll just add an appdaemon restart to the HASS restart script.

@ReneTode
Copy link
Contributor

ReneTode commented Feb 7, 2021

in general i always advised to restart AD when you restart HA.
it is a matter that is simular to dependant apps.

there always have been, and probably always will be issues to be taken care off, when you restart HA and dont restart AD.
also problems arrise when you restart AD but you dont restart devices that run AD dashboards.

in all the years i use AD/HA/Dashboard, i never have restarted the base program (HA) without the depending programms (AD)

for addons (or depending programs like AD) its hardly possible to make sure everything keeps running, with a constant changing environment as HA is.

@bg1000
Copy link
Contributor

bg1000 commented Feb 9, 2021

@ReneTode - I understand your point about dependent applications. For me, this problem directly correlated to the change in HA to bring the gui up before the app has fully initialized. Before this, I don't think I ever had to restart Appdaemon after restarting HA.

I appreciate the changes that were made to allow a delay and check other startup conditions. I contributed a small PR to the docs on this which is now in the dev branch. I use both of these in my configuration but it still isn't 100% reliable and I end up manually restarting Appdaemon after HA is fully up. This isn't the end of the world but it feels like this one thing doesn't work as well as it used to which is a little frustrating sometimes.

@Odianosen25
Copy link
Collaborator

Hello @bg1000,

Please can you try this PR #1147, as it is to fix this issue.

It works for me locally, but want some confirmation before merging it.

If you can test and get back to me, it will be great.

Regards

@bg1000
Copy link
Contributor

bg1000 commented Feb 9, 2021

@Odianosen25 I'd be happy to try this. I am currently running 4.0.5 with docker. It looks like all of the code changes for PR1147 are in appdaemon/plugins/hass/hassplugin.py. Did you want me to apply this to 4.0.5 or to the current dev branch or something else?

@Odianosen25
Copy link
Collaborator

@bg1000,

I think it’s better if tried against the dev, branch as it’s against that the PR was made.

regards

@bg1000
Copy link
Contributor

bg1000 commented Feb 9, 2021

@Odianosen25 - No problem. I should be able to work on this tonight. I'll get back to you with some results or if I have any questions/problems.

@bg1000
Copy link
Contributor

bg1000 commented Feb 10, 2021

@Odianosen25 It's probably a little early to declare victory but so far so good! I have dev + PR1147 running and have done 5 restarts of homeassistant so far without any issues. Before I started testing I removed all startup checks/delays from my appdaemon.yaml as well. I'll try to get this up around 20 in the next day or two and touch base when I do. Thank you for working on this.

This is likely a separate issue but the container would not build for me as pulled from the repo. I had to add "ARG CRYPTOGRAPHY_DONT_BUILD_RUST=1" to the Dockerfile due to the cryptology build failing. I don't see how your PR could have effected this. I do see some changes in requirements.txt between master and dev. If you agree that this is worth looking into I can open as a separate issue with better documentation of the error.

@Odianosen25
Copy link
Collaborator

@bg1000,

Thanks for confirming, and someone else just also confirmed it seems to work for them, so most likely all good now. Will merge the PR later today into dev.

On the second issue you are right the build fails, and thanks for the pointer as I hadn’t been able to properly look into it yet. You can create a separate issue, or a PR to fix it.

Though I would have preferred avoiding the need to add that arg to the docker file, but will see how it goes.

regards

@Odianosen25
Copy link
Collaborator

Ok this has been merged and @bg1000, I have fixed the build issue.

Regards

@bg1000
Copy link
Contributor

bg1000 commented Feb 10, 2021

@Odianosen25 - Your fix to the Dockerfile is obviously better than what I did. I fetched the current dev branch. It built and started up without issue. Thanks again for working on this.

@Odianosen25
Copy link
Collaborator

Yeah thanks for the acknowledgement @bg1000, it was a security related issue and avoiding rust from my research wasn't a good idea; so had to find another solution.

If any other thing, can always let us know.

@bbrendon
Copy link
Contributor

Looks like its still happening. I left home and my doors didn't lock.

There are 3 script calls in the AD function and all threw warnings.

2021-02-20 20:13:40.972912 WARNING AppDaemon: Unknown domain (default/script) in call_service from shutdown
2021-02-20 20:14:04.632776 WARNING AppDaemon: Unknown domain (default/script) in call_service from shutdown
2021-02-20 20:14:07.636178 WARNING AppDaemon: Unknown domain (default/script) in call_service from shutdown

@bg1000
Copy link
Contributor

bg1000 commented Feb 21, 2021

@bbrendon I believe the fix is only in the dev branch currently. Are you running the current dev branch?

@bbrendon
Copy link
Contributor

I'm not sure how to tell what version i have... this?

# git log --abbrev-commit --oneline|head
0d7a4657 Merge pull request #1157 from Odianosen25/thread_duration_warning_threshold-default
a5ca2348 Update CONFIGURE.rst
c95ac2c1 Merge pull request #117 from AppDaemon/dev

@bg1000
Copy link
Contributor

bg1000 commented Feb 22, 2021

@bbrendon The recent fix @Odianosen25 put in for this was #1147. You can check for this with git log | grep '#1147'. Mine looks like the following:
Merge pull request #1147 from Odianosen25/fix-hass-service

@bbrendon
Copy link
Contributor

Yes. I have that comment in my git log.

@bg1000
Copy link
Contributor

bg1000 commented Feb 22, 2021

Not what I was hoping you would say . . .

@Odianosen25
Copy link
Collaborator

Fact is with HA plugin, I still think some time like 10 secs should be added to the plugin startup no matter what.

Just help to ensure at least no matter what happens, The plugin will come up right.

Regards

@bbrendon
Copy link
Contributor

Here is the fix. Run this in cron. :)

import subprocess

a = subprocess.run(
    ["ps -eo pid,etimes,cmd | awk '/python3 -m homeass/ && $2 > 60 {print $2}'"],
    shell=True,
    text=True,
    stdout=subprocess.PIPE,
)
hass_sec = int(a.stdout.strip())

b = subprocess.run(
    ["ps -eo pid,etimes,cmd |grep '[p]ython3 -m appdaemon'| awk '{print $2}'"],
    shell=True,
    text=True,
    stdout=subprocess.PIPE,
)
appd_sec = int(b.stdout.strip())

if hass_sec < appd_sec + 60:
    # restart appdaemon
    print("restarting appdaemon")
    c = subprocess.run(
        ["systemctl restart appdaemon.service"],
        shell=True,
        text=True,
        stdout=subprocess.PIPE,
    )


@definitio
Copy link

definitio commented Apr 4, 2021

Even 120 seconds is not enough for me. My workaround is to use python_script that call a service instead of call_service():

domain, service = data.get('service_name').split('/', maxsplit=1)
service_data = data.get('service_data')

logger.debug("Running {}.{} with data: {}".format(domain, service, service_data))
hass.services.call(domain, service, service_data)

Usage:

self.call_service('python_script/call_service', service_name='', service_data={})

@Odianosen25 Can you reopen the issue? It's definitely not fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants