Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zeroconf.get_service_info returns None #288

Closed
pawlizio opened this issue Aug 10, 2020 · 20 comments
Closed

Zeroconf.get_service_info returns None #288

pawlizio opened this issue Aug 10, 2020 · 20 comments

Comments

@pawlizio
Copy link

I have a velux KLF200 and trying to indentify its IP by it's mdns name. However using your example code does find the name of the device, but get_service_info always runs into timout.

I aready increased the timeout to 60000, but still I only return None due to timeout. The strange thing is, when using in parallel Bonjour browser get_service_info immediately returns correct information.

@jstasiak
Copy link
Collaborator

Please provide the OS you're running, Python version, python-zeroconf version the code you run and the logging output with the debug logging enabled.

@pawlizio
Copy link
Author

pawlizio commented Aug 12, 2020

Basically I'm running zeroconf via Home Assistant in my case on a Raspian OS with docker. For debugging I used however my computer with Windows OS. I'm trying to write a zeroconf discovery for VELUX KLF200. Velux provide on their API specification on chapter "15 Appendix 3: Identifying IP address of a KLF200 device using mDNS protocol" a way to discover KLF200 on the network. https://velcdn.azureedge.net/~/media/com/api/klf200/technical%20specification%20for%20klf%20200%20api-ver3-18.pdf

Trying this with Bonjour Browser also discovers this device, but using your zeroconf implementation only discovers the service but does not provide the service_info.

I forked your latest zeroconf and run it on my computer with the following modified code from your README:

from zeroconf import ServiceBrowser, Zeroconf, ServiceInfo
import logging

fh = logging.FileHandler("zeroconf.log")
fh.setLevel(logging.DEBUG)
ch = logging.FileHandler("zeroconf.log")
ch.setLevel(logging.WARNING)
nh = logging.StreamHandler()
nh.setLevel(logging.WARNING)
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
fh.setFormatter(formatter)
ch.setFormatter(formatter)
log = logging.getLogger("zeroconf")
log.setLevel(logging.DEBUG)
log.addHandler(fh)
log.addHandler(ch)
log.addHandler(nh)


class MyListener:
    
    def remove_service(self, zeroconf, type, name):
        print("Service %s removed" % (name,))

    def add_service(self, zeroconf, type, name):
        log.warning("Added service: %s", name)
        if name.startswith("VELUX_KLF_LAN"):
            log.warning("Found VELUX_KLF_LAN device")
            info = zeroconf.get_service_info(type, name, timeout=3000)
            log.warning("Got service info for %s, service info: %s" % (name, info))
            info = zeroconf.get_service_info(type, name, timeout=30000)
            log.warning("Service %s added by using Bonjour Browser in parallel, service info: %s" % (name, info))

    def update_service(self, zeroconf, type, name):
        log.warning("Got update for service: %s", name)
        if name.startswith("VELUX_KLF_LAN"):
            log.warning("Update available for VELUX_KLF_LAN device")
            info = zeroconf.get_service_info(type, name, timeout=3000)
            log.warning("Service %s update is available with the following info: %s"% (name, info))
            info = zeroconf.get_service_info(type, name, timeout=30000)
            log.warning("Service %s updated by using Bonjour Browser in parallel, service info: %s" % (name, info))

zeroconf = Zeroconf()
listener = MyListener()
browser = ServiceBrowser(zeroconf, "_http._tcp.local.", listener)
try:
    input("Press enter to exit...\n\n")
finally:
    zeroconf.close()

Attached is the log output. In the first 3 seconds you can see that none is returned as service info. I already tried to increase but without success. Now I added a second try where I increased the timing in order to open Bonjour Browser in parallel. Now you can see that also zeroconf provides a service info.
zeroconf.log

@emontnemery
Copy link
Collaborator

emontnemery commented Aug 13, 2020

If I read the log correctly, this is the question from zeroconf intended to get the service info.
We receive the question from ourselves on all three interfaces, and also a partial answer (onlysrv and txt is answered) from the velux device:

2020-08-13 00:25:00,067 - zeroconf - DEBUG - Sending (71 bytes #1) <DNSOutgoing:{multicast=True, flags=0, questions=[question[srv,in,VELUX_KLF_LAN_5FCC._http._tcp.local.], question[txt,in,VELUX_KLF_LAN_5FCC._http._tcp.local.], question[a,in,VELUX_KLF_LAN_5FCC._http._tcp.local.], question[quada,in,VELUX_KLF_LAN_5FCC._http._tcp.local.]], answers=[], authorities=[], additionals=[]}> as b'\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x12VELUX_KLF_LAN_5FCC\x05_http\x04_tcp\x05local\x00\x00!\x00\x01\xc0\x0c\x00\x10\x00\x01\xc0\x0c\x00\x01\x00\x01\xc0\x0c\x00\x1c\x00\x01'...
2020-08-13 00:25:00,068 - zeroconf - DEBUG - Received from '172.18.144.1':5353 (socket 668): <DNSIncoming:{id=0, flags=0, n_q=4, n_ans=0, n_auth=0, n_add=0, questions=[question[srv,in,VELUX_KLF_LAN_5FCC._http._tcp.local.], question[txt,in,VELUX_KLF_LAN_5FCC._http._tcp.local.], question[a,in,VELUX_KLF_LAN_5FCC._http._tcp.local.], question[quada,in,VELUX_KLF_LAN_5FCC._http._tcp.local.]], answers=[]}> (71 bytes) as [b'\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x12VELUX_KLF_LAN_5FCC\x05_http\x04_tcp\x05local\x00\x00!\x00\x01\xc0\x0c\x00\x10\x00\x01\xc0\x0c\x00\x01\x00\x01\xc0\x0c\x00\x1c\x00\x01']
2020-08-13 00:25:00,068 - zeroconf - DEBUG - Received from '192.168.178.49':5353 (socket 668): <DNSIncoming:{id=0, flags=0, n_q=4, n_ans=0, n_auth=0, n_add=0, questions=[question[srv,in,VELUX_KLF_LAN_5FCC._http._tcp.local.], question[txt,in,VELUX_KLF_LAN_5FCC._http._tcp.local.], question[a,in,VELUX_KLF_LAN_5FCC._http._tcp.local.], question[quada,in,VELUX_KLF_LAN_5FCC._http._tcp.local.]], answers=[]}> (71 bytes) as [b'\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x12VELUX_KLF_LAN_5FCC\x05_http\x04_tcp\x05local\x00\x00!\x00\x01\xc0\x0c\x00\x10\x00\x01\xc0\x0c\x00\x01\x00\x01\xc0\x0c\x00\x1c\x00\x01']
2020-08-13 00:25:00,068 - zeroconf - DEBUG - Received from '127.0.0.1':5353 (socket 668): <DNSIncoming:{id=0, flags=0, n_q=4, n_ans=0, n_auth=0, n_add=0, questions=[question[srv,in,VELUX_KLF_LAN_5FCC._http._tcp.local.], question[txt,in,VELUX_KLF_LAN_5FCC._http._tcp.local.], question[a,in,VELUX_KLF_LAN_5FCC._http._tcp.local.], question[quada,in,VELUX_KLF_LAN_5FCC._http._tcp.local.]], answers=[]}> (71 bytes) as [b'\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x12VELUX_KLF_LAN_5FCC\x05_http\x04_tcp\x05local\x00\x00!\x00\x01\xc0\x0c\x00\x10\x00\x01\xc0\x0c\x00\x01\x00\x01\xc0\x0c\x00\x1c\x00\x01']
2020-08-13 00:25:00,071 - zeroconf - DEBUG - Received from '192.168.178.76':5353 (socket 668): <DNSIncoming:{id=0, flags=33792, n_q=0, n_ans=2, n_auth=0, n_add=0, questions=[], answers=[record[srv,in-unique,VELUX_KLF_LAN_5FCC._http._tcp.local.]=120/119,VELUX_KLF_LAN_5FCC.local.:80, record[txt,in-unique,VELUX_KLF_LAN_5FCC._http._tcp.local.]=120/119,b'\x1dpath=V'...]}> (168 bytes) as [b'\x00\x00\x84\x00\x00\x00\x00\x02\x00\x00\x00\x00\x12VELUX_KLF_LAN_5FCC\x05_http\x04_tcp\x05local\x00\x00!\x80\x01\x00\x00\x00x\x00 \x00\x00\x00\x00\x00P\x12VELUX_KLF_LAN_5FCC\x05local\x00\x12VELUX_KLF_LAN_5FCC\x05_http\x04_tcp\x05local\x00\x00\x10\x80\x01\x00\x00\x00x\x00\x1e\x1dpath=VELUX_KLF_LAN_5FCC.local']
2020-08-13 00:25:00,270 - zeroconf - DEBUG - Sending (152 bytes #1) <DNSOutgoing:{multicast=True, flags=0, questions=[question[srv,in,VELUX_KLF_LAN_5FCC._http._tcp.local.], question[txt,in,VELUX_KLF_LAN_5FCC._http._tcp.local.], question[a,in,VELUX_KLF_LAN_5FCC.local.], question[quada,in,VELUX_KLF_LAN_5FCC.local.]], answers=[(record[srv,in-unique,VELUX_KLF_LAN_5FCC._http._tcp.local.]=120/119,VELUX_KLF_LAN_5FCC.local.:80, 1597271100270.5242), (record[txt,in-unique,VELUX_KLF_LAN_5FCC._http._tcp.local.]=120/119,b'\x1dpath=V'..., 1597271100270.5242)], authorities=[], additionals=[]}> as b'\x00\x00\x00\x00\x00\x04\x00\x02\x00\x00\x00\x00\x12VELUX_KLF_LAN_5FCC\x05_http\x04_tcp\x05local\x00\x00!\x00\x01\xc0\x0c\x00\x10\x00\x01\x12VELUX_KLF_LAN_5FCC\xc0*\x00\x01\x00\x01\xc0;\x00\x1c\x00\x01\xc0\x0c\x00!\x80\x01\x00\x00\x00w\x00\x08\x00\x00\x00\x00\x00P\xc0;\xc0\x0c\x00\x10\x80\x01\x00\x00\x00w\x00\x1e\x1dpath=VELUX_KLF_LAN_5FCC.local'...
2020-08-13 00:25:00,271 - zeroconf - DEBUG - Received from '172.18.144.1':5353 (socket 668): <DNSIncoming:{id=0, flags=0, n_q=4, n_ans=2, n_auth=0, n_add=0, questions=[question[srv,in,VELUX_KLF_LAN_5FCC._http._tcp.local.], question[txt,in,VELUX_KLF_LAN_5FCC._http._tcp.local.], question[a,in,VELUX_KLF_LAN_5FCC.local.], question[quada,in,VELUX_KLF_LAN_5FCC.local.]], answers=[record[srv,in-unique,VELUX_KLF_LAN_5FCC._http._tcp.local.]=119/119,VELUX_KLF_LAN_5FCC.local.:80, record[txt,in-unique,VELUX_KLF_LAN_5FCC._http._tcp.local.]=119/119,b'\x1dpath=V'...]}> (152 bytes) as [b'\x00\x00\x00\x00\x00\x04\x00\x02\x00\x00\x00\x00\x12VELUX_KLF_LAN_5FCC\x05_http\x04_tcp\x05local\x00\x00!\x00\x01\xc0\x0c\x00\x10\x00\x01\x12VELUX_KLF_LAN_5FCC\xc0*\x00\x01\x00\x01\xc0;\x00\x1c\x00\x01\xc0\x0c\x00!\x80\x01\x00\x00\x00w\x00\x08\x00\x00\x00\x00\x00P\xc0;\xc0\x0c\x00\x10\x80\x01\x00\x00\x00w\x00\x1e\x1dpath=VELUX_KLF_LAN_5FCC.local']
2020-08-13 00:25:00,271 - zeroconf - DEBUG - Received from '192.168.178.49':5353 (socket 668): <DNSIncoming:{id=0, flags=0, n_q=4, n_ans=2, n_auth=0, n_add=0, questions=[question[srv,in,VELUX_KLF_LAN_5FCC._http._tcp.local.], question[txt,in,VELUX_KLF_LAN_5FCC._http._tcp.local.], question[a,in,VELUX_KLF_LAN_5FCC.local.], question[quada,in,VELUX_KLF_LAN_5FCC.local.]], answers=[record[srv,in-unique,VELUX_KLF_LAN_5FCC._http._tcp.local.]=119/119,VELUX_KLF_LAN_5FCC.local.:80, record[txt,in-unique,VELUX_KLF_LAN_5FCC._http._tcp.local.]=119/119,b'\x1dpath=V'...]}> (152 bytes) as [b'\x00\x00\x00\x00\x00\x04\x00\x02\x00\x00\x00\x00\x12VELUX_KLF_LAN_5FCC\x05_http\x04_tcp\x05local\x00\x00!\x00\x01\xc0\x0c\x00\x10\x00\x01\x12VELUX_KLF_LAN_5FCC\xc0*\x00\x01\x00\x01\xc0;\x00\x1c\x00\x01\xc0\x0c\x00!\x80\x01\x00\x00\x00w\x00\x08\x00\x00\x00\x00\x00P\xc0;\xc0\x0c\x00\x10\x80\x01\x00\x00\x00w\x00\x1e\x1dpath=VELUX_KLF_LAN_5FCC.local']
2020-08-13 00:25:00,271 - zeroconf - DEBUG - Received from '127.0.0.1':5353 (socket 668): <DNSIncoming:{id=0, flags=0, n_q=4, n_ans=2, n_auth=0, n_add=0, questions=[question[srv,in,VELUX_KLF_LAN_5FCC._http._tcp.local.], question[txt,in,VELUX_KLF_LAN_5FCC._http._tcp.local.], question[a,in,VELUX_KLF_LAN_5FCC.local.], question[quada,in,VELUX_KLF_LAN_5FCC.local.]], answers=[record[srv,in-unique,VELUX_KLF_LAN_5FCC._http._tcp.local.]=119/119,VELUX_KLF_LAN_5FCC.local.:80, record[txt,in-unique,VELUX_KLF_LAN_5FCC._http._tcp.local.]=119/119,b'\x1dpath=V'...]}> (152 bytes) as [b'\x00\x00\x00\x00\x00\x04\x00\x02\x00\x00\x00\x00\x12VELUX_KLF_LAN_5FCC\x05_http\x04_tcp\x05local\x00\x00!\x00\x01\xc0\x0c\x00\x10\x00\x01\x12VELUX_KLF_LAN_5FCC\xc0*\x00\x01\x00\x01\xc0;\x00\x1c\x00\x01\xc0\x0c\x00!\x80\x01\x00\x00\x00w\x00\x08\x00\x00\x00\x00\x00P\xc0;\xc0\x0c\x00\x10\x80\x01\x00\x00\x00w\x00\x1e\x1dpath=VELUX_KLF_LAN_5FCC.local']
2020-08-13 00:25:00,286 - zeroconf - DEBUG - Received from '192.168.178.76':5353 (socket 668): <DNSIncoming:{id=0, flags=33792, n_q=0, n_ans=2, n_auth=0, n_add=0, questions=[], answers=[record[srv,in-unique,VELUX_KLF_LAN_5FCC._http._tcp.local.]=120/120,VELUX_KLF_LAN_5FCC.local.:80, record[txt,in-unique,VELUX_KLF_LAN_5FCC._http._tcp.local.]=120/120,b'\x1dpath=V'...]}> (168 bytes) as [b'\x00\x00\x84\x00\x00\x00\x00\x02\x00\x00\x00\x00\x12VELUX_KLF_LAN_5FCC\x05_http\x04_tcp\x05local\x00\x00!\x80\x01\x00\x00\x00x\x00 \x00\x00\x00\x00\x00P\x12VELUX_KLF_LAN_5FCC\x05local\x00\x12VELUX_KLF_LAN_5FCC\x05_http\x04_tcp\x05local\x00\x00\x10\x80\x01\x00\x00\x00x\x00\x1e\x1dpath=VELUX_KLF_LAN_5FCC.local']

I assume these are the questions from Bonjour, received on both interfaces:

2020-08-13 00:25:10,968 - zeroconf - DEBUG - Received from '192.168.178.49':5353 (socket 668): <DNSIncoming:{id=0, flags=0, n_q=2, n_ans=4, n_auth=0, n_add=0, questions=[question[ptr,in,_http._tcp.local.], question[a,in,VELUX_KLF_LAN_5FCC.local.]], answers=[record[ptr,in,_http._tcp.local.]=4490/4490,DaumlechNAS._http._tcp.local., record[ptr,in,_http._tcp.local.]=4490/4490,Kyocera ECOSYS M5521cdw._http._tcp.local., record[ptr,in,_http._tcp.local.]=120/120,VELUX_KLF_WIFI_3725._http._tcp.local., record[ptr,in,_http._tcp.local.]=120/120,VELUX_KLF_LAN_5FCC._http._tcp.local.]}> (190 bytes) as [b'\x00\x00\x00\x00\x00\x02\x00\x04\x00\x00\x00\x00\x05_http\x04_tcp\x05local\x00\x00\x0c\x00\x01\x12VELUX_KLF_LAN_5FCC\xc0\x17\x00\x01\x00\x01\xc0\x0c\x00\x0c\x00\x01\x00\x00\x11\x8a\x00\x0e\x0bDaumlechNAS\xc0\x0c\xc0\x0c\x00\x0c\x00\x01\x00\x00\x11\x8a\x00\x1a\x17Kyocera ECOSYS M5521cdw\xc0\x0c\xc0\x0c\x00\x0c\x00\x01\x00\x00\x00x\x00\x16\x13VELUX_KLF_WIFI_3725\xc0\x0c\xc0\x0c\x00\x0c\x00\x01\x00\x00\x00x\x00\x15\x12VELUX_KLF_LAN_5FCC\xc0\x0c']
2020-08-13 00:25:10,968 - zeroconf - DEBUG - Received from '172.18.144.1':5353 (socket 668): <DNSIncoming:{id=0, flags=0, n_q=2, n_ans=0, n_auth=0, n_add=0, questions=[question[ptr,in,_http._tcp.local.], question[a,in,VELUX_KLF_LAN_5FCC.local.]], answers=[]}> (59 bytes) as [b'\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x05_http\x04_tcp\x05local\x00\x00\x0c\x00\x01\x12VELUX_KLF_LAN_5FCC\xc0\x17\x00\x01\x00\x01']

With answer:

2020-08-13 00:25:10,972 - zeroconf - DEBUG - Received from '192.168.178.76':5353 (socket 668): <DNSIncoming:{id=0, flags=33792, n_q=0, n_ans=3, n_auth=0, n_add=0, questions=[], answers=[record[ptr,in,_http._tcp.local.]=120/120,VELUX_KLF_WIFI_3725._http._tcp.local., record[ptr,in,_http._tcp.local.]=120/120,VELUX_KLF_LAN_5FCC._http._tcp.local., record[a,in-unique,VELUX_KLF_LAN_5FCC.local.]=120/120,192.168.178.76]}> (183 bytes) as [b'\x00\x00\x84\x00\x00\x00\x00\x03\x00\x00\x00\x00\x05_http\x04_tcp\x05local\x00\x00\x0c\x00\x01\x00\x00\x00x\x00&\x13VELUX_KLF_WIFI_3725\x05_http\x04_tcp\x05local\x00\x05_http\x04_tcp\x05local\x00\x00\x0c\x00\x01\x00\x00\x00x\x00%\x12VELUX_KLF_LAN_5FCC\x05_http\x04_tcp\x05local\x00\x12VELUX_KLF_LAN_5FCC\x05local\x00\x00\x01\x80\x01\x00\x00\x00x\x00\x04\xc0\xa8\xb2L']
2020-08-13 00:25:10,972 - zeroconf - WARNING - Service VELUX_KLF_LAN_5FCC._http._tcp.local. added by using Bonjour Browser in parallel, service info: ServiceInfo(type='_http._tcp.local.', name='VELUX_KLF_LAN_5FCC._http._tcp.local.', addresses=[b'\xc0\xa8\xb2L'], port=80, weight=0, priority=0, server='VELUX_KLF_LAN_5FCC.local.', properties={b'path': b'VELUX_KLF_LAN_5FCC.local'})

Bonjour asks for ptr and a, whereas zeroconf asks for srv, txt, a and quada.
Maybe you can try to restrict zeroconf questions to a, just for testing:

                    out = DNSOutgoing(_FLAGS_QR_QUERY)
                    #out.add_question(DNSQuestion(self.name, _TYPE_SRV, _CLASS_IN))
                    #out.add_answer_at_time(zc.cache.get_by_details(self.name, _TYPE_SRV, _CLASS_IN), now)

                    #out.add_question(DNSQuestion(self.name, _TYPE_TXT, _CLASS_IN))
                    #out.add_answer_at_time(zc.cache.get_by_details(self.name, _TYPE_TXT, _CLASS_IN), now)

                    if self.server is not None:
                        out.add_question(DNSQuestion(self.server, _TYPE_A, _CLASS_IN))
                        #out.add_answer_at_time(zc.cache.get_by_details(self.server, _TYPE_A, _CLASS_IN), now)
                        #out.add_question(DNSQuestion(self.server, _TYPE_AAAA, _CLASS_IN))
                        #out.add_answer_at_time(
                        #    zc.cache.get_by_details(self.server, _TYPE_AAAA, _CLASS_IN), now
                        #)

One thing which seems a bit odd is that zeroconf opens two sets of sockets, and this is only done in the ZeroConf constructor which indicates a 2nd ZeroConf instance being created.
Are you running the example above directly from the command line or from within Home Assistant somehow?
Update the log format to include the thread info also, something like: "%(asctime)s %(levelname)s (%(threadName)s) [%(name)s] %(message)s"

2020-08-13 00:24:59,851 - zeroconf - DEBUG - Created socket <socket.socket fd=668, family=AddressFamily.AF_INET, type=SocketKind.SOCK_DGRAM, proto=0, laddr=('0.0.0.0', 5353)>
2020-08-13 00:24:59,856 - zeroconf - DEBUG - Created socket <socket.socket fd=760, family=AddressFamily.AF_INET, type=SocketKind.SOCK_DGRAM, proto=0, laddr=('172.18.144.1', 5353)>
2020-08-13 00:24:59,857 - zeroconf - DEBUG - Created socket <socket.socket fd=608, family=AddressFamily.AF_INET, type=SocketKind.SOCK_DGRAM, proto=0, laddr=('192.168.178.49', 5353)>
2020-08-13 00:24:59,857 - zeroconf - DEBUG - Created socket <socket.socket fd=604, family=AddressFamily.AF_INET, type=SocketKind.SOCK_DGRAM, proto=0, laddr=('127.0.0.1', 5353)>
2020-08-13 00:25:00,409 - zeroconf - DEBUG - Created socket <socket.socket fd=748, family=AddressFamily.AF_INET, type=SocketKind.SOCK_DGRAM, proto=0, laddr=('0.0.0.0', 5353)>
2020-08-13 00:25:00,416 - zeroconf - DEBUG - Created socket <socket.socket fd=1016, family=AddressFamily.AF_INET, type=SocketKind.SOCK_DGRAM, proto=0, laddr=('127.0.0.1', 5353)>
2020-08-13 00:25:00,417 - zeroconf - DEBUG - Created socket <socket.socket fd=1020, family=AddressFamily.AF_INET, type=SocketKind.SOCK_DGRAM, proto=0, laddr=('192.168.178.49', 5353)>
2020-08-13 00:25:00,418 - zeroconf - DEBUG - Created socket <socket.socket fd=636, family=AddressFamily.AF_INET, type=SocketKind.SOCK_DGRAM, proto=0, laddr=('172.18.144.1', 5353)>

@pawlizio
Copy link
Author

No, I always disconnected Home Assistant from the Network when performing this test. Attached is the modified version. I used visual studio code to run the code, but I also included one log done by calling python zc_test.py from Windows PowerShell.
zeroconf.log
zeroconf_from_PowerShell.log

@pawlizio
Copy link
Author

My problem is also that home assistant zeroconf component does not trigger other components if service_info is None in order to prevent the browser thread from collapsing. Otherwise I could use the name of the service to get the IP via socket.gethostbyname(hostname) where hostname is the service name without the last dot.

https://github.com/home-assistant/core/blob/ee64aafc3932ea0a7a76a33d1827db0c78fc0ed3/homeassistant/components/zeroconf/__init__.py#L215-L220

@emontnemery
Copy link
Collaborator

emontnemery commented Aug 13, 2020

Edit: Never mind what I wrote previously, I didn't realize the logs you uploaded include several iterations of running your test program. Maybe you could add something to the log when the program starts so it's possible to see where an iteration starts?

Can you try to modify like this instead:

                    out = DNSOutgoing(_FLAGS_QR_QUERY)
                    out.add_question(DNSQuestion(self.name, _TYPE_SRV, _CLASS_IN))
                    #out.add_answer_at_time(zc.cache.get_by_details(self.name, _TYPE_SRV, _CLASS_IN), now)

                    #out.add_question(DNSQuestion(self.name, _TYPE_TXT, _CLASS_IN))
                    #out.add_answer_at_time(zc.cache.get_by_details(self.name, _TYPE_TXT, _CLASS_IN), now)

                    if self.server is not None:
                        out.add_question(DNSQuestion(self.server, _TYPE_A, _CLASS_IN))
                        #out.add_answer_at_time(zc.cache.get_by_details(self.server, _TYPE_A, _CLASS_IN), now)
                        #out.add_question(DNSQuestion(self.server, _TYPE_AAAA, _CLASS_IN))
                        #out.add_answer_at_time(
                        #    zc.cache.get_by_details(self.server, _TYPE_AAAA, _CLASS_IN), now
                        #)

prevent the browser thread from collapsing

That comments just means that there would be exceptions thrown if trying to use the returned None as a service info object.

@pawlizio
Copy link
Author

This modification now works. I got immediately the service info. What does this means now?

zeroconf.log

@emontnemery
Copy link
Collaborator

emontnemery commented Aug 14, 2020

The difference between Avahi and zeroconf was that Avahi asked fewer questions when trying to get the service info.
The modification removed some questions when trying to get the service info.

Since it's now working, I think it means the Velux device has a broken mDNS implementation.

Can you try this also:

                    out = DNSOutgoing(_FLAGS_QR_QUERY)
                    out.add_question(DNSQuestion(self.name, _TYPE_SRV, _CLASS_IN))
                    out.add_answer_at_time(zc.cache.get_by_details(self.name, _TYPE_SRV, _CLASS_IN), now)

                    out.add_question(DNSQuestion(self.name, _TYPE_TXT, _CLASS_IN))
                    out.add_answer_at_time(zc.cache.get_by_details(self.name, _TYPE_TXT, _CLASS_IN), now)

                    if self.server is not None:
                        out.add_question(DNSQuestion(self.server, _TYPE_A, _CLASS_IN))
                        out.add_answer_at_time(zc.cache.get_by_details(self.server, _TYPE_A, _CLASS_IN), now)
                        #out.add_question(DNSQuestion(self.server, _TYPE_AAAA, _CLASS_IN))
                        #out.add_answer_at_time(
                        #    zc.cache.get_by_details(self.server, _TYPE_AAAA, _CLASS_IN), now
                        #)

@pawlizio
Copy link
Author

pawlizio commented Aug 14, 2020

I checked your proposal, but it fails, I got no service info. Therefore I tried to identify which question causes this problem and figured out that as soon as I include the "_TYPE_TXT" this issue occurs. All other questions are OK. Following example provides the service info:

                    out = DNSOutgoing(_FLAGS_QR_QUERY)
                    out.add_question(DNSQuestion(self.name, _TYPE_SRV, _CLASS_IN))
                    out.add_answer_at_time(zc.cache.get_by_details(self.name, _TYPE_SRV, _CLASS_IN), now)

                    # out.add_question(DNSQuestion(self.name, _TYPE_TXT, _CLASS_IN))
                    out.add_answer_at_time(zc.cache.get_by_details(self.name, _TYPE_TXT, _CLASS_IN), now)

                    if self.server is not None:
                        out.add_question(DNSQuestion(self.server, _TYPE_A, _CLASS_IN))
                        out.add_answer_at_time(zc.cache.get_by_details(self.server, _TYPE_A, _CLASS_IN), now)
                        out.add_question(DNSQuestion(self.server, _TYPE_AAAA, _CLASS_IN))
                        out.add_answer_at_time(
                            zc.cache.get_by_details(self.server, _TYPE_AAAA, _CLASS_IN), now
                        )

zeroconf_with_your_last_configuration.log

zeroconf_with_my_above_configuration.log

Also I removed now the MyListener class as follows:

class MyListener:
    
    def remove_service(self, zeroconf, type, name):
        print("Service %s removed" % (name,))

    def add_service(self, zeroconf, type, name):
        log.warning("Added service: %s", name)
        if name.startswith("VELUX_KLF_LAN"):
            log.warning("Found VELUX_KLF_LAN device")
            info = zeroconf.get_service_info(type, name, timeout=3000)
            log.warning("Got service info for %s, service info: %s" % (name, info))
            # info = zeroconf.get_service_info(type, name, timeout=30000)
            # log.warning("Service %s added by using Bonjour Browser in parallel, service info: %s" % (name, info))

    def update_service(self, zeroconf, type, name):
        log.warning("Got update for service: %s", name)
        if name.startswith("VELUX_KLF_LAN"):
            log.warning("Update available for VELUX_KLF_LAN device")
            info = zeroconf.get_service_info(type, name, timeout=3000)
            log.warning("Service %s update is available with the following info: %s"% (name, info))
            # info = zeroconf.get_service_info(type, name, timeout=30000)
            # log.warning("Service %s updated by using Bonjour Browser in parallel, service info: %s" % (name, info))

@emontnemery
Copy link
Collaborator

OK, so it seems the device only sends back an answer to the TXT question when there are other questions as well.
Maybe the resulting message becomes too large?

@jstasiak Could/should we avoid asking all the questions every time if we have recent answers?
Something like:

                    out = DNSOutgoing(_FLAGS_QR_QUERY)
                    if not zc.cache.has_recent(self.name, _TYPE_SRV, _CLASS_IN):
                        out.add_question(DNSQuestion(self.name, _TYPE_SRV, _CLASS_IN))
                        out.add_answer_at_time(zc.cache.get_by_details(self.name, _TYPE_SRV, _CLASS_IN), now)

                    if not zc.cache.has_recent(self.name, _TYPE_TXT, _CLASS_IN):
                        out.add_question(DNSQuestion(self.name, _TYPE_TXT, _CLASS_IN))
                        out.add_answer_at_time(zc.cache.get_by_details(self.name, _TYPE_TXT, _CLASS_IN), now)

                    if self.server is not None:
                        if not zc.cache.has_recent(self.name, _TYPE_A, _CLASS_IN):
                            out.add_question(DNSQuestion(self.server, _TYPE_A, _CLASS_IN))
                            out.add_answer_at_time(zc.cache.get_by_details(self.server, _TYPE_A, _CLASS_IN), now)
                        if not zc.cache.has_recent(self.name, _TYPE_AAAA, _CLASS_IN):
                            out.add_question(DNSQuestion(self.server, _TYPE_AAAA, _CLASS_IN))
                            out.add_answer_at_time(
                                zc.cache.get_by_details(self.server, _TYPE_AAAA, _CLASS_IN), now
                            )

@jstasiak
Copy link
Collaborator

I have no immediate opinion on this but it's worth exploring.

@pawlizio
Copy link
Author

OK, I think the main reason zeroconf does not get an IP is that Velux KLF200 only responses with maximum 2 records, so if I change the sequence as below, I receive a service info but never see a TXT record. I don't know if this is a performance issue. The first response only includes one srv-record , while the second has additionally an a-record.

                    out = DNSOutgoing(_FLAGS_QR_QUERY)
                    out.add_question(DNSQuestion(self.name, _TYPE_SRV, _CLASS_IN))
                    out.add_answer_at_time(zc.cache.get_by_details(self.name, _TYPE_SRV, _CLASS_IN), now)
                    if self.server is not None:
                        out.add_question(DNSQuestion(self.server, _TYPE_A, _CLASS_IN))
                        out.add_answer_at_time(zc.cache.get_by_details(self.server, _TYPE_A, _CLASS_IN), now)
                        out.add_question(DNSQuestion(self.server, _TYPE_AAAA, _CLASS_IN))
                        out.add_answer_at_time(
                            zc.cache.get_by_details(self.server, _TYPE_AAAA, _CLASS_IN), now
                        )
                    out.add_question(DNSQuestion(self.name, _TYPE_TXT, _CLASS_IN))
                    out.add_answer_at_time(zc.cache.get_by_details(self.name, _TYPE_TXT, _CLASS_IN), now)
                    zc.send(out)
                    next_ = now + delay
                    delay *= 2

zeroconf_with_changed_sequence.log

Also that one works, but KLF200 never gives a record for _TYPE_AAAA

                    out = DNSOutgoing(_FLAGS_QR_QUERY)
                    out.add_question(DNSQuestion(self.name, _TYPE_SRV, _CLASS_IN))
                    out.add_answer_at_time(zc.cache.get_by_details(self.name, _TYPE_SRV, _CLASS_IN), now)
                    out.add_question(DNSQuestion(self.name, _TYPE_TXT, _CLASS_IN))
                    out.add_answer_at_time(zc.cache.get_by_details(self.name, _TYPE_TXT, _CLASS_IN), now)
                    zc.send(out)
                    if self.server is not None:
                        out2 = DNSOutgoing(_FLAGS_QR_QUERY)
                        out2.add_question(DNSQuestion(self.server, _TYPE_A, _CLASS_IN))
                        out2.add_answer_at_time(zc.cache.get_by_details(self.server, _TYPE_A, _CLASS_IN), now)
                        out2.add_question(DNSQuestion(self.server, _TYPE_AAAA, _CLASS_IN))
                        out2.add_answer_at_time(
                            zc.cache.get_by_details(self.server, _TYPE_AAAA, _CLASS_IN), now
                        )
                        zc.send(out2)

zeroconf_2_questions_per_frame.log

For testing I also seperated all Questions:

                    out1 = DNSOutgoing(_FLAGS_QR_QUERY)
                    out1.add_question(DNSQuestion(self.name, _TYPE_SRV, _CLASS_IN))
                    out1.add_answer_at_time(zc.cache.get_by_details(self.name, _TYPE_SRV, _CLASS_IN), now)
                    zc.send(out1)
                    out2 = DNSOutgoing(_FLAGS_QR_QUERY)
                    out2.add_question(DNSQuestion(self.name, _TYPE_TXT, _CLASS_IN))
                    out2.add_answer_at_time(zc.cache.get_by_details(self.name, _TYPE_TXT, _CLASS_IN), now)
                    zc.send(out2)
                    if self.server is not None:
                        out3 = DNSOutgoing(_FLAGS_QR_QUERY)
                        out3.add_question(DNSQuestion(self.server, _TYPE_A, _CLASS_IN))
                        out3.add_answer_at_time(zc.cache.get_by_details(self.server, _TYPE_A, _CLASS_IN), now)
                        zc.send(out3)
                        out4 = DNSOutgoing(_FLAGS_QR_QUERY)
                        out4.add_question(DNSQuestion(self.server, _TYPE_AAAA, _CLASS_IN))
                        out4.add_answer_at_time(
                            zc.cache.get_by_details(self.server, _TYPE_AAAA, _CLASS_IN), now
                        )
                        zc.send(out4)

zeroconf_all_questions_seperated.log

@jstasiak
Copy link
Collaborator

That's... interesting. I'm wondering what the right workaround for this on our side could be.

@pawlizio
Copy link
Author

Is there a possibility to skip questions in a message which are already answered?

@emontnemery
Copy link
Collaborator

@jstasiak not asking questions we already know would work, right?

@jstasiak
Copy link
Collaborator

Oh yeah, I would think so, yes.

@pawlizio
Copy link
Author

pawlizio commented Aug 14, 2020

OK, any proposals how to implement?
Is this a way?

                    out = DNSOutgoing(_FLAGS_QR_QUERY)
                    if not zc.cache.get_by_details(self.name, _TYPE_SRV, _CLASS_IN):
                        out.add_question(DNSQuestion(self.name, _TYPE_SRV, _CLASS_IN))
                        out.add_answer_at_time(zc.cache.get_by_details(self.name, _TYPE_SRV, _CLASS_IN), now)
                    if not zc.cache.get_by_details(self.name, _TYPE_TXT, _CLASS_IN):
                        out.add_question(DNSQuestion(self.name, _TYPE_TXT, _CLASS_IN))
                        out.add_answer_at_time(zc.cache.get_by_details(self.name, _TYPE_TXT, _CLASS_IN), now)

                    if self.server is not None:
                        if not zc.cache.get_by_details(self.name, _TYPE_A, _CLASS_IN):
                            out.add_question(DNSQuestion(self.server, _TYPE_A, _CLASS_IN))
                            out.add_answer_at_time(zc.cache.get_by_details(self.server, _TYPE_A, _CLASS_IN), now)
                        if not zc.cache.get_by_details(self.name, _TYPE_AAAA, _CLASS_IN):
                            out.add_question(DNSQuestion(self.server, _TYPE_AAAA, _CLASS_IN))
                            out.add_answer_at_time(
                                zc.cache.get_by_details(self.server, _TYPE_AAAA, _CLASS_IN), now
                            )

At least it provides reasonable results:
zeroconf.log

@emontnemery
Copy link
Collaborator

@pawlizio yeah, like that, but instead of just checking if the answer is in the cache or not maybe also add a check for age if it is cached.

@pawlizio
Copy link
Author

I'm a pretty inexperienced with all this stuff, so it's quite hard to understand the details. How to check the age and what would be an acceptable value?

@pawlizio
Copy link
Author

pawlizio commented Aug 14, 2020

Maybe something like this?

    def request(self, zc: 'Zeroconf', timeout: float) -> bool:
        """Returns true if the service could be discovered on the
        network, and updates this object with details discovered.
        """
        now = current_time_millis()
        delay = _LISTENER_TIME
        next_ = now + delay
        first = next_
        last = now + timeout

        record_types_for_check_cache = [(_TYPE_SRV, _CLASS_IN), (_TYPE_TXT, _CLASS_IN)]
        if self.server is not None:
            record_types_for_check_cache.append((_TYPE_A, _CLASS_IN))
            record_types_for_check_cache.append((_TYPE_AAAA, _CLASS_IN))
        for record_type in record_types_for_check_cache:
            cached = zc.cache.get_by_details(self.name, *record_type)
            if cached:
                self.update_record(zc, now, cached)

        if self.server is not None and self.text is not None and self._addresses:
            return True

        try:
            zc.add_listener(self, DNSQuestion(self.name, _TYPE_ANY, _CLASS_IN))
            while self.server is None or self.text is None or not self._addresses:
                if last <= now:
                    return False
                if next_ <= now:
                    out = DNSOutgoing(_FLAGS_QR_QUERY)
                    cached_entry = zc.cache.get_by_details(self.name, _TYPE_SRV, _CLASS_IN)
                    if not cached_entry or cached_entry.created < first:
                        out.add_question(DNSQuestion(self.name, _TYPE_SRV, _CLASS_IN)) 
                        out.add_answer_at_time(cached_entry, now)
                    cached_entry = zc.cache.get_by_details(self.name, _TYPE_TXT, _CLASS_IN)
                    if not cached_entry or cached_entry.created < first:
                        out.add_question(DNSQuestion(self.name, _TYPE_TXT, _CLASS_IN))
                        out.add_answer_at_time(cached_entry, now)

                    if self.server is not None:
                        cached_entry = zc.cache.get_by_details(self.server, _TYPE_A, _CLASS_IN)
                        if not cached_entry or cached_entry.created < first:
                            out.add_question(DNSQuestion(self.server, _TYPE_A, _CLASS_IN))
                            out.add_answer_at_time(cached_entry, now)
                        cached_entry = zc.cache.get_by_details(self.name, _TYPE_AAAA, _CLASS_IN)
                        if not cached_entry or cached_entry.created < first:
                            out.add_question(DNSQuestion(self.server, _TYPE_AAAA, _CLASS_IN))
                            out.add_answer_at_time(cached_entry, now)
                    zc.send(out)
                    next_ = now + delay
                    delay *= 2

                zc.wait(min(next_, last) - now)
                now = current_time_millis()
        finally:
            zc.remove_listener(self)

        return True

zeroconf.log

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants