Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ServiceEvent::ServiceRemoved doesn't work #54

Open
WilliamVenner opened this issue Aug 9, 2022 · 11 comments
Open

ServiceEvent::ServiceRemoved doesn't work #54

WilliamVenner opened this issue Aug 9, 2022 · 11 comments

Comments

@WilliamVenner
Copy link

Hi,

I noticed that ServiceEvent::ServiceRemoved never fires. I tried shutting down the service daemon, I tried stop_browse on the service daemon, but the other computer never sees that my computer stopped listening/browsing/broadcasting.

Both computers are Windows.

Am I understanding correctly this event should fire when a device goes away?

@keepsimple1
Copy link
Owner

Can you please try to call unregister ? If it didn't work, please let me know.

@WilliamVenner
Copy link
Author

Hi, that does work!

Should the event be firing when a peer exits ungracefully though?

I want to show to the user on the UI that a peer has gone away, even if ungracefully.

@WilliamVenner
Copy link
Author

WilliamVenner commented Aug 10, 2022

While we're here... is there any way to uniquely identify a service/peer? What's stopping me advertising as a fake device or something? Am I misunderstanding the mDNS protocol, or surely is there a way to get the responder's IP address (and not the one they specify in the advertisement, but the actual IP address from the socket)?

@keepsimple1
Copy link
Owner

keepsimple1 commented Aug 11, 2022

I don't think our current implementation detects peer going away ungracefully, even though we use defined TTL values. I can look into this a bit.

I'm not sure about the identity of a service/peer other than it's endpoint (instance name, IP:port), but for mDNS, there is spec for unique resource records identified by its name, rrtype, and rrclass. A resource record can be unique or shared. See this RFC section. Currently we haven't implemented the Conflict Resolution mentioned in the RFC yet.

Regarding the IP address, we distinguish between the responder's IP and the announced service IP. The mDNS protocol specifies the service IP via A (for IPv4) records. Currently we don't log / keep the responder's IP, one reason being that socket2 0.4 branch does not have a safe recv_from API. We could make an effort to get the responder IP, but in the general mDNS sense, it is not useful. At least, I'm not aware of use cases.

@WilliamVenner
Copy link
Author

I don't think our current implementation detects peer going away ungracefully, even though we use defined TTL values. I can look into this a bit.

For my application, I want a list of discovered devices, which removes devices as they go away, just like a Wi-Fi list or something. It would definitely be nice UX. The other issue with not being able to detect when a device goes away is that, if a router gives the IP address to a different device, and then it tries to broadcast itself with mdns-sd, it doesn't seem to... update that state? It won't actually post a ServiceFound event because it thinks the device has already spoken to it, when in fact it's a different device, just with the same IP address. (Honestly that might be a bug in my code, but it would be worth observing)

Regarding the IP address, we distinguish between the responder's IP and the announced service IP.

I think it's important for security to not trust the advertised IP address(es) at all, at least for my application. In a worst case scenario, my application can grant an attacker (virtually) RAT access to the machine if they pull some dirty tricks with the advertised IP address. Maybe I'm being paranoid, but it definitely would be nice to have the actual socket IP address of the advertising service.

@keepsimple1
Copy link
Owner

For my application, I want a list of discovered devices, which removes devices as they go away, just like a Wi-Fi list or something. It would definitely be nice UX.

How quickly would you like to detect the devices going away in this use case?

@WilliamVenner
Copy link
Author

10 seconds would probably make sense from a UI perspective. It would be nice if that was configurable, of course.

@dalesmith3
Copy link

The RFCs do have some specs on TTL values. Let's see. https://datatracker.ietf.org/doc/html/rfc6762#section-10

As a general rule, the recommended TTL value for Multicast DNS
resource records with a host name as the resource record's name
(e.g., A, AAAA, HINFO) or a host name contained within the resource
record's rdata (e.g., SRV, reverse mapping PTR record) SHOULD be 120
seconds.

The recommended TTL value for other Multicast DNS resource records is
75 minutes.

2 minutes sure seems like a long time. (and 75 even more so!), so it's always best to send out those "going away" messages (can't remember what they are called right now).

@dalepsmith
Copy link
Contributor

Doh! Sorry about repeated info on the TTL values. Have to read more carefully.

@keepsimple1
Copy link
Owner

No worries. Btw, the same spec also talks about the difficulty / trade-off of detecting stale service / rdata due to the long TTL:

If Multicast DNS resource records follow the recommendation and have a TTL of 75
   minutes, that means that stale data could persist in the system for a
   little over an hour.  Making the default RR TTL significantly lower
   would reduce the lifetime of stale data, but would produce too much
   extra traffic on the network.  Various techniques are available to
   minimize the impact of such stale data, outlined in the five
   subsections below.

In addition to the "goodbye" going away packet, the following method seems interesting to me too:

https://datatracker.ietf.org/doc/html/rfc6762#section-10.4

Sometimes a cache record can be determined to be stale when a client
   attempts to use the rdata it contains, and the client finds that
   rdata to be incorrect.
<snip>
 The software implementing the Multicast DNS resource record cache
   should provide a mechanism so that clients detecting stale rdata can
   inform the cache.

We can probably look into adding support for such technique.

@keepsimple1
Copy link
Owner

keepsimple1 commented May 6, 2023

https://datatracker.ietf.org/doc/html/rfc6762#section-10.4
Cache Flush on Failure Indication

 The software implementing the Multicast DNS resource record cache
   should provide a mechanism so that clients detecting stale rdata can
   inform the cache.

   When the cache receives this hint that it should reconfirm some
   record, it MUST issue two or more queries for the resource record in
   dispute.  If no response is received within ten seconds, then, even
   though its TTL may indicate that it is not yet due to expire, that
   record SHOULD be promptly flushed from the cache.

Maybe we can support a new method in ServiceDaemon that allows the client ask the daemon to verify / check a given SRV data, say daemon.verify(full_service_instance_name) .

Such an approach lets the user (i.e. the application) to decide the timing of the checks, which avoids a hard-coded / rigid time setting inside this library. And such API does not add any overhead for other applications that don't need to such periodical checks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants