Skip to content

core/wide-area: fix for CVE-2024-52615#662

Merged
pemensik merged 2 commits intoavahi:masterfrom
msekletar:cve-2024-52615
Jun 19, 2025
Merged

core/wide-area: fix for CVE-2024-52615#662
pemensik merged 2 commits intoavahi:masterfrom
msekletar:cve-2024-52615

Conversation

@msekletar
Copy link
Copy Markdown
Collaborator

No description provided.

@msekletar msekletar marked this pull request as draft November 27, 2024 17:15
@msekletar
Copy link
Copy Markdown
Collaborator Author

Idea behind changes is straight forward, i.e. to not reuse the same port we have to open new socket every time we send packets to DNS servers.

I've ran some basic tests using avahi-resolve so now I am submitting here to get feedback from others as well as CI results.

Btw, maybe there is better way to go about this or we can remove wide area feature completely as I think it is very seldomly used.

@siteshwar
Copy link
Copy Markdown
Contributor

There is 1 new finding reported by OpenScanHub: https://openscanhub.fedoraproject.org/task/26177/log/added.html.

It does not look like a bug at first glance, but you may want to take a look at it. Thanks!

@evverx
Copy link
Copy Markdown
Collaborator

evverx commented Nov 28, 2024

we can remove wide area feature completely

I don't think it should be removed. It's part of https://www.rfc-editor.org/rfc/rfc6763 and, for example, https://www.rfc-editor.org/rfc/rfc8766.html specifies a network proxy that uses Multicast DNS to automatically populate the wide-area unicast Domain Name System namespace to cover real-world use cases. That being said in its current form it should certainly be off by default and the documentation should probably say that it's experimental (or something like that) and that avahi should talk to a local resolver to offload all the DNS stuff onto it. It doesn't fix all the weird interactions with mDNS though.

Ideally it should be made to work properly and be covered with tests (currently it isn't tested at all upstream) and as far as I understand @pemensik was interested in that but given that it isn't trivial I wouldn't expect it to be fixed anytime soon. I'd wait for @pemensik here. Personally I think that with all its limited resources avahi should focus on mDNS first.

It does not look like a bug at first glance, but you may want to take a look at it.

I think there are places in the avahi codebase where this pattern is used (historically) and that snippet came from the existing code and replaced the same warning marked as fixed in https://openscanhub.fedoraproject.org/task/26177/log/fixed.html.

@msekletar msekletar marked this pull request as ready for review June 3, 2025 16:06
@msekletar
Copy link
Copy Markdown
Collaborator Author

Actually, I've received request for the fix of this CVE (in RHEL context) so it would be good to move forward with the upstream inclusion as well.

@msekletar
Copy link
Copy Markdown
Collaborator Author

@pemensik PTAL

Copy link
Copy Markdown
Member

@pemensik pemensik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found just small issue with extra close() call so far.

The change getting randomization works, each type query has random source port and waits for the response on it. It is not ideal, since in case of many queries pending, each requires separate socket. But since this is usually used by localhost clients only, I think that is sufficient for now.

@pemensik
Copy link
Copy Markdown
Member

pemensik commented Jun 4, 2025

Ideally there should be somehow limited bank of random descriptors used by the process, which would start reusing already open random socket once we have open (defined) maximum random sockets. Could be 100-200 sockets. We could store array on AvahiWideAreaLookupEngine, from which each lookup would borrow fd, watch and usage counted structure. something like:

struct AvahiSharedSocket {
  int fd; // initialize to -EBADF
  int usage; // initialize to 1 on fd assigning, increment on reuse
  AvahiWatch *watch; // watch is created per-socket, not per lookup anyway.
  AvahiAddress server; // need to reuse only the same target?
};

struct AvahiWideAreaLookupEngine {
 // ...
size_t last_reused_socket;
AvahiSharedSocket random_sockets[AVAHI_WIDE_AREA_RANDSOCK_MAX];
};

But this would get complicated with minor details, if sockets should be reused properly. I propose to avoid it unless proven necessary. Clients doing requests are likely localhost and somehow trusted.

handle_packet and socket_event seems somehow prepared for sharing multiple lookups on a single socket. After all, originally they shared one socket per address family.

@pemensik
Copy link
Copy Markdown
Member

pemensik commented Jun 4, 2025

Experiments with strace show quite long time until those random sockets are closed, unless another query triggers cleanup of previous (dead) lookups. It seems to wait over 300 seconds to close those sockets. Especially when it uses add_to_cache() in handle_packet, positive answer were received and none other should arrive (or be processed). It may make sense to close socket already for faster reuse, like in send_to_dns_server() function. Unlike multicast queries, DNS queries should receive exactly one response at most. handle_packet should ensure only at most one callback is emitted for one lookup. Possible network duplicates would be received, but fail to find non-dead lookup. Can prevent sending back ICMP errors if socket is closed too early, but currently those sockets are open too long.

But management of living lookups is quite non-obvious, why it works current way. I guess keeping socket open longer than needed is safer now.

Ideally socket should be closed 10-200 miliseconds after answer with known lookup id were received from it. But changing that here is risky. lookup_stop might change just callback to null first time and set l->time_event to some short timeout. Then in sender_timeout_callback, it may detect l->callback == NULL and just close socket and free also time_event.

@pemensik
Copy link
Copy Markdown
Member

pemensik commented Jun 4, 2025

Hmm, maybe limiting total used outgoing random sockets is more important than I thought at first. It uses not only socket descriptors of the process, but also binds to local unique UDP port. It may prevent other programs running on the same host to run their queries, for example system DNS resolver. The limit of ~32k of emphemeral ports is system-wide, not per process limitation.

@msekletar
Copy link
Copy Markdown
Collaborator Author

Actually, number of UDP sockets used is implicitly limited by number of open file descriptors which is by default (soft-limit) 1024.

@msekletar
Copy link
Copy Markdown
Collaborator Author

@pemensik What do you think about my latest comment?

@pemensik
Copy link
Copy Markdown
Member

Okay, I made a mistake when checking default service file descriptiors. Though they are usually unlimited.

$ systemctl show avahi-daemon | grep NOFILE
LimitNOFILE=524288
LimitNOFILESoft=1024

But at least this default limit is more than enough to prevent other local services for being restricted by avahi lookups. We do not have to invent more complex logic if we have already relative strict limit on used descriptors.

@pemensik
Copy link
Copy Markdown
Member

I did not want to introduce a new simple limit of max lookups, but reusing existing one is kind of okay. Yes, existing implementation should not hit file descriptors limit even with heavy usage of wide-dns.

I have not tested how it would respond on socket creation error. It would be great to test it somehow.

@msekletar
Copy link
Copy Markdown
Collaborator Author

I've tried to test it with local named instance and many parallel avahi-resolve invocations. I was unable to get to the limit because local lookups are fast and I started to run into D-Bus related issues before exhausting number of open FDs. After the test number of file descriptors stabilized and returned back to "quiescent" state.

@pemensik
Copy link
Copy Markdown
Member

Okay, looks good! Merging it.

@msekletar
Copy link
Copy Markdown
Collaborator Author

@pemensik Packit CI failure should be investigated but it seems unrelated to the proposed change. Here is Fedora Rawhide scratch build w/o any changes which seems to fail with the same error, https://koji.fedoraproject.org/koji/taskinfo?taskID=134148227

@pemensik pemensik requested review from pemensik and removed request for pemensik June 19, 2025 18:39
@pemensik pemensik dismissed their stale review June 19, 2025 18:44

Changes were addresses, nothing more needed

@pemensik pemensik merged commit 25911be into avahi:master Jun 19, 2025
26 of 31 checks passed
@pemensik
Copy link
Copy Markdown
Member

Ah, okay merged finally. Got confused with the UI to let me cancel my own change request. Could not find the correct button. Rawhide failures should be addressed, but this is definitely improving the situation. Thank you!

@evverx
Copy link
Copy Markdown
Collaborator

evverx commented Jun 19, 2025

Here is Fedora Rawhide scratch build w/o any changes which seems to fail with the same error

It's been failing since April (with FORTIFY_SOURCE): #699

@evverx
Copy link
Copy Markdown
Collaborator

evverx commented Jan 15, 2026

I added the "merged-but-needs-fixing" label because it introduced #810. That's a really fast way for one unprivileged user to consume all the avahi file descriptors. In practice in most cases resolv.conf would probably point to 127.0.0.53 so it's minor but still.

@evverx evverx added the wide-area DNS-SD queries using unicast DNS label Jan 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merged-but-needs-fixing wide-area DNS-SD queries using unicast DNS

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants