New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some sockets are opened but never read. UDP socket buffer memory usage increases #171
Comments
@nicoladefranceschi I have the same issue. May you please provide be more specific how did you hotfixed it? Or provide hotfix version of your zeroconf.py? |
I tried using @nicoladefranceschi's suggestion, however, while the UDP receive packets issue seems to be mitigiated, there is a CPU spike whenever the sockets are cleaned up which is causing other services to fail. The problem is zeroconf seems to be attaching to all the interfaces. If the host is a docker host, then obviously there are multiple interfaces and zeroconf is attaching to all of them which then leads to UDP memory continuously increasing and eventually crashing. |
I have the same issue that is being reported: docker host saturating memory because zeroconf is attaching to all interfaces. |
I solved the problem by using macvlan on my docker container - https://blog.oddbit.com/post/2018-03-12-using-docker-macvlan-networks/ |
Thanks @kdvlr! |
i use macvlan as @kdvlr suggestion, but still bind to five 5353 port while only read one. |
Sorry for the late reply. I haven’t solved this issue as well and I’m wondering if this happens also on other installations, but is simply ignored; or if it’s something that can be solved by configuring properly zeroconf, but I doubt it. |
Bump. |
The recvq issue hit a few of my programs as well. I'd be willing to write a fix for zeroconf if the authors would be interested in and willing to merge a fix. Here's some code that should fix it for users of the zeroconf library if upstream doesn't fix the problem or until they do release a fix for this issue. For ServiceBrowsers:
For publishing services:
The
This hacky fix is released under the same license as |
@ioerror please open a PR if you have a fix in mind👍 |
Agreed, a fix would be most appreciated. |
#270 is almost ready to be merged (I think specifying IPv6 interfaces by index may be broken now but I'll verify) – I'd appreciate if someone verified that that PR (branch |
@jstasiak I'm happy to test a branch once the conflicts are resolved. I have a test application which regularly triggers this issue once my helper function (see above) is removed. |
@ioerror did you have a chance to test this? I'd like to merge and release this ASAP but I'll at least need a confirmation that this doesn't break stuff that was working before. :) |
@jstasiak the branch appears to fail in CI - is it really ready for testing? I could ignore that and try anyway, I thought it better to wait for the CI to report that it is A-OK first? |
@ioerror it's just linting failing, the actual tests pass. |
@jstasiak I checked out https://github.com/jstasiak/python-zeroconf/tree/rebase-188 today. I'm sad to report that without my
Branch |
Thank you. This is really strange, are you absolutely sure you're running the checked out version of the library as opposed to one installed on your system in a virtualenv or globally? Because only one socket should be bound to Can you set the |
Just to test, I've removed every trace of
In one terminal, I have opened my
When I also open my
I think that means that I was wrong! Hooray! I must have somehow had a stale branch interfering with my test. If there are other tests, I'm happy to do more. I think it looks like now it is fixed, though there are two |
There should be one socket bound to |
This contains two major changes: * Listen on data from respond_sockets in addition to listen_socket * Do not bind respond sockets to 0.0.0.0 or ::/0 The description of the original change by Emil: <<< Without either of these changes, I get no replies at all when browsing for services using the browser example. I'm on a corporate network, and when connecting to a different network it works without these changes, so maybe it's something about the network configuration in this particular network that breaks the previous behavior. Unfortunately, I have no idea how this affects other platforms, or what the changes really mean. However, it works for me and it seems reasonable to get replies back on the same socket where they are sent. >>> The tests pass and it's been confirmed to a reasonable degree that this doesn't break the previously working use cases. Additionally this removes a memory leak where data sent to some of the respond sockets would not be ever read from them (#171). Co-authored-by: Emil Styrke <emil.styrke@axis.com>
@ioerror When I run |
Well, this took a while but it should be fixed in 0.28.0, please reopen if I'm wrong. |
In the Zeroconf class init, some sockets:
https://github.com/jstasiak/python-zeroconf/blob/c7876108150cd251786db4ab52dadd1b2283d262/zeroconf.py#L1810
are created in both the unicast and multicast case.
But here:
https://github.com/jstasiak/python-zeroconf/blob/c7876108150cd251786db4ab52dadd1b2283d262/zeroconf.py#L1858-L1862
in case of multicast, the
_respond_sockets
are never read.This causes some problem with the Recv queue of UDP, because the OS (in my case Ubuntu) keeps all the packets in memory waiting for the socket to read them, but this never happens and the memory keeps growing "forever"!
This is my output of
sudo ss -nlpu
:To solve:
I tried running these two lines
https://github.com/jstasiak/python-zeroconf/blob/c7876108150cd251786db4ab52dadd1b2283d262/zeroconf.py#L1861-L1862
even in the case of multicast.
The problem is "solved".
I'm sure that is not the right solution and may actually be logically wrong doing that for this case.
I'm not an expert, but maybe there is a way to actually tell the OS that those sockets are not interested in listening for packets.
The text was updated successfully, but these errors were encountered: