-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
netlink socket created with pid of the process #3
Comments
Upon further investigation (https://people.redhat.com/nhorman/papers/netlink.pdf) setting the Pid to own pid may be correct and actually good and the article I mentioned earlier might be wrong... Not sure, digging in... |
Good question, I had read a lot of documentations about the Pid (port id) field in a SockaddrNetlink struct and the sense (the usage) of this field is a little bit opaque to netlink core. Besides, that is what is said in the manual page of netlink http://man7.org/linux/man-pages/man7/netlink.7.html about I had also read this paper (ie: netlink.pdf) to be more precise, this part :
But the manual of netlink said : So as you can see, the answer is not so easy. And until your request, the project had no need to subscribe to the netlink channel at multiple times. Best |
Thanks for reply. Yeah I see the answer may be complicated, so I think it's wise not to rush any changes in this area, perhaps it is ok the way it is now. This problem is currently only affecting our tests. I've no use case for multiple netlink sockets, it just turned out hard to prevent them in the tests. I'm sure I can tweak our test code, will keep investigating it. |
Did you manage to re-implement or find another solution to avoid this race in your unit tests ? At this time, according to your answer, I will not apply changes to fix this behavior because I'm not convinced that have a sense to have many subscriber at the same time inside the same executable. But I keep in mind this limitation to enhance this usability in case of future unit test. |
Yes, I did manage to overcome the problem by mocking my monitor code in the tests, which makes sense anyway.. I think you can close the issue for now, thank you! |
Thanks for replied, I'm thrilled you have found a way to solve this problem. |
Please reconsider fixing this.
This affects not just NETLINK_KOBJECT_UEVENT but also makes it very hard to use other kinds of netlink sockets in the same process. The kernel documentation is quite clear about the nl_pid field. From netlink(7):
The kernel can take care of assigning a unique nl_pid so I don't see the advantage of the current approach. |
Fixed thanks to @debfx |
Hi! I've an issue when using go-udev in the unit tests of our project; we have multiple tests that instantiate our components and then destroy them etc; every time a go-udev monitor is created internally and then destroyed. The problem is that despite the fact that I stop the monitor via the "quit" channel and then close the netlink socket via its Close() method, I cannot create a new instance of UEventConn and Connect gives me "address aleady in use error". I'm not too familiar with netlink sockets, but by doing some experiments I found out that the problem is in assigning Pid of the process when creating the socket:
This effectively prevents your application (at the kernel level) from having more than one netlink socket opened at a time . I think that what I'm seeing in my tests is a race where even though the structures are destroyed in the tests, the socket is still reserved for that pid in the kernel when I'm already trying to create and connect a new one. Does that make sense?
FWIW, here is an article https://medium.com/@mdlayher/linux-netlink-and-go-part-1-netlink-4781aaeeaca8 that says: "In my experience, it is much easier to just let netlink assign all PIDs itself, and make sure you keep track of which numbers it assigns for each socket."
So as a quick test I removed Pid assignment from the above call and voilla, all works. However, if this explanation sounds viable, then I suspect there some more work needed to prevent Connect from being called multiple times on same UEventConn instance.
What do you think? Any idea how to avoid the above race, or am I missing something more obvious?
Thanks!
The text was updated successfully, but these errors were encountered: