-
Notifications
You must be signed in to change notification settings - Fork 852
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libpcap blocking, ignoring interrupts, after a973128 #899
Comments
So does that test program catch SIGINT? |
Not explicitly, but strace shows that someone is (maybe python's KeyboardInterrupt handler). I'm ok with punting on that for now and focusing on the blocking behavior change. (I don't know if the signal problem happens with the old code because it never hangs to have time to send a SIGINT.) But I suppose the answer there is just that the program is depending on behavior that the man page says not to depend upon.
So perhaps it's just worth being aware that people who just code to observed behavior might run into this when upgrading. I think the fix here is to either use |
OK, I've added some flags to the capturetest test program to make it capture SIGINT and, optionally, call
so that it captures SIGINT and calls
However, if I ran it without
so there was never a guarantee that, if SIGINT is caught and the handler doesn't call |
So is this an event manager that has an event loop using If so, yes, you should call Note also that if you have Also, the return value of |
Correct. But the problem that arose is that:
Ok.
This is not at all straightforward to arrange given our event loop (it's more focused around "what time would you like something to happen" than "how much longer this time around the loop"), but luckily in this use case this code will know if the interface is going away and shutdown the pcap itself. |
With that program, on Linux, libpcap used not to ever block if the
The issue is that an interface going from IFF_UP to IFF_DOWN causes a single call to If the interface goes away completely (e.g., unplugging a USB adapter), no error is provided by the kernel, and no wakeup is even provided by the kernel. We try to detect that by, if we've gotten an ENETDOWN and haven't seen any packets since than and haven't seen the interface up since then, periodically checking whether the interface is still there. We can do that in the loop in blocking mode, but, in non-blocking mode, we can't, so we just say "here's a timeout, try checking for packets or errors every so often". As long as you're doing that, or as long as you don't need libpcap to say "this |
Same test program, with libpcap 1.9.1 - strace says that poll() times out:
and debugging says that (I do realize that the documented API says "you have to be able to handle blocking in this scenario", so libpcap's new behavior is not wrong - just worried that if this code made this assumption, then others might too and the new behavior might be surprising)
It's worth considering how this API will fit into an event system where the main loop isn't user-accessible - e.g., what if you are fitting into a system that uses libevent, and your api to the event system is just (I suppose that this is not that hard: the callback for the absolute timeout can just consult a flag that the "fd was readable" handler sets, and if it's set, get a new timeout. This would be ok, right - if |
@fenner @guyharris hello, three years have passed, are there really any bugs after a973128? |
I have no further followup. |
I've been updating my internal version of libpcap regularly. After updating this weekend, one of my tests completely hangs, and is unresponsive to SIGINT. This is just a preliminary report, in case @guyharris is working on a followup commit already - but it looks like poll() inside pcap_wait_for_frames_mmap is eating signals and not returning, and is blocking in a case that did not block before. I will gather more info but am providing this in case it rings a bell.
(This use case may be calling pcap_next() when there is no packet to be read. Ignoring error handling code, the code is:
The next thing it does is register fd in our event manager, and it's plausible that this (perhaps incorrectly) causes a call to pcap_next() - but given that there's a timeout, this shouldn't be fatal, maybe just cause an unnecessary delay.
I hit ^C, which strace shows was received but ignored, and then hit ^\ to generate a SIGQUIT.
strace says:
The traceback is:
The text was updated successfully, but these errors were encountered: