-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Infinite loop issue with new EOF handling code in eatline #2350
Comments
These are the backtraces from the attached backtraces.txt, for my own convenience mostly
|
I had a long reply typed up in which I analysed a bunch of wrong things and was just about to ping someone for help when I suddenly realised what the problem probably is 🙄 I think this quick-and-dirty fix should sort this out:
A better fix would be to either:
I'm not sure which I prefer. Just adding it to the prot_IS_EOF macro would be trivial, but I've just had a look at the FILE * api and it has distinct feof() and ferror() functions, so maybe we should do the same to keep the prot interface unsurprising. @ksmurchison, any thoughts? |
Actually I've just decided to do it the FILE way and then it's done, commits incoming. |
I'd appreciate feedback if this fixes the issue! |
This fix looks sane, but I'm curious as to what is causing pin->error without also causing pin->eof. |
This fix looks like it'll work. I rechecked the core files and they all had the same pin->error value: "Connection reset by peer". We'll integrate it over the weekend and report if that does fix the problem. |
@ksmurchison looking at Usually one just checks for the EOF return value and things work out fine, but I wonder if there's other places that need the same sort of fix? Though, I expect they would have broken, been discovered, and fixed, long ago -- looking how quickly the new eatline issue was discovered. |
@QuadrantAndrew FYI I'm planning to release a 3.0.7 as soon as I have confirmation that this issue is fixed. :) |
I wasn't able to get the fix in over the weekend. It's in now. I'll let it run during the busy time tomorrow and report back. |
Great, thanks :) |
After about 19h runnig cyrus-imapd with this patch we had no more problems. |
It's been running all day without any issues. Before the fix we'd get at least 3-4 runaway processes per hour. Looks like the issue is solved! Thank you so much for the quick response to this issue. |
3.0.7 has just been released with the fix in it :) |
Since upgrading to 3.0.6 we've been having issue with some imap processes taking 100% CPU. All of the processes get stuck in the eatline function when disconnecting from an append command. Once the processes get in this state there is no network traffic or system calls made by the processes, which leads me to believe it's stuck in the for(;;) loop in eatline.
Attached are some backtraces we were able to get by dumping the core of the imap processes before killing them.
backtraces.txt
The text was updated successfully, but these errors were encountered: