-
-
Notifications
You must be signed in to change notification settings - Fork 790
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IMAP container error on reboot #2917
Comments
I set up a test environment and was able to reproduce the issue. Below is the full scrubbed IMAP container log. This time around, restarting the IMAP container alone did not remedy the problem. However, restarting the entire compose stack did fix it.
|
Same sort of issue here, but the bug report was closed. They claimed the Dovecot error is unimportant: #2886 For the sake of completeness I'm attaching my |
Thanks @geckolinux. I searched through the issues, but did not run across the one you referenced. I'm not sure why it was closed. There is definitely a bug, most likely in the imap container. |
I think this could possibly be the fix: #2913. I have disabled ipv6 on my server. I added the dovecot override and did a reboot without issues. But again, this one has been elusive to report since it doesn't happen every time. |
Thanks a lot @DrDoug88 for confirming that. I'm not sure if IMAP is also part of the problem, but in my case I can confirm that POP3 was failing when my third party webapp tried to access it. Also the redirection error that prevented the web interface from working. Thanks for finding #2913, so in that case is it a problem if IPv6 is enabled or disabled on the server? And at what level? At the container level or the Linux kernel level? In my case the server doesn't have a public IPv6 address, but I also haven't explicitly disabled the IPv6 stack.
Yes, definitely this is a challenge. The randomness makes me think it's some kind of a race condition or missing service dependency. In my case the bug doesn't usually seem to occur unless the server was up for more than about 24h before the reboot. |
I disabled IPv6 on the host via |
I am not sure why you think it is related to ipv6 and the other ticket... Next time you are able to reproduce the issue please verify/confirm:
|
I am not sure why you think it is not related to ipv6. There seems to be a myriad of issues related to the imap/dovecot container. Other issues on this topic have been raised and closed without proper resolution. I will provide the information that you have requested. I know you are very actively working on this project @nextgens, I see you responding to nearly every issue. And I have nothing but gratitude for your hard work. But please read the room. We are just trying to get to the bottom of the problem man. |
There is nothing in the logs you have provided that suggests it is: all the IP addresses appearing in your logs are ipv4. geckolinux keeps unhelpfully adding noise to existing tickets... that in all likelihood and by his own admission are unrelated. I suggest you do like me and ignore him until he bothers to put a description of the symptoms, the relevant config and logs on the same ticket. |
@nextgens So please confirm, you would like duplicate tickets to be opened when the same symptoms are present? |
@geckolinux here you are adding noise about your config while complaining about POP3 and webmail while saying "I'm not sure if IMAP is also part of the problem". How is that related to a problem with the PID file (which is the Fatal error reported here)? On #2886 you were complaining about a log message that I told you is expected and harmless. The ticket has been closed because the reporter provided no logs nor config file. On the face of it the symptoms exhibited are not the same. |
They are entirely related. @geckolinux thankfully pointed me to that ticket, not noise, very helpful. In #2886, the OP says that the webmail is not working. The next responses seem to indicate IMAP/dovecot, and "Fatal: Dovecot is already running with PID 9". This is what I was running into. @geckolinux responded that he too was having the same issues, but only after the server was up for more than 24 hours. Then the ticket was closed without explanation. The objective of the forum concept is for others to find related issues and engage on those topics rather than creating duplicates. We are CLEARLY experiencing the same symptoms. On my ticket, I confirmed that it was the imap container, since on one instance when this occurred, I noticed the errors in the imap container, then I restarted only that container, and the issue was resolved, i.e. everything worked. This has happened randomly a handful of times to me over the past few months, but it only occurs on server reboot. I had originally thought it was webmail related too. I didn't have the time to investigate and report the bug properly, but now I do. I love Mailu, it's awesome, and I want to do my part to help from the user side. I spun up a test machine solely because of this issue, to test and find it, so that I could get the right kind of information/logs together to report it properly. In fact, I will even spin up a second test machine, one with both ipv4 and ipv6, and the other with only ipv4, to see if I can get the issue to reoccur. I have yet to replicate the issue since adding the listening interface override in dovecot.conf. |
I'm afraid I honestly don't understand. Is not the error from my logs At any rate, I will report back with the requested information from #2917 (comment) the next time this happens. |
Okay, I created a fresh VM, performed the installation, did a reboot. This one is with ipv4 and ipv6, i.e. no dovecot override. Upon reboot, the issue has occurred. Here are the details that you have requested:
Here is the accompanying log:
I can leave this machine in this state as long as necessary and perform any other requested actions. |
@DrDoug88 thank you for proving my point. I will leave the two of you investigate v6 related issues while I work on the PR to fix the issue you have reported. |
I have no idea what you are talking about. It was me that suggested the idea of it being ipv6. I thought maybe since I had disabled it on the host side, but the container might be trying to bind to it and causing problems. I've seen networking do weird things. |
@DrDoug88 Thanks very much for setting up the test instance and providing the details in #2917 (comment) . Apparently the root cause of this bug is what I theorized in #2917 (comment):
I imagine that the additional complexity of the enabled IPv6 stack was changing the timing of some OS events, making it more likely for stale PID files to prevent some Mailu components from starting up, including IMAP, POP3, and possibly the webmail redirect issue described in #2886. |
Thank you @nextgens for fixing the bug, and more in general for all that you do. I haven't had a chance to pull the changes and test it. But from your response, it seems like you were able to specifically identify it and address it accordingly. I hope that that this entire exchange amongst the three of us can at least convey the idea that the reporting process is not entirely clean and perfect as we would all like. When it comes to something that is not easy to replicate, a little bit of back and forth discussion is not the end of the world. Anyhow, thanks again. |
Environment & Version
Environment
Version
2.0
Description
Occasionally, when performing a server reboot for routine maintenance, Mailu does not start correctly. I have narrowed it down to the imap container. The only error I see in the imap container logs is: Fatal: Dovecot is already running with PID 9 (read from /run/dovecot/master.pid). This is on my production machine, so I cannot keep the service down too long to troubleshoot. Before, I would restart the entire stack, but since I looked into it more on this last occurrence, I simply restarted the IMAP container and everything works fine. I am apprehensive about even posting this bug, since I do not have more information to offer. I can spin up a test environment and keep rebooting until the error occurs if necessary. I was running 1.9 for about a year, and it seems like this only started once I moved to 2.0 back in April, but I'm not sure.
Thanks for all of the hard work you all do.
Replication Steps
It is very difficult to reproduce. It happens maybe 10% of the time.
Logs
The text was updated successfully, but these errors were encountered: