-
-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
In some cases, changing a vserver/alias' domain to a fake domain might break multi-versions PHP FPM #649
Comments
Thank you @nekohayo for the investigation. We at @EvoluData / @WikiSuite have experienced these mysterious HTTP 503 "Service Unavailable" every few months, on various servers, but we were unable to pinpoint the cause. And on production servers, we often have to fix the problem quickly, and can't keep the server as is as evidence for a senior sysadmin to investigate. We have noticed however that it seems to happen after a Virtualmin Virtual Server is deleted. As to your clues Recently, there has been some work on the PHP-FPM code in Virtualmin and I am hopeful it will take us step closer to a resolution. |
We expect those to be fixed in 7.8.2! |
We did recently fix a bug in 7.8.2 that could cause the PHP FPM server not to restart properly when changing the PHP version for a domain - but this only happens when using a TCP port, not a socket file. The fix will be included in the 7.8.3 release.. |
I have encountered a probable bug with Virtualmin 7.7 (which was recently auto-upgraded to 7.8.2) on my Debian 11 server at home, where I am hosting some of my personal websites.
I'm not entirely sure how this bug was triggered, but I have spent some weeks investigating and have come towards a reasonable guess.
Summary
The bug's symptom:
The possible cause I'm suspecting:
using fake domain names (like "exampleplaceholder" instead of "example.com" in your vservers, and then deleting those vservers, might confuse virtualmin and leave some cruft in PHP FPM configuration on the filesystem that is invisible to virtualmin/webmin's UI… and then those conflicting configs prevent PHP from starting.
The server's specs and usage type
This Debian 11 + Virtualmin 7.8.2 server has both PHP 7.4 and 8.2, installed from Sury's Debian repositories. Both PHP versions have the same modules installed, with this command:
apt install php{7.4,8.2}-{cgi,cli,fpm,pdo,gd,mbstring,mysqlnd,opcache,xml,zip,curl,imagick,intl,bcmath,sqlite3}
This Debian 11 server is pretty much the same thing as my other home server that runs Debian 12 + Virtualmin 7.8.2, I just haven't upgraded it to Debian 12 yet.
So in theory this server shouldn't misbehave, but it does. Investigation details below.
Symptoms in detail / troubleshooting so far
I first noticed the issue when some of my personal websites started randomly failing, showing only:
This HTTP/Apache error page is the result of PHP not working at all on those vservers. Showing a plain index.html page (instead of some index.php) works fine.
Doing
tail -f logs/error_log
in such a vserver's home, I get things like this whenever a visitor tries to hit the website:The number of that socket might vary across different failing vservers, for example I also have the same error with:
Searching for those errors, I saw this forum thread which led me to #96 (comment), which led me nowhere. Indeed, I did all the updates, reinstalled the various PHP 7.4 and 8.2 debian packages, etc etc.
Curiously, the package installations would now fail because they would be unable to start the services, too:
...eventually the only thing that made a visible difference was a full reboot, but the full reboot just shuffled my luck in whether it's PHP 7.4 failing, or PHP 8.2 failing. So you have a different set of websites failing to load, depending on which PHP version they use... Flip a coin!
As per my friend @s3phy's recommendation, I looked at
grep "listen" /etc/php/*/fpm/pool.d/www.conf
to see if things were using separate sockets, and they were:Alright then, could it be some stale socket file thing going on? Let's look at the dates in
ls -l /var/php-fpm/
(some characters replaced by "*"):Currently, both the "tab**" and "opensource" websites are failing, and both happen to be configured to use PHP 7.4 instead of 8.2.
Make a mental note of the "exampleplaceholder" fake domain / vserver here. It was there in that folder when I first encountered this issue around September 10th, but it's absent today. I think it might be key to the problem. More on that in a minute.
Currently, the Webmin dashboard "Servers status" indicates that the PHP FPM 7.4 service is down, and PHP FPM 8.2 is up. If you try to restart it, it fails, and suggests you look at
systemctl status php7.4-fpm.service
, in which case you see:...okay then, we have a much better clue what might be the problem: it mentions "exampleplaceholder". That's the fake domain name I had renamed a vserver to (or maybe was it an alias server, child of some other vserver? I don't remember, but I certainly had used the "Change domain name" feature in the Virtualmin menus), before recently deleting it via the virtualmin web interface. That's suspicious, why is it referencing something that doesn't even exist anymore?
Unfortunately, Virtualmin and Webmin don't seem to have any way for me to search for that string, so I went Conan-the-barbarian-style and did
grep -Rn "exampleplaceholder" /etc/*
as root, and the only mentions are in/etc/php/7.4/fpm/pool.d/16801964152855.conf
. Here are the contents of that file:But wait, does exampleplaceholder still exist elsewhere? Doesn't seem so:
Suspected causes
In conclusion, this leads me to the following hypothesis: there probably is some scenario, somehow, where Virtualmin 7.7 (and maybe 7.8.2) fails to delete some PHP configuration thing when deleting a renamed vserver/alias, maybe when:
exampleplaceholder
instead ofexample.com
)?I wish I had better clues as to the exact cause or steps to trigger this issue, but I'm hoping that the info and thought process so far here might ring a bell. Otherwise, I am happy to provide further information you may request; for now I haven't deleted the sock or conf files or anything like that, to ensure I can still provide information for you, until you tell me it's not needed.
Besides, I don't know if it would be good/safe for me to just
rm /etc/php/7.4/fpm/pool.d/16801964152855.conf
or if webmin/virtualmin/PHP/Apache would get even more confused 😅The text was updated successfully, but these errors were encountered: