FPM: For pm = ondemand, don't respawn processes that reached pm.max_requests #4101

mpdude · 2019-05-02T19:59:49Z

This addresses https://bugs.php.net/bug.php?id=77959.

In short: When using pm = ondemand and there is a short peak in load, many worker processes will be spawned. As workers will be used in a round-robin fashion, even for a lightly loaded server it may happen that every of those workers will be used before pm.process_idle_timeout is reached. In this case, the numer of workers will never decrease again as every process reaching pm.max_requests will immediately be replaced by a new one.

With this PR, for ondemand workers that reached pm.max_requests will not automatically be replaced, but a new process will only be forked if deemed necessary by the process management.

Please be lenient with me – this is my first php-src contribution.

…equests This addresses https://bugs.php.net/bug.php?id=77959.

KalleZ · 2019-05-02T20:23:55Z

cc @bukka

mpdude · 2019-05-02T21:32:19Z

Could someone please point me to the place where the worker is chosen for an incoming request? Or does that happen somewhere in a syscall (select(), epoll?) where the kernel chooses one of several processes waiting for a file descriptor or so?

php-pulls · 2019-05-03T04:52:51Z

On Thu, May 2, 2019 at 2:32 PM Matthias Pigulla ***@***.***> wrote: Could someone please point me to the place where the worker is chosen for an incoming request? Or does that happen somewhere in a syscall (select()) where the kernel chooses one of several processes waiting for a file descriptor or so?

It is mostly epoll() these days, not select(), but yes, whether it is select, poll or epoll, we give the kernel a set of file descriptors and it chooses one.

…

-Rasmus

mpdude · 2019-05-03T09:51:43Z

It is mostly epoll() these days, not select(), but yes, whether it is select, poll or epoll, we give the kernel a set of file descriptors and it chooses one.

With my very limited C and Linux programming knowledge, I attached strace to a few running worker processes and found that they block in the accept() syscall until a connection comes in.

So, am I right assuming that the fact incoming requests are handled by workers in a round-robin fashion is an implementation detail of this syscall and nothing we could easily change?

bukka · 2019-05-03T18:43:53Z

Yeah basically all workers (children in the pool) share the open socket for the pool. They all call accept so they are in the same wait queue which is handled by kernel. There is no event loop for workers as we are talking about multiple independent processes. The event pool is used only in master for workers management and other stuff.

I'm not sure about this PR. I see what you are trying to do - basically using max_requests as a reducer but it will happen independently on the current load which I don't think is the right solution. Also pm.max_request should not be used unless you have a leaking application so it's not really meant for this purpose. I think we might agree that it's kind of a hack... :)

Having a good scaling down mechanism is not easy though. It should be based on the actual load which might be quite difficult to implement right in FPM...

mpdude · 2019-05-04T21:32:26Z

@bukka Thank you for your detailed explanation. And yes, I agree it’s not a good solution.

Regarding the ondemand model, the only mechanism to scale down is the idle timeout. But workers run round-robin. So as long as your system receives number_of_current_workers requests per max_idle_time seconds, they don’t time out.

If there has been a spike in load, a large number of workers may have been forked that will never go away – even tough if just very few of them would be able to handle the load.

https://blog.cloudflare.com/the-sad-state-of-linux-socket-balancing/ suggests that epoll() can be used on the listening socket to make the last busy worker be the preferred one to receive the next request, which I think could help here.

What do you think about that?

Am I right that the accept() in question is in the FCGI code? Will epoll() cause portability problems?

bukka · 2019-05-05T19:15:12Z

Yeah it looks that current process idle logic for ondemand would need a FIFO (epoll) mechanism to work correctly as it will be picked up in round robin fashion. The only question is how it effects user with a big load - it can potentially lead to overloading single worker. So I think that behavior should be initially optional.

It needs to be portable as FPM is also used in FreeBSD, OpenBSD and possibly other platform that don't have epoll. So ideally it should use fpm events that would need to run in the worker. The accept is in fcgi_accept_request which is in FCGI code but the loop is in fpm_main.c so it should be possible to implement it without touching FCGI code as the listening socket is available (fcgi_fd).

mpdude · 2019-05-05T21:11:48Z

Why would it make a difference if workers accept() round-robin or FIFO? I mean, the first idle one should pick the connection. So does it make a difference if the oldest worker gets to handle most of the connections?

bukka · 2019-05-26T19:33:18Z

I guess some users might want to spread load more evenly between cores especially for other modes than ondemand. The blog article that you linked mentions such cases (a little bit more info is in discussion) so some people might prefer it. As I said I think that FIFO definitely makes much more sense for ondemand and I think it should be default. But it would be good to have an option to switch to the current way (accept only).

All of this is something that I would consider only for the next minor as it's a bigger change (currently 7.4) that I wouldn't risk in a bug fixing version.

mpdude · 2020-01-09T12:36:33Z

I have little experience in C programming, in particular at the lower level relevant here and when it comes to portability. I would, however, see it as a challenge and see if I can learn what is necessary.

The question is – would anyone here mentor and guide me and maybe show the first steps that seem appropriate?

bukka · 2020-06-09T21:51:30Z

@mpdude Sorry completely missed the previous comment and was also a bit busy at that time. I'd be happy to help with reviews and suggesting things if you are still interested.

ramsey · 2022-05-27T16:37:48Z

@mpdude Is this something you're still interested in pursuing? If so, it will need to be updated to target PHP-8.0. If not, please feel free to close this PR. Thanks!

iluuu1994 · 2022-06-27T09:04:13Z

Closing this due to a lack of response. Feel free to reopen if work continues.

mpdude · 2022-06-28T07:52:03Z

Thanks for all your comments.

I am afraid I am unable to bring this forward, since I have only basic C skills and no experience in programming on such a (for me) low level, let alone a development environment where I could work and debug this effectively. As for everyone in open source, my time available is limited and so I'll have to pick other items where I can contribute better. Sorry!

iluuu1994 · 2022-06-28T08:43:36Z

@mpdude No need to apologize, thanks for your efforts!

FPM: For pm = ondemand, don't respawn processes that reached pm.max_r…

0552dd0

…equests This addresses https://bugs.php.net/bug.php?id=77959.

KalleZ requested a review from bukka May 2, 2019 20:24

petk added the Bug label May 17, 2019

bukka mentioned this pull request May 26, 2019

FPM: Change last_idle_child lookup to fix idle time computation #4104

Closed

bukka added the SAPI: fpm label Jun 9, 2020

bukka mentioned this pull request Mar 16, 2021

Add FPM early bootstrapping mode #6772

Closed

ramsey added the Waiting on Author label May 27, 2022

iluuu1994 closed this Jun 27, 2022

bukka mentioned this pull request Dec 1, 2023

PHP-FPM: Killing idle child issue using pm=ondemand #12798

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

FPM: For pm = ondemand, don't respawn processes that reached pm.max_requests #4101

FPM: For pm = ondemand, don't respawn processes that reached pm.max_requests #4101

Uh oh!

mpdude commented May 2, 2019 •

edited

Loading

Uh oh!

KalleZ commented May 2, 2019

Uh oh!

mpdude commented May 2, 2019 •

edited

Loading

Uh oh!

php-pulls commented May 3, 2019 via email

Uh oh!

mpdude commented May 3, 2019

Uh oh!

bukka commented May 3, 2019

Uh oh!

mpdude commented May 4, 2019

Uh oh!

bukka commented May 5, 2019

Uh oh!

mpdude commented May 5, 2019

Uh oh!

bukka commented May 26, 2019

Uh oh!

mpdude commented Jan 9, 2020

Uh oh!

bukka commented Jun 9, 2020

Uh oh!

ramsey commented May 27, 2022

Uh oh!

iluuu1994 commented Jun 27, 2022

Uh oh!

mpdude commented Jun 28, 2022

Uh oh!

iluuu1994 commented Jun 28, 2022

Uh oh!

Uh oh!

FPM: For pm = ondemand, don't respawn processes that reached pm.max_requests #4101

FPM: For pm = ondemand, don't respawn processes that reached pm.max_requests #4101

Uh oh!

Conversation

mpdude commented May 2, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

KalleZ commented May 2, 2019

Uh oh!

mpdude commented May 2, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

php-pulls commented May 3, 2019 via email

Uh oh!

mpdude commented May 3, 2019

Uh oh!

bukka commented May 3, 2019

Uh oh!

mpdude commented May 4, 2019

Uh oh!

bukka commented May 5, 2019

Uh oh!

mpdude commented May 5, 2019

Uh oh!

bukka commented May 26, 2019

Uh oh!

mpdude commented Jan 9, 2020

Uh oh!

bukka commented Jun 9, 2020

Uh oh!

ramsey commented May 27, 2022

Uh oh!

iluuu1994 commented Jun 27, 2022

Uh oh!

mpdude commented Jun 28, 2022

Uh oh!

iluuu1994 commented Jun 28, 2022

Uh oh!

Uh oh!

mpdude commented May 2, 2019 •

edited

Loading

mpdude commented May 2, 2019 •

edited

Loading