performance issues (pcntl_fork overhead?) #34

kballenegger · 2012-01-06T01:28:02Z

So I've started playing with a deployment of this in production and seem to be having performance issues with pcntl_fork. Processing an empty job (that just contain an error_log statement) takes over 50ms, and our queue needs to be able to process more than 1000 items per second.

I'm thinking I might have to fork the project and change the behavior so that instead of forking on every job, to fork on every X jobs (where X > 100). My concern is that I haven't figured out a way to communicate between processes in PHP that would let me pass back the job object about to be processed from the child to the parent. Ideas?

The text was updated successfully, but these errors were encountered:

chrisboulton · 2012-01-06T01:32:19Z

I know it's not really a solution, but running multiple workers checking the same queues might be beneficial for you to get higher throughput.

Forking is by its very nature "slow", so I'm not too shocked to hear the performance isn't where you'd need it to be. Forking every X jobs might be a nice idea and something that could be built in to the core.

In terms of communication between the processes, you could use socket_create_pair to create a set of sockets between the parent and child processes. You want to look at IPC (inter-process communication)

Let me know what you're thinking.

kballenegger · 2012-01-06T02:10:11Z

The performance for per-job forking is that it takes about 50ms per job. This means about 20 jobs per second per worker maximum. To handle our queue we would need 50-100 workers, which is INSANE. Because of this, I'm leaning towards a fix that won't fork on every job.

I noticed that redis keeps a record of what the worker is working on. I'm working on a fix that will catch failures in a child that processes multiple items, and use the redis re-create the job object and fail it. This would be an option, which means it could play nicely with the current implementation.

If this doesn't work out so well, I'll see what I can do w/ socket_create_pair. Maybe I could serialize the job and pass it through the socket to the parent before each job is attempted… but this seems like a wonkier solution.

kballenegger · 2012-01-09T23:02:59Z

See pull request here: #35

kballenegger · 2012-02-02T08:46:36Z

Any news on this or my pull request above?

roynasser · 2012-02-03T15:19:31Z

I'd be interested in hearing more pros and especially about the cons of this?

danhunsaker · 2013-03-11T12:49:24Z

Cons include the loss of process-level isolation of jobs. In the current model (1 fork every 1 jobs), each job operates in complete and total isolation (memory-wise) from every other job. This is a very good idea for failure-tolerance - especially when any given job could spontaneously encounter a fatal error, or even segfault. It is possible to figure out some kind of failure detection mechanism and work around this, but the options are each more hackish and/or unreliable than the last - and you still have to replace the now-dead worker and figure out what job(s) to hand it - a process which is likely to take longer than the original fork itself.

I want to also point out, albeit rather belatedly, that error_log() has a certain amount of overhead all its own, namely that it writes to the disk. Solid state drives and RAM disks are about the only places where this overhead isn't going to be particularly noticeable, but even then I wouldn't rely on that for performance. My point being that error_log() is probably a bad function to use to gauge the speed of anything other than your target disk and its accompanying filesystem. The best bet, in fact, is to use a truly empty perform() method, one with literally nothing in it, as this will most accurately gauge the speed of PHP-Resque itself.

Something else that will help is having a PHP binary that doesn't contain (and a config that doesn't load) any features you aren't going to use in your workers. So will anything else that limits the amount of memory consumed per process. Each fork creates a full copy of the parent process's memory space, so the less you load, the less the OS needs to copy, and the less time it takes to do so.

I'm guessing you already knew most of that, though, so I hope you take the bits you knew as advice for others who didn't. :)

Dynom · 2013-08-30T07:10:16Z

I know that your initial post is about 2 years old. But since it's till open.. I'm curious what you came up with since then @kballenegger.

Another possibility I'm considering is to not fork at all and do some alternate process control (like you can with Supervisord). That removes a tremendous amount of overhead. I have jobs of around 100ms, that is just too slow. Shaving 50ms off would save a couple of servers.

kballenegger · 2013-08-30T07:59:12Z

@Dynom I kind of gave up; making nice things in PHP is too hard / impossible. These days I built everything in Ruby or C.

Rockstar04 · 2013-08-30T12:47:27Z

I agree with @Dynom, as far as resque itself goes, since it was written with Ruby first. I am going to try to run my daemon and jobs in Ruby, but leave the app code itself PHP since we would never get approval to port all of our apps to Ruby.

Dynom · 2013-08-31T07:20:39Z

An alternative approach might be a solution like this: https://github.com/salimane/php-resque-jobs-per-fork , this idea introduces (at least) the following:

con's:

It introduces potential problems with context, memory, etc. As will any implementation that doesn't always fork.

pro's:

It can significantly speed up jobs, by spreading the fork overhead over N jobs. While still allowing to "clean up" after N jobs.
It's also still fairly "robust" since it will still fork all workers. On any fatal errors, only the fork dies, not the main process. A mild "pro", but still :-)

Dynom · 2013-08-31T08:24:43Z

I've done some tests and I have very promising results. Heavy jobs that used to finished in 200ms now finish in 30ms and other jobs have very similar results. As soon as I find some time I will make a PR and I hope that @chrisboulton will merge it in asap

Dynom · 2013-08-31T08:59:16Z

I've done some test and I've seen some promising results. Large jobs taking 200ms now take 35ms. I've created PR #130, which is an up-to-date implementation of the work by Salimane's work here: https://github.com/salimane/php-resque-jobs-per-fork

This PR introduces an environment variable JOBS_PER_FORK, that will define the amount of jobs being processed per fork. A larger value introduces a slightly more risk (in case of errors or unexpected dependencies), but also reduces overhead quite a bit. I'm not sure yet what the optimum value is, but that will be different for each job.

kballenegger · 2013-08-31T11:01:27Z

@Dynom - In case you're interested, I have a fork / pull request for this already (#35). Never got merged in though:

#35

Dynom · 2013-08-31T12:19:39Z

Hi @kballenegger, sorry I did not see your PR. Is it still running successfully in production ?

kballenegger · 2013-08-31T12:37:33Z

@Dynom - We don't use Resque in our production systems any longer; we moved to more durable queuing & event stream systems as we grew even further in scale (combination of SQS, Rabbit, & Kafka, nowadays).

From what I recall, however, the fork worked great. We ran it with quite a bit of traffic for a while.

Dynom · 2013-08-31T12:40:33Z

Ok. I actually think that this won't hold for long and I'm already looking at alternatives. But we need some improvements now so I'll switch to this fix using either PR and create some time to investigate alternatives. SQS is not an option and I'm unsure about Rabbit.

I liked the Resque approach because it gives me great reliability in the queue. Instead of having SPF's with mindless brokers and the like. How do you handle that ?

chrisboulton closed this as completed Oct 11, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

performance issues (pcntl_fork overhead?) #34

performance issues (pcntl_fork overhead?) #34

kballenegger commented Jan 6, 2012

chrisboulton commented Jan 6, 2012

kballenegger commented Jan 6, 2012

kballenegger commented Jan 9, 2012

kballenegger commented Feb 2, 2012

roynasser commented Feb 3, 2012

danhunsaker commented Mar 11, 2013

Dynom commented Aug 30, 2013

kballenegger commented Aug 30, 2013

Rockstar04 commented Aug 30, 2013

Dynom commented Aug 31, 2013

Dynom commented Aug 31, 2013

Dynom commented Aug 31, 2013

kballenegger commented Aug 31, 2013

Dynom commented Aug 31, 2013

kballenegger commented Aug 31, 2013

Dynom commented Aug 31, 2013

performance issues (pcntl_fork overhead?) #34

performance issues (pcntl_fork overhead?) #34

Comments

kballenegger commented Jan 6, 2012

chrisboulton commented Jan 6, 2012

kballenegger commented Jan 6, 2012

kballenegger commented Jan 9, 2012

kballenegger commented Feb 2, 2012

roynasser commented Feb 3, 2012

danhunsaker commented Mar 11, 2013

Dynom commented Aug 30, 2013

kballenegger commented Aug 30, 2013

Rockstar04 commented Aug 30, 2013

Dynom commented Aug 31, 2013

Dynom commented Aug 31, 2013

Dynom commented Aug 31, 2013

kballenegger commented Aug 31, 2013

Dynom commented Aug 31, 2013

kballenegger commented Aug 31, 2013

Dynom commented Aug 31, 2013