Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

POC: Use pcntl_fork to run save command from different thread #18

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

ivankristianto
Copy link

@ivankristianto ivankristianto commented Apr 18, 2022

This PR is a refresh from the original PR #1 from Joe.

Like the issue mentioned in previous PR from Joe:

the "include_once" flag on locate_template is a blocker for this. get_header etc are called with this, so it's not actually possible right now to call get_header twice in one PHP execution.

And I tried to fork the main process to avoid multiple php include. And it does pretty well.
Currently only apply this to wpcli command to avoid php max execution timeout. And in production environment, mostly this will be done from cavalcade which run on wpcli.

The other challenge i have in mind is how to handle the fallback when the main process get killed?

@joehoyle what is your opinion on this PR?

@ivankristianto ivankristianto changed the title POC: Use pcnt_fork to run save command from different thread POC: Use pcntl_fork to run save command from different thread Apr 18, 2022
@ivankristianto
Copy link
Author

Tested with one of the project on my local with 378 pages.

With remote request:

real    1m49.870s
user    0m5.912s
sys 0m1.308s

With process forking:

real    0m41.125s
user    0m18.312s
sys 0m15.750s

With process forking like this, is faster by ~60%.
But it comes with a lot of database errors like:

[18-Apr-2022 04:20:42 UTC] WordPress database error 2022-04-18 04:20:42 Can't select global__r

Also all custom export doesn't work, like sitemaps and wp-json api json file. Will look into this deeper.

Copy link
Member

@joehoyle joehoyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some thoughts

}
$contents = array_map( __NAMESPACE__ . '\\replace_urls', $contents );

array_map( function( $content, $url ) use ( $args_assoc ) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why this was added? This CLI command is to just show the HTML output, not save it too.

die( 'could not fork' );
} else if ( $pid ) {
// This is the main process. Put it to wait till all the child process finished.
\pcntl_wait( $status ); //Protect against Zombie children
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to design this loop such that we could spawn multiple children at once? It might be that we can do 4 on the CPU ok, so doing them in parallel could give a good perf speedup.

// Fire 3rd party actions.
$this->do_actions( $url );

save_contents_for_url( $content, $url, $config );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also managed to run this in parallel, it would mean per the above we'd also get some parrallel uploading to netstorage which is nice. It might be technically better to de-couple the capturing output, and saving to netstorage so each can run at maximum page, but for now this should be ok.

die( 'could not fork' );
} else if ( $pid ) {
// This is the main process. Put it to wait till all the child process finished.
\pcntl_wait( $status ); //Protect against Zombie children
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, it seems currently you are just using forking when using CLI commands, as opposed to the cron tasks too. Is that intentional, or just for the POC?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants