Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use posix_spawn for proc_open when supported by OS #7933

Closed
wants to merge 1 commit into from

Conversation

crrodriguez
Copy link
Contributor

@crrodriguez crrodriguez commented Jan 11, 2022

Hello folks: long time no see.. I have some local patches that I wish to get off my sleeve..

  • proc_open([""], $descriptorspec, $pipes, $cwd, $env) should result in an error, there must be at least one non-empty element as a program name to execute

  • proc_open should use posix_spawn on newish operating system versions where it is significantly fast and uses less memory than the fork/exec path, this is probably going to be a highly noticeable imrpovement on systems where posix_spawn is a syscall.

@crrodriguez
Copy link
Contributor Author

crrodriguez commented Jan 17, 2022

Anyone can ack/nack this request or give feedback? both changes either include a test or in the case of the proc_open change the whole PHP testsuite counts as a test, as it is built on top of proc_open.

Copy link
Contributor

@cmb69 cmb69 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Sorry, I can't comment on the actual functionality, but left some mostly CS related comments.

ext/standard/proc_open.c Outdated Show resolved Hide resolved
ext/standard/proc_open.c Show resolved Hide resolved
ext/standard/proc_open.c Outdated Show resolved Hide resolved
ext/standard/proc_open.c Outdated Show resolved Hide resolved
ext/standard/proc_open.c Outdated Show resolved Hide resolved
ext/standard/proc_open.c Outdated Show resolved Hide resolved
ext/standard/proc_open.c Outdated Show resolved Hide resolved
ext/standard/proc_open.c Outdated Show resolved Hide resolved
ext/standard/tests/general_functions/proc_open_array.phpt Outdated Show resolved Hide resolved
ext/standard/tests/general_functions/proc_open_array.phpt Outdated Show resolved Hide resolved
@crrodriguez
Copy link
Contributor Author

It would be great if someone can tests this on recent macOS which should see the biggest benefit. I have tested it on linux only with great results.

@crrodriguez
Copy link
Contributor Author

Merging this will not affect old systems and it something breaks the test suite itself will fail on the target platform and everything will crumble down. It will be cool if someone can merge this soon.
Cheers.

@bukka bukka self-assigned this May 7, 2022
@bukka
Copy link
Member

bukka commented May 7, 2022

I will need to look more to the code but could you share those "great results". I mean have you got some benchmark showing noticable difference?

@crrodriguez
Copy link
Contributor Author

I will need to look more to the code but could you share those "great results". I mean have you got some benchmark showing noticable difference?

at least the PHP test suite itself runs several seconds faster with this patch applied..I will have to try a different benchmark or tool to prove it to you.

I also only did this for proc_open because glibc already implements popen(),(used by the mail(), popen(), passthru() and so on) system() , wordexp() with posix_spawn this very same way for the very same reason. Also Java, python and so on have done so.

@bukka
Copy link
Member

bukka commented May 7, 2022

Just to clarify, I'm trying to figure out the main advantages of this approach and if it's something measurable especially on Linux. To be honest I have never used posix spawn and from reading the Linux manual pages, it doesn't seem like a recommended approach but I might be missing something. If we really need to make it fast, we could just use clone which is used underneath those functions directly but I wonder if perf really matter that much here..?

@bukka
Copy link
Member

bukka commented May 7, 2022

at least the PHP test suite itself runs several seconds faster with this patch applied.

Ok that's interesting and might be certainly worth it. I will give it a try and take a proper look on the changes.

@crrodriguez
Copy link
Contributor Author

Just to clarify, I'm trying to figure out the main advantages of this approach and if it's something measurable especially on Linux. To be honest I have never used posix spawn and from reading the Linux manual pages, it doesn't seem like a recommended approach but I might be missing something. If we really need to make it fast, we could just use clone which is used underneath those functions directly but I wonder if perf really matter that much here..?

I strongly suggest against going the clone() way.. that will just make everything more complicated.. if the C library has a syscall wrappers to use it, it is not what you really want.. so you have to use the raw syscall, depending which version do you want use, calling conventions vary between architectures or kernel versions.

Other than been faster as the process size increases, it will also use less memory... oh and all the parent runs first races are gone.. since posix_spawn is always child_first..

@crrodriguez
Copy link
Contributor Author

https://bugs.python.org/issue35537 about the tests done by python developers (the comment that posix_spawn on linux uses vfork is wrong.. it uses an implementation defined clone3 CLONE_VFORK|CLONE_VM and needs to be updated to CLONE_CLEAR_SIGHAND as well to skip some extra work)

@bukka
Copy link
Member

bukka commented May 11, 2022

Thanks for the link - very useful reading. Perf difference is quite impressive. As you are aware there are some subtle differences that we need to properly consider. Especially I would like to know if the signal handling can somehow impact FPM so this needs to be properly analysed.

ext/standard/proc_open.c Outdated Show resolved Hide resolved
ext/standard/proc_open.c Outdated Show resolved Hide resolved
@crrodriguez
Copy link
Contributor Author

Thanks for the link - very useful reading. Perf difference is quite impressive. As you are aware there are some subtle differences that we need to properly consider. Especially I would like to know if the signal handling can somehow impact FPM so this needs to be properly analysed.

I am not aware of any issue you don't already have, as any other code using system(), popen() , wordexp() will use this interface from within the C library and will be subject to the same or similar signal handling rules.
On OSs where this is a syscall like freebsd or MacosX the kernel will take whatever shortcut it sees fit to run the child as fast as possible with minimal overhead

@devnexen
Copy link
Member

It would be great if someone can tests this on recent macOS which should see the biggest benefit. I have tested it on linux only with great results.

I gave a shot at your branch on my macOs, it works fine but did not notice much differences perf wise.

@crrodriguez
Copy link
Contributor Author

It would be great if someone can tests this on recent macOS which should see the biggest benefit. I have tested it on linux only with great results.

I gave a shot at your branch on my macOs, it works fine but did not notice much differences perf wise.

did the configure test succeded though.. grep HAVE_POSIX_SPAWN_FILE_ACTIONS_ADDCHDIR_NP main/php_config.h ?

@devnexen
Copy link
Member

absolutely.

ext/standard/proc_open.c Outdated Show resolved Hide resolved
@crrodriguez
Copy link
Contributor Author

I'll make some mods to this as looking elsewhere where this is used , (rust, haskell..pretty much everywhere else) reveals that POSIX_SPAWN_FILE_ACTIONS_ADDCHDIR_NP is available in macos catalina or later..but it does not actually work correctly so it has to be excluded, apparently apple has not fixed it yet.

@ramsey ramsey added this to the PHP 8.2 milestone May 27, 2022
@ramsey
Copy link
Member

ramsey commented May 27, 2022

Hi! What is the status of this PR?

@crrodriguez
Copy link
Contributor Author

Hi! What is the status of this PR?

I'll get back at this in upcoming days

@crrodriguez
Copy link
Contributor Author

crrodriguez commented May 30, 2022

absolutely.

So, what does this return in your macos installation?

<?php
$descriptorspec = array(
   1 => array("pipe", "w"),  // stdout is a pipe that the child will write to
   2 => array("file", "/tmp/error-output.txt", "a") // stderr is a file to write to
);

$cwd = '/tmp';
$process = proc_open('./test.sh', $descriptorspec, $pipes, $cwd);

if (is_resource($process)) {
    echo stream_get_contents($pipes[1]);
    fclose($pipes[1]);
    $return_value = proc_close($process);
    echo "command returned $return_value\n";
}
?>

You have to create a valid shell script at /tmp/test,sh and make it executable, it doesn't have to do anything other than returining true..

This is supposed to be broken in MacOS ..and it supposedly returns ENOENT when posix_spawn_file_actions_addchdir_np is used, due to a bug, errno is clobbered by a previous stat call, the program is executed correctly yet it return error. I wasn't able to figure out if this erros has been fixed or not.
It only happens if the command to be executed is a relative path..

@bukka
Copy link
Member

bukka commented Jun 30, 2022

Maybe @devnexen can check this.

@devnexen
Copy link
Member

devnexen commented Jul 1, 2022

absolutely.

So, what does this return in your macos installation?

it s defined

<?php
$descriptorspec = array(
   1 => array("pipe", "w"),  // stdout is a pipe that the child will write to
   2 => array("file", "/tmp/error-output.txt", "a") // stderr is a file to write to
);

$cwd = '/tmp';
$process = proc_open('./test.sh', $descriptorspec, $pipes, $cwd);

if (is_resource($process)) {
    echo stream_get_contents($pipes[1]);
    fclose($pipes[1]);
    $return_value = proc_close($process);
    echo "command returned $return_value\n";
}
?>

You have to create a valid shell script at /tmp/test,sh and make it executable, it doesn't have to do anything other than returining true..

This is supposed to be broken in MacOS ..and it supposedly returns ENOENT when posix_spawn_file_actions_addchdir_np is used, due to a bug, errno is clobbered by a previous stat call, the program is executed correctly yet it return error. I wasn't able to figure out if this erros has been fixed or not. It only happens if the command to be executed is a relative path..

Your test case works locally.

@bukka
Copy link
Member

bukka commented Jul 7, 2022

@crrodriguez just a reminder that the feature freeze for PHP 8.2 is approaching fast (it is on 19th July). I think it would be great to get it in once you address all the comments.

@crrodriguez
Copy link
Contributor Author

OK, I believe I have addressed all comments for now.

@github-actions
Copy link

There has not been any recent activity in this PR. It will automatically be closed in 7 days if no further action is taken.

@github-actions github-actions bot added the Stale label Apr 23, 2023
@crrodriguez
Copy link
Contributor Author

It needs new review.

@devnexen devnexen removed the Stale label Apr 23, 2023
@devnexen
Copy link
Member

seems you did indeed, @bukka does it looks good to you now (when you have time) ?

Copy link
Member

@bukka bukka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went again through all docs and discussions and except few minor things, the code looks good to me.

I have also done some testing on macOS (12.3 Monterey) and tried the previously provided script:

<?php
$descriptorspec = array(
   1 => array("pipe", "w"),  // stdout is a pipe that the child will write to
   2 => array("file", "/tmp/error-output.txt", "a") // stderr is a file to write to
);

$cwd = '/tmp';
$process = proc_open('./test.sh', $descriptorspec, $pipes, $cwd);

if (is_resource($process)) {
    echo stream_get_contents($pipes[1]);
    fclose($pipes[1]);
    $return_value = proc_close($process);
    echo "command returned $return_value\n";
}

It worked for me and I saw it worked previously for @devnexen (what version was that actually?). Not sure though if we tested on Catalina (10.15) and Big Sur (11). Our pipeline is on 11 and I see one failure there (it is openssl test that actually uses proc_open IIRC) so it would be good to re-run it and check it properly there. If you are aware there was a bug on previous versions, it might make sense to enable it only on newer versions potentially. It would be also good to create the the test for the script above in slightly modified way so it can be checked in the pipeline.

ext/standard/proc_open.c Outdated Show resolved Hide resolved
ext/standard/proc_open.c Outdated Show resolved Hide resolved
…able

As the size of the PHP process increases, forking gets slower and memory
consumption increases, degrading the performance in varying degrees.

This patch makes proc_open use posix_spawn only on systems which is known to be
safe, faster than the HAVE_FORK path and have posix_spawn_file_actions_addchdir_np(3)
action.
Non scientific benchmark shows running php own's test suite on linux completes
dozens of seconds faster, the impact is probably higher on systems where
posix_spawn is a syscall.
@bukka bukka changed the title Proc open fixes Use posix_spawn for proc_open when supported by OS Jul 13, 2023
@bukka bukka closed this in 5572975 Jul 13, 2023
@crrodriguez crrodriguez deleted the proc_open_fixes branch July 21, 2023 23:33
@ausi
Copy link

ausi commented Nov 2, 2023

This pull request caused a behavior change in proc_open() that was probably not intended (I think): #12589

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants