sh_diropenat(): General improvements#859
Merged
Conversation
- Avoid duplicating the file descriptor from openat when it's already >9 to increase performance and reliability. - Add a feature test which tests the errno resulting from using openat on a non-directory with O_DIRECTORY|O_SEARCH. On illumos (and presumably Solaris) openat can misleadingly fail with EACCES rather than ENOTDIR, so account for that by testing the given directory in question with stat(2). - Add a feature test to see if file descriptors obtained via O_SEARCH or O_PATH can actually be used with fchdir(2). This is also relevant for open(2) because of O_SEARCH's usage in the fallback code in sh_subshell(). - Ensure the close-on-exec bit is set for the file descriptor returned by sh_diropenat(). These changes allow for the elimination of the double openat and helps mitigate the performance loss for cd's performance as observed after the merge of 3d7b7e1: vide ksh93#816 (comment) Results obtained on Linux amd64 using CCFLAGS=-Os. Results for: 'typeset -i i; time for((i=0;i<500000;i++));do cd /dev; cd /;done' ksh-no-openat-9b0b2603: real 0m01.72s ksh-dev-de530132: real 0m03.55s ksh-this-commit: real 0m02.91s ksh-this-commit-no-subshell-inode-test: (same due to no subshells) real 0m02.90s ksh93u+: real 0m02.02s I'll admit cd still isn't as fast as before, yet the trade off is still worthwhile because ksh93 scripts are bound to use subshells (particulary command substitutions) far more often than cd. In the contrived benchmark cd is called 1,000,000 times, which hardly reflects real world scripts. This also doesn't account for the practice of calling cd from within a subshell, which is common. Results for: 'typeset -i i; time for((i=0;i<500000;i++));do (cd /dev); (cd /);done' ksh-no-openat-9b0b2603: real 0m06.73s ksh-dev-de530132: real 0m07.18s ksh-this-commit: real 0m06.34s ksh-this-commit-no-subshell-inode-test: real 0m04.77s ksh93u+: real 0m05.19s Here a small uplift can be seen relative to both of the previous 93u+m iterations, but because of the inode sanity check present since 5ee290c cd is still a bit slower in subshells relative to ksh93u+. (Perhaps in a future commit a faster alternative could be looked into.) Meanwhile the relevant shbench results are about the same (-l 10): ----------------------------------------------------------------------------------------------------------------------------------- name ksh-no-openat-9b0b2603 ksh-dev-de530132 ksh-this-commit ksh-this-commit-no-inode-test ksh93u+ ----------------------------------------------------------------------------------------------------------------------------------- fibonacci.ksh 0.247 [0.240-0.258] 0.227 [0.225-0.233] 0.235 [0.227-0.241] 0.235 [0.232-0.240] 0.258 [0.251-0.264] parens.ksh 0.191 [0.185-0.197] 0.097 [0.095-0.098] 0.097 [0.095-0.100] 0.099 [0.095-0.101] 0.227 [0.221-0.234] -----------------------------------------------------------------------------------------------------------------------------------
|
Looks good to me. The performance improvement for |
McDutchie
pushed a commit
that referenced
this pull request
May 31, 2025
- Avoid duplicating the file descriptor from openat when it's already >9 to increase performance and reliability. - Add a feature test which tests the errno resulting from using openat on a non-directory with O_DIRECTORY|O_SEARCH. On illumos (and presumably Solaris) openat can misleadingly fail with EACCES rather than ENOTDIR, so account for that by testing the given directory in question with stat(2). - Add a feature test to see if file descriptors obtained via O_SEARCH or O_PATH can actually be used with fchdir(2). This is also relevant for open(2) because of O_SEARCH's usage in the fallback code in sh_subshell(). - Ensure the close-on-exec bit is set for the file descriptor returned by sh_diropenat(). These changes allow for the elimination of the double openat and helps mitigate the performance loss for cd's performance as observed after the merge of 3d7b7e1: vide #816 (comment) Results obtained on Linux amd64 using CCFLAGS=-Os. Results for: 'typeset -i i; time for((i=0;i<500000;i++));do cd /dev; cd /;done' ksh-no-openat-9b0b2603: real 0m01.72s ksh-dev-de530132: real 0m03.55s ksh-this-commit: real 0m02.91s ksh-this-commit-no-subshell-inode-test: (same due to no subshells) real 0m02.90s ksh93u+: real 0m02.02s I'll admit cd still isn't as fast as before, yet the trade off is still worthwhile because ksh93 scripts are bound to use subshells (particulary command substitutions) far more often than cd. In the contrived benchmark cd is called 1,000,000 times, which hardly reflects real world scripts. This also doesn't account for the practice of calling cd from within a subshell, which is common. Results for: 'typeset -i i; time for((i=0;i<500000;i++));do (cd /dev); (cd /);done' ksh-no-openat-9b0b2603: real 0m06.73s ksh-dev-de530132: real 0m07.18s ksh-this-commit: real 0m06.34s ksh-this-commit-no-subshell-inode-test: real 0m04.77s ksh93u+: real 0m05.19s Here a small uplift can be seen relative to both of the previous 93u+m iterations, but because of the inode sanity check present since 5ee290c cd is still a bit slower in subshells relative to ksh93u+. (Perhaps in a future commit a faster alternative could be looked into.) Meanwhile the relevant shbench results are about the same (-l 10): ----------------------------------------------------------------------------------------------------------------------------------- name ksh-no-openat-9b0b2603 ksh-dev-de530132 ksh-this-commit ksh-this-commit-no-inode-test ksh93u+ ----------------------------------------------------------------------------------------------------------------------------------- fibonacci.ksh 0.247 [0.240-0.258] 0.227 [0.225-0.233] 0.235 [0.227-0.241] 0.235 [0.232-0.240] 0.258 [0.251-0.264] parens.ksh 0.191 [0.185-0.197] 0.097 [0.095-0.098] 0.097 [0.095-0.100] 0.099 [0.095-0.101] 0.227 [0.221-0.234] -----------------------------------------------------------------------------------------------------------------------------------
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
openatwhen it's already >9 to increase performance and reliability.errnoresulting from usingopenaton a non-directory withO_DIRECTORY|O_SEARCH. On illumos (and presumably Solaris)openatcan misleadingly fail withEACCESrather thanENOTDIR, so account for that by testing the given directory in question withstat(but sincestatis hardly performance-friendly, the feature test is used to ensure this is only compiled in when absolutely necessary).O_SEARCHorO_PATHcan actually be used withfchdir. This is also relevant foropenbecause ofO_SEARCH's usage in the fallback code insh_subshell().sh_diropenat().These changes allow for the elimination of the double
openatand should help mitigate the performance loss for cd's performance as observed after the merge of 3d7b7e1 (assuming the platform supportsF_DUPFD_CLOEXECandO_CLOEXEC):vide #816 (comment)
Results obtained on Linux amd64 using
CCFLAGS=-Os.Results for
typeset -i i; time for((i=0;i<500000;i++));do cd /dev; cd /;doneI'll admit
cdstill isn't as fast as before, yet the trade off is still worthwhile because ksh93 scripts are bound to use subshells (particulary command substitutions) far more often thancd. In the contrived benchmarkcdis called 1,000,000 times, which hardly reflects real world scripts. This also doesn't account for the practice of callingcdfrom within a subshell, which is common.Results for
typeset -i i; time for((i=0;i<500000;i++));do (cd /dev); (cd /);doneHere a small uplift can be seen relative to both of the previous 93u+m iterations, but because of the inode sanity check present since 5ee290c
cdis still a bit slower in subshells relative to ksh93u+. (Perhaps in a future commit a faster alternative could be looked into.)Meanwhile the relevant shbench results are about the same (
-l 10):