Optimize File.dirname for common encodings#15907
Merged
byroot merged 6 commits intoruby:masterfrom Jan 20, 2026
Merged
Conversation
- `str_null_check` was performed twice, once by `FilePathStringValue` and a second time by `StringValueCStr`. - `StringValueCStr` was checking for the terminator presence, but we don't care about that. - `FilePathStringValue` calls `rb_str_new_frozen` to ensure `fname` isn't mutated, but that's costly for such a check. Instead we can do it in debug mode only. - `rb_enc_get` is slow because it accepts arbitrary objects, even immediates, so it has to do numerous type checks. Add a much faster `rb_str_enc_get` when we know we're dealing with a string. - `rb_enc_copy` is slow for the same reasons, since we already have the encoding, we can use `rb_enc_str_new` instead.
784d01a to
6d20a5f
Compare
…dings `strrdirsep` quite innficiently search for the last separator from the front of the string. This is surprising but necessary because in Shift-JS, `0x5c` can be the second byte of some multi-byte characters, as such it's not possible to do a pure ASCII search. And it's even more costly because for each character we need to do expensive checks to handle this possibility. However in the overwhelming majority of cases, paths are encoded in UTF-8 or ASCII, so for these common encodings we can use the more logical and efficient algorithm. ``` compare-ruby: ruby 4.1.0dev (2026-01-17T14:40:03Z master 00a3b71) +PRISM [arm64-darwin25] built-ruby: ruby 4.1.0dev (2026-01-19T07:43:57Z file-dirname-lower.. a8d3535e5b) +PRISM [arm64-darwin25] ``` | |compare-ruby|built-ruby| |:------|-----------:|---------:| |long | 3.974M| 23.674M| | | -| 5.96x| |short | 15.281M| 29.034M| | | -| 1.90x|
It's both simpler and faster. | |compare-ruby|built-ruby| |:------|-----------:|---------:| |long | 3.960M| 24.072M| | | -| 6.08x| |short | 15.417M| 29.841M| | | -| 1.94x| |n_4 | 3.858M| 18.415M| | | -| 4.77x|
6d20a5f to
b0f4cbf
Compare
Member
Author
|
urgh, damn windows. The crash doesn't give any info whatsoever. I might have to bisect the changes in some way :/ |
b0f4cbf to
75eeff6
Compare
❌ 1/67299 Tests Failedtest/ruby/test_file.rb#test_stat 🛡️ never-failing, but failed now |
75eeff6 to
3bdc7fe
Compare
nobu
reviewed
Jan 19, 2026
8f908c3 to
7861afb
Compare
`rb_encoding *` is defined as `nonnull` so `if (enc)` is optimized out by the compiler. We have to pass a boolean alongside it to avoid crashes.
7861afb to
aab0b13
Compare
nobu
approved these changes
Jan 20, 2026
Member
nobu
left a comment
There was a problem hiding this comment.
Will rb_enc_path_last_separator also be faster by scanning backward with rb_enc_prev_char?
Member
Author
Oh, I didn't know that was a thing, I saw the algorithm I assume it was because scanning backward wasn't possible. The PR is already pretty complex, and multibyte paths are rare, so I will merge like this, but I do plan to give the same treatment to other path methods. |
byroot
added a commit
to byroot/ruby
that referenced
this pull request
Jan 20, 2026
Similar optimizations to the ones performed in rubyGH-15907. - Skip the expensive multi-byte encoding handling for the common encodings that are known to be safe. - Use `CheckPath` to save on copying the argument and only scan it for NULL bytes once. - Create the return string with rb_enc_str_new instead of rb_str_subseq as it's going to be a very small string anyway. This could be optimized a little bit further by searching for both `.` and `dirsep` in one pass, ``` compare-ruby: ruby 4.1.0dev (2026-01-19T03:51:30Z master 631bf19) +PRISM [arm64-darwin25] built-ruby: ruby 4.1.0dev (2026-01-20T07:33:42Z master 6fb5043) +PRISM [arm64-darwin25] ``` | |compare-ruby|built-ruby| |:------|-----------:|---------:| |long | 3.655M| 22.749M| | | -| 6.22x| |short | 16.193M| 30.557M| | | -| 1.89x|
byroot
added a commit
to byroot/ruby
that referenced
this pull request
Jan 20, 2026
Similar optimizations to the ones performed in rubyGH-15907. - Skip the expensive multi-byte encoding handling for the common encodings that are known to be safe. - Use `CheckPath` to save on copying the argument and only scan it for NULL bytes once. - Create the return string with rb_enc_str_new instead of rb_str_subseq as it's going to be a very small string anyway. This could be optimized a little bit further by searching for both `.` and `dirsep` in one pass, ``` compare-ruby: ruby 4.1.0dev (2026-01-19T03:51:30Z master 631bf19) +PRISM [arm64-darwin25] built-ruby: ruby 4.1.0dev (2026-01-20T07:33:42Z master 6fb5043) +PRISM [arm64-darwin25] ``` | |compare-ruby|built-ruby| |:----------|-----------:|---------:| |long | 3.606M| 22.229M| | | -| 6.17x| |long_name | 2.254M| 13.416M| | | -| 5.95x| |short | 16.488M| 29.969M| | | -| 1.82x|
byroot
added a commit
that referenced
this pull request
Jan 20, 2026
Similar optimizations to the ones performed in GH-15907. - Skip the expensive multi-byte encoding handling for the common encodings that are known to be safe. - Use `CheckPath` to save on copying the argument and only scan it for NULL bytes once. - Create the return string with rb_enc_str_new instead of rb_str_subseq as it's going to be a very small string anyway. This could be optimized a little bit further by searching for both `.` and `dirsep` in one pass, ``` compare-ruby: ruby 4.1.0dev (2026-01-19T03:51:30Z master 631bf19) +PRISM [arm64-darwin25] built-ruby: ruby 4.1.0dev (2026-01-20T07:33:42Z master 6fb5043) +PRISM [arm64-darwin25] ``` | |compare-ruby|built-ruby| |:----------|-----------:|---------:| |long | 3.606M| 22.229M| | | -| 6.17x| |long_name | 2.254M| 13.416M| | | -| 5.95x| |short | 16.488M| 29.969M| | | -| 1.82x|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes: #15902
strrdirsepquite ineficiently search for the last separator from the front of the string.This is surprising but necessary because in Shift-JS,
0x5c(backslash) can be the second byte of some multi-byte characters, (e.g."ソ".encode(Encoding::SHIFT_JIS).bis"\x83\\"), as such it's not possible to do a pure ASCII search.And it's even more costly because for each character we need to do expensive checks to handle this possibility, so the
Incmacro end up being a major hotspot.However in the overwhelming majority of cases, paths are encoded in UTF-8 or ASCII, so for these common encodings we can use the more logical and efficient algorithm.
This change also make
strrdirsepsuitable whenn > 1, so it helps simplify thedirname_nfunction.This PR also reduce the cost of
dirnamesanity checks of the provided path:str_null_checkwas performed twice, once byFilePathStringValueand a second time byStringValueCStr.StringValueCStrwas checking for the terminator presence, but we don't care about that.FilePathStringValuecallsrb_str_new_frozento ensurefnameisn't mutated, but that's costly for such a check. Instead we can do it in debug mode only.rb_enc_getis slow because it accepts arbitrary objects, even immediates, so it has to do numerous type checks. Add a much fasterrb_str_enc_getwhen we know we're dealing with a string.rb_enc_copyis slow for the same reasons, since we already have the encoding, we can userb_enc_str_newinstead.Many of these optimization could be applied to other path manipulation methods, but I'd rather do that in a followup.