Skip to content

Optimize File.dirname for common encodings#15907

Merged
byroot merged 6 commits intoruby:masterfrom
byroot:file-dirname-opt-2
Jan 20, 2026
Merged

Optimize File.dirname for common encodings#15907
byroot merged 6 commits intoruby:masterfrom
byroot:file-dirname-opt-2

Conversation

@byroot
Copy link
Copy Markdown
Member

@byroot byroot commented Jan 19, 2026

Closes: #15902

strrdirsep quite ineficiently search for the last separator from the front of the string.

This is surprising but necessary because in Shift-JS, 0x5c (backslash) can be the second byte of some multi-byte characters, (e.g. "ソ".encode(Encoding::SHIFT_JIS).b is "\x83\\"), as such it's not possible to do a pure ASCII search.
And it's even more costly because for each character we need to do expensive checks to handle this possibility, so the Inc macro end up being a major hotspot.

However in the overwhelming majority of cases, paths are encoded in UTF-8 or ASCII, so for these common encodings we can use the more logical and efficient algorithm.

This change also make strrdirsep suitable when n > 1, so it helps simplify the dirname_n function.

This PR also reduce the cost of dirname sanity checks of the provided path:

  • str_null_check was performed twice, once by FilePathStringValue and a second time by StringValueCStr.
  • StringValueCStr was checking for the terminator presence, but we don't care about that.
  • FilePathStringValue calls rb_str_new_frozen to ensure fname isn't mutated, but that's costly for such a check. Instead we can do it in debug mode only.
  • rb_enc_get is slow because it accepts arbitrary objects, even immediates, so it has to do numerous type checks. Add a much faster rb_str_enc_get when we know we're dealing with a string.
  • rb_enc_copy is slow for the same reasons, since we already have the encoding, we can use rb_enc_str_new instead.

Many of these optimization could be applied to other path manipulation methods, but I'd rather do that in a followup.

compare-ruby: ruby 4.1.0dev (2026-01-17T14:40:03Z master 00a3b71eaf) +PRISM [arm64-darwin25]
built-ruby: ruby 4.1.0dev (2026-01-19T08:03:42Z file-dirname-opt-2 784d01ad99) +PRISM [arm64-darwin25]
compare-ruby built-ruby
long 4.027M 24.204M
- 6.01x
short 15.548M 28.812M
- 1.85x
n_4 3.845M 19.359M
- 5.03x

- `str_null_check` was performed twice, once by `FilePathStringValue`
  and a second time by `StringValueCStr`.
- `StringValueCStr` was checking for the terminator presence, but we
  don't care about that.
- `FilePathStringValue` calls `rb_str_new_frozen` to ensure `fname`
  isn't mutated, but that's costly for such a check. Instead we
  can do it in debug mode only.
- `rb_enc_get` is slow because it accepts arbitrary objects, even immediates,
  so it has to do numerous type checks. Add a much faster `rb_str_enc_get`
  when we know we're dealing with a string.
- `rb_enc_copy` is slow for the same reasons, since we already have the
  encoding, we can use `rb_enc_str_new` instead.
…dings

`strrdirsep` quite innficiently search for the last separator from the front
of the string.

This is surprising but necessary because in Shift-JS, `0x5c` can
be the second byte of some multi-byte characters, as such it's
not possible to do a pure ASCII search. And it's even more costly
because for each character we need to do expensive checks to
handle this possibility.

However in the overwhelming majority of cases, paths are encoded
in UTF-8 or ASCII, so for these common encodings we can use the
more logical and efficient algorithm.

```
compare-ruby: ruby 4.1.0dev (2026-01-17T14:40:03Z master 00a3b71) +PRISM [arm64-darwin25]
built-ruby: ruby 4.1.0dev (2026-01-19T07:43:57Z file-dirname-lower.. a8d3535e5b) +PRISM [arm64-darwin25]
```

|       |compare-ruby|built-ruby|
|:------|-----------:|---------:|
|long   |      3.974M|   23.674M|
|       |           -|     5.96x|
|short  |     15.281M|   29.034M|
|       |           -|     1.90x|
It's both simpler and faster.

|       |compare-ruby|built-ruby|
|:------|-----------:|---------:|
|long   |      3.960M|   24.072M|
|       |           -|     6.08x|
|short  |     15.417M|   29.841M|
|       |           -|     1.94x|
|n_4    |      3.858M|   18.415M|
|       |           -|     4.77x|
@byroot byroot force-pushed the file-dirname-opt-2 branch from 6d20a5f to b0f4cbf Compare January 19, 2026 08:49
@byroot
Copy link
Copy Markdown
Member Author

byroot commented Jan 19, 2026

urgh, damn windows. The crash doesn't give any info whatsoever. I might have to bisect the changes in some way :/

@byroot byroot force-pushed the file-dirname-opt-2 branch from b0f4cbf to 75eeff6 Compare January 19, 2026 10:42
@launchable-app
Copy link
Copy Markdown

launchable-app bot commented Jan 19, 2026

1/67299 Tests Failed

test/ruby/test_file.rb#test_stat 🛡️ never-failing, but failed now
Failure:
TestFile#test_stat [/tmp/_actions-runner-working-dir/ruby/ruby/src/test/ruby/test_file.rb:412]:
Expected |1768840877.9465432 - 1768840874.3029788| (3.643564462661743) to be <= 1.

[-> View Test suite health in main branch]

@byroot byroot force-pushed the file-dirname-opt-2 branch from 75eeff6 to 3bdc7fe Compare January 19, 2026 12:08
Comment thread file.c Outdated
@byroot byroot force-pushed the file-dirname-opt-2 branch 3 times, most recently from 8f908c3 to 7861afb Compare January 19, 2026 16:19
`rb_encoding *` is defined as `nonnull` so `if (enc)` is optimized
out by the compiler. We have to pass a boolean alongside it to
avoid crashes.
@byroot byroot force-pushed the file-dirname-opt-2 branch from 7861afb to aab0b13 Compare January 19, 2026 16:34
@byroot byroot requested a review from nobu January 19, 2026 17:14
Copy link
Copy Markdown
Member

@nobu nobu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will rb_enc_path_last_separator also be faster by scanning backward with rb_enc_prev_char?

@byroot
Copy link
Copy Markdown
Member Author

byroot commented Jan 20, 2026

Will rb_enc_path_last_separator also be faster by scanning backward with rb_enc_prev_char?

Oh, I didn't know that was a thing, I saw the algorithm I assume it was because scanning backward wasn't possible.

The PR is already pretty complex, and multibyte paths are rare, so I will merge like this, but I do plan to give the same treatment to other path methods.

@byroot byroot merged commit 6fb5043 into ruby:master Jan 20, 2026
100 of 101 checks passed
byroot added a commit to byroot/ruby that referenced this pull request Jan 20, 2026
Similar optimizations to the ones performed in rubyGH-15907.

- Skip the expensive multi-byte encoding handling for the common
  encodings that are known to be safe.
- Use `CheckPath` to save on copying the argument and only scan it for
  NULL bytes once.
- Create the return string with rb_enc_str_new instead of rb_str_subseq
  as it's going to be a very small string anyway.

This could be optimized a little bit further by searching for both `.` and `dirsep`
in one pass,

```
compare-ruby: ruby 4.1.0dev (2026-01-19T03:51:30Z master 631bf19) +PRISM [arm64-darwin25]
built-ruby: ruby 4.1.0dev (2026-01-20T07:33:42Z master 6fb5043) +PRISM [arm64-darwin25]
```

|       |compare-ruby|built-ruby|
|:------|-----------:|---------:|
|long   |      3.655M|   22.749M|
|       |           -|     6.22x|
|short  |     16.193M|   30.557M|
|       |           -|     1.89x|
byroot added a commit to byroot/ruby that referenced this pull request Jan 20, 2026
Similar optimizations to the ones performed in rubyGH-15907.

- Skip the expensive multi-byte encoding handling for the common
  encodings that are known to be safe.
- Use `CheckPath` to save on copying the argument and only scan it for
  NULL bytes once.
- Create the return string with rb_enc_str_new instead of rb_str_subseq
  as it's going to be a very small string anyway.

This could be optimized a little bit further by searching for both `.` and `dirsep`
in one pass,

```
compare-ruby: ruby 4.1.0dev (2026-01-19T03:51:30Z master 631bf19) +PRISM [arm64-darwin25]
built-ruby: ruby 4.1.0dev (2026-01-20T07:33:42Z master 6fb5043) +PRISM [arm64-darwin25]
```

|           |compare-ruby|built-ruby|
|:----------|-----------:|---------:|
|long       |      3.606M|   22.229M|
|           |           -|     6.17x|
|long_name  |      2.254M|   13.416M|
|           |           -|     5.95x|
|short      |     16.488M|   29.969M|
|           |           -|     1.82x|
byroot added a commit that referenced this pull request Jan 20, 2026
Similar optimizations to the ones performed in GH-15907.

- Skip the expensive multi-byte encoding handling for the common
  encodings that are known to be safe.
- Use `CheckPath` to save on copying the argument and only scan it for
  NULL bytes once.
- Create the return string with rb_enc_str_new instead of rb_str_subseq
  as it's going to be a very small string anyway.

This could be optimized a little bit further by searching for both `.` and `dirsep`
in one pass,

```
compare-ruby: ruby 4.1.0dev (2026-01-19T03:51:30Z master 631bf19) +PRISM [arm64-darwin25]
built-ruby: ruby 4.1.0dev (2026-01-20T07:33:42Z master 6fb5043) +PRISM [arm64-darwin25]
```

|           |compare-ruby|built-ruby|
|:----------|-----------:|---------:|
|long       |      3.606M|   22.229M|
|           |           -|     6.17x|
|long_name  |      2.254M|   13.416M|
|           |           -|     5.95x|
|short      |     16.488M|   29.969M|
|           |           -|     1.82x|
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants