Skip to content

Optimize File.extname for common encodings#15912

Merged
byroot merged 1 commit intoruby:masterfrom
byroot:opt-file-extname
Jan 20, 2026
Merged

Optimize File.extname for common encodings#15912
byroot merged 1 commit intoruby:masterfrom
byroot:opt-file-extname

Conversation

@byroot
Copy link
Copy Markdown
Member

@byroot byroot commented Jan 20, 2026

Similar optimizations to the ones performed in GH-15907.

  • Skip the expensive multi-byte encoding handling for the common encodings that are known to be safe.
  • Use CheckPath to save on copying the argument and only scan it for NULL bytes once.
  • Create the return string with rb_enc_str_new instead of rb_str_subseq as it's going to be a very small string anyway.

This could be optimized a little bit further by searching for both . and dirsep in one pass,

compare-ruby: ruby 4.1.0dev (2026-01-19T03:51:30Z master 631bf19b37) +PRISM [arm64-darwin25]
built-ruby: ruby 4.1.0dev (2026-01-20T07:33:42Z master 6fb50434e3) +PRISM [arm64-darwin25]
compare-ruby built-ruby
long 3.606M 22.229M
- 6.17x
long_name 2.254M 13.416M
- 5.95x
short 16.488M 29.969M
- 1.82x

Similar optimizations to the ones performed in rubyGH-15907.

- Skip the expensive multi-byte encoding handling for the common
  encodings that are known to be safe.
- Use `CheckPath` to save on copying the argument and only scan it for
  NULL bytes once.
- Create the return string with rb_enc_str_new instead of rb_str_subseq
  as it's going to be a very small string anyway.

This could be optimized a little bit further by searching for both `.` and `dirsep`
in one pass,

```
compare-ruby: ruby 4.1.0dev (2026-01-19T03:51:30Z master 631bf19) +PRISM [arm64-darwin25]
built-ruby: ruby 4.1.0dev (2026-01-20T07:33:42Z master 6fb5043) +PRISM [arm64-darwin25]
```

|           |compare-ruby|built-ruby|
|:----------|-----------:|---------:|
|long       |      3.606M|   22.229M|
|           |           -|     6.17x|
|long_name  |      2.254M|   13.416M|
|           |           -|     5.95x|
|short      |     16.488M|   29.969M|
|           |           -|     1.82x|
@byroot
Copy link
Copy Markdown
Member Author

byroot commented Jan 20, 2026

This could be optimized a little bit further by searching for both . and dirsep in one pass,

I tried it:

compare-ruby built-ruby
long 3.544M 26.261M
- 7.41x
long_name 2.286M 27.529M
- 12.04x
short 16.385M 29.478M
- 1.80x

However there are many corner cases to consider, some of them being Windows only, so I'd rather ship the sure thing now and potentially come back to it later.

@byroot byroot enabled auto-merge (rebase) January 20, 2026 08:58
@byroot byroot merged commit 53fe993 into ruby:master Jan 20, 2026
94 of 96 checks passed
@byroot byroot deleted the opt-file-extname branch January 20, 2026 09:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant