Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

String#chars is slow #2670

Open
lopopolo opened this issue Dec 26, 2023 · 0 comments
Open

String#chars is slow #2670

lopopolo opened this issue Dec 26, 2023 · 0 comments
Labels
A-performance Area: Performance improvements and optimizations. A-ruby-core Area: Ruby Core types.

Comments

@lopopolo
Copy link
Member

For both ASCII-only UTF-8 Strings and Binary Strings.

artichoke@0dd6a602bac4:/$ time artichoke -e 'x = "abc" * 10_000_000' -e 'puts x.chars.length'
30000000

real    0m5.372s
user    0m4.749s
sys     0m0.622s
artichoke@0dd6a602bac4:/$ time artichoke -e 'x = "abc".b * 10_000_000' -e 'puts x.chars.length'
30000000

real    0m5.405s
user    0m4.765s
sys     0m0.641s

On MRI these operations are twice as fast:

$ time ruby -e 'x = "abc".b * 10_000_000' -e 'puts x.chars.length'
30000000
ruby -e 'x = "abc".b * 10_000_000' -e 'puts x.chars.length'  2.75s user 1.83s system 99% cpu 4.585 total

For cases where a String is ASCII compatible, this operation doesn't need to go through the iterator (and especially doesn't need to go through UTF-8 decoding). It should be quicker to do a copy and map over individual bytes to allocate the many String objects.

This probably requires a new API on spinoso_string::String and for the split between owned and borrowed encoded strings to land for binary and ascii encodings.

@lopopolo lopopolo added A-ruby-core Area: Ruby Core types. A-performance Area: Performance improvements and optimizations. labels Dec 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-performance Area: Performance improvements and optimizations. A-ruby-core Area: Ruby Core types.
Development

No branches or pull requests

1 participant