Lazily cache UTF-8 character length on RString #4531
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Store the UTF-8 length as computed by
utf8_strlen
onRString
whenMRB_UTF8_STRING
is defined. This significantly (3200x) speeds upString
operations at the cost of onemrb_int
of memory perRString
.The UTF-8
char_len
is computed on first call toutf8_strlen
and is cached untilmrb_str_modify
is called, when it is zeroed out.utf8_strlen
gets a new fastpath for when the character length is cached and retains the fast path for ASCII strings which is used as a fallback.Benchmark
/usr/bin/time -l ./bin/mruby -e 'def str(len); ("💪" * len); end; s = str(100000); puts s.length; 1000000.times { s.length }'
This PR @
ca692e6
Completes in 0.30 seconds.
master
Completes in 963 seconds. The caching implementation is 3210x faster than master.