New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up String::from_utf16 #55530

Merged
merged 1 commit into from Nov 15, 2018

Conversation

Projects
None yet
10 participants
@ljedrz
Contributor

ljedrz commented Oct 31, 2018

Collecting into a Result is idiomatic, but not necessarily fast due to rustc not being able to preallocate for the resulting collection. This is fine in case of an error, but IMO we should optimize for the common case, i.e. a successful conversion.

This changes the behavior of String::from_utf16 from collecting into a Result to pushing to a preallocated String in a loop.

According to my simple benchmark this change makes String::from_utf16 around twice as fast.

@rust-highfive

This comment has been minimized.

Collaborator

rust-highfive commented Oct 31, 2018

r? @aidanhs

(rust_highfive has picked a reviewer for you, use r? to override)

@ljedrz

This comment has been minimized.

Contributor

ljedrz commented Oct 31, 2018

I created more benchmarks to check which initial capacity provides the best results in different scenarios. The results aren't too surprising:

test bench_short_old        ... bench:         953 ns/iter (+/- 364)
test bench_short_new_len    ... bench:         428 ns/iter (+/- 226)
test bench_short_new_len15  ... bench:         420 ns/iter (+/- 226)
test bench_short_new_len2   ... bench:         340 ns/iter (+/- 91)
test bench_short_new_len3   ... bench:         296 ns/iter (+/- 20)

test bench_medium_old       ... bench:       1,153 ns/iter (+/- 407)
test bench_medium_new_len   ... bench:         558 ns/iter (+/- 145)
test bench_medium_new_len15 ... bench:         528 ns/iter (+/- 195)
test bench_medium_new_len2  ... bench:         479 ns/iter (+/- 78)
test bench_medium_new_len3  ... bench:         444 ns/iter (+/- 46)

test bench_long_old         ... bench:       1,862 ns/iter (+/- 180)
test bench_long_new_len     ... bench:         860 ns/iter (+/- 467)
test bench_long_new_len15   ... bench:         840 ns/iter (+/- 50)
test bench_long_new_len2    ... bench:         774 ns/iter (+/- 292)
test bench_long_new_len3    ... bench:         661 ns/iter (+/- 26)

The good news is that any choice is at least twice as fast as the current solution, so it boils down to how often we can expect to see non-ASCII characters and code units above 0x0800 or how much extra capacity we can provide without being wasteful.

The conservative approach would be to just go with .len() - this incurs zero wasted capacity while still being twice as fast as the current code.

@TimNN

This comment has been minimized.

Contributor

TimNN commented Nov 13, 2018

Ping from triage @aidanhs / @rust-lang/libs: This PR requires your review.

@SimonSapin

This comment has been minimized.

Contributor

SimonSapin commented Nov 13, 2018

@bors r+

Thanks!

@bors

This comment has been minimized.

Contributor

bors commented Nov 13, 2018

📌 Commit 19aa101 has been approved by SimonSapin

kennytm added a commit to kennytm/rust that referenced this pull request Nov 14, 2018

Rollup merge of rust-lang#55530 - ljedrz:speed_up_String_from_utf16, …
…r=SimonSapin

Speed up String::from_utf16

Collecting into a `Result` is idiomatic, but not necessarily fast due to rustc not being able to preallocate for the resulting collection. This is fine in case of an error, but IMO we should optimize for the common case, i.e. a successful conversion.

This changes the behavior of `String::from_utf16` from collecting into a `Result` to pushing to a preallocated `String` in a loop.

According to [my simple benchmark](https://gist.github.com/ljedrz/953a3fb74058806519bd4d640d6f65ae) this change makes `String::from_utf16` around **twice** as fast.

bors added a commit that referenced this pull request Nov 14, 2018

Auto merge of #55943 - kennytm:rollup, r=kennytm
Rollup of 12 pull requests

Successful merges:

 - #55211 (Add BufWriter::buffer method)
 - #55507 (Add link to std::mem::size_of to size_of intrinsic documentation)
 - #55527 (Implement checked_add_duration for SystemTime)
 - #55530 (Speed up String::from_utf16)
 - #55767 (Disable some pretty-printers when gdb is rust-enabled)
 - #55827 (A few tweaks to iterations/collecting)
 - #55834 (Forward the ABI of the non-zero sized fields of an union if they have the same ABI)
 - #55901 (fix various typos in doc comments)
 - #55926 (Change sidebar selector to fix compatibility with docs.rs)
 - #55930 (A handful of hir tweaks)
 - #55934 (Switch AppVeyor builds to the GCE cloud)
 - #55936 (save-analysis: be even more aggressive about ignorning macro-generated defs)

Failed merges:

r? @ghost

bors added a commit that referenced this pull request Nov 14, 2018

Auto merge of #55943 - kennytm:rollup, r=kennytm
Rollup of 13 pull requests

Successful merges:

 - #55211 (Add BufWriter::buffer method)
 - #55507 (Add link to std::mem::size_of to size_of intrinsic documentation)
 - #55527 (Implement checked_add_duration for SystemTime)
 - #55530 (Speed up String::from_utf16)
 - #55767 (Disable some pretty-printers when gdb is rust-enabled)
 - #55827 (A few tweaks to iterations/collecting)
 - #55834 (Forward the ABI of the non-zero sized fields of an union if they have the same ABI)
 - #55901 (fix various typos in doc comments)
 - #55926 (Change sidebar selector to fix compatibility with docs.rs)
 - #55930 (A handful of hir tweaks)
 - #55934 (Switch AppVeyor builds to the GCE cloud)
 - #55935 (appveyor: Use VS2017 for all our images)
 - #55936 (save-analysis: be even more aggressive about ignorning macro-generated defs)

bors added a commit that referenced this pull request Nov 14, 2018

Auto merge of #55943 - kennytm:rollup, r=kennytm
Rollup of 13 pull requests

Successful merges:

 - #55211 (Add BufWriter::buffer method)
 - #55507 (Add link to std::mem::size_of to size_of intrinsic documentation)
 - #55527 (Implement checked_add_duration for SystemTime)
 - #55530 (Speed up String::from_utf16)
 - #55767 (Disable some pretty-printers when gdb is rust-enabled)
 - #55827 (A few tweaks to iterations/collecting)
 - #55834 (Forward the ABI of the non-zero sized fields of an union if they have the same ABI)
 - #55901 (fix various typos in doc comments)
 - #55926 (Change sidebar selector to fix compatibility with docs.rs)
 - #55930 (A handful of hir tweaks)
 - #55934 (Switch AppVeyor builds to the GCE cloud)
 - #55935 (appveyor: Use VS2017 for all our images)
 - #55936 (save-analysis: be even more aggressive about ignorning macro-generated defs)

kennytm added a commit to kennytm/rust that referenced this pull request Nov 14, 2018

Rollup merge of rust-lang#55530 - ljedrz:speed_up_String_from_utf16, …
…r=SimonSapin

Speed up String::from_utf16

Collecting into a `Result` is idiomatic, but not necessarily fast due to rustc not being able to preallocate for the resulting collection. This is fine in case of an error, but IMO we should optimize for the common case, i.e. a successful conversion.

This changes the behavior of `String::from_utf16` from collecting into a `Result` to pushing to a preallocated `String` in a loop.

According to [my simple benchmark](https://gist.github.com/ljedrz/953a3fb74058806519bd4d640d6f65ae) this change makes `String::from_utf16` around **twice** as fast.
let mut ret = String::with_capacity(v.len());
for c in decode_utf16(v.iter().cloned()) {
if let Ok(c) = c {
ret.push(c);

This comment has been minimized.

@RalfJung

RalfJung Nov 14, 2018

Member

Would be nice to have a comment explaining why the code works this way instead of the more "obvious" collect call.

Basically, what you wrote in the PR should be in the code.

This comment has been minimized.

@ljedrz

ljedrz Nov 14, 2018

Contributor

Good idea; since it is already rolled up, I can add this comment afterwards, along with some other assorted code adjustments.

bors added a commit that referenced this pull request Nov 14, 2018

Auto merge of #55943 - kennytm:rollup, r=kennytm
Rollup of 12 pull requests

Successful merges:

 - #55211 (Add BufWriter::buffer method)
 - #55507 (Add link to std::mem::size_of to size_of intrinsic documentation)
 - #55530 (Speed up String::from_utf16)
 - #55556 (Use `Mmap` to open the rmeta file.)
 - #55767 (Disable some pretty-printers when gdb is rust-enabled)
 - #55827 (A few tweaks to iterations/collecting)
 - #55834 (Forward the ABI of the non-zero sized fields of an union if they have the same ABI)
 - #55901 (fix various typos in doc comments)
 - #55926 (Change sidebar selector to fix compatibility with docs.rs)
 - #55930 (A handful of hir tweaks)
 - #55935 (appveyor: Use VS2017 for all our images)
 - #55936 (save-analysis: be even more aggressive about ignorning macro-generated defs)

bors added a commit that referenced this pull request Nov 14, 2018

Auto merge of #55943 - kennytm:rollup, r=kennytm
Rollup of 15 pull requests

Successful merges:

 - #55182 (Redox: Update to new changes)
 - #55211 (Add BufWriter::buffer method)
 - #55507 (Add link to std::mem::size_of to size_of intrinsic documentation)
 - #55530 (Speed up String::from_utf16)
 - #55556 (Use `Mmap` to open the rmeta file.)
 - #55622 (NetBSD: link libstd with librt in addition to libpthread)
 - #55767 (Disable some pretty-printers when gdb is rust-enabled)
 - #55827 (A few tweaks to iterations/collecting)
 - #55901 (fix various typos in doc comments)
 - #55926 (Change sidebar selector to fix compatibility with docs.rs)
 - #55930 (A handful of hir tweaks)
 - #55935 (appveyor: Use VS2017 for all our images)
 - #55936 (save-analysis: be even more aggressive about ignorning macro-generated defs)
 - #55947 (xLTO: Don't pass --plugin-opt=thin to LLD. That's not supported anymore.)
 - #55948 (submodules: update clippy from d8b42690 to 7e0ddef4)

bors added a commit that referenced this pull request Nov 15, 2018

Auto merge of #55943 - kennytm:rollup, r=kennytm
Rollup of 16 pull requests

Successful merges:

 - #54906 (Reattach all grandchildren when constructing specialization graph.)
 - #55182 (Redox: Update to new changes)
 - #55211 (Add BufWriter::buffer method)
 - #55507 (Add link to std::mem::size_of to size_of intrinsic documentation)
 - #55530 (Speed up String::from_utf16)
 - #55556 (Use `Mmap` to open the rmeta file.)
 - #55622 (NetBSD: link libstd with librt in addition to libpthread)
 - #55827 (A few tweaks to iterations/collecting)
 - #55901 (fix various typos in doc comments)
 - #55926 (Change sidebar selector to fix compatibility with docs.rs)
 - #55930 (A handful of hir tweaks)
 - #55932 (core/char: Speed up `to_digit()` for `radix <= 10`)
 - #55935 (appveyor: Use VS2017 for all our images)
 - #55936 (save-analysis: be even more aggressive about ignorning macro-generated defs)
 - #55948 (submodules: update clippy from d8b42690 to 7e0ddef4)
 - #55956 (add tests for some fixed ICEs)

pietroalbini added a commit to pietroalbini/rust that referenced this pull request Nov 15, 2018

Rollup merge of rust-lang#55530 - ljedrz:speed_up_String_from_utf16, …
…r=SimonSapin

Speed up String::from_utf16

Collecting into a `Result` is idiomatic, but not necessarily fast due to rustc not being able to preallocate for the resulting collection. This is fine in case of an error, but IMO we should optimize for the common case, i.e. a successful conversion.

This changes the behavior of `String::from_utf16` from collecting into a `Result` to pushing to a preallocated `String` in a loop.

According to [my simple benchmark](https://gist.github.com/ljedrz/953a3fb74058806519bd4d640d6f65ae) this change makes `String::from_utf16` around **twice** as fast.

bors added a commit that referenced this pull request Nov 15, 2018

Auto merge of #55974 - pietroalbini:rollup, r=pietroalbini
Rollup of 17 pull requests

Successful merges:

 - #55182 (Redox: Update to new changes)
 - #55211 (Add BufWriter::buffer method)
 - #55507 (Add link to std::mem::size_of to size_of intrinsic documentation)
 - #55530 (Speed up String::from_utf16)
 - #55556 (Use `Mmap` to open the rmeta file.)
 - #55622 (NetBSD: link libstd with librt in addition to libpthread)
 - #55750 (Make `NodeId` and `HirLocalId` `newtype_index`)
 - #55778 (Wrap some query results in `Lrc`.)
 - #55781 (More precise spans for temps and their drops)
 - #55785 (Add mem::forget_unsized() for forgetting unsized values)
 - #55852 (Rewrite `...` as `..=` as a `MachineApplicable` 2018 idiom lint)
 - #55865 (Unix RwLock: avoid racy access to write_locked)
 - #55901 (fix various typos in doc comments)
 - #55926 (Change sidebar selector to fix compatibility with docs.rs)
 - #55930 (A handful of hir tweaks)
 - #55932 (core/char: Speed up `to_digit()` for `radix <= 10`)
 - #55956 (add tests for some fixed ICEs)

Failed merges:

r? @ghost

@bors bors merged commit 19aa101 into rust-lang:master Nov 15, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@ljedrz ljedrz deleted the ljedrz:speed_up_String_from_utf16 branch Nov 15, 2018

@arthurprs

This comment has been minimized.

Contributor

arthurprs commented Nov 21, 2018

@ljedrz Is this a size_hint limitation when collecting into Result<,>? Otherwise I see no reason why the slice iterator couldn't propagate the hint to String::from_iter.

@ljedrz

This comment has been minimized.

Contributor

ljedrz commented Nov 21, 2018

@arthurprs Yes; the implementation of FromIterator for Result sets the lower bound to zero. The same happens with Option::from_iter. Related: #52910.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment