Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-11100: [Rust] Speed up numeric to string cast using lexical_core #9068

Closed
wants to merge 7 commits into from

Conversation

Dandandan
Copy link
Contributor

@Dandandan Dandandan commented Jan 1, 2021

Uses lexical_core to speed up num to string casts.

This gets a nice speed up, especially for floats:

cast i64 to string 512  time:   [22.209 us 22.462 us 22.885 us]
                        change: [-38.438% -37.979% -37.154%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 14 outliers among 100 measurements (14.00%)
  11 (11.00%) low mild
  3 (3.00%) high severe

Benchmarking cast f32 to string 512: Collecting 100 samples in estimated 5.0698                                                                                 cast f32 to string 512  time:   [25.587 us 25.692 us 25.786 us]
                        change: [-62.364% -62.215% -62.076%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) low severe
  1 (1.00%) low mild
  4 (4.00%) high mild
  1 (1.00%) high severe

@github-actions
Copy link

github-actions bot commented Jan 1, 2021

@codecov-io
Copy link

codecov-io commented Jan 1, 2021

Codecov Report

Merging #9068 (48f5522) into master (cc0ee5e) will increase coverage by 0.05%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #9068      +/-   ##
==========================================
+ Coverage   82.55%   82.61%   +0.05%     
==========================================
  Files         203      204       +1     
  Lines       50043    49942     -101     
==========================================
- Hits        41313    41259      -54     
+ Misses       8730     8683      -47     
Impacted Files Coverage Δ
rust/arrow/src/compute/kernels/cast.rs 96.99% <100.00%> (ø)
rust/arrow/src/csv/writer.rs 78.82% <100.00%> (-0.56%) ⬇️
rust/arrow/src/util/serialization.rs 100.00% <100.00%> (ø)
...datafusion/src/optimizer/hash_build_probe_order.rs 54.73% <0.00%> (-3.70%) ⬇️
rust/arrow/src/array/array_string.rs 90.16% <0.00%> (-0.11%) ⬇️
rust/datafusion/src/scalar.rs 58.99% <0.00%> (-0.08%) ⬇️
rust/arrow/src/array/array_binary.rs 90.54% <0.00%> (-0.07%) ⬇️
rust/parquet/src/arrow/schema.rs 90.93% <0.00%> (+0.31%) ⬆️
rust/arrow/src/array/array_primitive.rs 92.32% <0.00%> (+0.68%) ⬆️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cc0ee5e...48f5522. Read the comment docs.

Copy link
Member

@jorgecarleitao jorgecarleitao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @Dandandan !

Left a minor optional comment.

rust/arrow/src/util/mod.rs Outdated Show resolved Hide resolved
GeorgeAp pushed a commit to sirensolutions/arrow that referenced this pull request Jun 7, 2021
Uses lexical_core to speed up num to string casts.

This gets a nice speed up, especially for floats:
```
cast i64 to string 512  time:   [22.209 us 22.462 us 22.885 us]
                        change: [-38.438% -37.979% -37.154%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 14 outliers among 100 measurements (14.00%)
  11 (11.00%) low mild
  3 (3.00%) high severe

Benchmarking cast f32 to string 512: Collecting 100 samples in estimated 5.0698                                                                                 cast f32 to string 512  time:   [25.587 us 25.692 us 25.786 us]
                        change: [-62.364% -62.215% -62.076%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) low severe
  1 (1.00%) low mild
  4 (4.00%) high mild
  1 (1.00%) high severe
```

Closes apache#9068 from Dandandan/numeric_cast_lexical

Authored-by: Heres, Daniel <danielheres@gmail.com>
Signed-off-by: Jorge C. Leitao <jorgecarleitao@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants