Skip to content
This repository has been archived by the owner on Feb 18, 2024. It is now read-only.

Poor performance in sort::sort_to_indices with limit option in arrow2 #245

Closed
Tracked by #1170
sundy-li opened this issue Aug 2, 2021 · 3 comments
Closed
Tracked by #1170
Labels
enhancement An improvement to an existing feature

Comments

@sundy-li
Copy link
Collaborator

sundy-li commented Aug 2, 2021

Hello, I did a benchmark vs arrow1.

It shows that arrow2 has great performance improved in min, sum and max etc.

But the sort::sort_to_indices function with limit 100 has a performance drop.

Benchmark results:

arrow2-sort 2^13 f32    time:   [67.831 us 67.978 us 68.144 us]   

arrow1-sort 2^13 f32    time:   [34.306 us 34.405 us 34.521 us] 

Codes: https://github.com/sundy-li/learn/tree/master/arrow-vs-arrow2/benches

Scripts:
cargo bench -- "arrow1-sort 2\^13 f32"
cargo bench -- "arrow2-sort 2\^13 f32"

@sundy-li
Copy link
Collaborator Author

sundy-li commented Aug 2, 2021

image

as_slice is the bottleneck.

I think we can use values: &[T] to replace get: G.

Improved in commit: datafuse-extras#1

Now:

arrow2-sort-limit 2^13 f32                                                                             
  time:   [46.601 us 46.665 us 46.728 us]
  change: [-24.847% -24.656% -24.483%] (p = 0.00 < 0.05)
  Performance has improved.

Another bottleneck is from_usize().unwrap() in the loop.

@jorgecarleitao
Copy link
Owner

Great finding; thanks for sharing!

Would you like me to backport it here?

@sundy-li
Copy link
Collaborator Author

sundy-li commented Aug 2, 2021

Great finding; thanks for sharing!

Would you like me to backport it here?

Sure, after I finish the migration, I'll create pr to arrow2.

@sundy-li sundy-li mentioned this issue Aug 3, 2021
@sundy-li sundy-li closed this as completed Aug 4, 2021
@jorgecarleitao jorgecarleitao added the enhancement An improvement to an existing feature label Aug 11, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement An improvement to an existing feature
Projects
None yet
Development

No branches or pull requests

2 participants