Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-10609: [Rust] Optimize min/max of non null strings #8673

Closed
wants to merge 6 commits into from

Conversation

Dandandan
Copy link
Contributor

Applies the same optimization as in ARROW-10595. Difference is smaller, but still there:

min string 512          time:   [3.4096 us 3.4378 us 3.4683 us]                            
                        change: [-13.563% -13.111% -12.628%] (p = 0.00 < 0.05)
                        Performance has improved.

@github-actions
Copy link

rust/arrow/benches/aggregate_kernels.rs Outdated Show resolved Hide resolved
Comment on lines +39 to 44
for i in 1..data.len() {
let item = array.value(i);
if !has_value || cmp(&n, item) {
has_value = true;
if cmp(&n, item) {
n = item;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like this code can use fold with your changes now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I translated the other min_max_helper into using fold. couldn't figure out to convert min_max_string to use fold here.

@vertexclique
Copy link
Contributor

Liked it, can you address the comments, that will make some improvements on the benches indirectly.

Copy link
Member

@jorgecarleitao jorgecarleitao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Cool improvement.

@alamb
Copy link
Contributor

alamb commented Nov 16, 2020

CI failure seems to be infrastructure related:

C:\windows\System32\tar.exe -cz -f D:\a\_temp\a71516c7-d573-4e09-b21a-03fde6e0e547\cache.tgz -C D:\a\arrow\arrow\rust\target .
Warning: Cache service responded with 503 during chunk upload.
events.js:187
      throw er; // Unhandled 'error' event
      ^

https://github.com/apache/arrow/pull/8673/checks?check_run_id=1408484265

I re-triggered the job in the github CI and will hope it will pass this time

@alamb
Copy link
Contributor

alamb commented Nov 16, 2020

The TravisCI builds are not affected by Rust, so merging this one in

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me -- I read this carefully

@alamb alamb closed this in e5fce7f Nov 16, 2020
GeorgeAp pushed a commit to sirensolutions/arrow that referenced this pull request Jun 7, 2021
Applies the same optimization as in ARROW-10595. Difference is smaller, but still there:

```
min string 512          time:   [3.4096 us 3.4378 us 3.4683 us]
                        change: [-13.563% -13.111% -12.628%] (p = 0.00 < 0.05)
                        Performance has improved.
```

Closes apache#8673 from Dandandan/min_max_string

Authored-by: Heres, Daniel <danielheres@gmail.com>
Signed-off-by: Andrew Lamb <andrew@nerdnetworks.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants