Skip to content

ARROW-10682: [Rust] Improve sort kernel performance by enabling inlining of is_valid calls#8736

Closed
jhorstmann wants to merge 1 commit intoapache:masterfrom
jhorstmann:ARROW-10682-improve-sort-kernel-performance
Closed

ARROW-10682: [Rust] Improve sort kernel performance by enabling inlining of is_valid calls#8736
jhorstmann wants to merge 1 commit intoapache:masterfrom
jhorstmann:ARROW-10682-improve-sort-kernel-performance

Conversation

@jhorstmann
Copy link
Copy Markdown
Contributor

Benchmark results on an aws t3 instance (cascadelake):

$ cargo bench --bench sort_kernel
sort 2^10               time:   [168.90 us 168.98 us 169.08 us]                      
                        change: [-59.182% -59.041% -58.824%] (p = 0.00 < 0.05)
                        Performance has improved.

sort 2^12               time:   [839.88 us 840.15 us 840.46 us]                      
                        change: [-57.095% -57.040% -56.947%] (p = 0.00 < 0.05)
                        Performance has improved.

sort nulls 2^10         time:   [148.69 us 148.75 us 148.82 us]                            
                        change: [-66.823% -66.727% -66.633%] (p = 0.00 < 0.05)
                        Performance has improved.

sort nulls 2^12         time:   [721.79 us 721.95 us 722.12 us]                            
                        change: [-65.966% -65.920% -65.845%] (p = 0.00 < 0.05)
                        Performance has improved.

/// ```
fn is_null(&self, index: usize) -> bool {
self.data().is_null(index)
self.data_ref().is_null(index)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This alone improved the performance a lot, but directly using ArrayData was even better. Should also improve some other kernels.

I think the data method is a real performance foot-gun and would be in favor of removing it completely. That makes it obvious to every user whether an Arc gets cloned. The cloning itself might not be a problem, but I think it sometimes prohibits other compiler optimizations.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really great finding!

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I agree with you: require an explicit .clone from the user (also broadly speaking, not just for .data() in particular. We should audit the code base for similar occurrences.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created https://issues.apache.org/jira/browse/ARROW-10683 about removing data method.

@jhorstmann
Copy link
Copy Markdown
Contributor Author

@jorgecarleitao I see you're currently also working on sorting, this might be of interest to you.

@github-actions
Copy link
Copy Markdown

@alamb
Copy link
Copy Markdown
Contributor

alamb commented Nov 21, 2020

👍 thank you @jhorstmann and @jorgecarleitao

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants