Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-11048: [Rust] Added bench to MutableBuffer #9032

Closed
wants to merge 1 commit into from
Closed

ARROW-11048: [Rust] Added bench to MutableBuffer #9032

wants to merge 1 commit into from

Conversation

jorgecarleitao
Copy link
Member

This bench compares the behavior of MutableBuffer vs allocating Vec<u8> and using extend_from_slice.

On my computer:

mutable                 time:   [579.24 us 580.21 us 581.34 us]                    
mutable prepared        time:   [614.98 us 616.15 us 617.42 us]                             
from_slice              time:   [1.2945 ms 1.3262 ms 1.3607 ms]                        
from_slice prepared     time:   [1.0161 ms 1.0472 ms 1.0881 ms]                                 

I.e. growing a MutableBuffer seems to be 2x faster than creating a Vec and using Buffer::from to convert it to a Buffer.

It is odd that creating a buffer with the correct capacity takes longer, though. Any ideas @jhorstmann , @Dandandan ?

@github-actions
Copy link

@Dandandan
Copy link
Contributor

I am not sure what is causing the difference, I would expect it to be faster just as from_slice prepared is faster.
The main difference is that initializing with capacity 0 uses memory::allocate_aligned while the other uses a few calls to memory::reallocate. I guess the first is (quite) a bit slower somehow?

@jhorstmann
Copy link
Contributor

I think Buffer::from currently does another copy of the vec contents because it does not take ownership and because of the current alignment and padding requirements. We probably could add a zero-copy Buffer::from_vec method if we loosen those restrictions. Another interesting benchmark would be another version of from_slice that returns a Vec instead of Buffer, to see whether the extend/reallocate logic itself could be optimized.

@jorgecarleitao
Copy link
Member Author

jorgecarleitao commented Dec 29, 2020

just FYI, I found one reason. alloc_zeroed is expensive. I am preparing a PR where this is addressed. There is some code atm that relies on MutableBuffer::reserve and MutableBuffer::new to allocate zeros (and not undefined), which I need to handle.

Copy link
Contributor

@Dandandan Dandandan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, nice to be able to measure this / remove inefficiencies 👍

@alamb
Copy link
Contributor

alamb commented Dec 31, 2020

The full set of Rust CI tests did not run on this PR :(

Can you please rebase this PR against apache/master to pick up the changes in #9056 so that they do?

I apologize for the inconvenience.

@codecov-io
Copy link

Codecov Report

Merging #9032 (7a0fa18) into master (51672b2) will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #9032   +/-   ##
=======================================
  Coverage   82.61%   82.61%           
=======================================
  Files         202      202           
  Lines       50048    50048           
=======================================
  Hits        41347    41347           
  Misses       8701     8701           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 51672b2...7a0fa18. Read the comment docs.

@alamb
Copy link
Contributor

alamb commented Jan 1, 2021

I believe these benches were already added as part of #8997 (which I found out while trying to merge this PR -- LOL)

@jorgecarleitao
Copy link
Member Author

Closing as this code was merged as part of another PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants