Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-5557: [C++] Add VisitBits benchmark #4550

Closed
wants to merge 1 commit into from

Conversation

@pitrou
Copy link
Contributor

pitrou commented Jun 13, 2019

No description provided.

@pitrou

This comment has been minimized.

Copy link
Contributor Author

pitrou commented Jun 13, 2019

Results on Linux:

-----------------------------------------------------------------------
Benchmark                                Time           CPU Iterations
-----------------------------------------------------------------------
ReferenceNaiveBitmapReader/8192     123334 ns     123309 ns      11344   126.714MB/s
BitmapReader/8192                   110950 ns     110926 ns      12544    140.86MB/s
VisitBits/8192                      113867 ns     113842 ns      12290   137.252MB/s
VisitBitsUnrolled/8192               54291 ns      54280 ns      25955    287.86MB/s
ReferenceNaiveBitmapWriter/8192     110612 ns     110590 ns      12635   70.6437MB/s
BitmapWriter/8192                   111624 ns     111600 ns      12183   70.0044MB/s
FirstTimeBitmapWriter/8192           67885 ns      67871 ns      20632   115.108MB/s
GenerateBits/8192                    72072 ns      72057 ns      19008    108.42MB/s
GenerateBitsUnrolled/8192            42734 ns      42725 ns      32885   182.856MB/s
CopyBitmapWithoutOffset/8192           176 ns        176 ns    7943701   43.2839GB/s
CopyBitmapWithOffset/8192             7814 ns       7812 ns     179221   1000.06MB/s

@fsaintjacques

@pitrou

This comment has been minimized.

Copy link
Contributor Author

pitrou commented Jun 13, 2019

Is there anyone who can benchmark on MSVC? I'm getting mediocre numbers here so something may be wrong with my Windows setup.

@wesm

This comment has been minimized.

Copy link
Member

wesm commented Jun 13, 2019

I can try the benchmarks on my local baremetal Windows box

@wesm

This comment has been minimized.

Copy link
Member

wesm commented Jun 13, 2019

here's the results on my Windows machine

(arrow-dev) λ release\Release\arrow-bit-util-benchmark.exe                        
06/13/19 12:27:25                                                                 
Running release\Release\arrow-bit-util-benchmark.exe                              
Run on (8 X 3096 MHz CPU s)                                                       
CPU Caches:                                                                       
  L1 Data 32K (x4)                                                                
  L1 Instruction 32K (x4)                                                         
  L2 Unified 262K (x4)                                                            
  L3 Unified 8388K (x1)                                                           
--------------------------------------------------------------------              
Benchmark                             Time           CPU Iterations               
--------------------------------------------------------------------              
BitmapReader/8192                150894 ns     149972 ns       4480   104.186MB/s 
VisitBits/8192                   150735 ns     149972 ns       4480   104.186MB/s 
VisitBitsUnrolled/8192           192623 ns     192540 ns       3733   81.1522MB/s 
BitmapWriter/8192                 96463 ns      96257 ns       7467    81.163MB/s 
FirstTimeBitmapWriter/8192        79010 ns      78474 ns       8960   99.5556MB/s 
GenerateBits/8192                 68788 ns      68359 ns      11200   114.286MB/s 
GenerateBitsUnrolled/8192         80782 ns      81961 ns       8960   95.3191MB/s 
CopyBitmapWithoutOffset/8192         90 ns         91 ns    8960000   84.1346GB/s 
CopyBitmapWithOffset/8192         10239 ns      10045 ns      74667   777.781MB/s 

i7-8809G 3.1 GHz on Windows 10

@pitrou

This comment has been minimized.

Copy link
Contributor Author

pitrou commented Jun 13, 2019

Ok. My results are similar here. It is surprising that MSVC isn't able to optimized the unrolled versions better...

@wesm wesm force-pushed the pitrou:ARROW-5557-visit-bits-benchmark branch from 131200d to 9d502b2 Jun 18, 2019
@wesm

This comment has been minimized.

Copy link
Member

wesm commented Jun 18, 2019

@wesm wesm closed this in bffe31b Jun 18, 2019
@pitrou pitrou deleted the pitrou:ARROW-5557-visit-bits-benchmark branch Jun 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.