Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-10914: [Rust] Refactor simd arithmetic kernels to use chunked iteration #8929

Conversation

jhorstmann
Copy link
Contributor

This should avoid out of bounds reads in those kernels and also removes some unsafe blocks. The performance actually seems to increase slightly, I will post the numbers later.

The division kernel looks a bit more complex now, but follows the same pattern as used in the simd aggregation functions, with two separate codepaths for the nullable/non-null cases.

The comparison kernels might also do some out of bounds reads, I will look at those in a separate ticket/PR.

@github-actions
Copy link

@jhorstmann jhorstmann marked this pull request as ready for review December 15, 2020 19:43
@jhorstmann
Copy link
Contributor Author

I'm seeing bigger variations on the benchmark numbers between runs, the results below between current master and this pr are from one of the faster runs:

~/arrow/rust/arrow$ cargo +nightly-2020-11-24 bench --features simd --bench arithmetic_kernels
   Compiling arrow v3.0.0-SNAPSHOT (/home/ubuntu/arrow/rust/arrow)
    Finished bench [optimized] target(s) in 1m 39s
     Running /home/ubuntu/arrow/rust/target/release/deps/arithmetic_kernels-4b00176b1c415f0c
Gnuplot not found, using plotters backend
add 512                 time:   [319.21 ns 319.38 ns 319.52 ns]                    
                        change: [-24.243% -24.193% -24.140%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  2 (2.00%) low severe
  5 (5.00%) low mild
  2 (2.00%) high mild
  1 (1.00%) high severe

subtract 512            time:   [387.54 ns 387.86 ns 388.40 ns]                         
                        change: [-19.909% -19.766% -19.578%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  1 (1.00%) low mild
  3 (3.00%) high mild
  2 (2.00%) high severe

multiply 512            time:   [515.01 ns 517.68 ns 522.09 ns]                          
                        change: [-9.9324% -9.6166% -9.1766%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) high mild
  3 (3.00%) high severe

divide 512              time:   [430.74 ns 430.80 ns 430.86 ns]                       
                        change: [-41.347% -41.234% -41.143%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  1 (1.00%) low mild
  6 (6.00%) high mild
  3 (3.00%) high severe

limit 512, 512          time:   [107.84 ns 107.94 ns 108.12 ns]                           
                        change: [+0.6778% +0.9025% +1.2602%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) low mild
  4 (4.00%) high mild
  2 (2.00%) high severe

add_nulls_512           time:   [383.53 ns 383.61 ns 383.69 ns]                          
                        change: [-20.450% -20.350% -20.270%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  1 (1.00%) low severe
  1 (1.00%) low mild
  3 (3.00%) high mild
  5 (5.00%) high severe

divide_nulls_512        time:   [543.48 ns 543.96 ns 544.84 ns]                              
                        change: [-23.819% -23.700% -23.587%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  3 (3.00%) low severe
  1 (1.00%) low mild
  1 (1.00%) high mild
  5 (5.00%) high severe

Copy link
Member

@jorgecarleitao jorgecarleitao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot, @jhorstmann !

I did not go through all the details of this PR, but what I read looks really good and makes the code much easier to read. So, 👍

GeorgeAp pushed a commit to sirensolutions/arrow that referenced this pull request Jun 7, 2021
…teration

This should avoid out of bounds reads in those kernels and also removes some unsafe blocks. The performance actually seems to increase slightly, I will post the numbers later.

The division kernel looks a bit more complex now, but follows the same pattern as used in the simd aggregation functions, with two separate codepaths for the nullable/non-null cases.

The comparison kernels might also do some out of bounds reads, I will look at those in a separate ticket/PR.

Closes apache#8929 from jhorstmann/ARROW-10914-simd-arithmetic-read-out-of-bounds

Authored-by: Jörn Horstmann <joern.horstmann@signavio.com>
Signed-off-by: Jorge C. Leitao <jorgecarleitao@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants