-
Notifications
You must be signed in to change notification settings - Fork 246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SIMD helpers to speed up Rust get_sad #3050
Conversation
Codecov ReportBase: 86.36% // Head: 86.32% // Decreases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## master #3050 +/- ##
==========================================
- Coverage 86.36% 86.32% -0.04%
==========================================
Files 83 83
Lines 33004 33019 +15
==========================================
Hits 28505 28505
- Misses 4499 4514 +15
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
I don't think the |
Cargo.toml
Outdated
@@ -108,6 +108,7 @@ new_debug_unreachable = "1.0.4" | |||
once_cell = "1.13.0" | |||
av1-grain = { version = "0.2.0", features = ["serialize"] } | |||
serde-big-array = { version = "0.4.1", optional = true } | |||
wide = "0.7.5" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can drop this new dependency for now, as it is unused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't worry, I'll pull it back in shortly 🙂
The Rust version of get_sad is still used during ME for regions that are not of an exact block size. This was consuming about 2.3% of runtime at speed 2. After the change, the time spend in Rust get_sad appears to be reduced by about 2/3, and the overall runtime of a speed 2 is decreased by about 1% on an AVX2 CPU.
Co-authored-by: David Michael Barr <b@rr-dav.id.au>
42cfde1
to
e23c2c5
Compare
The Rust version of get_sad is still used during ME for regions that are not of an exact block size.
This was consuming about 2.3% of runtime at speed 2.
This change causes the function to iterate over chunks of 4 that can easily be vectorized,
and then only the remainder will go through the slower non-vectorized SAD process.
After the change, the time spend in Rust get_sad
appears to be reduced by about 2/3, and the overall runtime of a speed 2 is decreased by about 1% on an AVX2 CPU.