-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Runtime detection of SSE capabilities #29
Comments
Did you mean something like this pub fn accumulate(src: &[f32]) -> Vec<u8> {
if is_x86_feature_detected!("sse") {
unsafe { accumulate_sse(src) }
} else {
let mut acc = 0.0;
src.iter()
.map(|c| {
// This would translate really well to SIMD
acc += c;
let y = acc.abs();
let y = if y < 1.0 { y } else { 1.0 };
(255.0 * y) as u8
}).collect()
}
} |
Yes, very much like that. The comment can probably be adapted though :) |
The comments can be left as bookmarks, to track the code copying patterns :D Also, the overhead of one branch instruction(for feature detection) as well as the zeroing of the result Vector(added by me in the accumulate_sse), keep bugging me. Maybe something can be done with the vector zeroing. Not sure building for one specific CPU feature is a good alternative to a runtime feature detection. Both of them, most likely insignificant and not worth the time, unless profiling tells otherwise. |
Stable Rust now has an is_x86_feature_detected macro, which should be used to switch between SSE and fallback implementations based on runtime detection of the SSE capability.
The text was updated successfully, but these errors were encountered: