-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement tanh, fixes #225 #226
Conversation
src/codegen/math/float/tanh.rs
Outdated
use libm::{ | ||
F32Ext, | ||
F64Ext | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to move these traits down to the functions that implement them.
Looks like it all builds and the tests pass. I have also checked with If you are happy with the changes I can squash and force-push the commits. |
Cool, so the issue was the use of the aligned load and stores ? |
Yepp, looks like it. That was the only semantic change I did.
…On Mon, Mar 11, 2019, 16:02 gnzlbg ***@***.***> wrote:
Looks like it all builds and the tests pass.
Cool, so the issue was the use of the aligned load and stores ?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#226 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAqy-QqnEuuOFHIo4qrBNxg3D0cXxcZrks5vVm_7gaJpZM4bmueq>
.
|
src/codegen/math/float/tanh.rs
Outdated
($name:ident, $basetype:ty, $simdtype:ty, $lanes:expr, $trait:path) => { | ||
fn $name(x: $simdtype) -> $simdtype { | ||
let mut buf = [0.0; $lanes]; | ||
x.write_to_slice_unaligned(&mut buf); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of using write_to_slice
here, I think it would be better to just do a transmute
:
let mut buf: [$base_type; $lanes] = unsafe { mem::transmute(buf) };
// ...
unsafe { mem::transmute(buf) }
since arrays and simd vectors are layout-compatible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this looks really good and is exactly what I had in mind. I left only one nit, but otherwise I'll merge once CI is green, thank you for working on this!
This implements tanh for packed vectors. This is primarily interesting when using sleef-sys for its simd implemenations of tanh. Since llvm does not contain tanh intrinsics, the libm implementation is used for primitives, and packed vectors are transmuted into slices before applying the libm tanh to each of its elements.
Alright, done. I am just a bit confused about benchmarking this. Take this library: use packed_simd::f64x4 as f64s;
pub fn packed_simd_tanh(src: &[f64], dst: &mut [f64]) {
assert_eq!(src.len() % f64s::lanes(), 0);
assert_eq!(src.len(), dst.len());
let lanes = f64s::lanes();
src.chunks_exact(lanes)
.zip(dst.chunks_exact_mut(lanes))
.for_each(|(s, d)| {
let s_v = f64s::from_slice_unaligned(s);
f64s::write_to_slice_unaligned(s_v, d);
});
}
pub fn standard_tanh(src: &[f64], dst: &mut [f64]) {
assert_eq!(src.len(), dst.len());
src.iter()
.zip(dst.iter_mut())
.for_each(|(s, d)| {
*d = s.tanh();
});
} And these benches: #![feature(test)]
extern crate test;
#[bench]
fn packed_simd(bench: &mut test::Bencher) {
let n = 2usize << 20;
let v = vec![1.0; n];
let mut w = vec![0.0; n];
bench.iter(|| {
vectorize_tanh::packed_simd_tanh(&v, &mut w);
});
}
#[bench]
fn std(bench: &mut test::Bencher) {
let n = 2usize << 20;
let v = vec![1.0; n];
let mut w = vec![0.0; n];
bench.iter(|| {
vectorize_tanh::standard_tanh(&v, &mut w);
});
} Compiling this with
Any idea why this might be? |
No idea. The benchmarks have no effect in theory (writing the result of tanh but never using it, should probably use |
Yeah, you are right about that. Chances are that the Some of the checks seem to be failing, but that seems to be a platform specific issue. |
@SuperFluffy sorry about the delay here! Please feel free to ping me next time if I don't answer within a day. There is still one build job failing (the |
Thank you a lot @SuperFluffy ! @hsivonen there is one build job failing (the thumbv7neon android one), see https://travis-ci.com/rust-lang-nursery/packed_simd/jobs/184006248#L4787 . Somehow it appears that the wrong |
This PR tries to address #225. It currently does not compile, because it turns out that
{f32,f64}::tanh
are provided viacmath
, i.e. via stdlib.Any suggestions how to implement
tanh_f32
and friends?