Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add classification functions #80

Merged
merged 4 commits into from
Apr 9, 2021
Merged

Add classification functions #80

merged 4 commits into from
Apr 9, 2021

Conversation

calebzulawski
Copy link
Member

@calebzulawski calebzulawski commented Mar 6, 2021

Closes #36.

@calebzulawski calebzulawski marked this pull request as ready for review March 6, 2021 14:56
@calebzulawski
Copy link
Member Author

Looks like aarch64 segfaulted during the travis build (not the test). No clue how that happened...

@workingjubilee
Copy link
Member

That's... odd. This requires rebasing anyways, can you do that and see if it repros?

@workingjubilee
Copy link
Member

Looks like we do indeed have a particularly gribbly bug!

@calebzulawski
Copy link
Member Author

I read somewhere that LLVM segfaults when an assert fails. When I run this in docker (I don't have access to an actual aarch64 machine) it succeeds, so I thought maybe we're running out of memory and a malloc assert is failing.

@Lokathor
Copy link
Contributor

I'm unclear why codegen-units=1 is in the build flags on some targets but not all targets, and I'm worried about that not being picked up by downstream, when this crate is built into core/std, or is inlined into a user's program.

If we need to reduce codegen-units for this to build without crashing LLVM, and a user of the crate leads the codegen-units at the default quantity of 256 of 16, will their build explode when they use this crate?

@calebzulawski
Copy link
Member Author

I read in an issue elsewhere that codegen-units=1 could fix aarch64 issues, but it didn't here. I just didn't remove it yet. I'm still not completely certain what is causing the issue. I can't replicate it locally with cross, so I think it's particular to the build agent in some way?

@calebzulawski
Copy link
Member Author

I think I've figured out the issue, but I'm not sure what's really happening. The problem is the simd_ne intrinsic which is only used in is_nan, is_normal, and is_subnormal. If I replace it with a naive implementation everything compiles fine on aarch64. I think this is probably a bug in LLVM?

@workingjubilee
Copy link
Member

Following up from discussion: these two examples offer repro of our bug, but only when targeting aarch64, but seemingly only when generating tests, and only in release mode.

#[test]
fn is_normal() {
    assert!(core_simd::SimdF32::<64>::splat(0.).is_normal().to_array()[0]);
}
#[test]
fn bug() {
    use core_simd::{SimdF32, SimdU32};
    let v = SimdF32::<64>::splat(0.);
    let m = v.lanes_eq(SimdF32::splat(0.0)) & v.lanes_eq(SimdF32::splat(0.0));
    assert!(m.to_array()[0]);
}

@calebzulawski
Copy link
Member Author

I reduced the maximum lane count from 64 to 32, since the error only occurred with 64-length vectors.

This prevents AVX-512 vectors of u8, but this is really only temporary until we can get an LLVM fix (and get that fixed version into rustc)

@calebzulawski calebzulawski marked this pull request as ready for review April 3, 2021 20:36
@workingjubilee
Copy link
Member

Alright, looks like this is good now with that. Have you filed the LLVM bug somewhere? I think with that posted somewhere (anywhere) this can be merged.

@calebzulawski
Copy link
Member Author

There's an LLVM bug filed (though I'm not 100% sure it's the same bug). I still need to file the rust and stdsimd bugs to track it.

@calebzulawski
Copy link
Member Author

Filed #90 and rust-lang/rust#84020.

@workingjubilee workingjubilee merged commit 0682c31 into master Apr 9, 2021
This was referenced Apr 9, 2021
@workingjubilee workingjubilee deleted the feature/comparisons branch April 14, 2021 02:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Comparison functions
3 participants