Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upIntrinsics get compiled even if appropriate target_features are not enabled #323
Comments
This comment has been minimized.
This comment has been minimized.
|
This is currently intended, to allow compiling binaries that use features that aren't available at runtime (aka turns out to be dead code at runtime). It's true though that consumers who which for desired performance will need to enable the correct features as well. In any case though I think this is working as intended, so closing. |
alexcrichton
closed this
Feb 17, 2018
This comment has been minimized.
This comment has been minimized.
|
@alexcrichton are you sure that this is intended? the AES intrinsics have |
This comment has been minimized.
This comment has been minimized.
|
A comment by @newpavlov in the inline assembly tracking issue seems to expand on this issue, but IMO the issue as filled its pretty much ambiguous: @newpavlov where is the problem? Are the intrinsics from If its the first case, then its a bug here. If it is the second, then that code is probably invoking undefined behavior and you are lucky that it works at all. |
This comment has been minimized.
This comment has been minimized.
|
This function looks like undefined behavior to me: https://github.com/RustCrypto/block-ciphers/blob/c3619695bac12e03f61a5eafd08d900d4575bfb6/aesni/src/aes128/expand.rs#L28 (if you remove the global |
This comment has been minimized.
This comment has been minimized.
|
I was under the wrong impression about how stdsimd supposed to work which was based on how LLVM intrinsics operate. (i.e. compilation fails if intrinsic feature is not enabled) But for runtime detection and switching it's indeed required to have ability to enable feature locally for intrinsic through |
This comment has been minimized.
This comment has been minimized.
The only mistake I see in that library is opening |
This comment has been minimized.
This comment has been minimized.
|
I mean for |
This comment has been minimized.
This comment has been minimized.
|
The problem is that the compiler cannot know on which CPUs you might intend to run your binaries. If you are always going to run the binaries on some sets of CPUs you might not need to do (and pay) for the checks. There is an RFC about allowing calling the intrinsics without
You don't need to do that. You can use |
This comment has been minimized.
This comment has been minimized.
In that case your can just use I think I've seen somewhere proposal for introducing something like this (not precise, just a coarse strawman draft): #[target_features(feature1)]
fn foo1() {
// safe to use intrinsics dependent on "feature1"
}
#[target_features(feature2)]
fn foo2() {
// safe to use intrinsics dependent on "feature2"
}
fn foo_fallback() {}
// will not compile unless `feature1` is enabled for the whole crate
// through `#![cfg(..)]` or something similar
fn func1() {
foo1()
}
fn func2() {
// this macro allows to call feature-gated functions using runtime detection
runtime_match_feature!{
"feature1" => foo1(),
"feature2" => foo2(),
_ => foo_fallback(),
}
}
// macro will refuse to compile without crate-wise enabled `any(feature1, feature2)` or
// by marking this function `#[target_features(any(feature1, feature2))]`
fn func3() {
runtime_match_feature!{
"feature1" => foo1(),
"feature2" => foo2(),
}
}
// behaves same as the `runtime_match_feature`, but performs switching at compile time
// depending on the enabled features.
fn func4() {
compile_match_feature!{
"feature1" => foo1(),
"feature2" => foo2(),
_ => foo_fallback(),
}
}Here compiler build sort-of tree of feature restrictions and
Personally I strongly dislike this approach, as it introduces runtime errors for cases which can be detected at compile time, and strong compile time checks is one of the main selling points of Rust for me. |
This comment has been minimized.
This comment has been minimized.
|
Something like that is what rust-lang/rfcs#2212 proposes. But
You can also use |
This comment has been minimized.
This comment has been minimized.
|
Thank you for the link! Yes, I know about |
This comment has been minimized.
This comment has been minimized.
Yeah the codegen differences happen because LLVM can't inline across functions which enable different sets of target featues. If the AES feature isn't enabled then no calls to the AES intrinsics can get inlined, which likely destroys performance. It's also technically not undefined behavior I think if you run it on the right CPU, so I think LLVM can't do too much as it has to ensure the program works in that case at least. |
newpavlov commentedFeb 16, 2018
•
edited
If we'll take
aesnicrate and will remove the following lines fromlib.rs:Crate will compile even without
RUSTFLAGS="-C target_feature=+aes". (although resulting code will be 4 times slower) Is it intended behavior?