-
Notifications
You must be signed in to change notification settings - Fork 13.7k
std: optimize dlsym!
macro and add a test for it
#146019
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
dlsym!
macro and add a test for it
This comment has been minimized.
This comment has been minimized.
The `dlsym!` macro always ensures that the name string is nul-terminated, so there is no need to perform the check at runtime. Also, acquire loads are generally faster than a load and a barrier, so use them. This is only false in the case where the symbol is missing, but that shouldn't matter too much.
Just curious... @bors2 try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
std: optimize `dlsym!` macro and add a test for it
pub(crate) const fn new(name: &'static str) -> Self { | ||
let Ok(name) = CStr::from_bytes_with_nul(name.as_bytes()) else { | ||
panic!("not a nul-terminated string") | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be a bit more accurate to take a &'static CStr
in the function signature and then move the CStr::from_bytes_with_nul(...).unwrap()
to macro dlsym
, so this function isn't ever called with invalid inputs (not that it's likely to be used anywhere else).
match self.func.load(Ordering::Acquire) { | ||
func if func.addr() == 1 => self.initialize(), | ||
func if func.is_null() => None, | ||
func => Some(unsafe { mem::transmute_copy::<*mut c_void, F>(&func) }), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While you're here, mind adding a safety comment?
use crate::ffi::c_int; | ||
|
||
#[test] | ||
fn dlsym() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we have that other branch, maybe also want to add a test for:
dlsym! {
#[link_name = "abs"]
fn definitely_not_abs(i: c_int) -> c_int;
}
const { | ||
if size_of::<F>() != size_of::<*mut libc::c_void>() { | ||
panic!("not a function pointer") | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional, but this could probably be enforced a bit stronger with the bound F: crate::marker::FnPtr
. Even if that bound is added, we should likely avoid using its .addr()
method for now until it gets the planned updates (maybe worth a FIXME).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, it's preexisting but any idea why transmute_copy
is used rather than transmute
? That would enforce the size constraint.
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (e2bfd7f): comparison URL. Overall result: ✅ improvements - no action neededBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary -1.0%, secondary -1.9%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary 1.9%, secondary 2.2%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 467.034s -> 466.989s (-0.01%) |
The
dlsym!
macro always ensures that the name string is nul-terminated, so there is no need to perform the check at runtime. Also, acquire loads are generally faster than a load and a barrier, so use them. This is only false in the case where the symbol is missing, but that shouldn't matter too much.