-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize is_main_thread by using a libc call #616
Conversation
iOS failure is not this PR's fault, I think |
Thanks for the PR! Though did you actually measure this? The implementation of In any case, I suspect there's an even better solution here: #[cfg(feature = "libc")]
use libc::pthread_main_np;
#[link(name = "c", kind = "dylib")]
#[cfg(not(feature = "libc"))]
extern "C" {
// Avoid a dependency on `libc` if possible
fn pthread_main_np() -> std::os::raw::c_int;
}
#[inline]
fn is_main_thread() -> bool {
// SAFETY: TODO
unsafe { pthread_main_np() } != 0
} |
I haven't measured this, I was going to once I had access to a Darwin machine (or more likely a VM). |
I threw one together: use criterion::{criterion_group, criterion_main, Criterion};
use objc2_foundation::NSThread;
extern "C" {
fn pthread_main_np() -> std::os::raw::c_int;
}
pub fn is_main_thread() -> bool {
unsafe { pthread_main_np() != 0 }
}
pub fn is_main_thread_cached() -> bool {
std::thread_local! {
static IS_MAIN_THREAD: bool = is_main_thread();
}
IS_MAIN_THREAD
.try_with(|x| *x)
.unwrap_or_else(|_| is_main_thread())
}
pub fn is_main_thread_nsthread() -> bool {
NSThread::isMainThread_class()
}
fn criterion_benchmark(c: &mut Criterion) {
let mut group = c.benchmark_group("main");
group.bench_function("is_main_thread", |b| b.iter(|| is_main_thread()));
group.bench_function("is_main_thread_cached", |b| {
b.iter(|| is_main_thread_cached())
});
group.bench_function("is_main_thread_nsthread", |b| {
b.iter(|| is_main_thread_nsthread())
});
group.finish();
}
criterion_group!(benches, criterion_benchmark);
criterion_main!(benches); And got the following result (on a Macbook Pro M2): main/is_main_thread time: [1.6967 ns 1.6992 ns 1.7015 ns]
main/is_main_thread_cached time: [2.5315 ns 2.5363 ns 2.5412 ns]
main/is_main_thread_nsthread time: [4.2286 ns 4.2408 ns 4.2590 ns] If I enable LTO, the result is quite different: main/is_main_thread time: [1.6856 ns 1.6886 ns 1.6913 ns]
main/is_main_thread_cached time: [461.23 ps 461.97 ps 462.81 ps]
main/is_main_thread_nsthread time: [3.7592 ns 3.7614 ns 3.7640 ns] Not sure what to make of this - is there a way to specialize the implementation based on whether LTO is enabled or not? Are there factors here that I'm not considering? |
I don't think there's a way, and even if there was I would classify it as "hacky". I have no idea why this would happen; is there something weird on macOS with LTO/dynamic linking? I would expect this to be a bug in macOS's LTO, so my way forwards to be to use |
Hmm, what do you mean? It's faster using LTO, as you'd expect? |
@@ -41,7 +41,22 @@ pub fn is_multi_threaded() -> bool { | |||
/// Whether the current thread is the main thread. | |||
#[cfg(feature = "NSThread")] | |||
pub fn is_main_thread() -> bool { | |||
NSThread::isMainThread_class() | |||
#[cfg(feature = "libc")] | |||
use libc::pthread_main_np; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not always available on GNUStep, so I suspect you might need to keep using NSThread::isMainThread_class()
on feature = "gnustep-1-7"
to make CI happy?
@@ -41,7 +41,33 @@ pub fn is_multi_threaded() -> bool { | |||
/// Whether the current thread is the main thread. | |||
#[cfg(feature = "NSThread")] | |||
pub fn is_main_thread() -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Could you #[inline]
this and the imp
functions, so that they can be inlined even without LTO?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And also, I'm doing merge commits in this repo, so if you could squash, that'd be nice!
Previous, the "is_main_thread" function would always make a call to the "NSThread::isMainThread_class" function to determine if the current thread is the main thread. This requires an objc dispatch call every time it is called. As this function or MainThreadMarker is used frequently throughout the objc2 ecosystem it takes a lot of time. The NSThread function just calls libc::pthread_main_np under the hood. Benchmarks show that calling the libc function instead is four times faster. So on platforms that have it we just call that instead. Signed-off-by: John Nunley <dev@notgull.net>
Previous, the "is_main_thread" function would always make a call to the
"NSThread::isMainThread_class" function to determine if the current thread is
the main thread. This requires an objc dispatch call every time it is called. As
this function or MainThreadMarker is used frequently throughout the objc2
ecosystem it takes a lot of time.
Crucially, the main-threadedness of a thread will never change. So we can avoid
making this objc call every time if we cache this state in a thread-local
variable. This commit does that for is_main_thread, and also makes it so
MainThreadMarker::new() uses the is_main_thread function.
Inspired by the code in async-task here:
https://github.com/smol-rs/async-task/blob/3065c372e1ef1611230195ad7f3aae80ffde8261/src/runnable.rs#L421-L428