Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std::threads: revisit stack address calculation on netbsd. #122002

Merged
merged 2 commits into from Mar 11, 2024

Conversation

devnexen
Copy link
Contributor

@devnexen devnexen commented Mar 4, 2024

like older linux glibc versions, we need to get the guard size
and increasing the stack's bottom address accordingly.

like older linux glibc versions, we need to get the guard size
 and increasing the stack's bottom address accordingly.
@rustbot
Copy link
Collaborator

rustbot commented Mar 4, 2024

r? @workingjubilee

rustbot has assigned @workingjubilee.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added O-unix Operating system: Unix-like S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Mar 4, 2024
Copy link
Contributor

@workingjubilee workingjubilee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR fixing this for NetBSD! I may be being a bit thick, so my apologies, but it would help if you could explain some context here.

library/std/src/sys/pal/unix/thread.rs Outdated Show resolved Hide resolved
library/std/src/sys/pal/unix/thread.rs Outdated Show resolved Hide resolved
library/std/src/sys/pal/unix/thread.rs Outdated Show resolved Hide resolved
@workingjubilee
Copy link
Contributor

cc @he32 It's not clear to me which versions of NetBSD we support based on our platform support docs for NetBSD. Can you offer some guidance on what we should do here, in light of things like https://gnats.netbsd.org/57721

@he32
Copy link
Contributor

he32 commented Mar 5, 2024

cc @he32 It's not clear to me which versions of NetBSD we support based on our platform support docs for NetBSD. Can you offer some guidance on what we should do here, in light of things like https://gnats.netbsd.org/57721

The TL;DR; version is "NetBSD 9.0 or newer".

We used to support rust on NetBSD/8.0, but recent-ish bumps of LLVM have upped the requirement for the C++ compiler to such an extent that supporting 8.0 was no longer feasible using the "native" C++ compiler.

There are platform-dependent exceptions, e.g. the riscv64 target is so new that it requires ~10.0 or newer -- the port was only recently added, and I think it still only runs in an emulator (qemu).

That said, I have sadly not yet been able to translate Taylor's investigations in the quoted NetBSD PR back into a workable platform-dependent patch for rust. However, rust 1.75.0 on NetBSD/amd64 builds a working firefox and thunderbird, and I have no reason to believe 1.76.0 is any worse in that regard.

@riastradh
Copy link

Is there an explanation somewhere of why Rust is querying the stack parameters, and what Rust does with them? Programs don't normally need to do this, so it would be helpful to know what Rust is trying to achieve.

The layout of the stack in NetBSD is documented here: https://man.NetBSD.org/stack.7

Note that it is different for the main thread and for threads created with pthread_create. Also, the usable size of the main thread's stack -- and thus the exact least/greatest address of the stack, depending on whether it grows down or up -- may vary at runtime depending on the stack size rlimit. So if you ask for the least/greatest usable stack address at two different times, you may get two different answers, if the process called setrlimit in the middle. That's why it would be helpful to know what Rust wants this for before committing to a way to do it.

@workingjubilee
Copy link
Contributor

workingjubilee commented Mar 6, 2024

@riastradh This information is used to make decisions on how to set up our guard pages, if we need to set them up, which itself is a very OS-specific decision. It is also used to inform error reporting on whether the reason we hit a signal handler is stack overflow or what. This function will be called at program startup:

pub unsafe fn init() -> Option<Guard> {
let page_size = os::page_size();
PAGE_SIZE.store(page_size, Ordering::Relaxed);
if cfg!(all(target_os = "linux", not(target_env = "musl"))) {
// Linux doesn't allocate the whole stack right away, and
// the kernel has its own stack-guard mechanism to fault
// when growing too close to an existing mapping. If we map
// our own guard, then the kernel starts enforcing a rather
// large gap above that, rendering much of the possible
// stack space useless. See #43052.

@workingjubilee
Copy link
Contributor

workingjubilee commented Mar 6, 2024

Programs don't normally need to do this, so it would be helpful to know what Rust is trying to achieve.

Programs, strictly speaking, do not "need" to do anything useful, yet we demand they be useful anyways. int main() { while (true) {} } is a valid C program. So I'd ask that we focus on what we can do, rather than jump to assumptions about necessity, as asking too many questions about necessity quickly lands in rather existentialist territory about why anything instead of nothing? And because I am happy to entertain "anything" instead of "nothing", I would suggest that detecting and reporting stack overflow, or indeed any useful error data at all, instead of doing nothing and contenting ourselves with Segmentation fault (core dumped), is in fact preferable.

Anyways, I don't think anything in std calls pthread_attr_setstack, only pthread_attr_setstacksize upon spawning a thread.

@workingjubilee
Copy link
Contributor

workingjubilee commented Mar 6, 2024

@devnexen Hmm. Is this fallthrough implementation correct for NetBSD? We don't have to also fix this in the same PR, but I'm feeling somewhat doubtful that the code becomes meaningfully more correct if we don't touch both places.

} else {
// Reallocate the last page of the stack.
// This ensures SIGBUS will be raised on
// stack overflow.
// Systems which enforce strict PAX MPROTECT do not allow
// to mprotect() a mapping with less restrictive permissions
// than the initial mmap() used, so we mmap() here with
// read/write permissions and only then mprotect() it to
// no permissions at all. See issue #50313.
let stackptr = get_stack_start_aligned()?;
let result = mmap64(
stackptr,
page_size,
PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANON | MAP_FIXED,
-1,
0,
);
if result != stackptr || result == MAP_FAILED {
panic!("failed to allocate a guard page: {}", io::Error::last_os_error());
}
let result = mprotect(stackptr, page_size, PROT_NONE);
if result != 0 {
panic!("failed to protect the guard page: {}", io::Error::last_os_error());
}
let guardaddr = stackptr.addr();
Some(guardaddr..guardaddr + page_size)
}

@devnexen
Copy link
Contributor Author

devnexen commented Mar 6, 2024

@devnexen Hmm. Is this fallthrough implementation correct for NetBSD? We don't have to also fix this in the same PR, but I'm feeling somewhat doubtful that the code becomes meaningfully more correct if we don't touch both places.

I m not sure it s incorrect but more do we really to do this with NetBSD ? I m fine reverting back this change if needed.

@riastradh
Copy link

So I'd ask that we focus on what we can do, rather than jump to assumptions about necessity, as asking too many questions about necessity quickly lands in rather existentialist territory about why anything instead of nothing?

Sorry, didn't mean to get philosophical here, just wanted to know what this query is in the service of. I only said that to explain it's such an unusual thing to do I didn't have a guess about why Rust might be doing it.

Anyways, I don't think anything in std calls pthread_attr_setstack, only pthread_attr_setstacksize upon spawning a thread.

Great, so that means you don't have to worry about the pthread_attr_setstack bug with the guard region.

@riastradh This information is used to make decisions on how to set up our guard pages, if we need to set them up, which itself is a very OS-specific decision.

If you want to guarantee there is a guard region of at least a particular size in threads created with pthread_create, you can do that portably with pthread_attr_setguardsize, so there's no need to query it for that purpose.

For the main thread, it may be a different story on other operating systems. But on NetBSD, there is always a guard region of a number of bytes given by sysctl vm.guard_size -- default one megabyte (1048576 bytes), and I don't think there's any supported way to change it. So unless you want a guard region larger than that, there's no need to query or do anything. (That's separate from the pages allocated for the difference between the soft and hard stack size rlimits on exec, which is what https://man.netbsd.org/stack.7 calls the inaccessible pages.)

It is also used to inform error reporting on whether the reason we hit a signal handler is stack overflow or what.

In that case:

  • For the main thread, any SIGSEGV with .si_code = SEGV_ACCERR and .si_addr anywhere between the process's stack base (obtained from the ELF auxiliary vector AT_STACKBASE entry) and the hard rlimit, up or down depending on the direction of stack growth, is probably a stack overflow. (It could also be an attempt to execute code in the stack, though, and I'm not sure offhand if there's an easy way to distinguish that case.)

    I think you may also be able to narrow it down to stack overflow vs some other abuse of stack address space by examining getrusage .ru_isrss to see whether .si_addr has gone beyond the actual stack utilization of the process.

    (One can recover the stack base from pthread_attr_getstack in the main thread, but it's a little circuitous: if the stack grows up, you get the stack base directly; if the stack grows down, you have to add the (process-initial) soft rlimit to the address returned by pthread_attr_getstack to get the stack base. If you never call setrlimit(RLIMIT_STACK) to change the soft limit, you can just use the main thread's pthread_attr_getstack region directly for this, but I would guess Rust does provide some way to call setrlimit(RLIMIT_STACK) so you can't rely on it.)

  • For a thread created with pthread_create, any SIGSEGV with .si_code = SEGV_ACCERR and .si_addr in the region returned by pthread_attr_getstack is probably a stack overflow (with the same caveat about distinguishing execute access).

This function will be called at program startup:

pub unsafe fn init() -> Option<Guard> {
let page_size = os::page_size();
PAGE_SIZE.store(page_size, Ordering::Relaxed);
if cfg!(all(target_os = "linux", not(target_env = "musl"))) {
// Linux doesn't allocate the whole stack right away, and
// the kernel has its own stack-guard mechanism to fault
// when growing too close to an existing mapping. If we map
// our own guard, then the kernel starts enforcing a rather
// large gap above that, rendering much of the possible
// stack space useless. See #43052.

So I gathered, but what uses the results of that function?

@workingjubilee
Copy link
Contributor

workingjubilee commented Mar 7, 2024

(One can recover the stack base from pthread_attr_getstack in the main thread, but it's a little circuitous: if the stack grows up, you get the stack base directly; if the stack grows down, you have to add the (process-initial) soft rlimit to the address returned by pthread_attr_getstack to get the stack base. If you never call setrlimit(RLIMIT_STACK) to change the soft limit, you can just use the main thread's pthread_attr_getstack region directly for this, but I would guess Rust does provide some way to call setrlimit(RLIMIT_STACK) so you can't rely on it.)

We do not "provide some way to call setrlimit" per se, however I should note that it is incorrect to reason about what std does as definitive for a process, because we do "provide a way" to call arbitrary extern "C" fn, by allowing people to write

extern "C" {
    fn setrlimit(resource: i32, rlim: *const rlimit) -> i32;
}

And a Rust-defined extern "C" fn can be called by a C program. All this is why I was thinking about the pthread_attr_setstack bug anyways.

@riastradh
Copy link

We do not "provide some way to call setrlimit" per se, however I should note that it is incorrect to reason about what std does as definitive for a process, because we do "provide a way" to call arbitrary extern "C" fn, by allowing people to write

extern "C" {
    fn setrlimit(resource: i32, rlim: *const rlimit) -> i32;
}

And a Rust-defined extern "C" fn can be called by a C program. All this is why I was thinking about the pthread_attr_setstack bug anyways.

There's a difference between these two issues:

  • This logic doesn't need to care if some other Rust code calls pthread_attr_setstack and triggers the netbsd<=9 bug. (The other Rust code needs to do pthread_attr_setguardsize(A, 0) and it'll be fine.)
  • This logic does need to care if some other Rust code changes the soft stack rlimit. If this logic caches the results of pthread_attr_getstack without any adjustment for the main thread, the other code may invalidate that cache and break whatever relies on the cache. (And I don't think this is peculiar to NetBSD -- I expect it's the same deal on most operating systems.)

@workingjubilee
Copy link
Contributor

So I gathered, but what uses the results of that function?

To answer this, it is called in the main runtime startup:

rust/library/std/src/rt.rs

Lines 71 to 107 in 1c580bc

// One-time runtime initialization.
// Runs before `main`.
// SAFETY: must be called only once during runtime initialization.
// NOTE: this is not guaranteed to run, for example when Rust code is called externally.
//
// # The `sigpipe` parameter
//
// Since 2014, the Rust runtime on Unix has set the `SIGPIPE` handler to
// `SIG_IGN`. Applications have good reasons to want a different behavior
// though, so there is a `#[unix_sigpipe = "..."]` attribute on `fn main()` that
// can be used to select how `SIGPIPE` shall be setup (if changed at all) before
// `fn main()` is called. See <https://github.com/rust-lang/rust/issues/97889>
// for more info.
//
// The `sigpipe` parameter to this function gets its value via the code that
// rustc generates to invoke `fn lang_start()`. The reason we have `sigpipe` for
// all platforms and not only Unix, is because std is not allowed to have `cfg`
// directives as this high level. See the module docs in
// `src/tools/tidy/src/pal.rs` for more info. On all other platforms, `sigpipe`
// has a value, but its value is ignored.
//
// Even though it is an `u8`, it only ever has 4 values. These are documented in
// `compiler/rustc_session/src/config/sigpipe.rs`.
#[cfg_attr(test, allow(dead_code))]
unsafe fn init(argc: isize, argv: *const *const u8, sigpipe: u8) {
unsafe {
sys::init(argc, argv, sigpipe);
let main_guard = sys::thread::guard::init();
// Next, set up the current Thread with the guard information we just
// created. Note that this isn't necessary in general for new threads,
// but we just do this to name the main thread and to give it correct
// info about the stack bounds.
let thread = Thread::new(Some(rtunwrap!(Ok, CString::new("main"))));
thread_info::set(main_guard, thread);
}
}

Which then uses it to set these:

thread_local! {
static THREAD_INFO: ThreadInfo = const { ThreadInfo {
stack_guard: OnceCell::new(),
thread: OnceCell::new()
} };
}

Which finally becomes an answer for a question way over here:

unsafe extern "C" fn signal_handler(
signum: libc::c_int,
info: *mut libc::siginfo_t,
_data: *mut libc::c_void,
) {
let guard = thread_info::stack_guard().unwrap_or(0..0);

Wow that's a lot of hops to trace this logic.

@riastradh
Copy link

Wow that's a lot of hops to trace this logic.

OK, cool, thanks for tracking this down!

Is there another call for threads created with pthread_create, or is it only called for the main thread?

If the code you quoted covers all uses of the outputs, then it looks like this logic is used only to determine a region of memory in which a SIGSEGV probably indicates a stack overflow.

In that case: I suggest you just use pthread_attr_getstack to get p and n, and pthread_attr_getguardsize to get g (or use g = PAGE_SIZE), and use [p - g, p) on machines where the stack grows down, or [p + n, p + n + g) on machines where the stack grows up, as the guard region for the SIGSEGV handler's stack overflow detection.

This is not quite accurate for the main thread, but the harm of the inaccuracy is just that some SIGSEGVs which actually are stack overflows -- in programs that have raised the soft rlimit -- may not be reported as stack overflows. Nothing else bad will happen.

So these existing fragments are probably fine to use for NetBSD, on machines where stacks grow down:

#[cfg(any(
target_os = "android",
target_os = "freebsd",
target_os = "hurd",
target_os = "linux",
target_os = "netbsd",
target_os = "l4re"
))]
unsafe fn get_stack_start() -> Option<*mut libc::c_void> {
let mut ret = None;
let mut attr: libc::pthread_attr_t = crate::mem::zeroed();
#[cfg(target_os = "freebsd")]
assert_eq!(libc::pthread_attr_init(&mut attr), 0);
#[cfg(target_os = "freebsd")]
let e = libc::pthread_attr_get_np(libc::pthread_self(), &mut attr);
#[cfg(not(target_os = "freebsd"))]
let e = libc::pthread_getattr_np(libc::pthread_self(), &mut attr);
if e == 0 {
let mut stackaddr = crate::ptr::null_mut();
let mut stacksize = 0;
assert_eq!(libc::pthread_attr_getstack(&attr, &mut stackaddr, &mut stacksize), 0);
ret = Some(stackaddr);
}
if e == 0 || cfg!(target_os = "freebsd") {
assert_eq!(libc::pthread_attr_destroy(&mut attr), 0);
}
ret
}

if cfg!(all(target_os = "linux", not(target_env = "musl"))) {
// Linux doesn't allocate the whole stack right away, and
// the kernel has its own stack-guard mechanism to fault
// when growing too close to an existing mapping. If we map
// our own guard, then the kernel starts enforcing a rather
// large gap above that, rendering much of the possible
// stack space useless. See #43052.
//
// Instead, we'll just note where we expect rlimit to start
// faulting, so our handler can report "stack overflow", and
// trust that the kernel's own stack guard will work.
let stackptr = get_stack_start_aligned()?;
let stackaddr = stackptr.addr();
Some(stackaddr - page_size..stackaddr)

Or the OpenBSD case below, which is identical:

} else if cfg!(target_os = "openbsd") {
// OpenBSD stack already includes a guard page, and stack is
// immutable.
//
// We'll just note where we expect rlimit to start
// faulting, so our handler can report "stack overflow", and
// trust that the kernel's own stack guard will work.
let stackptr = get_stack_start_aligned()?;
let stackaddr = stackptr.addr();
Some(stackaddr - page_size..stackaddr)

Side notes about this code:

  • You don't need to mmap or mprotect guard pages. NetBSD already does this for you.
  • You don't need to use pthread_attr_setguardsize. It's only relevant when creating threads with pthread_create; it doesn't serve any purpose in querying the stack parameters of existing threads.

The above assumes you don't create pthreads with pthread_attr_setstack. If you do, then it is your responsibility to allocate guard pages and tell the signal handler what guard pages you allocated for it. (That's true for all operating systems; it's not NetBSD-specific.)

@riastradh
Copy link

@devnexen Why are you still defining a NetBSD-specific get_stack_start, when it looks like the existing one that is currently shared with android/freebsd/hurd/linux/l4re should work just fine (or, no worse than it already works for those other platforms, anyway)?

You seem to be trying to work around the pthread_attr_setstack bug, but that bug doesn't apply here (and adjusting the stack address you get out of pthread_attr_getstack doesn't help with that bug). The pthread_attr_setstack bug applies only when you create a thread with pthread_attr_setstack, which this logic is not doing; it doesn't apply when you query an existing thread, which this logic is doing.

@workingjubilee
Copy link
Contributor

Wow that's a lot of hops to trace this logic.

OK, cool, thanks for tracking this down!

Is there another call for threads created with pthread_create, or is it only called for the main thread?

If the code you quoted covers all uses of the outputs, then it looks like this logic is used only to determine a region of memory in which a SIGSEGV probably indicates a stack overflow.

Almost. The sys::thread::guard::init fn is only called on the first-time startup there, which does whatever guard mapping it needs. There is one other place that calls thread_info::set, which takes the stack guard argument, and it is thread::Builder::spawn_unchecked:

unsafe fn spawn_unchecked_<'a, 'scope, F, T>(
self,
f: F,
scope_data: Option<Arc<scoped::ScopeData>>,
) -> io::Result<JoinInner<'scope, T>>
where
F: FnOnce() -> T,
F: Send + 'a,
T: Send + 'a,
'scope: 'a,
{
let Builder { name, stack_size } = self;
let stack_size = stack_size.unwrap_or_else(thread::min_stack);
let my_thread = Thread::new(name.map(|name| {
CString::new(name).expect("thread name may not contain interior null bytes")
}));
let their_thread = my_thread.clone();
let my_packet: Arc<Packet<'scope, T>> = Arc::new(Packet {
scope: scope_data,
result: UnsafeCell::new(None),
_marker: PhantomData,
});
let their_packet = my_packet.clone();
let output_capture = crate::io::set_output_capture(None);
crate::io::set_output_capture(output_capture.clone());
// Pass `f` in `MaybeUninit` because actually that closure might *run longer than the lifetime of `F`*.
// See <https://github.com/rust-lang/rust/issues/101983> for more details.
// To prevent leaks we use a wrapper that drops its contents.
#[repr(transparent)]
struct MaybeDangling<T>(mem::MaybeUninit<T>);
impl<T> MaybeDangling<T> {
fn new(x: T) -> Self {
MaybeDangling(mem::MaybeUninit::new(x))
}
fn into_inner(self) -> T {
// SAFETY: we are always initialized.
let ret = unsafe { self.0.assume_init_read() };
// Make sure we don't drop.
mem::forget(self);
ret
}
}
impl<T> Drop for MaybeDangling<T> {
fn drop(&mut self) {
// SAFETY: we are always initialized.
unsafe { self.0.assume_init_drop() };
}
}
let f = MaybeDangling::new(f);
let main = move || {
if let Some(name) = their_thread.cname() {
imp::Thread::set_name(name);
}
crate::io::set_output_capture(output_capture);
// SAFETY: we constructed `f` initialized.
let f = f.into_inner();
// SAFETY: the stack guard passed is the one for the current thread.
// This means the current thread's stack and the new thread's stack
// are properly set and protected from each other.
thread_info::set(unsafe { imp::guard::current() }, their_thread);
let try_result = panic::catch_unwind(panic::AssertUnwindSafe(|| {
crate::sys_common::backtrace::__rust_begin_short_backtrace(f)
}));
// SAFETY: `their_packet` as been built just above and moved by the
// closure (it is an Arc<...>) and `my_packet` will be stored in the
// same `JoinInner` as this closure meaning the mutation will be
// safe (not modify it and affect a value far away).
unsafe { *their_packet.result.get() = Some(try_result) };
// Here `their_packet` gets dropped, and if this is the last `Arc` for that packet that
// will call `decrement_num_running_threads` and therefore signal that this thread is
// done.
drop(their_packet);
// Here, the lifetime `'a` and even `'scope` can end. `main` keeps running for a bit
// after that before returning itself.
};

From guard::current(), which is like this for NetBSD:

#[cfg(any(
target_os = "android",
target_os = "freebsd",
target_os = "hurd",
target_os = "linux",
target_os = "netbsd",
target_os = "l4re"
))]
pub unsafe fn current() -> Option<Guard> {
let mut ret = None;
let mut attr: libc::pthread_attr_t = crate::mem::zeroed();
#[cfg(target_os = "freebsd")]
assert_eq!(libc::pthread_attr_init(&mut attr), 0);
#[cfg(target_os = "freebsd")]
let e = libc::pthread_attr_get_np(libc::pthread_self(), &mut attr);
#[cfg(not(target_os = "freebsd"))]
let e = libc::pthread_getattr_np(libc::pthread_self(), &mut attr);
if e == 0 {
let mut guardsize = 0;
assert_eq!(libc::pthread_attr_getguardsize(&attr, &mut guardsize), 0);
if guardsize == 0 {
if cfg!(all(target_os = "linux", target_env = "musl")) {
// musl versions before 1.1.19 always reported guard
// size obtained from pthread_attr_get_np as zero.
// Use page size as a fallback.
guardsize = PAGE_SIZE.load(Ordering::Relaxed);
} else {
panic!("there is no guard page");
}
}
let mut stackptr = crate::ptr::null_mut::<libc::c_void>();
let mut size = 0;
assert_eq!(libc::pthread_attr_getstack(&attr, &mut stackptr, &mut size), 0);
let stackaddr = stackptr.addr();
ret = if cfg!(any(target_os = "freebsd", target_os = "netbsd", target_os = "hurd")) {
Some(stackaddr - guardsize..stackaddr)
} else if cfg!(all(target_os = "linux", target_env = "musl")) {
Some(stackaddr - guardsize..stackaddr)
} else if cfg!(all(target_os = "linux", any(target_env = "gnu", target_env = "uclibc")))
{
// glibc used to include the guard area within the stack, as noted in the BUGS
// section of `man pthread_attr_getguardsize`. This has been corrected starting
// with glibc 2.27, and in some distro backports, so the guard is now placed at the
// end (below) the stack. There's no easy way for us to know which we have at
// runtime, so we'll just match any fault in the range right above or below the
// stack base to call that fault a stack overflow.
Some(stackaddr - guardsize..stackaddr + guardsize)
} else {
Some(stackaddr..stackaddr + guardsize)
};
}
if e == 0 || cfg!(target_os = "freebsd") {
assert_eq!(libc::pthread_attr_destroy(&mut attr), 0);
}
ret
}
}

We may wish to adjust how we set up threads, in light of this information (this has been quite illuminating, thank you! I wish the man pages for the pthread headers were as clear...) but I believe it's currently harmless, because this information still is only consumed by the signal handler.

Side notes about this code:

* You don't need to mmap or mprotect guard pages.  NetBSD already does this for you.

Right, it sounds like for sys::thread::guard::init the OpenBSD branch here is the one that NetBSD should be using? Though it sounds like the stack isn't immutable, exactly, on NetBSD, but we "don't care" because, thinking about it more clearly now, anyone running pthread_attr_setstack has to be running it after sys::thread::guard::init, in which case we can't exactly interdict their syscalls, as far as I know, and they're taking their life into their own hands, OR we never get to run this function anyways.

} else if cfg!(target_os = "openbsd") {
// OpenBSD stack already includes a guard page, and stack is
// immutable.
//
// We'll just note where we expect rlimit to start
// faulting, so our handler can report "stack overflow", and
// trust that the kernel's own stack guard will work.
let stackptr = get_stack_start_aligned()?;
let stackaddr = stackptr.addr();
Some(stackaddr - page_size..stackaddr)

@workingjubilee
Copy link
Contributor

Based on my understanding of how things are so far, I think this is the correct change, and nothing in my last message actually changes things, so we can ship this, assuming @riastradh or @he32 also agrees. Thank you for bearing with us as we hashed this out, @devnexen. You do a lot of work for the BSD targets and it's nice to see.

@bors delegate=@riastradh rollup=always

@bors
Copy link
Contributor

bors commented Mar 10, 2024

✌️ @riastradh, you can now approve this pull request!

If @workingjubilee told you to "r=me" after making some further change, please make that change, then do @bors r=@workingjubilee

@riastradh
Copy link

Is there another call for threads created with pthread_create, or is it only called for the main thread?
If the code you quoted covers all uses of the outputs, then it looks like this logic is used only to determine a region of memory in which a SIGSEGV probably indicates a stack overflow.

Almost. The sys::thread::guard::init fn is only called on the first-time startup there, which does whatever guard mapping it needs.

OK, so it sounds like there is nothing else that uses sys::thread::guard::init, then? In other words: this is used exclusively to find the main thread's stack guard region, in order to recognize stack overflows in the SIGSEGV handler?

There is one other place that calls thread_info::set, which takes the stack guard argument, and it is thread::Builder::spawn_unchecked:
[...]
From guard::current(), which is like this for NetBSD:

Interesting, so there's separate logic to find the stack guards of non-main threads? That logic looks fine too.

For NetBSD (as well as Linux and probably every other system) it might be worthwhile, in the main thread's stack overflow recognition on SIGSEGV, to use the combination of what is called the guard and the inaccessible pages at https://man.NetBSD.org/stack.7 -- that is, the region between the soft rlimit and the hard rlimit, plus the guard.

But that's a little more work -- and it's not really NetBSD-specific because other OSes also grow the stack automatically, so there's no need to put it in this PR, which, as I understand it, is really only needed to omit Rust's attempt to mmap/mprotect its own guard page at the process's minimum possible stack size when NetBSD already put one at the maximum possible stack size.

(this has been quite illuminating, thank you! I wish the man pages for the pthread headers were as clear...)

If you can point out any parts you found unclear in the pthread man pages in NetBSD, I'd be happy to take a look and see if we can improve them!

Side notes about this code:

  • You don't need to mmap or mprotect guard pages. NetBSD already does this for you.

Right, it sounds like for sys::thread::guard::init the OpenBSD branch here is the one that NetBSD should be using?

Yes, that looks good.

Though it sounds like the stack isn't immutable, exactly, on NetBSD,

On most systems, I expect the main thread's stack will grow automatically up to the soft rlimit (which can, in turn, be grown up to the hard rlimit). So if that's what you mean by 'the stack isn't immutable', that's not NetBSD-specific.

but we "don't care" because, thinking about it more clearly now, anyone running pthread_attr_setstack has to be running it after sys::thread::guard::init, in which case we can't exactly interdict their syscalls, as far as I know, and they're taking their life into their own hands, OR we never get to run this function anyways.

Note that pthread_attr_setstack doesn't change any existing threads. It only sets up configuration (pthread attributes) that can be passed to pthread_create when creating new threads. If you do pthread_getattr_np(T, &A) and then pthread_attr_setstack(&A, ...), that doesn't change anything about the existing thread T -- pthread_getattr_np just makes a copy of T's configuration; the resulting pthread_attr_t A has no lingering connection to T.

And if someone does write Rust code to create a pthread with a custom stack using pthread_attr_setstack and pthread_create, they will have to arrange their own guard pages (mapping them and informing the SIGSEGV handler of them) on every OS, not just NetBSD.

Copy link

@riastradh riastradh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current state of this PR -- with a one-line change to use the existing OpenBSD code for sys::thread::guard::init on NetBSD too -- looks good to me by code inspection, assuming you've tested it and verified it works (which I have not done).

@@ -911,9 +911,10 @@ pub mod guard {
}
}) * page_size;
Some(guard)
} else if cfg!(target_os = "openbsd") {
} else if cfg!(any(target_os = "openbsd", target_os = "netbsd")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I was being a bit dense earlier, it seems!

@workingjubilee
Copy link
Contributor

workingjubilee commented Mar 11, 2024

honestly no idea what the comment that says "the stack is immutable" really means.

And while I said that OpenBSD and NetBSD should share the logic in question for guard::init(), that was in fact introduced a revision ago and I was just looking at the wrong commit. So yeah! Good to go.

@bors rollup=always r=workingjubilee,riastradh

@bors
Copy link
Contributor

bors commented Mar 11, 2024

📌 Commit ffdd97f has been approved by workingjubilee,riastradh

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 11, 2024
jhpratt added a commit to jhpratt/rust that referenced this pull request Mar 11, 2024
…r=workingjubilee,riastradh

std::threads: revisit stack address calculation on netbsd.

like older linux glibc versions, we need to get the guard size
 and increasing the stack's bottom address accordingly.
bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 11, 2024
Rollup of 8 pull requests

Successful merges:

 - rust-lang#121148 (Add slice::try_range)
 - rust-lang#121573 (unix_sigpipe: Add test for SIGPIPE disposition in child processes)
 - rust-lang#121633 (Win10: Use `GetSystemTimePreciseAsFileTime` directly)
 - rust-lang#121840 (Expose the Freeze trait again (unstably) and forbid implementing it manually)
 - rust-lang#121907 (skip sanity check for non-host targets in `check` builds)
 - rust-lang#122002 (std::threads: revisit stack address calculation on netbsd.)
 - rust-lang#122108 (Add `target.*.runner` configuration for targets)
 - rust-lang#122298 (RawVec::into_box: avoid unnecessary intermediate reference)

r? `@ghost`
`@rustbot` modify labels: rollup
bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 11, 2024
Rollup of 9 pull requests

Successful merges:

 - rust-lang#121148 (Add slice::try_range)
 - rust-lang#121633 (Win10: Use `GetSystemTimePreciseAsFileTime` directly)
 - rust-lang#121840 (Expose the Freeze trait again (unstably) and forbid implementing it manually)
 - rust-lang#121907 (skip sanity check for non-host targets in `check` builds)
 - rust-lang#122002 (std::threads: revisit stack address calculation on netbsd.)
 - rust-lang#122108 (Add `target.*.runner` configuration for targets)
 - rust-lang#122298 (RawVec::into_box: avoid unnecessary intermediate reference)
 - rust-lang#122315 (Allow multiple `impl Into<{D,Subd}iagMessage>` parameters in a function.)
 - rust-lang#122326 (Optimize `process_heap_alloc`)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit b3ac6fa into rust-lang:master Mar 11, 2024
11 checks passed
@rustbot rustbot added this to the 1.78.0 milestone Mar 11, 2024
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Mar 11, 2024
Rollup merge of rust-lang#122002 - devnexen:thread_stack_netbsd_fix, r=workingjubilee,riastradh

std::threads: revisit stack address calculation on netbsd.

like older linux glibc versions, we need to get the guard size
 and increasing the stack's bottom address accordingly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
O-unix Operating system: Unix-like S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants