-
Notifications
You must be signed in to change notification settings - Fork 13.6k
Description
Description
I'm currently stuck in a very obscure edgecase: the first recv_timeout panics the program with an assertion fail if no message is received until the timeout. If a message is sent before this timeout the program won't crash, even if the 2nd call to recv_timeout does hit the timeout:
This does work
- recv_timeout(100ms) // message is sent within that time
- message is processed
- recv_timeout(100ms) // no message within timeout
- program doesn't crash and tries again
This doesn't work
- recv_timeout(100ms) // no message within timeout
- assertion fail in stdlib panics the program
Backtrace
thread 'main' panicked at 'assertion failed: `(left == right)`
left: `112055772029824`,
right: `0`', libstd/sync/mpsc/shared.rs:253:13
stack backtrace:
0: std::sys::unix::backtrace::tracing::imp::unwind_backtrace
at libstd/sys/unix/backtrace/tracing/gcc_s.rs:49
1: std::sys_common::backtrace::print
at libstd/sys_common/backtrace.rs:71
at libstd/sys_common/backtrace.rs:59
2: std::panicking::default_hook::{{closure}}
at libstd/panicking.rs:211
3: std::panicking::default_hook
at libstd/panicking.rs:227
4: std::panicking::rust_panic_with_hook
at libstd/panicking.rs:475
5: std::panicking::continue_panic_fmt
at libstd/panicking.rs:390
6: std::panicking::begin_panic_fmt
at libstd/panicking.rs:345
7: <std::sync::mpsc::shared::Packet<T>>::decrement
at /checkout/src/libstd/macros.rs:78
8: <std::sync::mpsc::shared::Packet<T>>::recv
at /checkout/src/libstd/sync/mpsc/shared.rs:232
9: <std::sync::mpsc::Receiver<T>>::recv_deadline
at /checkout/src/libstd/sync/mpsc/mod.rs:1387
10: <std::sync::mpsc::Receiver<T>>::recv_timeout
at /checkout/src/libstd/sync/mpsc/mod.rs:1300
[my code starts here]
Code from stdlib
This is the code in question. The first assert fails for unknown reasons (not sure how to_wake
is used):
// Essentially the exact same thing as the stream decrement function.
// Returns true if blocking should proceed.
fn decrement(&self, token: SignalToken) -> StartResult {
unsafe {
assert_eq!(self.to_wake.load(Ordering::SeqCst), 0);
let ptr = token.cast_to_usize();
self.to_wake.store(ptr, Ordering::SeqCst);
let steals = ptr::replace(self.steals.get(), 0);
match self.cnt.fetch_sub(1 + steals, Ordering::SeqCst) {
DISCONNECTED => { self.cnt.store(DISCONNECTED, Ordering::SeqCst); }
// If we factor in our steals and notice that the channel has no
// data, we successfully sleep
n => {
assert!(n >= 0);
if n - steals <= 0 { return Installed }
}
}
self.to_wake.store(0, Ordering::SeqCst);
drop(SignalToken::cast_from_usize(ptr));
Abort
}
}
Failed attempt to reproduce
The issue is 100% reliable in my codebase (the number changes, but the panic is always the same, even after full rebuilds), I've tried to build a test case that is structed the same way my program is structured but failed to reproduce the issue. The full code base isn't public yet.
use std::thread;
use std::sync::mpsc;
use std::time::Duration;
enum Event {
Tick,
Done,
}
fn main() {
let (tx, rx) = mpsc::channel();
thread::spawn(move || {
thread::sleep(Duration::from_secs(3));
tx.send(Event::Tick).unwrap();
thread::sleep(Duration::from_secs(3));
tx.send(Event::Done).unwrap();
});
loop {
match rx.recv_timeout(Duration::from_secs(100)) {
Ok(Event::Tick) => println!("tick"),
Ok(Event::Done) => break,
Err(mpsc::RecvTimeoutError::Timeout) => (),
Err(mpsc::RecvTimeoutError::Disconnected) => break,
}
}
}
Random thoughts
I'm suspecting there might be a dependency at fault that has unsound unsafe code, but I'm running out of ideas how to debug this (had some issues with valgrind and still working on getting it to work). Some pointers would be appreciated.
System info
Archlinux with stable rustc from rustup:
rustc 1.29.0 (aa3ca1994 2018-09-11)
cargo 1.29.0 (524a578d7 2018-08-05)