Skip to content

cpp_supervised_app crashes sometimes during startup #175

@NicolasFussberger

Description

@NicolasFussberger

Description

While executing the demo scenario described here the cpp_supervised_app sometimes crashes during startup.

The crash is coming from a panic in health_monitoring_library:

thread '<unnamed>' (110) panicked at library/std/src/io/stdio.rs:1166:9:
failed printing to stdout: Resource temporarily unavailable (os error 11)
stack backtrace:
2026/04/24 11:16:23.9383844 214149635 000 DEMO NONE DFLT log info verbose 1 Reporting kRunning to Launch Manager
2026/04/24 11:16:23.9383844 214149635 000 DEMO NONE DFLT log info verbose 1 Signal handler started
   0:           0x481225 - <<std[ea9f4e93d12430d]::sys::backtrace::BacktraceLock>::print::DisplayBacktrace as core[d1bb96a9607206a1]::fmt::Display>::fmt
   1:           0x49b997 - core[d1bb96a9607206a1]::fmt::write
   2:           0x435cd6 - std[ea9f4e93d12430d]::io::default_write_fmt::<std[ea9f4e93d12430d]::sys::stdio::unix::Stderr>
   3:           0x480102 - <std[ea9f4e93d12430d]::sys::backtrace::BacktraceLock>::print
   4:           0x452d0c - std[ea9f4e93d12430d]::panicking::default_hook::{closure#0}
   5:           0x45538b - std[ea9f4e93d12430d]::panicking::default_hook
   6:           0x45554e - std[ea9f4e93d12430d]::panicking::panic_with_hook
   7:           0x47fc68 - std[ea9f4e93d12430d]::panicking::panic_handler::{closure#0}
   8:           0x47c9e9 - std[ea9f4e93d12430d]::sys::backtrace::__rust_end_short_backtrace::<std[ea9f4e93d12430d]::panicking::panic_handler::{closure#0}, !>
   9:           0x45345d - __rustc[43d5200f6d4d6223]::rust_begin_unwind
  10:           0x407d3c - core[d1bb96a9607206a1]::panicking::panic_fmt
  11:           0x4462c8 - std[ea9f4e93d12430d]::io::stdio::_print
  12:           0x4f68e3 - <stdout_logger[132f9830714abd2d]::StdoutLogger as score_log[627c137e28d9586]::Log>::log::{closure#0}
  13:           0x4f60a8 - <std[ea9f4e93d12430d]::thread::local::LocalKey<core[d1bb96a9607206a1]::cell::RefCell<stdout_logger[132f9830714abd2d]::FixedBufWriter<2048usize>>>>::with_borrow_mut::<<stdout_logger[1
32f9830714abd2d]::StdoutLogger as score_log[627c137e28d9586]::Log>::log::{closure#0}, ()>::{closure#0}
[2026/04/24 11:16:23.8457025][111][HMON][INFO] Monitoring thread started.
  14:           0x4f54ac - <std[ea9f4e93d12430d]::thread::local::LocalKey<core[d1bb96a9607206a1]::cell::RefCell<stdout_logger[132f9830714abd2d]::FixedBufWriter<2048usize>>>>::try_with::<<std[ea9f4e93d12430d]::
thread::local::LocalKey<core[d1bb96a9607206a1]::cell::RefCell<stdout_logger[132f9830714abd2d]::FixedBufWriter<2048usize>>>>::with_borrow_mut<<stdout_logger[132f9830714abd2d]::StdoutLogger as score_log[627c137e
28d9586]::Log>::log::{closure#0}, ()>::{closure#0}, ()>
  15:           0x4f53d7 - <std[ea9f4e93d12430d]::thread::local::LocalKey<core[d1bb96a9607206a1]::cell::RefCell<stdout_logger[132f9830714abd2d]::FixedBufWriter<2048usize>>>>::with::<<std[ea9f4e93d12430d]::thre
ad::local::LocalKey<core[d1bb96a9607206a1]::cell::RefCell<stdout_logger[132f9830714abd2d]::FixedBufWriter<2048usize>>>>::with_borrow_mut<<stdout_logger[132f9830714abd2d]::StdoutLogger as score_log[627c137e28d9
586]::Log>::log::{closure#0}, ()>::{closure#0}, ()>
  16:           0x4f54f6 - <std[ea9f4e93d12430d]::thread::local::LocalKey<core[d1bb96a9607206a1]::cell::RefCell<stdout_logger[132f9830714abd2d]::FixedBufWriter<2048usize>>>>::with_borrow_mut::<<stdout_logger[1
32f9830714abd2d]::StdoutLogger as score_log[627c137e28d9586]::Log>::log::{closure#0}, ()>
  17:           0x4f97ca - <stdout_logger[132f9830714abd2d]::StdoutLogger as score_log[627c137e28d9586]::Log>::log
  18:           0x431655 - <alloc[1af107bdbdd7bf3b]::boxed::Box<dyn score_log[627c137e28d9586]::Log> as score_log[627c137e28d9586]::Log>::log
  19:           0x4217ad - <health_monitoring_lib[db032d45c09879ab]::supervisor_api_client::score_supervisor_api_client::ScoreSupervisorAPIClient>::new
  20:           0x4c1349 - <health_monitoring_lib[db032d45c09879ab]::health_monitor::HealthMonitor>::start_internal
  21:           0x41b83f - health_monitor_start
  22:           0x40eb81 - _ZN5score2hm13HealthMonitor5startEv
  23:           0x40b4ee - main
  24:     0x7f13986651ca - __libc_start_call_main
                               at ./csu/../sysdeps/nptl/libc_start_call_main.h:58:16
  25:     0x7f139866528b - __libc_start_main_impl
                               at ./csu/../csu/libc-start.c:360:3
  26:           0x40a79a - _start
  27:                0x0 - <unknown>


thread '<unnamed>' (110) panicked at library/core/src/panicking.rs:255:5:
panic in a function that cannot unwind
stack backtrace:
   0:           0x481225 - <<std[ea9f4e93d12430d]::sys::backtrace::BacktraceLock>::print::DisplayBacktrace as core[d1bb96a9607206a1]::fmt::Display>::fmt
   1:           0x49b997 - core[d1bb96a9607206a1]::fmt::write
   2:           0x435cd6 - std[ea9f4e93d12430d]::io::default_write_fmt::<std[ea9f4e93d12430d]::sys::stdio::unix::Stderr>
   3:           0x480102 - <std[ea9f4e93d12430d]::sys::backtrace::BacktraceLock>::print
   4:           0x452d0c - std[ea9f4e93d12430d]::panicking::default_hook::{closure#0}
   5:           0x45538b - std[ea9f4e93d12430d]::panicking::default_hook
   6:           0x45554e - std[ea9f4e93d12430d]::panicking::panic_with_hook
   7:           0x47fc9a - std[ea9f4e93d12430d]::panicking::panic_handler::{closure#0}
   8:           0x47c9e9 - std[ea9f4e93d12430d]::sys::backtrace::__rust_end_short_backtrace::<std[ea9f4e93d12430d]::panicking::panic_handler::{closure#0}, !>
   9:           0x45345d - __rustc[43d5200f6d4d6223]::rust_begin_unwind
  10:           0x407b2d - core[d1bb96a9607206a1]::panicking::panic_nounwind_fmt
  11:           0x407a6b - core[d1bb96a9607206a1]::panicking::panic_nounwind
  12:           0x407c37 - core[d1bb96a9607206a1]::panicking::panic_cannot_unwind
  13:           0x41b78a - health_monitor_start
  14:           0x40eb81 - _ZN5score2hm13HealthMonitor5startEv
  15:           0x40b4ee - main
  16:     0x7f13986651ca - __libc_start_call_main
                               at ./csu/../sysdeps/nptl/libc_start_call_main.h:58:16
  17:     0x7f139866528b - __libc_start_main_impl
                               at ./csu/../csu/libc-start.c:360:3
  18:           0x40a79a - _start
  19:                0x0 - <unknown>
thread caused non-unwinding panic. aborting

Analysis results

No response

Solution

No response

Error Occurrence Rate

Sporadic

How to reproduce

Issue was reproduced with current lifecycle main (7fc47fb)

Start a tmux session
Run the demo bazel run //examples:run_examples --config=x86_64-linux -- tmux
Click through the demo scenario
Within the right tmux pane, switch between RunTargets repeatedly to trigger stop and restart of cpp_supervised_app:

lmcontrol Running
lmcontrol Startup

Supporting Information

While starting cpp_supervised_app example

Classification

Minor

First Affected Release

not released (main)

Last Affected Release

not released (main)

Expected Fixed Release

before release (main)

Category

  • Safety Relevant
  • Security Relevant

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

Projects

Status

Ready

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions