Discovered while tackling PR #72.
Summary
daqiri_bench_rdma --mode both (and likely any bench using the RDMA / socket / DPDK managers) segfaults during teardown when invoked from a bash-parent shell. The segfault fires after the legitimate shutdown completes, inside C++
`__cxa_finalize`.
Repro
On DGX Spark with `scripts/setup_rdma_loopback.sh` applied:
```bash
bash -c './build/examples/daqiri_bench_rdma examples/daqiri_bench_rdma_tx_rx_spark.yaml --seconds 30 --mode both; echo "bench exit: $?"'
```
Pre-fix: SIGSEGV 100% reproducible. Bench output emits normally for the full 30s, then crashes during exit. `bench exit: 139`.
Root cause
Each `Manager::shutdown()` implementation begins with a `DAQIRI_LOG_INFO(...)` and is invoked twice in the typical bench lifecycle:
- Explicitly from `main()` via `daqiri::shutdown()` — body runs, `initialized_` cleared.
- Again from the manager's destructor during C++ `__cxa_finalize` (for `SocketMgr` in RoCE mode, the cascade is `~SocketMgr → SocketMgr::shutdown → RdmaMgr::shutdown`).
By the time the destructor cascade fires, spdlog's default logger — a function-local static created lazily on the first `DAQIRI_LOG_INFO` — has already been destroyed. The log call at the top of the second `shutdown()` dereferences
the freed sink and crashes inside `spdlog::logger::sink_it_`.
Backtrace
```
spdlog::logger::sink_it_ ← SIGSEGV here
spdlog::logger::log_
daqiri::log_formatted_message
RdmaMgr::shutdown +0x70 ← 2nd call's first DAQIRI_LOG_INFO
SocketMgr::shutdown +0x214
SocketMgr::~SocketMgr (D1)
SocketMgr deleting destructor (D0)
__cxa_finalize +0x16c
```
Diagnostic notes
- The bash-vs-tini asymmetry initially looked like a process-group / signal issue; turned out to be a red herring caused by different static destruction order under different `LD_LIBRARY_PATH` setups. The container's
`LD_LIBRARY_PATH=/opt/daqiri/lib` overrides the rebuilt library at `build/src/`, so a freshly-built manager isn't always exercised at runtime — pass `-e LD_LIBRARY_PATH=/work/build/src:/opt/daqiri/lib` on `docker run` to load
the rebuild.
- spdlog logger in daqiri (`src/logging.cpp:62`) uses `stderr_color_mt` without `flush_on()` configured. For shutdown-path debugging, prefer `fprintf(stderr, …); fflush(stderr);` over `DAQIRI_LOG_INFO` so messages survive past
spdlog finalization.
Fix
Small idempotency guard at the top of each manager's `shutdown()`:
```cpp
if (!initialized_) { return; }
```
Applied symmetrically to `RdmaMgr::shutdown()`, `DpdkMgr::shutdown()`, and `SocketMgr::shutdown()` — the log-first body-second pattern is identical in all three. `DpdkMgr`'s existing `num_init` reference-counted body is
preserved; the guard only activates after the body has cleared `initialized_` on the final shutdown.
Post-fix: clean exit 0, all destructor markers run in order.
PR forthcoming.
Discovered while tackling PR #72.
Summary
daqiri_bench_rdma --mode both(and likely any bench using the RDMA / socket / DPDK managers) segfaults during teardown when invoked from a bash-parent shell. The segfault fires after the legitimate shutdown completes, inside C++`__cxa_finalize`.
Repro
On DGX Spark with `scripts/setup_rdma_loopback.sh` applied:
```bash
bash -c './build/examples/daqiri_bench_rdma examples/daqiri_bench_rdma_tx_rx_spark.yaml --seconds 30 --mode both; echo "bench exit: $?"'
```
Pre-fix: SIGSEGV 100% reproducible. Bench output emits normally for the full 30s, then crashes during exit. `bench exit: 139`.
Root cause
Each `Manager::shutdown()` implementation begins with a `DAQIRI_LOG_INFO(...)` and is invoked twice in the typical bench lifecycle:
By the time the destructor cascade fires, spdlog's default logger — a function-local static created lazily on the first `DAQIRI_LOG_INFO` — has already been destroyed. The log call at the top of the second `shutdown()` dereferences
the freed sink and crashes inside `spdlog::logger::sink_it_`.
Backtrace
```
spdlog::logger::sink_it_ ← SIGSEGV here
spdlog::logger::log_
daqiri::log_formatted_message
RdmaMgr::shutdown +0x70 ← 2nd call's first DAQIRI_LOG_INFO
SocketMgr::shutdown +0x214
SocketMgr::~SocketMgr (D1)
SocketMgr deleting destructor (D0)
__cxa_finalize +0x16c
```
Diagnostic notes
`LD_LIBRARY_PATH=/opt/daqiri/lib` overrides the rebuilt library at `build/src/`, so a freshly-built manager isn't always exercised at runtime — pass `-e LD_LIBRARY_PATH=/work/build/src:/opt/daqiri/lib` on `docker run` to load
the rebuild.
spdlog finalization.
Fix
Small idempotency guard at the top of each manager's `shutdown()`:
```cpp
if (!initialized_) { return; }
```
Applied symmetrically to `RdmaMgr::shutdown()`, `DpdkMgr::shutdown()`, and `SocketMgr::shutdown()` — the log-first body-second pattern is identical in all three. `DpdkMgr`'s existing `num_init` reference-counted body is
preserved; the guard only activates after the body has cleared `initialized_` on the final shutdown.
Post-fix: clean exit 0, all destructor markers run in order.
PR forthcoming.