moq-native(jemalloc): drop runtime activation; fixes moq-boy startup crash#1509
Conversation
Heap profiling is a debug aid, but jemalloc::run() returned anyhow::Result<()> and both moq-relay and moq-boy treated a returned Err as fatal via `Err(err) = jemalloc => return Err(err).context(...)` arms in their top-level select. The result: an unrelated jemalloc setup hiccup (prof.active unwritable, SIGUSR1 handler install failing) could take down a running relay. Convert all `?` paths inside run() to `tracing::warn!` + `return Ok(())`. The existing `Err(err) = jemalloc` select arms in callers stay (signature is unchanged) but the pattern simply never matches, so the binaries keep running. Closes #1507. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Root cause of moq-boy crashing on first startup with "name or mib specifies an unknown/invalid value": the `Ok(false) => write(prof.active, true)` branch tries to flip profiling on at runtime, but jemalloc only initializes the profiling backend when `opt.prof=true` at init. Writing prof.active later returns EINVAL. moq-relay's nix module sets MALLOC_CONF=prof:true,prof_active:true at the systemd unit, so it hits the `Ok(true)` branch and never noticed. moq-boy has no such wrapper, takes the dead `Ok(false)` path, and crashes. Drop the activation attempt entirely. If prof.active reads false, log a hint to set MALLOC_CONF=prof:true,prof_active:true at startup and return Ok(()). Reverts the warn-on-EINVAL coverup from the previous commit on this branch in favor of fixing the actual broken path.
|
Caution Review failedFailed to post review comments WalkthroughThe PR makes jemalloc profiling activation in run() best-effort: read failures or a compiled-but-inactive 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches✨ Simplify code
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
rs/moq-native/src/jemalloc.rs (1)
8-9: 💤 Low valueClarify the early-return behavior in the docstring.
The phrase "Returns when profiling can't be set up" could be clearer. The function returns early with
Ok(())on setup failure, but runs indefinitely in a signal-handling loop on success.📝 Suggested docstring improvement
-/// Profiling is a debug aid, so any setup failure is logged and swallowed -/// rather than propagated. Returns when profiling can't be set up. +/// Profiling is a debug aid, so any setup failure is logged and the function +/// returns early with Ok(()). On successful setup, loops indefinitely handling SIGUSR1.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@rs/moq-native/src/jemalloc.rs` around lines 8 - 9, Clarify the docstring for the profiling setup function in rs/moq-native/src/jemalloc.rs: state explicitly that on setup failure the function returns early with Ok(()) (logging the error and swallowing it), but on successful setup it does not return and instead runs indefinitely in the signal-handling loop; update the sentence "Returns when profiling can't be set up" to something like "Returns Ok(()) if profiling setup fails (error logged and swallowed); on success the function enters the signal-handling loop and does not return."
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@rs/moq-native/src/jemalloc.rs`:
- Around line 8-9: Clarify the docstring for the profiling setup function in
rs/moq-native/src/jemalloc.rs: state explicitly that on setup failure the
function returns early with Ok(()) (logging the error and swallowing it), but on
successful setup it does not return and instead runs indefinitely in the
signal-handling loop; update the sentence "Returns when profiling can't be set
up" to something like "Returns Ok(()) if profiling setup fails (error logged and
swallowed); on success the function enters the signal-handling loop and does not
return."
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 2e6a1a97-f909-422a-8456-1a441937c2dd
📒 Files selected for processing (1)
rs/moq-native/src/jemalloc.rs
|
@coderabbitai the docstring nit is moot — commit 3b8c269 (after the review snapshot at 124b97f) rewrote the docstring to describe the new contract ("Profiling must be enabled at startup via MALLOC_CONF=prof:true ... jemalloc only initializes the profiling backend when opt.prof is set at init; toggling prof.active later returns EINVAL"), since I also dropped the runtime-activation branch entirely as the real root cause fix. (Written by Claude) |
|
Tip For best results, initiate chat on the files or code changes.
|
Summary
Root cause of moq-boy crashing on first startup with
jemalloc profiler failed: name or mib specifies an unknown/invalid value:rs/moq-native/src/jemalloc.rs had a branch that tried to flip on profiling at runtime:
But jemalloc only initializes the profiling backend if
opt.prof=trueat init. Writingprof.activelater returns EINVAL. The branch was dead code in the success case and a crash in the failure case.moq-relay's nix module sets
MALLOC_CONF=prof:true,prof_active:trueat the systemd unit, so it hits theOk(true)branch and never noticed. moq-boy compiles in the same jemalloc feature but is launched without any MALLOC_CONF wrapper, takes the brokenOk(false)path, and crashes. systemd restarts it, restart works fine if the env got applied elsewhere, masking the bug.This drops the activation attempt entirely. If
prof.activereads false, log a hint to setMALLOC_CONF=prof:true,prof_active:trueat startup and return Ok(()).Closes #1507.
Test plan
cargo check --features jemalloconrs/moq-nativebuilds cleanlyMALLOC_CONF(previously crashed; now logs a hint and continues without profiling)Ok(true)branch unchanged)MALLOC_CONF=prof:true,prof_active:true,prof_prefix=/tmp/heapand send SIGUSR1 — should still dumpFollow-up (separate PR)
moq-boy probably wants a launcher / nix module mirroring the relay's
MALLOC_CONFenv so heap profiling is opt-in via the same surface as the relay, instead of relying on the caller to set it manually.(Written by Claude)