Skip to content

security: sanitize panic recovery to avoid leaking internal state#85

Merged
gh-worker-dd-mergequeue-cf854d[bot] merged 4 commits intomainfrom
fix/panic-leak
Mar 15, 2026
Merged

security: sanitize panic recovery to avoid leaking internal state#85
gh-worker-dd-mergequeue-cf854d[bot] merged 4 commits intomainfrom
fix/panic-leak

Conversation

@julesmcrt
Copy link
Collaborator

Summary

  • recover() handlers in Run() and the pipe goroutine returned fmt.Errorf("internal error: %v", rec) — the raw panic value (which may include memory addresses or internal state) was exposed to callers
  • Now writes the panic value and full stack trace to os.Stderr for forensic use
  • Returns a generic "internal error" string to callers, with no panic details
  • Updates the existing TestRunRecoversPanic test to assert the new behaviour

Security finding

SECURITY_FINDINGS.md — Panic recovery leaks internal state (MEDIUM)

Test plan

  • All existing tests pass (including updated TestRunRecoversPanic)
  • Panic details are no longer returned in the error value

🤖 Generated with Claude Code

The recover() handlers in Run() and the pipe goroutine included the raw panic
value verbatim in the returned error via fmt.Errorf("internal error: %v", rec).
This could expose memory addresses, internal struct contents, or other state
to untrusted callers.

Write the panic value and full stack trace to os.Stderr for forensic purposes,
and return a generic "internal error" to callers. Update the existing panic
recovery test to assert the new behaviour.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@julesmcrt
Copy link
Collaborator Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bb0f260c37

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Use the runner's configured stderr writer (r.stderr / rLeft.stderr) for
panic diagnostic output in both Run's defer/recover and the pipe goroutine
recover, falling back to os.Stderr only if nil. This keeps panic details
within the caller-configured I/O boundary instead of leaking them to the
process-global stderr.

Update TestRunRecoversPanic to assert that panic details appear in the
runner's stderr buffer rather than only checking the returned error.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@julesmcrt
Copy link
Collaborator Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 873578b979

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

interp/api.go Outdated
defer func() {
if rec := recover(); rec != nil {
retErr = fmt.Errorf("internal error: %v", rec)
panicOut := r.stderr

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid nil dereference in panic recovery

When Run is called on a nil *Runner, the first panic (at if !r.didReset) is caught by the deferred recover, but this new recovery path immediately dereferences r again via r.stderr. That causes a second panic ([recovered, repanicked]) and crashes the caller instead of returning an error, which is a regression from the previous implementation that converted such panics into an internal error return value.

Useful? React with 👍 / 👎.

When Run is called on a nil *Runner, the nil dereference at r.didReset
panics and is caught by the deferred recover. However, the previous
recovery path immediately dereferenced r again via r.stderr, causing a
second panic (recovered, repanicked) instead of returning an error.

Fix by checking r != nil before accessing r.stderr.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@julesmcrt
Copy link
Collaborator Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 11f6967dfc

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

interp/api.go Outdated
if panicOut == nil {
panicOut = os.Stderr
}
fmt.Fprintf(panicOut, "rshell: internal panic: %v\n%s\n", rec, debug.Stack())

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Prevent secondary panic in Run panic-recovery logging

The new fmt.Fprintf inside the recover() block can itself panic when r.stderr is a bad io.Writer (for example, a typed-nil writer or a writer whose Write panics), which turns an internal panic into an unrecovered crash instead of returning the intended generic internal error. This only occurs when an internal panic happens, but in that path the recovery handler should be panic-safe; consider guarding the logging write with its own recover or using a non-panicking best-effort sink.

Useful? React with 👍 / 👎.

if panicOut == nil {
panicOut = os.Stderr
}
fmt.Fprintf(panicOut, "rshell: internal panic: %v\n%s\n", rec, debug.Stack())

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Harden pipe-goroutine recover logger against writer panics

The left-side pipe goroutine now logs panic details via fmt.Fprintf inside its recover() defer, but this write can panic if rLeft.stderr ultimately wraps a panicking/typed-nil writer, causing the goroutine to panic again and potentially terminate the process. This requires both an internal panic and a problematic stderr writer, but recovery code should not introduce a new panic path; use a panic-safe best-effort write here as well.

Useful? React with 👍 / 👎.

If r.stderr (or its underlying writer) is itself a panicking writer,
fmt.Fprintf inside the recover() block would cause a secondary panic
([recovered, repanicked]) instead of returning internal error.

Guard both the Run handler and the left-pipe-goroutine handler with a
nested best-effort recover so the write is silenced on failure and the
outer recovery path always completes cleanly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@julesmcrt
Copy link
Collaborator Author

@codex review

@chatgpt-codex-connector
Copy link

Codex Review: Didn't find any major issues. Bravo.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@julesmcrt julesmcrt marked this pull request as ready for review March 13, 2026 14:36
@julesmcrt
Copy link
Collaborator Author

/merge

@gh-worker-devflow-routing-ef8351
Copy link

gh-worker-devflow-routing-ef8351 bot commented Mar 15, 2026

View all feedbacks in Devflow UI.

2026-03-15 14:14:56 UTC ℹ️ Start processing command /merge


2026-03-15 14:15:01 UTC ℹ️ MergeQueue: pull request added to the queue

The expected merge time in main is approximately 1m (p90).


2026-03-15 14:15:54 UTC ℹ️ MergeQueue: This merge request was merged

@gh-worker-dd-mergequeue-cf854d gh-worker-dd-mergequeue-cf854d bot merged commit 16b393f into main Mar 15, 2026
13 checks passed
@gh-worker-dd-mergequeue-cf854d gh-worker-dd-mergequeue-cf854d bot deleted the fix/panic-leak branch March 15, 2026 14:15
julesmcrt added a commit that referenced this pull request Mar 16, 2026
PRs #78 and #85 introduced panic recovery in interp/api.go and
interp/runner_exec.go that use runtime/debug.Stack() and os.Stderr,
but those symbols were not added to interpAllowedSymbols, causing
TestInterpAllowedSymbols and TestVerificationInterpCleanPass to fail
on all platforms.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
julesmcrt added a commit that referenced this pull request Mar 16, 2026
Drop the debug.Stack() call from the panic recovery handlers in
api.go and runner_exec.go — emitting a raw stack trace leaks internal
implementation details and contradicts the sanitization goal of #85.
The panic value alone is sufficient for diagnosis.

This also removes the need to add runtime/debug and runtime/debug.Stack
to the interp allowlist; only os.Stderr remains as a new entry.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
gh-worker-dd-mergequeue-cf854d bot pushed a commit that referenced this pull request Mar 16, 2026
…96)

fix: add runtime/debug.Stack and os.Stderr to interp allowlist

PRs #78 and #85 introduced panic recovery in interp/api.go and
interp/runner_exec.go that use runtime/debug.Stack() and os.Stderr,
but those symbols were not added to interpAllowedSymbols, causing
TestInterpAllowedSymbols and TestVerificationInterpCleanPass to fail
on all platforms.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fix: remove runtime/debug.Stack from panic recovery in interp

Drop the debug.Stack() call from the panic recovery handlers in
api.go and runner_exec.go — emitting a raw stack trace leaks internal
implementation details and contradicts the sanitization goal of #85.
The panic value alone is sufficient for diagnosis.

This also removes the need to add runtime/debug and runtime/debug.Stack
to the interp allowlist; only os.Stderr remains as a new entry.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fix: use io.Discard instead of os.Stderr as panic fallback in interp

When the runner has no stderr configured, the panic recovery handlers
were falling back to os.Stderr, causing rshell to write directly to
the process stderr — undesirable for embedders like datadog-agent.

Replace the fallback with io.Discard so the message is silently
dropped; the caller still receives "internal error" via the returned
error. This also removes the need to add os.Stderr to the interp
allowlist, leaving it unchanged.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-authored-by: jules.macret <jules.macret@datadoghq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants