fix: propagate child process exit code by mtojek · Pull Request #202 · coder/boundary

mtojek · 2026-05-19T11:30:30Z

Fixes #190

Both LandJail.Run() and NSJailManager.Run() always returned nil, discarding the child process exit code. The boundary process exited 0 regardless of what the target command returned.

Changes

landjail/manager.go, nsjail_manager/manager.go: Capture the child process error via a buffered channel instead of discarding it in the goroutine. Run() now blocks on the channel after the select and returns the error (which includes *exec.ExitError with the correct code).
landjail/child.go, nsjail_manager/child.go: Return the raw *exec.ExitError from cmd.Run() instead of calling os.Exit() or wrapping it in fmt.Errorf(). Serpent's RunCommandError has Unwrap(), so errors.As can find *exec.ExitError through the entire chain.
cli/cli.go: The handler calls os.Exit(exitCode) when the child process exits with a non-zero code. All cleanup (proxy stop, etc.) has already happened inside Run() via defers. This ensures the correct exit code is propagated regardless of how the caller handles errors, both as a standalone binary and when embedded as a coder boundary subcommand (no changes needed in coder/coder).

How to test

Build and run:

go build -o ./boundary ./cmd/boundary/

# success → 0
./boundary --jail-type landjail -- true; echo $?

# false → 1
./boundary --jail-type landjail -- false; echo $?

# arbitrary exit code → 42
./boundary --jail-type landjail -- bash -c 'exit 42'; echo $?

# arbitrary exit code → 127
./boundary --jail-type landjail -- bash -c 'exit 127'; echo $?

# command not found → 1
./boundary --jail-type landjail -- no-such-cmd; echo $?

landjail requires kernel 6.7+ (Landlock V4). Use --jail-type nsjail with appropriate privileges on older kernels.

Generated by Coder Agents

Both LandJail.Run() and NSJailManager.Run() always returned nil, discarding the child process exit code. The landjail child also wrapped exit codes in fmt.Errorf() instead of calling os.Exit(). Changes: - Add exitcode.Error type to carry exit codes through the error chain - Fix landjail child to call os.Exit(exitCode), matching nsjail behavior - Fix both managers to capture child errors via a channel and return exitcode.Error from Run() - Fix main.go to extract exitcode.Error before defaulting to os.Exit(1) - Change NSJailManager.RunChildProcess to return error (was void) Fixes #190

SasSwart

Nothing blocking given this is easy to patch and a minor fix. But a few comments to consider nonetheless.

SasSwart · 2026-05-20T12:28:53Z

+
+			// If the child process exited with a non-zero code, exit
+			// with the same code directly. All cleanup (proxy, etc.)
+			// has already happened inside Run(). Exiting here ensures
+			// the correct code is propagated regardless of how the
+			// calling framework handles errors (standalone binary or
+			// embedded as a coder subcommand).
+			var exitErr *exec.ExitError
+			if errors.As(err, &exitErr) {
+				os.Exit(exitErr.ExitCode())
+			}
+			return err


This is a convenient solution, but it makes me nervous to have an os.Exit that is more than a single level of indirection from the entrypoint like this. Looking at this bit of code here, we don't how what cleanup would have happened on the return path between here and the entry point.

I think the proper solution is to do the error checking near the entry point both here and in the coder subcommand.

In practice, the risk is low and its easy to patch later, so I don't think this blocks the PR. Its worth a mention though.

Agreed it's not ideal. The problem is the embedded mode (coder boundary ...), we don't control coder's entrypoint, and serpent wraps our returned error in RunCommandError, so coder's main() just does os.Exit(1) losing the actual code.

To do it "properly" we'd need changes in coder/coder or serpent. This is a conscious tradeoff: all cleanup (proxy, iptables) already ran inside Run() via defers before the error returns, so the os.Exit is safe.

Let me know your thoughts!

SasSwart · 2026-05-20T12:35:03Z

-		// This is an unexpected error
-		logger.Error("Command execution failed", "error", err)
-		return fmt.Errorf("command execution failed: %v", err)
+		return err


nit: I feel like the error wrapping here was useful. It provides a better single description of the failure path than disjoint debug logs that might be filtered out.

Ok, brought it back 👍

SasSwart · 2026-05-20T12:40:10Z

+	// error is already buffered. In the signal path the child may still
+	// be running; return nil so deferred cleanup (iptables, proxy) can
+	// proceed before the process exits.
+	select {


Asking for clarity:

why do we need a second select here instead of a three case select above?

select { case sig := <-sigChan: // ... case err := <-childErr: // ... case <-ctx.Done(): // ... }

When the child finishes, the goroutine sends on childErr AND calls defer cancel(), which closes ctx.Done(). So both channels are ready at roughly the same time. Go picks randomly between ready cases - if we land in ctx.Done() instead of childErr, we lose the exit code error and return nil.

Two selects avoid that: first one waits for signal or context cancellation, second one (non-blocking) drains the child result. In the ctx.Done path the error is already buffered so we always get it. In the signal path the child may still be running, so default: return nil lets deferred cleanup proceed.

Addresses review feedback: re-add error wrapping for ExitError with fmt.Errorf and %w verb so the error message is descriptive while preserving the *exec.ExitError type for errors.As().

mtojek force-pushed the mtojek/fix-exit-code-propagation branch from 02622ca to 61d23c4 Compare May 19, 2026 11:34

mtojek force-pushed the mtojek/fix-exit-code-propagation branch from 61d23c4 to cb56ca1 Compare May 19, 2026 11:38

mtojek requested a review from SasSwart May 19, 2026 11:42

mtojek marked this pull request as ready for review May 19, 2026 11:42

SasSwart approved these changes May 20, 2026

View reviewed changes

fix: wrap exit error with descriptive message using %w

8b03bb7

Addresses review feedback: re-add error wrapping for ExitError with fmt.Errorf and %w verb so the error message is descriptive while preserving the *exec.ExitError type for errors.As().

mtojek force-pushed the mtojek/fix-exit-code-propagation branch from 7b2e933 to 8b03bb7 Compare May 20, 2026 13:48

mtojek merged commit 3e1e57b into main May 21, 2026
5 checks passed

mtojek deleted the mtojek/fix-exit-code-propagation branch May 21, 2026 07:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: propagate child process exit code#202

fix: propagate child process exit code#202
mtojek merged 2 commits into
mainfrom
mtojek/fix-exit-code-propagation

mtojek commented May 19, 2026 •

edited

Loading

Uh oh!

SasSwart left a comment

Uh oh!

SasSwart May 20, 2026

Uh oh!

mtojek May 20, 2026

Uh oh!

SasSwart May 20, 2026

Uh oh!

mtojek May 20, 2026

Uh oh!

SasSwart May 20, 2026

Uh oh!

mtojek May 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mtojek commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

How to test

Uh oh!

SasSwart left a comment

Choose a reason for hiding this comment

Uh oh!

SasSwart May 20, 2026

Choose a reason for hiding this comment

Uh oh!

mtojek May 20, 2026

Choose a reason for hiding this comment

Uh oh!

SasSwart May 20, 2026

Choose a reason for hiding this comment

Uh oh!

mtojek May 20, 2026

Choose a reason for hiding this comment

Uh oh!

SasSwart May 20, 2026

Choose a reason for hiding this comment

Uh oh!

mtojek May 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mtojek commented May 19, 2026 •

edited

Loading