fix(lxc): resolve unprivileged lxcpath instead of hard-coding /var/lib/lxc by caarlos0 · Pull Request #275 · microsoft/mxc

caarlos0 · 2026-05-11T20:49:07Z

I have signed the Contributor License Agreement.
This pull request is related to an issue.
If this PR changes build commands, project architecture, or key conventions, I have updated .github/copilot-instructions.md.

Summary

Fixes the LXC unprivileged-mode failure reported in #274:

Error: Failed to configure filesystem: Failed to set config item lxc.mount.entry: No such file or directory (os error 2)

Root cause

LxcContainer::config_file_path() hard-coded /var/lib/lxc, but every lxc-* shell-out (lxc-create, lxc-info, lxc-start, ...) was invoked without -P. As a non-root user, liblxc auto-routes the container to ~/.local/share/lxc/<name>/config (per XDG_DATA_HOME / HOME); our wrapper then tried to append lxc.mount.entry to /var/lib/lxc/<name>/config, which doesn't exist → ENOENT.

The same hard-coded path bug also affected the denied-paths rootfs probe in filesystem_mounts.rs, where is_file() would always return false for unprivileged users and silently mask files with tmpfs instead of /dev/null.

Fix

New resolve_default_lxcpath() helper that mimics liblxc's resolution: LXC_PATH → /var/lib/lxc for root → $XDG_DATA_HOME/lxc → $HOME/.local/share/lxc → /var/lib/lxc last-resort fallback. Extracted into a parameterized resolve_lxcpath_with_env(get_env, geteuid) so every branch is covered by deterministic unit tests.
LxcContainer now stores the resolved lxc_path (always populated, no Option) and passes -P <resolved> to every lxc-* invocation, so behavior is identical regardless of how the binary is launched (cron, systemd unit with non-default HOME, etc.).
filesystem_mounts.rs denied-path rootfs probe now uses the resolved container.lxc_path() instead of the hard-coded fallback.
set_config_item's error now includes the key, value, and target config-file path, addressing the diagnostic gap also called out in #274.

Tests

Added 12 unit tests in lxc_common::lxc_bindings::tests covering:

All resolve_lxcpath_with_env branches (LXC_PATH precedence even as root, root default, XDG_DATA_HOME wins over HOME, HOME fallback, trailing-slash normalization, empty-env handling, last-resort fallback when neither env var is set).
LxcContainer::new(..., None) always yields a non-empty resolved path.
LxcContainer::new(..., Some(...)) honors explicit overrides and config_file_path() derives from the resolved path.
set_config_item error message contains key, value, and the container config path.

cargo test -p lxc_common: 25 passed (12 new). cargo build -p lxc and cargo clippy -p lxc_common --no-deps -- -D warnings clean.

Closes #274
Refs https://github.com/github/copilot-agent-runtime/issues/164

Microsoft Reviewers: Open in CodeFlow

…b/lxc When run as a non-root user, liblxc stores containers under $XDG_DATA_HOME/lxc (default ~/.local/share/lxc) rather than the system-wide /var/lib/lxc. Our LxcContainer wrapper hard-coded /var/lib/lxc when computing the config file path, while shelling out to the lxc-* CLIs without -P at all — so liblxc auto-routed container creation to the user path, then set_config_item tried to append to /var/lib/lxc/<name>/config and failed with ENOENT. Resolve the lxcpath up front using the same precedence liblxc applies (LXC_PATH > /var/lib/lxc for root > $XDG_DATA_HOME/lxc > $HOME/.local/share/lxc), store it on LxcContainer, and pass -P <resolved> to every lxc-* invocation so behavior is identical regardless of how the binary is launched. Also fix the same hard-coded path bug in filesystem_mounts.rs's denied-path rootfs probe. Improve set_config_item's error message to include the key, value, and config-file path so diagnostics like the one in #274 immediately point at the failing entry. Fixes #274 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>

The previous commit unconditionally called libc::geteuid in resolve_default_lxcpath, which broke the workspace build on Windows targets (libc has no geteuid on non-Unix platforms). Wrap the call in cfg(unix) and provide a non-root fallback for non-Unix targets — the function is never invoked there in production (lxc-exec is Linux-only) but the crate must still compile workspace-wide. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>

Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>

bbonaby

thanks!

bbonaby · 2026-05-11T21:30:07Z

+    // `geteuid` only exists on Unix; on other targets the function is never
+    // invoked in production (lxc-exec is Linux-only) but the crate still
+    // needs to compile workspace-wide, so fall back to a non-root EUID.
+    #[cfg(unix)]


suggestion: could use #[cfg(target_os = "linux")] for linux only

bbonaby · 2026-05-11T21:31:08Z

+    fn current_euid() -> u32 {
+        unsafe { libc::geteuid() as u32 }
+    }
+    #[cfg(not(unix))]


suggestion: could use #[cfg(not(target_os = "linux"))]

bbonaby · 2026-05-11T21:32:14Z

+    /// Build a `Command` for an `lxc-*` tool with `-P <lxc_path> -n <name>`
+    /// already populated. Centralizes the argv prefix so we can't accidentally
+    /// drop `-P` again (see #274).
+    fn lxc_command(&self, tool: &str) -> std::process::Command {


praise: good call

caarlos0 · 2026-05-11T23:17:42Z

Thanks @bbonaby - do you have plans for next release?

bbonaby · 2026-05-11T23:49:00Z

Hey @caarlos0 so we'll likely publish our new 0.2.0 npm package either tomorrow or Wednesday. Will keep you posted.

* fix(lxc): inherit stdio in lxc-attach for interactive shells Wire lxc-attach's stdin/stdout/stderr to the parent process so the inner process shares the pty driving lxc-exec. Without this, stdin is closed and interactive shells exit immediately on EOF. Mirrors the AppContainer runner on Windows, which shares the parent's ConPTY with the sandboxed child. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com> * fix(lxc): skip redundant lxc-stop in destroy() `lxc-destroy -f` already force-stops a running container, but `destroy()` also called `lxc-stop` first. Plain `lxc-stop` waits up to 60 seconds for a graceful shutdown, which is fatal for distros that run systemd as PID 1 in unprivileged userns: init never cleanly responds to SIGPWR, so the wait times out before the force-stop step. Removing the prior stop drops cleanup from ~60s to ~0.2s on ubuntu and is a no-op on alpine (no init to wait on). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com> * fix(lxc): bridge stdio via inner pty with detached ctty The inherit-stdio approach from the previous commit got `lxc-attach` running but produced silent shells: bash booted, printed its prompt, and then ignored everything the host wrote. Two compounding problems: 1. **Pre-buffered input was discarded.** Bash's readline init calls `tcsetattr` on its stdin, which can flush bytes the parent buffered into the pty before bash got there. Anything the CLI pre-sent (e.g. a shell wrapper writing `PS1=""; PS2=""; unset HISTFILE`) was lost. 2. **`lxc-attach` bypassed our slave fds.** When it inherits a controlling terminal from the parent, it forwards the inner container pty's I/O straight to `/dev/tty` instead of using the stdio slots we set. Bash's prompt reached our screen but the master fd we kept stayed empty, so there was no way to forward host stdin into bash either. Fix: - Allocate an inner pty pair via `nix::pty::openpty`. Give the slave to `lxc-attach` (and thus bash) for stdin/stdout/stderr; keep the master. - In `pre_exec`: `setsid()` to drop the inherited controlling terminal, then `ioctl(TIOCSCTTY)` on fd 0 so `lxc-attach` adopts our slave as its new ctty. This is what makes the slave fds actually carry the data. - Output thread copies master to host stdout and signals "ready" on the first byte from inside the container, which marks bash as past its readline init and safe to feed. - Input thread waits for "ready" (capped at 5s so a wedged shell doesn't block forever, since the higher-level command timeout takes over) before forwarding host stdin to the master. Adds the `term` feature to the workspace `nix` dependency so `pty::openpty` is available. Validated end-to-end on Ubuntu Resolute (arm64, OrbStack) running the CLI's sandboxed shell tool; bash receives wrapper text, runs commands, and exits cleanly. Supersedes the inherit-stdio approach in a3f2a66. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com> * fix(lxc): gate Unix-only attach_run for Windows clippy The new pty-based attach_run uses pre_exec, openpty, TIOCSCTTY, and nix::unistd::setsid — none of which exist on Windows. Workspace clippy runs on windows-latest and was failing to compile lxc_common. Gate the Unix implementation behind cfg(unix) and provide a stub for non-Unix targets so the crate keeps compiling workspace-wide. lxc-exec itself is still Linux-only at runtime; the stub just lets the workspace build on Windows. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com> * fix(lxc): use cfg(target_os = "linux") instead of cfg(unix) The codebase consistently gates platform-specific code with explicit target_os checks (target_os = "windows" / "linux" / "macos") rather than the broader unix/not(unix) family. cfg(unix) also covers macOS, where lxc isn't available either, so the explicit Linux gate is more accurate. Convert the existing current_euid gate (added in #275) and the new attach_run gate to match. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com> * fix(lxc): destroy active container on SIGHUP/SIGTERM/SIGINT `LxcScriptRunner::run_internal` only calls `container.destroy()` after `attach_run` returns. When the runner is terminated by a fatal signal (its parent exited or sent a SIGHUP/SIGTERM/SIGINT), the in-flight `attach_run` is interrupted and the destroy block is never reached, leaving the container running indefinitely — `lxc-ls` lists it long after the CLI session that spawned it is gone. Install a `signal_cleanup` watchdog in `lxc-exec`'s `main` that: - Blocks SIGHUP/SIGTERM/SIGINT in the calling thread before any other thread is spawned, so all threads inherit the blocked mask. - Spawns a dedicated thread that synchronously `sigwait`s for any of those signals, destroys the currently active container, and exits with `128 + signo` so the parent observes a normal signal-style exit. `LxcScriptRunner::run_internal` registers the active container name with the watchdog right after resolving it, so the cleanup window covers create/start/attach as well as the normal post-attach destroy path. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com> * fix(lxc): gate signal_cleanup install on Linux only The new signal_cleanup module pulls in nix::sys::signal::{SigSet, Signal} and nix::Result, which are POSIX-only and don't compile on Windows. The wxc-exec-lint CI job runs cargo clippy --workspace on windows-latest, so lxc_common has to compile workspace-wide. Gate the nix imports, the watchdog thread, and the install() body behind cfg(target_os = "linux"); provide a no-op install() stub for non-Linux that returns Ok(()). Signal-driven cleanup only matters at runtime where lxc-exec actually runs, which is Linux. set_active stays cross-platform — it's a plain Mutex<Option<String>> with no syscall dependencies — so callers don't need their own cfg gates. Also switches the install() return type from nix::Result<()> to Result<(), String> so the signature exists on every target. The single caller in lxc/src/main.rs already formats the error with Display. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com> * fix(lxc): cargo fmt attach_run Pre-existing formatting drift from commit 49fbbb9 (the inner-pty attach_run rewrite) — surfaced now that clippy passes and the lint workflow reaches its 'Fail if formatting check failed' step. No behavior change. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com> * fix: review Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com> * fix(lxc): gate INSTALLED OnceLock to Linux The INSTALLED guard added in the review commit was placed at module scope without a cfg gate, but its only reader — install() — is cfg(target_os = "linux"). On Windows clippy that makes INSTALLED dead code: error: static `INSTALLED` is never used --> lxc_common\src\signal_cleanup.rs:29:8 Gate the static behind cfg(target_os = "linux") to match install(). ACTIVE_CONTAINER stays cross-platform — set_active is callable on every target. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com> * fix: merge conflict Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com> * fix: drop cmd Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com> * fix(lxc): unblock signals in attached child shell The signal mask installed by signal_cleanup::install() blocks SIGHUP, SIGTERM, and SIGINT in the runner so the watchdog thread can sigwait on them. That mask is inherited across fork+exec, so without an explicit reset the inner shell launched by lxc-attach inherits the blocked set and silently ignores Ctrl-C and termination signals. Restore the default mask in the existing pre_exec closure so the inner process behaves like any other interactive shell. Also tighten the comment on the 5-second readiness gate: it only caps how long the stdin forwarder waits before kicking in, it is not a substitute for the (still unimplemented) script_timeout. The original wording overstated the protection it provides. Addresses Copilot review feedback on PR #283. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com> * fix(lxc): only register containers for signal cleanup when destroyOnExit is true LxcScriptRunner unconditionally called signal_cleanup::set_active so the watchdog thread would lxc-destroy the container on a fatal signal. That is the right behavior for the default case, but with `lifecycle.destroyOnExit = false` the normal completion path deliberately preserves the container. A SIGTERM during such a run would silently delete a container the caller asked to keep. Gate the registration on self.destroy_on_exit so the signal path mirrors the normal-completion behavior. Addresses Copilot review feedback on PR #283. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com> * fix(lxc): abort lxc-exec when signal-cleanup install fails Previously, install() called thread_block() to mask SIGHUP/SIGTERM/SIGINT before spawning the watchdog thread. If the spawn failed, the function returned an error but the signals stayed blocked. main() only logged a warning and continued, leaving the process in a state where those termination signals were silently ignored — exactly the failure mode the watchdog exists to handle. Two fixes: 1. signal_cleanup::install() now restores the original mask via thread_unblock() when the watchdog spawn fails, so callers see a clean error rather than a permanently broken signal disposition. It also defers the INSTALLED flag until the watchdog is actually running, so a retry after a transient spawn failure works. 2. lxc-exec's main() now exits with status 1 on install() failure instead of warning and continuing. Containers that leak on Ctrl-C are exactly what this code prevents; we should not run without it. Addresses Copilot review feedback on PR #283. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com> * fix(lxc): clean up iptables FORWARD hook in signal-cleanup watchdog On a fatal signal the watchdog destroyed the active container but bypassed the network-policy cleanup that the normal completion path runs. Any iptables MXC-* chain and FORWARD hook the runner installed for that container leaked, breaking later runs that reused the same container name (or just slowly accumulating dead chains). Track the host-side veth interface alongside the container name in the watchdog's active-sandbox slot, register it from LxcScriptRunner::run_internal right after veth discovery, and have the watchdog tear down iptables before destroying the container so the FORWARD hook is removed while the veth name still exists. Adds NetworkIptablesManager::force_cleanup() so the watchdog can trigger the same teardown path without owning the original manager instance. Addresses Copilot review feedback on PR #283. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com> --------- Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

caarlos0 closed this May 11, 2026

caarlos0 reopened this May 11, 2026

caarlos0 and others added 2 commits May 11, 2026 18:09

chore: simplify code

eb8eb69

Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>

bbonaby reviewed May 11, 2026

View reviewed changes

bbonaby approved these changes May 11, 2026

View reviewed changes

bbonaby merged commit c6edccb into microsoft:main May 11, 2026
7 checks passed

caarlos0 deleted the fix/lxc-unprivileged-lxcpath branch May 11, 2026 23:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(lxc): resolve unprivileged lxcpath instead of hard-coding /var/lib/lxc#275

fix(lxc): resolve unprivileged lxcpath instead of hard-coding /var/lib/lxc#275
bbonaby merged 3 commits into
microsoft:mainfrom
caarlos0:fix/lxc-unprivileged-lxcpath

caarlos0 commented May 11, 2026 •

edited

Loading

Uh oh!

bbonaby left a comment

Uh oh!

bbonaby May 11, 2026

Uh oh!

bbonaby May 11, 2026

Uh oh!

bbonaby May 11, 2026

Uh oh!

Uh oh!

caarlos0 commented May 11, 2026 •

edited

Loading

Uh oh!

bbonaby commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

caarlos0 commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root cause

Fix

Tests

Microsoft Reviewers: Open in CodeFlow

Uh oh!

bbonaby left a comment

Choose a reason for hiding this comment

Uh oh!

bbonaby May 11, 2026

Choose a reason for hiding this comment

Uh oh!

bbonaby May 11, 2026

Choose a reason for hiding this comment

Uh oh!

bbonaby May 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

caarlos0 commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bbonaby commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

caarlos0 commented May 11, 2026 •

edited

Loading

caarlos0 commented May 11, 2026 •

edited

Loading