Skip to content

fix(lxc): resolve unprivileged lxcpath instead of hard-coding /var/lib/lxc#275

Merged
bbonaby merged 3 commits into
microsoft:mainfrom
caarlos0:fix/lxc-unprivileged-lxcpath
May 11, 2026
Merged

fix(lxc): resolve unprivileged lxcpath instead of hard-coding /var/lib/lxc#275
bbonaby merged 3 commits into
microsoft:mainfrom
caarlos0:fix/lxc-unprivileged-lxcpath

Conversation

@caarlos0

@caarlos0 caarlos0 commented May 11, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes the LXC unprivileged-mode failure reported in #274:

Error: Failed to configure filesystem: Failed to set config item lxc.mount.entry: No such file or directory (os error 2)

Root cause

LxcContainer::config_file_path() hard-coded /var/lib/lxc, but every lxc-* shell-out (lxc-create, lxc-info, lxc-start, ...) was invoked without -P. As a non-root user, liblxc auto-routes the container to ~/.local/share/lxc/<name>/config (per XDG_DATA_HOME / HOME); our wrapper then tried to append lxc.mount.entry to /var/lib/lxc/<name>/config, which doesn't exist → ENOENT.

The same hard-coded path bug also affected the denied-paths rootfs probe in filesystem_mounts.rs, where is_file() would always return false for unprivileged users and silently mask files with tmpfs instead of /dev/null.

Fix

  • New resolve_default_lxcpath() helper that mimics liblxc's resolution: LXC_PATH/var/lib/lxc for root → $XDG_DATA_HOME/lxc$HOME/.local/share/lxc/var/lib/lxc last-resort fallback. Extracted into a parameterized resolve_lxcpath_with_env(get_env, geteuid) so every branch is covered by deterministic unit tests.
  • LxcContainer now stores the resolved lxc_path (always populated, no Option) and passes -P <resolved> to every lxc-* invocation, so behavior is identical regardless of how the binary is launched (cron, systemd unit with non-default HOME, etc.).
  • filesystem_mounts.rs denied-path rootfs probe now uses the resolved container.lxc_path() instead of the hard-coded fallback.
  • set_config_item's error now includes the key, value, and target config-file path, addressing the diagnostic gap also called out in #274.

Tests

Added 12 unit tests in lxc_common::lxc_bindings::tests covering:

  • All resolve_lxcpath_with_env branches (LXC_PATH precedence even as root, root default, XDG_DATA_HOME wins over HOME, HOME fallback, trailing-slash normalization, empty-env handling, last-resort fallback when neither env var is set).
  • LxcContainer::new(..., None) always yields a non-empty resolved path.
  • LxcContainer::new(..., Some(...)) honors explicit overrides and config_file_path() derives from the resolved path.
  • set_config_item error message contains key, value, and the container config path.

cargo test -p lxc_common: 25 passed (12 new). cargo build -p lxc and cargo clippy -p lxc_common --no-deps -- -D warnings clean.

Closes #274
Refs https://github.com/github/copilot-agent-runtime/issues/164

Microsoft Reviewers: Open in CodeFlow

…b/lxc

When run as a non-root user, liblxc stores containers under
$XDG_DATA_HOME/lxc (default ~/.local/share/lxc) rather than the
system-wide /var/lib/lxc. Our LxcContainer wrapper hard-coded
/var/lib/lxc when computing the config file path, while shelling out to
the lxc-* CLIs without -P at all — so liblxc auto-routed container
creation to the user path, then set_config_item tried to append to
/var/lib/lxc/<name>/config and failed with ENOENT.

Resolve the lxcpath up front using the same precedence liblxc applies
(LXC_PATH > /var/lib/lxc for root > $XDG_DATA_HOME/lxc >
$HOME/.local/share/lxc), store it on LxcContainer, and pass -P
<resolved> to every lxc-* invocation so behavior is identical regardless
of how the binary is launched. Also fix the same hard-coded path bug in
filesystem_mounts.rs's denied-path rootfs probe.

Improve set_config_item's error message to include the key, value, and
config-file path so diagnostics like the one in #274 immediately point
at the failing entry.

Fixes #274

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
@caarlos0 caarlos0 closed this May 11, 2026
@caarlos0 caarlos0 reopened this May 11, 2026
caarlos0 and others added 2 commits May 11, 2026 18:09
The previous commit unconditionally called libc::geteuid in
resolve_default_lxcpath, which broke the workspace build on Windows
targets (libc has no geteuid on non-Unix platforms). Wrap the call in
cfg(unix) and provide a non-root fallback for non-Unix targets — the
function is never invoked there in production (lxc-exec is Linux-only)
but the crate must still compile workspace-wide.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>

@bbonaby bbonaby left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

// `geteuid` only exists on Unix; on other targets the function is never
// invoked in production (lxc-exec is Linux-only) but the crate still
// needs to compile workspace-wide, so fall back to a non-root EUID.
#[cfg(unix)]

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: could use #[cfg(target_os = "linux")] for linux only

fn current_euid() -> u32 {
unsafe { libc::geteuid() as u32 }
}
#[cfg(not(unix))]

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: could use #[cfg(not(target_os = "linux"))]

/// Build a `Command` for an `lxc-*` tool with `-P <lxc_path> -n <name>`
/// already populated. Centralizes the argv prefix so we can't accidentally
/// drop `-P` again (see #274).
fn lxc_command(&self, tool: &str) -> std::process::Command {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: good call

@bbonaby bbonaby merged commit c6edccb into microsoft:main May 11, 2026
7 checks passed
@caarlos0

caarlos0 commented May 11, 2026

Copy link
Copy Markdown
Contributor Author

Thanks @bbonaby - do you have plans for next release?

@caarlos0 caarlos0 deleted the fix/lxc-unprivileged-lxcpath branch May 11, 2026 23:17
@bbonaby

bbonaby commented May 11, 2026

Copy link
Copy Markdown
Collaborator

Hey @caarlos0 so we'll likely publish our new 0.2.0 npm package either tomorrow or Wednesday. Will keep you posted.

bbonaby pushed a commit that referenced this pull request May 14, 2026
* fix(lxc): inherit stdio in lxc-attach for interactive shells

Wire lxc-attach's stdin/stdout/stderr to the parent process so the inner
process shares the pty driving lxc-exec. Without this, stdin is closed
and interactive shells exit immediately on EOF.

Mirrors the AppContainer runner on Windows, which shares the parent's
ConPTY with the sandboxed child.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>

* fix(lxc): skip redundant lxc-stop in destroy()

`lxc-destroy -f` already force-stops a running container, but `destroy()`
also called `lxc-stop` first. Plain `lxc-stop` waits up to 60 seconds for
a graceful shutdown, which is fatal for distros that run systemd as PID 1
in unprivileged userns: init never cleanly responds to SIGPWR, so the wait
times out before the force-stop step.

Removing the prior stop drops cleanup from ~60s to ~0.2s on ubuntu and is
a no-op on alpine (no init to wait on).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>

* fix(lxc): bridge stdio via inner pty with detached ctty

The inherit-stdio approach from the previous commit got `lxc-attach` running
but produced silent shells: bash booted, printed its prompt, and then ignored
everything the host wrote. Two compounding problems:

1. **Pre-buffered input was discarded.** Bash's readline init calls
   `tcsetattr` on its stdin, which can flush bytes the parent buffered into
   the pty before bash got there. Anything the CLI pre-sent (e.g. a shell
   wrapper writing `PS1=""; PS2=""; unset HISTFILE`) was lost.
2. **`lxc-attach` bypassed our slave fds.** When it inherits a controlling
   terminal from the parent, it forwards the inner container pty's I/O
   straight to `/dev/tty` instead of using the stdio slots we set. Bash's
   prompt reached our screen but the master fd we kept stayed empty, so
   there was no way to forward host stdin into bash either.

Fix:

- Allocate an inner pty pair via `nix::pty::openpty`. Give the slave to
  `lxc-attach` (and thus bash) for stdin/stdout/stderr; keep the master.
- In `pre_exec`: `setsid()` to drop the inherited controlling terminal,
  then `ioctl(TIOCSCTTY)` on fd 0 so `lxc-attach` adopts our slave as its
  new ctty. This is what makes the slave fds actually carry the data.
- Output thread copies master to host stdout and signals "ready" on the
  first byte from inside the container, which marks bash as past its
  readline init and safe to feed.
- Input thread waits for "ready" (capped at 5s so a wedged shell doesn't
  block forever, since the higher-level command timeout takes over) before
  forwarding host stdin to the master.

Adds the `term` feature to the workspace `nix` dependency so `pty::openpty`
is available.

Validated end-to-end on Ubuntu Resolute (arm64, OrbStack) running the CLI's
sandboxed shell tool; bash receives wrapper text, runs commands, and exits
cleanly.

Supersedes the inherit-stdio approach in a3f2a66.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>

* fix(lxc): gate Unix-only attach_run for Windows clippy

The new pty-based attach_run uses pre_exec, openpty, TIOCSCTTY, and
nix::unistd::setsid — none of which exist on Windows. Workspace clippy
runs on windows-latest and was failing to compile lxc_common.

Gate the Unix implementation behind cfg(unix) and provide a stub for
non-Unix targets so the crate keeps compiling workspace-wide. lxc-exec
itself is still Linux-only at runtime; the stub just lets the workspace
build on Windows.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>

* fix(lxc): use cfg(target_os = "linux") instead of cfg(unix)

The codebase consistently gates platform-specific code with explicit
target_os checks (target_os = "windows" / "linux" / "macos") rather
than the broader unix/not(unix) family. cfg(unix) also covers macOS,
where lxc isn't available either, so the explicit Linux gate is more
accurate. Convert the existing current_euid gate (added in #275) and
the new attach_run gate to match.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>

* fix(lxc): destroy active container on SIGHUP/SIGTERM/SIGINT

`LxcScriptRunner::run_internal` only calls `container.destroy()` after
`attach_run` returns. When the runner is terminated by a fatal signal
(its parent exited or sent a SIGHUP/SIGTERM/SIGINT), the in-flight
`attach_run` is interrupted and the destroy block is never reached,
leaving the container running indefinitely — `lxc-ls` lists it long
after the CLI session that spawned it is gone.

Install a `signal_cleanup` watchdog in `lxc-exec`'s `main` that:

  - Blocks SIGHUP/SIGTERM/SIGINT in the calling thread before any other
    thread is spawned, so all threads inherit the blocked mask.
  - Spawns a dedicated thread that synchronously `sigwait`s for any of
    those signals, destroys the currently active container, and exits
    with `128 + signo` so the parent observes a normal signal-style
    exit.

`LxcScriptRunner::run_internal` registers the active container name with
the watchdog right after resolving it, so the cleanup window covers
create/start/attach as well as the normal post-attach destroy path.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>

* fix(lxc): gate signal_cleanup install on Linux only

The new signal_cleanup module pulls in nix::sys::signal::{SigSet,
Signal} and nix::Result, which are POSIX-only and don't compile on
Windows. The wxc-exec-lint CI job runs cargo clippy --workspace on
windows-latest, so lxc_common has to compile workspace-wide.

Gate the nix imports, the watchdog thread, and the install() body
behind cfg(target_os = "linux"); provide a no-op install() stub for
non-Linux that returns Ok(()). Signal-driven cleanup only matters at
runtime where lxc-exec actually runs, which is Linux. set_active stays
cross-platform — it's a plain Mutex<Option<String>> with no syscall
dependencies — so callers don't need their own cfg gates.

Also switches the install() return type from nix::Result<()> to
Result<(), String> so the signature exists on every target. The single
caller in lxc/src/main.rs already formats the error with Display.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>

* fix(lxc): cargo fmt attach_run

Pre-existing formatting drift from commit 49fbbb9 (the inner-pty
attach_run rewrite) — surfaced now that clippy passes and the lint
workflow reaches its 'Fail if formatting check failed' step.

No behavior change.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>

* fix: review

Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>

* fix(lxc): gate INSTALLED OnceLock to Linux

The INSTALLED guard added in the review commit was placed at module
scope without a cfg gate, but its only reader — install() — is
cfg(target_os = "linux"). On Windows clippy that makes INSTALLED
dead code:

    error: static `INSTALLED` is never used
      --> lxc_common\src\signal_cleanup.rs:29:8

Gate the static behind cfg(target_os = "linux") to match install().
ACTIVE_CONTAINER stays cross-platform — set_active is callable on every
target.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>

* fix: merge conflict

Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>

* fix: drop cmd

Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>

* fix(lxc): unblock signals in attached child shell

The signal mask installed by signal_cleanup::install() blocks SIGHUP,
SIGTERM, and SIGINT in the runner so the watchdog thread can sigwait
on them. That mask is inherited across fork+exec, so without an explicit
reset the inner shell launched by lxc-attach inherits the blocked set
and silently ignores Ctrl-C and termination signals.

Restore the default mask in the existing pre_exec closure so the inner
process behaves like any other interactive shell.

Also tighten the comment on the 5-second readiness gate: it only caps
how long the stdin forwarder waits before kicking in, it is not a
substitute for the (still unimplemented) script_timeout. The original
wording overstated the protection it provides.

Addresses Copilot review feedback on PR #283.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>

* fix(lxc): only register containers for signal cleanup when destroyOnExit is true

LxcScriptRunner unconditionally called signal_cleanup::set_active so
the watchdog thread would lxc-destroy the container on a fatal signal.
That is the right behavior for the default case, but with
`lifecycle.destroyOnExit = false` the normal completion path
deliberately preserves the container. A SIGTERM during such a run would
silently delete a container the caller asked to keep.

Gate the registration on self.destroy_on_exit so the signal path mirrors
the normal-completion behavior.

Addresses Copilot review feedback on PR #283.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>

* fix(lxc): abort lxc-exec when signal-cleanup install fails

Previously, install() called thread_block() to mask SIGHUP/SIGTERM/SIGINT
before spawning the watchdog thread. If the spawn failed, the function
returned an error but the signals stayed blocked. main() only logged a
warning and continued, leaving the process in a state where those
termination signals were silently ignored — exactly the failure mode
the watchdog exists to handle.

Two fixes:

1. signal_cleanup::install() now restores the original mask via
   thread_unblock() when the watchdog spawn fails, so callers see a
   clean error rather than a permanently broken signal disposition.
   It also defers the INSTALLED flag until the watchdog is actually
   running, so a retry after a transient spawn failure works.

2. lxc-exec's main() now exits with status 1 on install() failure
   instead of warning and continuing. Containers that leak on Ctrl-C
   are exactly what this code prevents; we should not run without it.

Addresses Copilot review feedback on PR #283.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>

* fix(lxc): clean up iptables FORWARD hook in signal-cleanup watchdog

On a fatal signal the watchdog destroyed the active container but
bypassed the network-policy cleanup that the normal completion path
runs. Any iptables MXC-* chain and FORWARD hook the runner installed
for that container leaked, breaking later runs that reused the same
container name (or just slowly accumulating dead chains).

Track the host-side veth interface alongside the container name in
the watchdog's active-sandbox slot, register it from
LxcScriptRunner::run_internal right after veth discovery, and have
the watchdog tear down iptables before destroying the container so
the FORWARD hook is removed while the veth name still exists.

Adds NetworkIptablesManager::force_cleanup() so the watchdog can
trigger the same teardown path without owning the original manager
instance.

Addresses Copilot review feedback on PR #283.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>

---------

Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants