Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checkpointing of Wasm container with podman+crun fails : Can't lookup mount #2170

Closed
mh4ck-Thales opened this issue May 5, 2023 · 4 comments

Comments

@mh4ck-Thales
Copy link

Description

When trying to checkpoint a wasm container started with podman + crun with wasmedge support, the checkpointing fails with an error like:

Error (criu/files-reg.c:1710): Can't lookup mount=476 for fd=-3 path=/
Error (criu/cr-dump.c:1524): Collect mappings (pid: 5571) failed with -1

This happens on both Fedora 38 (btrfs) and Debian 11 (ext4) up-to-date. For both OSes the error at the end of the dump.log file is the same, excepted for the mount number and pid. The wasm container I tried to checkpoint is simply outputting the prime numbers on stdout.

Steps to reproduce the issue:

  1. Create a wasm app. The easiest way is to create a rust app with a simple infinite loop, and compile it for wasm :
cargo new app && cd app
rustup target add wasm32-wasi
echo 'fn main() { loop { println!("Hello Wasm");}}' > src/main.rs
cargo build --target wasm32-wasi
  1. Create the wasm Container from this Containerfile :
FROM scratch
COPY target/wasm32-wasi/debug/app.wasm /app.wasm
CMD ["/app.wasm"]

And build with

podman build -t demo-wasm --platform wasi/wasm .
  1. Start this container in the background :
podman run --platform wasi/wasm --name demo-wasm-1 -d localhost/demo-wasm

You can check it is running with podman logs demo-wasm-1. You should see a lot of "Hello Wasm" printed.

  1. Try to checkpoint this container with
podman container checkpoint demo-wasm-1

And notice it is failing.

Describe the results you received:
The checkpointing of the container fails

Describe the results you expected:
The checkpointing succeeds

Additional information you deem important (e.g. issue happens only occasionally):

The issue happens with the most simple of Wasm container. I was able to checkpoint and restore normal containers (debian and others) on the same machine without any issue.

CRIU logs and information:

Output of podman container checkpoint command :

2023-05-05T14:21:43.243762Z: CRIU checkpointing failed -52.  Please check CRIU logfile /var/lib/containers/storage/overlay-containers/ec5ef8e9db19f3840bfc9357687935de4f7610448552a2be9ab611f2cbd3742e/userdata/dump.log
Error: `/usr/bin/crun-wasm checkpoint --image-path /var/lib/containers/storage/overlay-containers/ec5ef8e9db19f3840bfc9357687935de4f7610448552a2be9ab611f2cbd3742e/userdata/checkpoint --work-path /var/lib/containers/storage/overlay-containers/ec5ef8e9db19f3840bfc9357687935de4f7610448552a2be9ab611f2cbd3742e/userdata ec5ef8e9db19f3840bfc9357687935de4f7610448552a2be9ab611f2cbd3742e` failed: exit status 1

dump.log file is attached :

dump.log

Output of `criu --version`:

Version: 3.17.1

Output of `criu check --all`:

Looks good but some kernel features are missing
which, depending on your process tree, may cause
dump or restore failure.

Additional environment details:

Tried on both Fedora 38 (btrfs) and Debian 11 (ext4) in VMs. Criu installed from respective package managers. Outputs are from the Fedora machine.

@adrianreber
Copy link
Member

Thanks for the report. As non wasm apps are working this seems to be a crun problem.

It seems the way mounts are configured, from a quick look, is not in a way that CRIU can handle it.

On the CRIU side we cannot do much for now. This needs to be fixed in the crun side. It will probably also be assigned to me 😂

@mh4ck-Thales
Copy link
Author

I don't know if it is crun-related or if it can depend on the wasm runtime used (crun can be built with different wasm runtimes, like wasmedge, wasmtime or wasmer). Fedora is shipped with wasmedge support, and I built crun on Debian with wasmedge too. I'll check on my side if changing the wasm runtime does soemthing, and I'll open a crun issue about this.

@adrianreber
Copy link
Member

@Snorch can you take a look at the logs to see if something is wrong the way the mounts are configured. That is my first assumption. Not sure if that is the reason.

@adrianreber
Copy link
Member

Closing in favour of containers/crun#1204

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants