$ kubectl ate suspend actor ctr1
Error: failed to suspend actor: rpc error: code = Unknown desc = while calling ateom.CheckpointWorkload: rpc error: code = Unknown desc = while deleting pause container: while running `runsc delete`: exit status 128
$ kubectl ate suspend actor ctr1
Error: failed to suspend actor: rpc error: code = Unknown desc = while calling ateom.CheckpointWorkload: rpc error: code = Unknown desc = while checkpointing pause: while running `runsc checkpoint`: exit status 128
$ kubectl ate get workers
NAMESPACE POOL POD STATUS ASSIGNED ACTOR
ate-demo-counter counter counter-deployment-585c7fc5cd-x26b5 FREE <none>
ate-demo-counter counter counter-deployment-585c7fc5cd-rrtqm FREE <none>
ate-demo-counter counter counter-deployment-585c7fc5cd-rgrb9 FREE <none>
ate-demo-counter counter counter-deployment-585c7fc5cd-zt776 FREE <none>
ate-demo-counter counter counter-deployment-585c7fc5cd-4r9p9 ASSIGNED ate-demo-counter/counter/ctr1
$ kubectl ate get actors
NAMESPACE TEMPLATE ID STATUS ATEOM POD ATEOM IP VERSION
ate-demo-counter counter ctr6 STATUS_SUSPENDED <none> 1
ate-demo-counter counter ctr2 STATUS_SUSPENDED <none> 5
ate-demo-counter counter ctr4 STATUS_SUSPENDED <none> 5
ate-demo-counter counter ctr3 STATUS_SUSPENDED <none> 5
ate-demo-counter counter ctr5 STATUS_SUSPENDED <none> 5
ate-demo-counter counter ctr1 STATUS_SUSPENDING ate-demo-counter/counter-deployment-585c7fc5cd-4r9p9 10.244.0.29 4
$ kubectl ate suspend actor ctr1
Error: failed to suspend actor: rpc error: code = Unknown desc = while calling ateom.CheckpointWorkload: rpc error: code = Unknown desc = while checkpointing pause: while running `runsc checkpoint`: exit status 128
{"time":"2026-05-21T21:46:15.90675658Z","level":"INFO","msg":"Actor checkpointing","labels":{"ate.dev/actor_id":"ctr1","ate.dev/actor_template":"counter","ate.dev/actor_namespace":"ate-demo-counter"}}
{"time":"2026-05-21T21:46:15.906784733Z","level":"INFO","msg":"About to run runsc checkpoint","container":"pause"}
I0521 21:46:15.940484 140 cli.go:271] **************** gVisor ****************
I0521 21:46:15.940520 140 cli.go:272] Version release-20260511.0-42-ga7924c4ef10d-dirty, go1.25.5, amd64, 12 CPUs, linux, PID 140, PPID 1, UID 0, GID 0
I0521 21:46:15.940530 140 cli.go:274] Args: [/run/ateom-gvisor/static-files/runsc-a397be1abc2420d26bce6c70e6e2ff96c73aaaab929756c56f5e2089ea842b63 -log-format json --alsologtostderr -root /run/ateom-gvisor/actors/ate-demo-counter:counter:ctr1/runsc-state checkpoint -image-path /run/ateom-gvisor/actors/ate-demo-counter:counter:ctr1/checkpoint pause]
I0521 21:46:15.940544 140 config.go:487] Platform: systrap
I0521 21:46:15.940555 140 config.go:488] RootDir: /run/ateom-gvisor/actors/ate-demo-counter:counter:ctr1/runsc-state
I0521 21:46:15.940560 140 config.go:489] FileAccess: exclusive / Directfs: true / Overlay: root:self
I0521 21:46:15.940568 140 config.go:490] Network: sandbox
I0521 21:46:15.940574 140 config.go:491] UseCPUNums: false
W0521 21:46:15.940579 140 config.go:496] --allow-suid is disabled, SUID/SGID bits on executables will be ignored.
I0521 21:46:15.940584 140 cli.go:283] **************** gVisor ****************
I0521 21:46:15.951081 140 cli.go:310] Exiting with status: 0
W0521 21:46:15.980427 145 maincli.go:38] Cannot find if container pause exists, checking if sandbox pause is running, err: getting container state (CID: "pause"): connecting to control server at PID 34: connection refused
I0521 21:46:15.980466 145 cli.go:271] **************** gVisor ****************
I0521 21:46:15.980490 145 cli.go:272] Version release-20260511.0-42-ga7924c4ef10d-dirty, go1.25.5, amd64, 12 CPUs, linux, PID 145, PPID 1, UID 0, GID 0
I0521 21:46:15.980499 145 cli.go:274] Args: [/run/ateom-gvisor/static-files/runsc-a397be1abc2420d26bce6c70e6e2ff96c73aaaab929756c56f5e2089ea842b63 -log-format json --alsologtostderr -root /run/ateom-gvisor/actors/ate-demo-counter:counter:ctr1/runsc-state state pause]
I0521 21:46:15.980511 145 config.go:487] Platform: systrap
I0521 21:46:15.980523 145 config.go:488] RootDir: /run/ateom-gvisor/actors/ate-demo-counter:counter:ctr1/runsc-state
I0521 21:46:15.980528 145 config.go:489] FileAccess: exclusive / Directfs: true / Overlay: root:self
I0521 21:46:15.980534 145 config.go:490] Network: sandbox
I0521 21:46:15.980540 145 config.go:491] UseCPUNums: false
W0521 21:46:15.980545 145 config.go:496] --allow-suid is disabled, SUID/SGID bits on executables will be ignored.
{
"ociVersion": "1.2.1",
"id": "pause",
"status": "running",
"pid": 34,
"bundle": "/run/ateom-gvisor/actors/ate-demo-counter:counter:ctr1/bundles/pause",
"annotations": {
"io.kubernetes.cri.container-name": "pause",
"io.kubernetes.cri.container-type": "sandbox"
}
}
I0521 21:46:15.980550 145 cli.go:283] **************** gVisor ****************
I0521 21:46:15.980576 145 cli.go:310] Exiting with status: 0
W0521 21:46:16.032647 150 maincli.go:38] Cannot find if container counter exists, checking if sandbox pause is running, err: getting container state (CID: "counter"): connecting to control server at PID 34: connection refused
W0521 21:46:16.032650 150 maincli.go:38] Sandbox isn't running anymore, marking container counter as stopped:
I0521 21:46:16.032696 150 cli.go:271] **************** gVisor ****************
I0521 21:46:16.032722 150 cli.go:272] Version release-20260511.0-42-ga7924c4ef10d-dirty, go1.25.5, amd64, 12 CPUs, linux, PID 150, PPID 1, UID 0, GID 0
I0521 21:46:16.032730 150 cli.go:274] Args: [/run/ateom-gvisor/static-files/runsc-a397be1abc2420d26bce6c70e6e2ff96c73aaaab929756c56f5e2089ea842b63 -log-format json --alsologtostderr -root /run/ateom-gvisor/actors/ate-demo-counter:counter:ctr1/runsc-state state counter]
I0521 21:46:16.032742 150 config.go:487] Platform: systrap
I0521 21:46:16.032754 150 config.go:488] RootDir: /run/ateom-gvisor/actors/ate-demo-counter:counter:ctr1/runsc-state
I0521 21:46:16.032759 150 config.go:489] FileAccess: exclusive / Directfs: true / Overlay: root:self
I0521 21:46:16.032766 150 config.go:490] Network: sandbox
I0521 21:46:16.032772 150 config.go:491] UseCPUNums: false
W0521 21:46:16.032777 150 config.go:496] --allow-suid is disabled, SUID/SGID bits on executables will be ignored.
I0521 21:46:16.032782 150 cli.go:283] **************** gVisor ****************
I0521 21:46:16.032809 150 cli.go:310] Exiting with status: 0
{
"ociVersion": "1.2.1",
"id": "counter",
"status": "stopped",
"pid": -1,
"bundle": "/run/ateom-gvisor/actors/ate-demo-counter:counter:ctr1/bundles/counter",
"annotations": {
"io.kubernetes.cri.container-name": "counter",
"io.kubernetes.cri.container-type": "container",
"io.kubernetes.cri.sandbox-id": "pause"
}
}
W0521 21:46:16.080113 155 maincli.go:38] Cannot find if container counter exists, checking if sandbox pause is running, err: getting container state (CID: "counter"): connecting to control server at PID 34: connection refused
W0521 21:46:16.080116 155 maincli.go:38] Sandbox isn't running anymore, marking container counter as stopped:
I0521 21:46:16.080156 155 cli.go:271] **************** gVisor ****************
I0521 21:46:16.080179 155 cli.go:272] Version release-20260511.0-42-ga7924c4ef10d-dirty, go1.25.5, amd64, 12 CPUs, linux, PID 155, PPID 1, UID 0, GID 0
I0521 21:46:16.080188 155 cli.go:274] Args: [/run/ateom-gvisor/static-files/runsc-a397be1abc2420d26bce6c70e6e2ff96c73aaaab929756c56f5e2089ea842b63 -log-format json --alsologtostderr -root /run/ateom-gvisor/actors/ate-demo-counter:counter:ctr1/runsc-state delete -force counter]
I0521 21:46:16.080199 155 config.go:487] Platform: systrap
I0521 21:46:16.080215 155 config.go:488] RootDir: /run/ateom-gvisor/actors/ate-demo-counter:counter:ctr1/runsc-state
I0521 21:46:16.080220 155 config.go:489] FileAccess: exclusive / Directfs: true / Overlay: root:self
I0521 21:46:16.080226 155 config.go:490] Network: sandbox
I0521 21:46:16.080233 155 config.go:491] UseCPUNums: false
W0521 21:46:16.080238 155 config.go:496] --allow-suid is disabled, SUID/SGID bits on executables will be ignored.
I0521 21:46:16.080244 155 cli.go:283] **************** gVisor ****************
W0521 21:46:16.080491 155 container.go:1821] Process (34) not found setting oom_score_adj
I0521 21:46:16.080522 155 cli.go:310] Exiting with status: 0
W0521 21:46:16.143706 160 maincli.go:38] Cannot find if container pause exists, checking if sandbox pause is running, err: getting container state (CID: "pause"): connecting to control server at PID 34: connection refused
W0521 21:46:16.143708 160 maincli.go:38] Sandbox isn't running anymore, marking container pause as stopped:
I0521 21:46:16.143741 160 cli.go:271] **************** gVisor ****************
I0521 21:46:16.143761 160 cli.go:272] Version release-20260511.0-42-ga7924c4ef10d-dirty, go1.25.5, amd64, 12 CPUs, linux, PID 160, PPID 1, UID 0, GID 0
I0521 21:46:16.143771 160 cli.go:274] Args: [/run/ateom-gvisor/static-files/runsc-a397be1abc2420d26bce6c70e6e2ff96c73aaaab929756c56f5e2089ea842b63 -log-format json --alsologtostderr -root /run/ateom-gvisor/actors/ate-demo-counter:counter:ctr1/runsc-state delete -force pause]
I0521 21:46:16.143783 160 config.go:487] Platform: systrap
I0521 21:46:16.143795 160 config.go:488] RootDir: /run/ateom-gvisor/actors/ate-demo-counter:counter:ctr1/runsc-state
I0521 21:46:16.143799 160 config.go:489] FileAccess: exclusive / Directfs: true / Overlay: root:self
I0521 21:46:16.143806 160 config.go:490] Network: sandbox
I0521 21:46:16.143811 160 config.go:491] UseCPUNums: false
W0521 21:46:16.143816 160 config.go:496] --allow-suid is disabled, SUID/SGID bits on executables will be ignored.
I0521 21:46:16.143821 160 cli.go:283] **************** gVisor ****************
W0521 21:46:21.053287 160 container.go:930] stopping container: removing cgroup path "/sys/fs/cgroup/pause": device or resource busy
W0521 21:46:21.053432 160 util.go:107] FATAL ERROR: destroying container: stopping container: removing cgroup path "/sys/fs/cgroup/pause": device or resource busy
destroying container: stopping container: removing cgroup path "/sys/fs/cgroup/pause": device or resource busy
{"time":"2026-05-21T21:46:21.054283909Z","level":"INFO","msg":"Handle RPC","method":"/ateom.Ateom/CheckpointWorkload","req":{"actor_template_namespace":"ate-demo-counter","actor_template_name":"counter","actor_id":"ctr1","runsc_path":"/run/ateom-gvisor/static-files/runsc-a397be1abc2420d26bce6c70e6e2ff96c73aaaab929756c56f5e2089ea842b63","spec":{"containers":[{"name":"counter"}]}},"resp":null,"err":"while deleting pause container: while running `runsc delete`: exit status 128","elapsed-time":"5.14752792s"}
{"time":"2026-05-21T21:46:30.220620132Z","level":"INFO","msg":"Actor checkpointing","labels":{"ate.dev/actor_id":"ctr1","ate.dev/actor_template":"counter","ate.dev/actor_namespace":"ate-demo-counter"}}
{"time":"2026-05-21T21:46:30.220643698Z","level":"INFO","msg":"About to run runsc checkpoint","container":"pause"}
FetchSpec failed: loading container: file does not exist
{"time":"2026-05-21T21:46:30.272075347Z","level":"INFO","msg":"Handle RPC","method":"/ateom.Ateom/CheckpointWorkload","req":{"actor_template_namespace":"ate-demo-counter","actor_template_name":"counter","actor_id":"ctr1","runsc_path":"/run/ateom-gvisor/static-files/runsc-a397be1abc2420d26bce6c70e6e2ff96c73aaaab929756c56f5e2089ea842b63","spec":{"containers":[{"name":"counter"}]}},"resp":null,"err":"while checkpointing pause: while running `runsc checkpoint`: exit status 128","elapsed-time":"51.455806ms"}
{"time":"2026-05-21T21:47:13.504365359Z","level":"INFO","msg":"Actor checkpointing","labels":{"ate.dev/actor_id":"ctr1","ate.dev/actor_template":"counter","ate.dev/actor_namespace":"ate-demo-counter"}}
{"time":"2026-05-21T21:47:13.50439249Z","level":"INFO","msg":"About to run runsc checkpoint","container":"pause"}
FetchSpec failed: loading container: file does not exist
{"time":"2026-05-21T21:47:13.552016869Z","level":"INFO","msg":"Handle RPC","method":"/ateom.Ateom/CheckpointWorkload","req":{"actor_template_namespace":"ate-demo-counter","actor_template_name":"counter","actor_id":"ctr1","runsc_path":"/run/ateom-gvisor/static-files/runsc-a397be1abc2420d26bce6c70e6e2ff96c73aaaab929756c56f5e2089ea842b63","spec":{"containers":[{"name":"counter"}]}},"resp":null,"err":"while checkpointing pause: while running `runsc checkpoint`: exit status 128","elapsed-time":"47.651721ms"}
Expected Behavior
Suspended actor
Actual Behavior
Stuck suspending
Steps to Reproduce the Problem
Version: 18c86
Worker Logs: