Skip to content

criu: show excerpt from log file on c/r error#2030

Merged
giuseppe merged 2 commits intocontainers:mainfrom
kolyshkin:debug-criu
Mar 5, 2026
Merged

criu: show excerpt from log file on c/r error#2030
giuseppe merged 2 commits intocontainers:mainfrom
kolyshkin:debug-criu

Conversation

@kolyshkin
Copy link
Collaborator

@kolyshkin kolyshkin commented Feb 28, 2026

  1. crui: simplify criu_check_mem_track error message.

    In case criu_check_mem_track failed, or returned that kernel memory
    tracking is not supported, don't refer to CRIU log file.

    Kernel memory tracking is supported since kernel v3.11 (and further
    improved in v3.18) and so we can assume it's supported. If not, a
    simple message should be sufficient, and CRIU log probably does
    not contain any further details.

  2. criu: show excerpt from log file on c/r error

    This is a brute way to see some CRIU logs in CI when C/R fails.

    The output looks like this:

    $ sudo /home/kir//git/crun/crun  checkpoint 123
    2026-03-04T22:28:17.146850Z: --- excerpt from CRIU log `/home/kir/git/runc-tst/checkpoint/dump.log`
    2026-03-04T22:28:17.147043Z: (00.138480) Error (criu/files-reg.c:1790): Can't lookup mount=40 for fd=0 path=/dev/pts/5 (deleted)
    2026-03-04T22:28:17.147054Z: (00.138490) Error (criu/cr-dump.c:1698): Dump files (pid: 353704) failed with -1
    2026-03-04T22:28:17.147061Z: (00.211227) Error (criu/cr-dump.c:2128): Dumping FAILED.
    2026-03-04T22:28:17.147074Z: --- end of excerpt
    2026-03-04T22:28:17.147222Z: CRIU checkpointing failed: -52: Invalid exchange

@gemini-code-assist

This comment was marked as outdated.

gemini-code-assist[bot]

This comment was marked as outdated.

@packit-as-a-service
Copy link

Ephemeral COPR build failed. @containers/packit-build please check.

@kolyshkin kolyshkin force-pushed the debug-criu branch 2 times, most recently from 119f93e to 7b81b77 Compare March 1, 2026 21:17
@kolyshkin kolyshkin changed the title [debug] show dump.log on criu_dump error [debug/DNM] show dump.log on criu_dump error Mar 1, 2026
@kolyshkin kolyshkin force-pushed the debug-criu branch 2 times, most recently from 308cde1 to 22993af Compare March 2, 2026 20:14
@kolyshkin

This comment was marked as outdated.

@kolyshkin kolyshkin force-pushed the debug-criu branch 2 times, most recently from f3dce20 to d7c4713 Compare March 3, 2026 18:20
@packit-as-a-service
Copy link

TMT tests failed. @containers/packit-build please check.

@kolyshkin
Copy link
Collaborator Author

Ah, finally got it!

# [18:34:40.955891847] # podman  container checkpoint f7083b2ea1869dcbb97239b0c9fb4d02d270fddec13701c04a30a8305390c43c
# [18:34:41.043687221] 144-(00.026346) Add mnt ns 13 pid 32790
# 145-(00.026364) Will take cgroup namespace in the image
# 146-(00.026366) Add cgroup ns 14 pid 32790
# 147-(00.026439) net: Lock network
# 148-(00.026442) Running network-lock scripts
# 149:Error (criu/util.c:627): execvp("iptables-restore", ...) failed: No such file or directory
# 150:(00.027125) Error (criu/util.c:642): exited, status=1
# 151:Error (criu/util.c:627): execvp("ip6tables-restore", ...) failed: No such file or directory
# 152:(00.027775) Error (criu/util.c:642): exited, status=1
# 153:(00.027788) Error (criu/net.c:3124): net: Locking network failed: iptables-restore returned -1. This may be connected to disabled CONFIG_NETFILTER_XT_MARK kernel build config option.
# 154-(00.027806) net: Unlock network
# 155-(00.027809) Running network-unlock scripts
# 156:Error (criu/util.c:627): execvp("iptables-restore", ...) failed: No such file or directory
# 157:(00.028459) Error (criu/util.c:642): exited, status=1
# 158:Error (criu/util.c:627): execvp("ip6tables-restore", ...) failed: No such file or directory
# 159:(00.030103) Error (criu/util.c:642): exited, status=1
# 160-(00.030126) Unfreezing tasks into 1
# 161-(00.030130) 	Unseizing 32790 into 1
# 162-(00.030141) 	Unseizing 32830 into 1
# 163:(00.030158) Error (criu/cr-dump.c:2098): Dumping FAILED.
# CRIU checkpointing failed -52.  Please check CRIU logfile /var/lib/containers/storage/overlay-containers/f7083b2ea1869dcbb97239b0c9fb4d02d270fddec13701c04a30a8305390c43c/userdata/dump.log: Invalid exchange
# Error: `/usr/bin/crun checkpoint --image-path /var/lib/containers/storage/overlay-containers/f7083b2ea1869dcbb97239b0c9fb4d02d270fddec13701c04a30a8305390c43c/userdata/checkpoint --work-path /var/lib/containers/storage/overlay-containers/f7083b2ea1869dcbb97239b0c9fb4d02d270fddec13701c04a30a8305390c43c/userdata f7083b2ea1869dcbb97239b0c9fb4d02d270fddec13701c04a30a8305390c43c` failed: exit status 1
# [18:34:41.046892601] [ rc=125 (** EXPECTED 0 **) ]

@kolyshkin kolyshkin force-pushed the debug-criu branch 3 times, most recently from 474d829 to 9c8fab4 Compare March 4, 2026 01:17
@kolyshkin kolyshkin changed the title [debug/DNM] show dump.log on criu_dump error criu: show excerpt from log file on c/r error Mar 4, 2026
@kolyshkin kolyshkin marked this pull request as ready for review March 4, 2026 01:26
return;
if (pid == 0)
{
char *const args[] = { "grep", "-n", "-B5", "Error", log_path, NULL };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how much information is in the -B5 lines? Without this flag, we could ~easily rewrite this to be a combination of fopen/fgets/has_prefix. The -B5 makes it a bit more difficult (the ring buffer implementation we have accounts for bytes, not lines)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not all criu errors are self-explanatory, so -B5 add some context to where the error has happened. Yet I guess just printing errors is better than referring to the log file, so let me work on it.

has_prefix won't work though, as criu log usually contain timestamps.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

In case criu_check_mem_track failed, or returned that kernel memory
tracking is not supported, don't refer to CRIU log file.

Kernel memory tracking is supported since kernel v3.11 (and further
improved in v3.18) and so we can assume it's supported. If not, a
simple message should be sufficient, and CRIU log _probably_ does
not contain any further details.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This is a brute way to see some CRIU logs in CI when C/R fails.

The output looks like this:

$ sudo /home/kir//git/crun/crun  checkpoint 123
2026-03-04T22:28:17.146850Z: --- excerpt from CRIU log `/home/kir/git/runc-tst/checkpoint/dump.log`
2026-03-04T22:28:17.147043Z: (00.138480) Error (criu/files-reg.c:1790): Can't lookup mount=40 for fd=0 path=/dev/pts/5 (deleted)
2026-03-04T22:28:17.147054Z: (00.138490) Error (criu/cr-dump.c:1698): Dump files (pid: 353704) failed with -1
2026-03-04T22:28:17.147061Z: (00.211227) Error (criu/cr-dump.c:2128): Dumping FAILED.
2026-03-04T22:28:17.147074Z: --- end of excerpt
2026-03-04T22:28:17.147222Z: CRIU checkpointing failed: -52: Invalid exchange

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Copy link
Member

@giuseppe giuseppe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@giuseppe giuseppe merged commit 66f1405 into containers:main Mar 5, 2026
48 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants