-
Notifications
You must be signed in to change notification settings - Fork 555
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Socket file missing after successful checkpoint and restore step in postgres image #1914
Comments
@Snorch could you take a look at why criu decided that it is a ghost socket? |
I checked it in Virtuozzo container and it I guess it can be due to overmounted files support which we haven't ported yet on top of mount-v2. (jfyi: in mount-v1 overmounted files are restored as link-remap with rmdir on path). Will continue investigating. |
@Snorch could you write instructions how to dump/restore with the mount-v1 engine. Maybe it will work with it. |
|
Reproduce is on criu 3.14, so mount-v2 is not involved for sure. https://github.com/shashank-sharma/psql-testing/blob/master/Dockerfile#L11 |
JFYI: Found this problem in mount-v2 when trying to reproduce with newer criu: |
@shashank-sharma I've updated your reproduce to use latest criu/docker and it works smoothly now: https://github.com/Snorch/psql-testing/tree/updated-reproduce
So I believe problem is already solved. As you can see there are quiet a lot of unix-socket changes in criu between v3.14 which you use and latest criu-dev branch:
Please try latest, hope it works for you too. |
@Snorch Thanks for the quick update, I just tested it out (3.17) and you are right, it works fine with the latest version. Closing this issue, Resolved |
…arenting find_new_reaper() assumes that "has_child_subreaper" logic is safe as long as we are not the exiting ->child_reaper and this is doubly wrong: 1. In fact it is safe if "pid_ns->child_reaper == father"; there must be no children after zap_pid_ns_processes() returns, so it doesn't matter what we return in this case and even pid_ns->child_reaper is wrong otherwise: we can't reparent to ->child_reaper == current. This is not a bug, but this is confusing. 2. It is not safe if we are not pid_ns->child_reaper but from the same thread group. We drop tasklist_lock before zap_pid_ns_processes(), so another thread can lock it and choose the new reaper from the upper namespace if has_child_subreaper == T, and this is obviously wrong. This is not that bad, zap_pid_ns_processes() won't return until the the new reaper reaps all zombies, but this should be fixed anyway. We could change for_each_thread() loop to use ->exit_state instead of PF_EXITING which we had to use until 8aac627, or we could change copy_signal() to check CLONE_NEWPID before setting has_child_subreaper, but lets change this code so that it is clear we can't look outside of our namespace, otherwise same_thread_group(reaper, child_reaper) check will look wrong and confusing anyway. We can simply start from "father" and fix the problem. We can't wrongly return a thread from the same thread group if ->is_child_subreaper == T, we know that all threads have PF_EXITING set. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Aaron Tomlin <atomlin@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Kay Sievers <kay@vrfy.org> Cc: Lennart Poettering <lennart@poettering.net> Cc: Sterling Alexander <stalexan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from ms commit 7d24e2d) Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> ================= Patchset description: vz7: fix child-reaper reparenting Forth patch is needed as kernel can reparent process to a dead thread which is wrong. Third patch is needed as kernel could reparent process from father from one pidns to process from different pidns, which creates configurations not supported by CRIU. Found it when reproducing problem from CRIU mainstream issue in VZ7 ct. checkpoint-restore/criu#1914 First and Second are just to make it apply cleaner. Oleg Nesterov (4): exit: reparent: fix the cross-namespace PR_SET_CHILD_SUBREAPER reparenting exit: reparent: document the ->has_child_subreaper checks exit: fix the setns() && PR_SET_CHILD_SUBREAPER interaction exit: reparent: fix the dead-parent PR_SET_CHILD_SUBREAPER reparenting
Swap the "init_task" and same_thread_group() checks. This way it is more simple to document these checks and we can remove the link to the previous discussion on lkml. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Aaron Tomlin <atomlin@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Kay Sievers <kay@vrfy.org> Cc: Lennart Poettering <lennart@poettering.net> Cc: Sterling Alexander <stalexan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from ms commit 175aed3) Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> ================= Patchset description: vz7: fix child-reaper reparenting Forth patch is needed as kernel can reparent process to a dead thread which is wrong. Third patch is needed as kernel could reparent process from father from one pidns to process from different pidns, which creates configurations not supported by CRIU. Found it when reproducing problem from CRIU mainstream issue in VZ7 ct. checkpoint-restore/criu#1914 First and Second are just to make it apply cleaner. Oleg Nesterov (4): exit: reparent: fix the cross-namespace PR_SET_CHILD_SUBREAPER reparenting exit: reparent: document the ->has_child_subreaper checks exit: fix the setns() && PR_SET_CHILD_SUBREAPER interaction exit: reparent: fix the dead-parent PR_SET_CHILD_SUBREAPER reparenting
find_new_reaper() checks same_thread_group(reaper, child_reaper) to prevent the cross-namespace reparenting but this is not enough if the exiting parent was injected by setns() + fork(). Suppose we have a process P in the root namespace and some namespace X. P does setns() to enter the X namespace, and forks the child C. C forks a grandchild G and exits. The grandchild G should be re-parented to X->child_reaper, but in this case the ->real_parent chain does not lead to ->child_reaper, so it will be wrongly reparanted to P's sub-reaper or a global init. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> (cherry picked from ms commit c6c70f4) Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> ================= Patchset description: vz7: fix child-reaper reparenting Forth patch is needed as kernel can reparent process to a dead thread which is wrong. Third patch is needed as kernel could reparent process from father from one pidns to process from different pidns, which creates configurations not supported by CRIU. Found it when reproducing problem from CRIU mainstream issue in VZ7 ct. checkpoint-restore/criu#1914 First and Second are just to make it apply cleaner. Oleg Nesterov (4): exit: reparent: fix the cross-namespace PR_SET_CHILD_SUBREAPER reparenting exit: reparent: document the ->has_child_subreaper checks exit: fix the setns() && PR_SET_CHILD_SUBREAPER interaction exit: reparent: fix the dead-parent PR_SET_CHILD_SUBREAPER reparenting
…ting The ->has_child_subreaper code in find_new_reaper() finds alive "thread" but returns another "reaper" thread which can be dead. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Aaron Tomlin <atomlin@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Kay Sievers <kay@vrfy.org> Cc: Lennart Poettering <lennart@poettering.net> Cc: Sterling Alexander <stalexan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from ms commit 8a1296a) Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> ================= Patchset description: vz7: fix child-reaper reparenting Forth patch is needed as kernel can reparent process to a dead thread which is wrong. Third patch is needed as kernel could reparent process from father from one pidns to process from different pidns, which creates configurations not supported by CRIU. Found it when reproducing problem from CRIU mainstream issue in VZ7 ct. checkpoint-restore/criu#1914 First and Second are just to make it apply cleaner. Oleg Nesterov (4): exit: reparent: fix the cross-namespace PR_SET_CHILD_SUBREAPER reparenting exit: reparent: document the ->has_child_subreaper checks exit: fix the setns() && PR_SET_CHILD_SUBREAPER interaction exit: reparent: fix the dead-parent PR_SET_CHILD_SUBREAPER reparenting
…arenting find_new_reaper() assumes that "has_child_subreaper" logic is safe as long as we are not the exiting ->child_reaper and this is doubly wrong: 1. In fact it is safe if "pid_ns->child_reaper == father"; there must be no children after zap_pid_ns_processes() returns, so it doesn't matter what we return in this case and even pid_ns->child_reaper is wrong otherwise: we can't reparent to ->child_reaper == current. This is not a bug, but this is confusing. 2. It is not safe if we are not pid_ns->child_reaper but from the same thread group. We drop tasklist_lock before zap_pid_ns_processes(), so another thread can lock it and choose the new reaper from the upper namespace if has_child_subreaper == T, and this is obviously wrong. This is not that bad, zap_pid_ns_processes() won't return until the the new reaper reaps all zombies, but this should be fixed anyway. We could change for_each_thread() loop to use ->exit_state instead of PF_EXITING which we had to use until 8aac627, or we could change copy_signal() to check CLONE_NEWPID before setting has_child_subreaper, but lets change this code so that it is clear we can't look outside of our namespace, otherwise same_thread_group(reaper, child_reaper) check will look wrong and confusing anyway. We can simply start from "father" and fix the problem. We can't wrongly return a thread from the same thread group if ->is_child_subreaper == T, we know that all threads have PF_EXITING set. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Aaron Tomlin <atomlin@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Kay Sievers <kay@vrfy.org> Cc: Lennart Poettering <lennart@poettering.net> Cc: Sterling Alexander <stalexan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from ms commit 7d24e2d) Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> ================= Patchset description: vz7: fix child-reaper reparenting Forth patch is needed as kernel can reparent process to a dead thread which is wrong. Third patch is needed as kernel could reparent process from father from one pidns to process from different pidns, which creates configurations not supported by CRIU. Found it when reproducing problem from CRIU mainstream issue in VZ7 ct. checkpoint-restore/criu#1914 First and Second are just to make it apply cleaner. Oleg Nesterov (4): exit: reparent: fix the cross-namespace PR_SET_CHILD_SUBREAPER reparenting exit: reparent: document the ->has_child_subreaper checks exit: fix the setns() && PR_SET_CHILD_SUBREAPER interaction exit: reparent: fix the dead-parent PR_SET_CHILD_SUBREAPER reparenting
Swap the "init_task" and same_thread_group() checks. This way it is more simple to document these checks and we can remove the link to the previous discussion on lkml. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Aaron Tomlin <atomlin@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Kay Sievers <kay@vrfy.org> Cc: Lennart Poettering <lennart@poettering.net> Cc: Sterling Alexander <stalexan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from ms commit 175aed3) Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> ================= Patchset description: vz7: fix child-reaper reparenting Forth patch is needed as kernel can reparent process to a dead thread which is wrong. Third patch is needed as kernel could reparent process from father from one pidns to process from different pidns, which creates configurations not supported by CRIU. Found it when reproducing problem from CRIU mainstream issue in VZ7 ct. checkpoint-restore/criu#1914 First and Second are just to make it apply cleaner. Oleg Nesterov (4): exit: reparent: fix the cross-namespace PR_SET_CHILD_SUBREAPER reparenting exit: reparent: document the ->has_child_subreaper checks exit: fix the setns() && PR_SET_CHILD_SUBREAPER interaction exit: reparent: fix the dead-parent PR_SET_CHILD_SUBREAPER reparenting
find_new_reaper() checks same_thread_group(reaper, child_reaper) to prevent the cross-namespace reparenting but this is not enough if the exiting parent was injected by setns() + fork(). Suppose we have a process P in the root namespace and some namespace X. P does setns() to enter the X namespace, and forks the child C. C forks a grandchild G and exits. The grandchild G should be re-parented to X->child_reaper, but in this case the ->real_parent chain does not lead to ->child_reaper, so it will be wrongly reparanted to P's sub-reaper or a global init. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> (cherry picked from ms commit c6c70f4) Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> ================= Patchset description: vz7: fix child-reaper reparenting Forth patch is needed as kernel can reparent process to a dead thread which is wrong. Third patch is needed as kernel could reparent process from father from one pidns to process from different pidns, which creates configurations not supported by CRIU. Found it when reproducing problem from CRIU mainstream issue in VZ7 ct. checkpoint-restore/criu#1914 First and Second are just to make it apply cleaner. Oleg Nesterov (4): exit: reparent: fix the cross-namespace PR_SET_CHILD_SUBREAPER reparenting exit: reparent: document the ->has_child_subreaper checks exit: fix the setns() && PR_SET_CHILD_SUBREAPER interaction exit: reparent: fix the dead-parent PR_SET_CHILD_SUBREAPER reparenting
…ting The ->has_child_subreaper code in find_new_reaper() finds alive "thread" but returns another "reaper" thread which can be dead. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Aaron Tomlin <atomlin@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Kay Sievers <kay@vrfy.org> Cc: Lennart Poettering <lennart@poettering.net> Cc: Sterling Alexander <stalexan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from ms commit 8a1296a) Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> ================= Patchset description: vz7: fix child-reaper reparenting Forth patch is needed as kernel can reparent process to a dead thread which is wrong. Third patch is needed as kernel could reparent process from father from one pidns to process from different pidns, which creates configurations not supported by CRIU. Found it when reproducing problem from CRIU mainstream issue in VZ7 ct. checkpoint-restore/criu#1914 First and Second are just to make it apply cleaner. Oleg Nesterov (4): exit: reparent: fix the cross-namespace PR_SET_CHILD_SUBREAPER reparenting exit: reparent: document the ->has_child_subreaper checks exit: fix the setns() && PR_SET_CHILD_SUBREAPER interaction exit: reparent: fix the dead-parent PR_SET_CHILD_SUBREAPER reparenting
Description
After checkpoint and restore of postgres image (14.3), criu is not able to restore one socket file. I observed that after checkpoint, socket file is still there (I checked by mounting the directory and seeing if it is still there), after restoration is when the socket file is missing
Steps to reproduce the issue:
.s.PGSQL.5432
socket fileDetailed steps can be found here: https://github.com/shashank-sharma/psql-testing/blob/master/pdb/start.sh
And it is possible to reproduce, I have created minimal example:
https://github.com/shashank-sharma/psql-testing
Describe the results you received:
Describe the results you expected:
Expected
.s.PGSQL.5432
after restoration so that it is possible to use after restoreAdditional information you deem important (e.g. issue happens only occasionally):
CRIU logs and information:
CRIU full dump/restore logs:
Log here
Output of `criu --version`:
Output of `criu check --all`:
Additional environment details:
The text was updated successfully, but these errors were encountered: