Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resume healthcheck when daemon restarts #41935

Conversation

m-k8s
Copy link
Contributor

@m-k8s m-k8s commented Jan 26, 2021

Call updateHealthMonitor for alive non-paused containers when daemon restarts.

What I did

Fix issue #41871 feat. @ealessandria-orange

- How I did it

While dockerd restores, call updateHealthMonitor for non paused alive containers.

- How to verify it

At dockerd restart, containers' healthcheck is still running.

- Description for the changelog

Fix issue #41871

@m-k8s m-k8s force-pushed the Issue-41871-Restore-healthcheck-at-dockerd-restart branch from 496ecb1 to 9c5f55e Compare January 26, 2021 14:18
daemon/daemon.go Outdated Show resolved Hide resolved
@thaJeztah
Copy link
Member

Probably could use a test as well; I see there's a test for restarting the daemon that's already testing some scenarios with live-restore enabled;

func TestDaemonRestartKillContainers(t *testing.T) {

@m-k8s m-k8s changed the title Update daemon/daemon.go: resume healthcheck on restore WIP: Update daemon/daemon.go: resume healthcheck on restore Jan 26, 2021
@m-k8s m-k8s changed the title WIP: Update daemon/daemon.go: resume healthcheck on restore Update daemon/daemon.go: resume healthcheck on restore Jan 26, 2021
@m-k8s m-k8s force-pushed the Issue-41871-Restore-healthcheck-at-dockerd-restart branch from 04fd49e to e1973f1 Compare January 26, 2021 19:36
@thaJeztah
Copy link
Member

Looks like CI is failing (seeing some panics)

@thaJeztah thaJeztah added the status/failing-ci Indicates that the PR in its current state fails the test suite label Jan 27, 2021
@m-k8s m-k8s marked this pull request as draft January 28, 2021 09:43
@m-k8s m-k8s changed the title Update daemon/daemon.go: resume healthcheck on restore Resume healthcheck when daemon restarts Jan 28, 2021
@m-k8s m-k8s marked this pull request as ready for review January 28, 2021 13:07
@m-k8s
Copy link
Contributor Author

m-k8s commented Jan 28, 2021

CI job "ppc64le" doesn't pass, I don't think it's coming from my code, @thaJeztah can you look at it please?

@thaJeztah
Copy link
Member

I kicked CI to run again 👍

@thaJeztah thaJeztah removed the status/failing-ci Indicates that the PR in its current state fails the test suite label Jan 28, 2021
@m-k8s
Copy link
Contributor Author

m-k8s commented Jan 28, 2021

I kicked CI to run again 👍

Thank you, but the job "ppc64le" failed again, the error says that it cannot delete some files (operation not allowed).
Can you try to manually delete this folder /home/docker/workspace/moby_PR-41935/ on the "ppc64le" jenkins node please?

@m-k8s
Copy link
Contributor Author

m-k8s commented Jan 28, 2021

@thaJeztah I think Jenkins is looking to delete the directory because I force-pushed the e1973f1 commit two days ago.
I sent this forced push because I had "amended" a commit to correct a piece of code, it's a bad habit, I should have made another commit, sorry.

daemon/daemon.go Outdated Show resolved Hide resolved
Copy link
Member

@cpuguy83 cpuguy83 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@thaJeztah
Copy link
Member

could you squash the commits? (let me know if you need help doing so)

@m-k8s
Copy link
Contributor Author

m-k8s commented Jan 29, 2021

could you squash the commits? (let me know if you need help doing so)

Thanks, I would click on the "squash and merge" button.

I cannot start the merge because the integration test did not pass.

@cpuguy83
Copy link
Member

We cannot use squash and merge due to dco signing requirements.

@cpuguy83
Copy link
Member

It also doesn't use a merge commit in that case.

@m-k8s m-k8s force-pushed the Issue-41871-Restore-healthcheck-at-dockerd-restart branch from 8b2383b to 77e809e Compare January 29, 2021 20:01
@m-k8s
Copy link
Contributor Author

m-k8s commented Feb 1, 2021

The commits squash is done.

@cpuguy83
Copy link
Member

cpuguy83 commented Feb 1, 2021

@alexisries Thanks, we'll merge once review is complete.

@cpuguy83
Copy link
Member

Note that this is fixing a bug in that one cannot restart a daemon with live-restore and also have healthchecks.

@m-k8s m-k8s force-pushed the Issue-41871-Restore-healthcheck-at-dockerd-restart branch 2 times, most recently from cef3235 to 6088690 Compare September 22, 2021 19:43
Copy link
Member

@tianon tianon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 😅

daemon/daemon.go Outdated Show resolved Hide resolved
Call updateHealthMonitor for alive non-paused containers

Signed-off-by: Alexis Ries <alexis.ries.ext@orange.com>
@m-k8s m-k8s force-pushed the Issue-41871-Restore-healthcheck-at-dockerd-restart branch from 6088690 to 9f39889 Compare October 7, 2021 19:23
Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@thaJeztah thaJeztah added this to the 21.xx milestone Oct 15, 2021
@thaJeztah
Copy link
Member

Failures are the known failures from #42892 - ignoring those

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants