-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix container pid race condition. #3857
Conversation
Codecov Report
@@ Coverage Diff @@
## master #3857 +/- ##
=======================================
Coverage 42.34% 42.34%
=======================================
Files 130 130
Lines 14683 14683
=======================================
Hits 6217 6217
Misses 7540 7540
Partials 926 926
Continue to review full report at Codecov.
|
Build succeeded.
|
Can we have a test? |
Done. To make it easy to reproduce the race condition, I added a Here is the test result with containerd v1.3.0:
Here is the test result with this patch:
BTW, I tried to add retry in the test to help reproducing the race condition, but it is hard. However, it can be easily reproduced by running |
8050209
to
8c8e830
Compare
Signed-off-by: Lantao Liu <lantaol@google.com>
8c8e830
to
a6b6097
Compare
Build succeeded.
|
LGTM |
containerd 1.3.2 Welcome to the v1.3.2 release of containerd! The second patch release for `containerd` 1.3 includes a fix for a race condition related to the reported pid on exit when called from Docker. ### Runtime * Fix containerd pid race condition [containerd#3857](containerd#3857) * Use cached process state to reduce exec cost [containerd#3711](containerd#3711) ### CRI * Added `insecure_skip_verify` option in the registry tls config to allow skipping registry certificate verification [containerd#3847](containerd#3847) Please try out the release binaries and report any issues at https://github.com/containerd/containerd/issues. ### Contributors * Lantao Liu * Derek McGowan * Michael Crosby * Phil Estes * Maksym Pavlenko ### Changes * [`ff48f57fc8`](containerd@ff48f57) Merge pull request [containerd#3866](containerd#3866) from dmcgowan/prepare-1.3.2 * [`99005c2647`](containerd@99005c2) Add release notes for v1.3.2 * [`e987ea3cac`](containerd@e987ea3) Merge pull request [containerd#3864](containerd#3864) from Random-Liu/update-cri-release-1.3 * [`8bbd71e560`](containerd@8bbd71e) Update cri to b1bef15fbeb6c6f0569b67322acfa74ca3597755. * [`070c27205c`](containerd@070c272) Merge pull request [containerd#3863](containerd#3863) from Random-Liu/cherrypick-#3857-release-1.3 * [`306d6d4b55`](containerd@306d6d4) Fix container pid. * [`5028701f1a`](containerd@5028701) Merge pull request [containerd#3753](containerd#3753) from thaJeztah/1.3_backport_avoid_unnecessary_runc_state * [`e3d2b608cc`](containerd@e3d2b60) Merge pull request [containerd#3855](containerd#3855) from fuweid/cp-3853 * [`04fbb97ad0`](containerd@04fbb97) Fix cleanup error on content client test * [`be24f39454`](containerd@be24f39) Merge pull request [containerd#3840](containerd#3840) from ameyag/1.3_backport_log_file_win * [`333679343b`](containerd@3336793) Add `--log-file` flag for windows service. * [`9abfc70043`](containerd@9abfc70) Use cached state instead of `runc state`. ### Changes from containerd/cri * [`b1bef15f`](containerd/cri@b1bef15) Merge pull request [containerd#1346](containerd/cri#1346) from Random-Liu/cherrypick-#1345-release-1.3 * [`27edfa06`](containerd/cri@27edfa0) Add insecure_skip_verify option. * [`e5dd8053`](containerd/cri@e5dd805) Merge pull request [containerd#1322](containerd/cri#1322) from Random-Liu/cherrypick-#1319-release-1.3 * [`c0dee957`](containerd/cri@c0dee95) Fix containerd build, use `libbtrfs-dev` when available. * [`1fb415d2`](containerd/cri@1fb415d) Merge pull request [containerd#1314](containerd/cri#1314) from Random-Liu/cherrypick-#1312-release-1.3 * [`0973459d`](containerd/cri@0973459) Update based on default xenial distro. * [`a46f6baf`](containerd/cri@a46f6ba) Configure golangci-lint ### Dependency Changes Previous release can be found at [v1.3.1](https://github.com/containerd/containerd/releases/tag/v1.3.1) * **github.com/containerd/cri** 5d49e7e51b43e36a6b9c4386257c7d08c602237f -> b1bef15fbeb6c6f0569b67322acfa74ca3597755 # gpg: directory '/c/Users/juterry/.gnupg' created # gpg: keybox '/c/Users/juterry/.gnupg/pubring.kbx' created # gpg: Signature made Tue Dec 3 11:09:10 2019 PST # gpg: using RSA key 8C7A111C21105794B0E8A27BF58C5D0A4405ACDB # gpg: Can't check signature: No public key
If you try docker with containerd 1.3, and run
docker run hello-world
, it will stuck.After #3366, we change pid after container/exec process created, this causes various issues:
task.Start
might be-1
if the container is a short running container, becauseStart
andState
are not called in a transaction: https://github.com/containerd/containerd/blob/master/services/tasks/local.go#L230.uint
, returning-1
will eventually make it4294967295
, which is not very user friendly:This PR partially reverts #3366, and use
exited
to figure out whether the process has already exited. I tested the change with Docker, anddocker run hello-world
works now.Signed-off-by: Lantao Liu lantaol@google.com