-
Notifications
You must be signed in to change notification settings - Fork 259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support restarting containerd in tests, add restart test case #1188
Conversation
Previously main.go didn't have _test suffix, so it was not considered a test file. Notably this meant that TestMain was never actually invoked because it must be in a test file. It seems we were fortunate that there was nothing in TestMain that wasn't done automatically by the Go test infrastructure. As the cri-containerd directory is all test code, it is probably safe to rename all other files in the directory to be test code as well. However, that is left for a future change. Signed-off-by: Kevin Parsons <kevpar@microsoft.com>
This change lets the cri-containerd tests start/stop containerd as needed, rather than assuming it is always running. This is done through the addition of startContainerd/stopContainerd functions which can be called from tests. As all of the existing tests need containerd to be running, this currently is not used in any tests. Future tests can take advantage of this functionality. Tests assume that containerd is running when they start, and should not need to explicitly start containerd before calling into it. This means that if a test stops containerd, it needs to ensure containerd is started again. If containerd crashes during a test, then subsequent tests will fail, but that's the same as the current behavior. An unfortunate side effect of this change is that, due to a standing issue with Go's service support and containerd, the service can sometimes exit with ERROR_PROCESS_ABORTED when it is stopped. Combined with the fact that recovery actions are used for containerd, this can result in the service being restarted by the service control manager. To work around this, we need to first disable recovery actions for the service before running tests. This can be done with: sc failure containerd reset=0 actions= command= Signed-off-by: Kevin Parsons <kevpar@microsoft.com>
Needs a rebase I believe. The Edit: Err not a rebase, this has the changes. Just need to update your uses of the old function name |
Adds a test case that runs a pod+container, restarts containerd, then verifies that the pod+container were terminated. This validates the change made in the CRI fork [1] to terminate containers when containerd is restarted. [1]: kevpar/cri@f8e83e6 Signed-off-by: Kevin Parsons <kevpar@microsoft.com>
Aha, I hadn't checked back on CI yet. |
Signed-off-by: Kevin Parsons <kevpar@microsoft.com>
Signed-off-by: Kevin Parsons <kevpar@microsoft.com>
defer stopContainer(t, client, ctx, containerID) | ||
|
||
t.Log("Restart containerd") | ||
stopContainerd(t) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually, cri-containerd.test.exe
runs 4 tests in parallel (IIRC), This would break other tests that are running in parallel, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it? I thought running tests in parallel required you to sprinkle the ones that you'd like to make eligible with t.Parallel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I thought just setting -test.parallel
(which has default value of 4) to value greater than 1 does it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is also that tests must explicitly opt into being run in parallel. I agree running in parallel would cause problems given the service is global state.
Related work items: microsoft#1067, microsoft#1097, microsoft#1119, microsoft#1170, microsoft#1176, microsoft#1180, microsoft#1181, microsoft#1182, microsoft#1183, microsoft#1184, microsoft#1185, microsoft#1186, microsoft#1187, microsoft#1188, microsoft#1189, microsoft#1191, microsoft#1193, microsoft#1194, microsoft#1195, microsoft#1196, microsoft#1197, microsoft#1200, microsoft#1201, microsoft#1202, microsoft#1203, microsoft#1204, microsoft#1205, microsoft#1206, microsoft#1207, microsoft#1209, microsoft#1210, microsoft#1211, microsoft#1218, microsoft#1219, microsoft#1220, microsoft#1223
This PR adds support in the cri-containerd test suite to allow tests to restart containerd.
A test case is also added that validates containerd properly terminates running containers
when it is restarted.
Note this test case depends on functionality in the kevpar/cri windows_port branch, and
that the
terminate_containers_on_restart
option is set on theplugins.cri
section of thecontainerd config.
See commits for more details.