Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support restarting containerd in tests, add restart test case #1188

Merged
merged 5 commits into from
Oct 19, 2021

Conversation

kevpar
Copy link
Member

@kevpar kevpar commented Oct 5, 2021

This PR adds support in the cri-containerd test suite to allow tests to restart containerd.
A test case is also added that validates containerd properly terminates running containers
when it is restarted.

Note this test case depends on functionality in the kevpar/cri windows_port branch, and
that the terminate_containers_on_restart option is set on the plugins.cri section of the
containerd config.

See commits for more details.

@kevpar kevpar requested a review from a team as a code owner October 5, 2021 23:26
Previously main.go didn't have _test suffix, so it was not considered a
test file. Notably this meant that TestMain was never actually invoked
because it must be in a test file. It seems we were fortunate that there
was nothing in TestMain that wasn't done automatically by the Go test
infrastructure.

As the cri-containerd directory is all test code, it is probably safe
to rename all other files in the directory to be test code as well.
However, that is left for a future change.

Signed-off-by: Kevin Parsons <kevpar@microsoft.com>
This change lets the cri-containerd tests start/stop containerd as
needed, rather than assuming it is always running. This is done through
the addition of startContainerd/stopContainerd functions which can be
called from tests. As all of the existing tests need containerd to be
running, this currently is not used in any tests. Future tests can take
advantage of this functionality.

Tests assume that containerd is running when they start, and should not
need to explicitly start containerd before calling into it. This means
that if a test stops containerd, it needs to ensure containerd is
started again. If containerd crashes during a test, then subsequent
tests will fail, but that's the same as the current behavior.

An unfortunate side effect of this change is that, due to a standing
issue with Go's service support and containerd, the service can
sometimes exit with ERROR_PROCESS_ABORTED when it is stopped. Combined
with the fact that recovery actions are used for containerd, this can
result in the service being restarted by the service control manager. To
work around this, we need to first disable recovery actions for the
service before running tests. This can be done with:

  sc failure containerd reset=0 actions= command=

Signed-off-by: Kevin Parsons <kevpar@microsoft.com>
@kevpar kevpar changed the title Add support for restarting containerd in tests, add restart test case Support restarting containerd in tests, add restart test case Oct 5, 2021
@dcantah
Copy link
Contributor

dcantah commented Oct 7, 2021

Needs a rebase I believe. The pullRequiredLcowImages function was renamed to have Lcow be capitalized here 8debf44#diff-a8c710332541b245cc336ff7f50e609f30213b47a1f67abe39908fe212862322

Edit: Err not a rebase, this has the changes. Just need to update your uses of the old function name

@dcantah dcantah self-assigned this Oct 7, 2021
Adds a test case that runs a pod+container, restarts containerd, then
verifies that the pod+container were terminated. This validates the
change made in the CRI fork [1] to terminate containers when containerd
is restarted.

[1]: kevpar/cri@f8e83e6

Signed-off-by: Kevin Parsons <kevpar@microsoft.com>
@kevpar
Copy link
Member Author

kevpar commented Oct 7, 2021

Needs a rebase I believe. The pullRequiredLcowImages function was renamed to have Lcow be capitalized here 8debf44#diff-a8c710332541b245cc336ff7f50e609f30213b47a1f67abe39908fe212862322

Edit: Err not a rebase, this has the changes. Just need to update your uses of the old function name

Aha, I hadn't checked back on CI yet.

Signed-off-by: Kevin Parsons <kevpar@microsoft.com>
Signed-off-by: Kevin Parsons <kevpar@microsoft.com>
defer stopContainer(t, client, ctx, containerID)

t.Log("Restart containerd")
stopContainerd(t)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually, cri-containerd.test.exe runs 4 tests in parallel (IIRC), This would break other tests that are running in parallel, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it? I thought running tests in parallel required you to sprinkle the ones that you'd like to make eligible with t.Parallel

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I thought just setting -test.parallel (which has default value of 4) to value greater than 1 does it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is also that tests must explicitly opt into being run in parallel. I agree running in parallel would cause problems given the service is global state.

@ambarve ambarve mentioned this pull request Oct 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants