Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gha: Setup nydus snapshotter for CoCo tests #8585

Merged

Conversation

ChengyuZhu6
Copy link
Member

@ChengyuZhu6 ChengyuZhu6 commented Dec 6, 2023

Setup nydus snapshotter for CoCo tests.

Fixes #8584

Copy link
Member

@stevenhorsman stevenhorsman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a few questions/comments

tests/common.bash Outdated Show resolved Hide resolved
tests/gha-run-k8s-common.sh Outdated Show resolved Hide resolved
tests/gha-run-k8s-common.sh Outdated Show resolved Hide resolved
@fidencio
Copy link
Member

fidencio commented Dec 6, 2023

In order to make things consistent for both Kata Containers and Confidential Containers, I wonder if we work with the nydus community to have a payload coming from them, that could be used both here and by the Confidential Containers CI.

This would reduce a lot the chances of misconfigurations, and that would be a "trusted" source that both projects could consume.

IOW, we'd provide them the patches, and then consume their payload here and in CoCo.

WDYT?

@ChengyuZhu6
Copy link
Member Author

In order to make things consistent for both Kata Containers and Confidential Containers, I wonder if we work with the nydus community to have a payload coming from them, that could be used both here and by the Confidential Containers CI.

This would reduce a lot the chances of misconfigurations, and that would be a "trusted" source that both projects could consume.

IOW, we'd provide them the patches, and then consume their payload here and in CoCo.

WDYT?

I think that's a great idea.

@ChengyuZhu6
Copy link
Member Author

@stevenhorsman @fidencio The PR has been merged.

@ChengyuZhu6
Copy link
Member Author

@stevenhorsman @fidencio Should we support an official release of the Nydus snapshotter for arm64?

@ChengyuZhu6 ChengyuZhu6 force-pushed the dependencies-for-guest-pull branch 4 times, most recently from d1e05a1 to 23f135b Compare December 7, 2023 06:50
Copy link
Contributor

@liudalibj liudalibj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are some typo

.github/workflows/run-kata-coco-tests.yaml Outdated Show resolved Hide resolved
.github/workflows/run-kata-coco-tests.yaml Outdated Show resolved Hide resolved
.github/workflows/run-kata-coco-tests.yaml Outdated Show resolved Hide resolved
@stevenhorsman
Copy link
Member

@stevenhorsman @fidencio Should we support an official release of the Nydus snapshotter for arm64?

I think it would be worth starting on this as in main we produce arm64 payloads, even if we don't have any CCA runtime, or tests yet AFAIK. I don't think it should block this PR though

@ChengyuZhu6
Copy link
Member Author

The failures about run-nydus are caused by the new release name, which differs from the version used by kata now. The PR is to resolve the issue.

@ChengyuZhu6 ChengyuZhu6 force-pushed the dependencies-for-guest-pull branch 5 times, most recently from 9de3605 to 027f570 Compare February 8, 2024 05:19
@ChengyuZhu6 ChengyuZhu6 self-assigned this Feb 8, 2024
@ChengyuZhu6 ChengyuZhu6 force-pushed the dependencies-for-guest-pull branch 5 times, most recently from 3cf3080 to 470d491 Compare February 8, 2024 09:11
@ChengyuZhu6
Copy link
Member Author

/test

@ChengyuZhu6
Copy link
Member Author

@fidencio @wainersm @stevenhorsman I found that the snapshotter can be deployed and cleaned up as expected. This PR does not configure the snapshotter in containerd, so it should not affect the existing tests. However, I encountered some failures that I could not explain. Could you please help me troubleshoot them? If the tests pass, can we merge the PR first?

Copy link
Member

@fidencio fidencio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left a few comments that I will address as Chengyu is on holidays.

@@ -21,7 +21,8 @@ DOCKER_REGISTRY=${DOCKER_REGISTRY:-quay.io}
DOCKER_REPO=${DOCKER_REPO:-kata-containers/kata-deploy-ci}
DOCKER_TAG=${DOCKER_TAG:-kata-containers-latest}
KATA_DEPLOY_WAIT_TIMEOUT=${KATA_DEPLOY_WAIT_TIMEOUT:-10m}
KATA_HYPERVISOR=${KATA_HYPERVISOR:-qemu}
SNAPSHOTTER_DEPLOY_WAIT_TIMEOUT=${SNAPSHOTTER_DEPLOY_WAIT_TIMEOUT:-5m}
KATA_HYPERVISOR=${KATA_HYPERVISOR-qemu}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
KATA_HYPERVISOR=${KATA_HYPERVISOR-qemu}
KATA_HYPERVISOR=${KATA_HYPERVISOR:-qemu}

echo "::endgroup::"
}

function cleanup_snapshotter() {
echo "::group::Cleanuping ${SNAPSHOTTER:-}"
#TODO Add the logic for cleaning up the snapshotter in PR https://github.com/kata-containers/kata-containers/pull/8585.
case ${SNAPSHOTTER:-} in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer having SNAPSHOTTER declared atop, instead of having SNAPSHOTTER:- everywhere.

sleep 180s
kubectl delete -f "misc/snapshotter/nydus-snapshotter-rbac.yaml"
popd
sleep 180s
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rbac deletion usually is quite fast, I'd drop this to 30s or so, instead of 3 minutes.

@@ -21,7 +21,8 @@ DOCKER_REGISTRY=${DOCKER_REGISTRY:-quay.io}
DOCKER_REPO=${DOCKER_REPO:-kata-containers/kata-deploy-ci}
DOCKER_TAG=${DOCKER_TAG:-kata-containers-latest}
KATA_DEPLOY_WAIT_TIMEOUT=${KATA_DEPLOY_WAIT_TIMEOUT:-10m}
KATA_HYPERVISOR=${KATA_HYPERVISOR:-qemu}
SNAPSHOTTER_DEPLOY_WAIT_TIMEOUT=${SNAPSHOTTER_DEPLOY_WAIT_TIMEOUT:-5m}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5m seems like way too much as well, maybe 3 minutes is okay.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's leave this as 5 minutes for now, as I just realised that the kata-deploy timeout is 10 minutes.

We can use daemonset to deploy nydus snapshotter, which will decrease
one manual step both for Kata Containers and Confidential Containers CI.

Fixes: kata-containers#8584

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Cleanup nydus snapshotter by the daemonset.

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Copy link
Member

@fidencio fidencio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks @ChengyuZhu6!

@fidencio
Copy link
Member

fidencio commented Feb 9, 2024

/test

Copy link
Member

@gkurz gkurz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still cannot tell from a nydus standpoint but LGTM from a pure scripting point of view.

Thanks @fidencio and @ChengyuZhu6 !

Copy link
Member

@stevenhorsman stevenhorsman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @ChengyuZhu6 and reviewers!

git clone -b "${nydus_snapshotter_version}" "${nydus_snapshotter_url}" "${nydus_snapshotter_install_dir}"

pushd "$nydus_snapshotter_install_dir"
if [ "${PULL_TYPE}" == "guest-pull" ]; then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a review comment, just to remind that PULL_TYPE is set to guest-pull in the pre-PR to this that updates the workflows: a214bd8

@fidencio fidencio merged commit e78a951 into kata-containers:main Feb 9, 2024
283 of 295 checks passed
@ChengyuZhu6 ChengyuZhu6 deleted the dependencies-for-guest-pull branch February 18, 2024 04:24
c3d pushed a commit to c3d/kata-containers that referenced this pull request Feb 23, 2024
This PR is a split of kata-containers#8585.
make the changes on the Github workflows, and the skeleton to deploy_snapshotter()
and cleanup_snapshotter() in tests/integration/kubernetes/gha-run.sh in this commit.

After initially merging this patch to trigger CI jobs for CoCo, which will begin executing
the dummy functions deploy_snapshotter() and cleanup_snapshotter(), the implementation details for these functions
remain in kata-containers#8585. Our subsequent step involves transferring this logic to the PR kata-containers#8484, enabling the PR to undergo CI testing prior to its merge.

Fixes: kata-containers#8997

Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ok-to-test size/medium Average sized task
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

CI: Install dependencies on testing guest pull image feature
8 participants