8337: add test cases for guest pull images #1

liudalibj · 2023-12-13T10:07:44Z

add test cases for guest pull images
need revist after we use container2.0 with 'image pull per runtime class' feature

for kata-containers#8337 and kata-containers#8407

liudalibj · 2023-12-13T10:10:53Z

This pr only include the test cases for guest pull with nydus-snapshotter
I tested those cases with my kcli cluster with nydus-snapshotter be installed and config by hand and the runtime class is "kata-qemu".
A good result looks like:

root@liudali-x86-libvirt:~/kata-containers/tests/integration/kubernetes# export KATA_HYPERVISOR=qemu
root@liudali-x86-libvirt:~/kata-containers/tests/integration/kubernetes# export PULL_TYPE=guest-pull
root@liudali-x86-libvirt:~/kata-containers/tests/integration/kubernetes# bats k8s-guest-pull-image.bats
 ✓ Test we can pull an unencrypted image (busybox) outside the guest with runc and then inside the guest successfully
 ✓ Test we can pull an unencrypted image (busybox) inside the guest twice in a row and then outside the guest successfully
 ✓ Test we can pull an other unencrypted image (alpine) outside the guest and then inside the guest successfully

3 tests, 0 failures

stevenhorsman

Hey DaLi, I've made a few comments for possible ways to make the code a bit easier to read, which @wainersm might also have input on.

The other thing that we might want to consider is the reverse case of issue 8337 (unless you have it and I've missed it), where we pull without a cri-runtime handler annotation, then pull with it and verify that it's pulled on the guest properly. You seem to have the reverse, pulled with annotation, on guest and then without and (TODO) check that it isn't pulled on the guest, but maybe both ways would be useful unless you think the runc case covers that?

I also think that for test like this one that we know will fail it's fine to write them as they should work, but then add a skip, and add a note in kata-containers#8337 to unskip them in future.

Thanks for the great work!

stevenhorsman · 2023-12-13T16:58:12Z

tests/integration/kubernetes/k8s-guest-pull-image.bats

+    runc_pod_config="$(new_pod_config "$unencrypted_image_1" "kata-${KATA_HYPERVISOR}")"
+    sed -i '/runtimeClassName:/d' $runc_pod_config


This is an idea rather than direct suggestion: Rather than setting the runtime class to kata- and then deleting it, I wonder if it would be clearer to update new_pod_config to allow it to either allow a blank runtime class, or maybe nicer create a separate to pod-config.yaml.in that doesn't have a runtimeclass and let new_pod_config optional take the base config as the final parameter? I just wonder if that would be clearer to read and understand?

On second thought (and considering some of the other suggestions I've made about extending it), maybe overloading new_pod_config with a base runtime class is too much of an edge-case and we'd be better off just adding a comment here to say that although the config is initialised with a runtimeclass it is then immediately deleted, such that it runs as runc

Thanks the comment, updated the new_pod_config function and pod-config.yaml.in template.

Sorry, my review comments were very confusing, but I left it so that people could give their input. In my first comment I suggested making runtimeclass optional, but after thinking about the idea of making metadata_annotations optional instead I decided that might be more useful, so suggested leaving as is with a comment. Not stuck with shell scripts and not being able to use multiple option values is an annoyance though.

Anyway these comments are mostly related to readability and re-use of code, so they can be refactored later as we add more tests and see the common patterns, so I don't want to hold up this PR due to these thoughts

Added comments to say we want to get runc pod here.

stevenhorsman · 2023-12-13T17:01:08Z

tests/integration/kubernetes/k8s-guest-pull-image.bats


-    # Get pod specification
-    kubectl wait --for=condition=Ready --timeout=$timeout pod "test-e2e"
+    set_node "$runc_pod_config" "$node"


If we need to do set_node for all our pod_configs, should we consider integrating it into new_pod_config and pass it as a third parameter to check the test files cleaner?

Part of me is wondering about adding metadata_annotations to the new_pod_config too, but I can see scenarios, like in this file that they will need to be optional, so that would make it need to be the last parameter. Maybe still worth considering though.

I tried to move $node to new_pod_config function but failed, because the runtimeClassName is one optional parameter too one function can't have 2+ optional parameters.
I didn't make the $node as a required parameter, I assume that we do NOT need set in for all other test cases?
So just leave the set_node call asis

@wainersm - any thoughts on this. I think you added the node call as we are running these tests on AKS now, so getting the logs requires us to know which node it's running on, so do you think it should be optional? At the moment the only usage of new_pod_config is the k8s-measured-rootfs.bats test which also calls set_node

I moved the set_node to new_pod_config function, the codes looks better than before.

liudalibj · 2023-12-14T06:06:11Z

Hey DaLi, I've made a few comments for possible ways to make the code a bit easier to read, which @wainersm might also have input on.

The other thing that we might want to consider is the reverse case of issue 8337 (unless you have it and I've missed it), where we pull without a cri-runtime handler annotation, then pull with it and verify that it's pulled on the guest properly. You seem to have the reverse, pulled with annotation, on guest and then without and (TODO) check that it isn't pulled on the guest, but maybe both ways would be useful unless you think the runc case covers that?

Both case are reversed, there are two TODO items, Test we can pull an other unencrypted image (alpine) outside the guest and then inside the guest successfully is for the case we pull without a cri-runtime handler annotation, then pull with it and verify that it's pulled on the guest properly

I also think that for test like this one that we know will fail it's fine to write them as they should work, but then add a skip, and add a note in kata-containers#8337 to unskip them in future.

Thanks for the great work!

liudalibj · 2023-12-14T08:18:08Z

@stevenhorsman @ChengyuZhu6 Many thanks for your review comments, I updated the codes, please help to take a look when you are free.

stevenhorsman · 2023-12-14T10:33:45Z

Both case are reversed, there are two TODO items

Sorry, I'm not sure how I missed this last night.
Just to ensure I completely understand the Test we can pull an other unencrypted image (alpine) outside the guest and then inside the guest successfully scenario - it is first doing a host pull (but still kata-containers, not runc) using the vanilla kata logic, as it doesn't set the cri-handler annotation, then it does a second pull with the annotation and (in future) checks that it is still pulled on the guest?
If so that works great for me, sorry I got distracted with the first two tests.

I guess something to think about here, and maybe it needs wider discussion is whether having TODOs and changing the expected behaviour is the correct thing to do. There is an argument that this is a pretty serious limitation, so having the tests correct and initially failing might drive useful conversations in the community and potentially get any container 2.0 fixes backported to 1.7 maybe, so we can start using this as soon as possible? I'm a bit worried that what you've done, although it passes the test, hides the problem a bit?

liudalibj · 2023-12-15T01:54:38Z

Sorry, I'm not sure how I missed this last night. Just to ensure I completely understand the Test we can pull an other unencrypted image (alpine) outside the guest and then inside the guest successfully scenario - it is first doing a host pull (but still kata-containers, not runc) using the vanilla kata logic, as it doesn't set the cri-handler annotation, then it does a second pull with the annotation and (in future) checks that it is still pulled on the guest? If so that works great for me, sorry I got distracted with the first two tests.

Yeah, that case try to pull image on host(without annotation) first and then try to pull image on guest(with annotation)

I guess something to think about here, and maybe it needs wider discussion is whether having TODOs and changing the expected behaviour is the correct thing to do. There is an argument that this is a pretty serious limitation, so having the tests correct and initially failing might drive useful conversations in the community and potentially get any container 2.0 fixes backported to 1.7 maybe, so we can start using this as soon as possible? I'm a bit worried that what you've done, although it passes the test, hides the problem a bit?

Yeah, agree with you @stevenhorsman that current way hides the problem.
So we should update the test case to expect all correct count of rootfs from now, and then:

option1: failed the test case
option2: skip the test case, enable it after we have containerd 2.0
Which options we should use here @stevenhorsman @ChengyuZhu6 ?

stevenhorsman · 2023-12-15T09:17:11Z

So we should update the test case to expect all correct count of rootfs from now, and then:

option1: failed the test case

option2: skip the test case, enable it after we have containerd 2.0
Which options we should use here @stevenhorsman @ChengyuZhu6 ?

So this might be kicking the can down the road a bit, but for this PR I'd recommend option 1 where we let the test case fail and then as part of the main PR 8484 (IIRC) we can test it out and have the wider discussion with the community to highlight that it is still an issue and try and get broader agreement on a path forward.

- add test cases for guest pull images - need revist after we use container2.0 with 'image pull per runtime class' feature for kata-containers#8337 and kata-containers#8407 Signed-off-by: Da Li Liu <liudali@cn.ibm.com>

liudalibj · 2023-12-15T09:39:14Z

So this might be kicking the can down the road a bit, but for this PR I'd recommend option 1 where we let the test case fail and then as part of the main PR 8484 (IIRC) we can test it out and have the wider discussion with the community to highlight that it is still an issue and try and get broader agreement on a path forward.

Updated the test cases to use option 1 with comments provided.

stevenhorsman

LGTM. Thanks DaLi for the great work!

ChengyuZhu6 · 2023-12-18T06:29:10Z

Thanks @liudalibj LGTM!

Preparation for complete GPU rootfs build step #1/#N Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>

liudalibj force-pushed the guest-pull branch 6 times, most recently from 9494b5b to af3ee22 Compare December 13, 2023 15:32

stevenhorsman reviewed Dec 13, 2023

View reviewed changes

liudalibj force-pushed the guest-pull branch from af3ee22 to ce076e6 Compare December 14, 2023 08:09

liudalibj force-pushed the guest-pull branch 5 times, most recently from 1ad9581 to 64b8c46 Compare December 14, 2023 08:41

liudalibj force-pushed the guest-pull branch 2 times, most recently from f7efa72 to 344aa76 Compare December 15, 2023 03:07

8337: add test cases for guest pull images

0ef0ed8

- add test cases for guest pull images - need revist after we use container2.0 with 'image pull per runtime class' feature for kata-containers#8337 and kata-containers#8407 Signed-off-by: Da Li Liu <liudali@cn.ibm.com>

liudalibj force-pushed the guest-pull branch from 344aa76 to 0ef0ed8 Compare December 15, 2023 09:37

stevenhorsman approved these changes Dec 15, 2023

View reviewed changes

ChengyuZhu6 merged commit 17970a5 into ChengyuZhu6:guest-pull Dec 18, 2023
1 check failed

liudalibj deleted the guest-pull branch December 18, 2023 06:34

ChengyuZhu6 pushed a commit that referenced this pull request May 22, 2024

gpu: Add build targets for GPU rootfs initrd/image

85374f5

Preparation for complete GPU rootfs build step #1/#N Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

8337: add test cases for guest pull images #1

8337: add test cases for guest pull images #1

liudalibj commented Dec 13, 2023

liudalibj commented Dec 13, 2023 •

edited

Loading

stevenhorsman left a comment

stevenhorsman Dec 13, 2023

stevenhorsman Dec 13, 2023

liudalibj Dec 14, 2023

stevenhorsman Dec 14, 2023

liudalibj Dec 15, 2023

stevenhorsman Dec 13, 2023

stevenhorsman Dec 13, 2023

liudalibj Dec 14, 2023

stevenhorsman Dec 14, 2023

liudalibj Dec 15, 2023

liudalibj commented Dec 14, 2023

liudalibj commented Dec 14, 2023

stevenhorsman commented Dec 14, 2023

liudalibj commented Dec 15, 2023

stevenhorsman commented Dec 15, 2023

liudalibj commented Dec 15, 2023

stevenhorsman left a comment

ChengyuZhu6 commented Dec 18, 2023

		runc_pod_config="$(new_pod_config "$unencrypted_image_1" "kata-${KATA_HYPERVISOR}")"
		sed -i '/runtimeClassName:/d' $runc_pod_config

8337: add test cases for guest pull images #1

8337: add test cases for guest pull images #1

Conversation

liudalibj commented Dec 13, 2023

liudalibj commented Dec 13, 2023 • edited Loading

stevenhorsman left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

liudalibj commented Dec 14, 2023

liudalibj commented Dec 14, 2023

stevenhorsman commented Dec 14, 2023

liudalibj commented Dec 15, 2023

stevenhorsman commented Dec 15, 2023

liudalibj commented Dec 15, 2023

stevenhorsman left a comment

Choose a reason for hiding this comment

ChengyuZhu6 commented Dec 18, 2023

liudalibj commented Dec 13, 2023 •

edited

Loading