PoC spike for k8s-based linkerd-cni testing #147

stevej · 2022-12-05T19:54:28Z

PoC spike for testing integration of linkerd-cni within k3s. The initial test uses flannel as that's what k3s uses by default.

This branch is intended to kick off a discussion of ways to perform integration testing of linkerd-cni with commonly shared CNIs. Do we like this approach?

Github workflow rules for running linkerd-cni integration tests.
Dockerfile for linkerd-cni integration tests
Integration tests that ensure that linkerd-cni's install-cni.sh interacted nicely with flannel's conf file.
linkerd-cni.yaml generated from linkerd install-cni with arguments for running within k3d
run.sh executes the containerized integration tests within k8s.
justfile rules for building linkerd-cni Dockerfile and cni-plugin-test Dockerfile for integration tests
justfile rules for running integration tests for linkerd-cni

Signed-off-by: Steve Jenson stevej@buoyant.io

…tions.go Signed-off-by: Steve Jenson <stevej@buoyant.io>

Signed-off-by: Steve Jenson <stevej@buoyant.io>

…ngining imports, removing linkerd2 k8s client with client-go Signed-off-by: Steve Jenson <stevej@buoyant.io>

Signed-off-by: Steve Jenson <stevej@buoyant.io>

…ting the cni-plugin installer script Signed-off-by: Steve Jenson <stevej@buoyant.io>

Signed-off-by: Steve Jenson <stevej@buoyant.io>

…ue to image pull errors. Signed-off-by: Steve Jenson <stevej@buoyant.io>

Signed-off-by: Steve Jenson <stevej@buoyant.io>

stevej · 2023-01-11T21:17:24Z

justfile

+# TODO(stevej): add a k3d-create-debug
 export K3S_DISABLE := "local-storage,traefik,servicelb,metrics-server@server:*"
-export K3D_CREATE_FLAGS := '--no-lb'
+export K3D_CREATE_FLAGS := '--no-lb --k3s-arg "--debug@server:*"'


I need to move this into a flag for creating a cluster with debug logging.

Hmmm, I have in mind perhaps two options. Both of them involve creating a cluster with --debug by default. I think it makes sense to have it on my default and then override it in CI (if we want to). Maybe this particular way of doing things wouldn't be a good fit for the other linkerd repos but in this case I think it makes sense, especially since we are going to be pretty reliant on kubelet logs for debugging. Anyway:

We could use something like:

_default_k3d_flags := '--no-lb --k3s-arg "--debug@server:*"' export K3D_CREATE_FLAGS := env_var_or_default("K3D_CREATE_FLAGS", _default_k3d_flags)

One of way of doing this is to export the environment variable based on a recipe-local temp var. Sounds like a mouthful but it means that by default we will use --debug. We can then override this either locally:

$ just _default_k3d_flags='--no-lb' k3d-create

or in CI using the same syntax.

Another approach here is to simply hardcode the default value in K3D_CREATE_FLAGS, locally and CI we can always override by setting the env variable.

We could have a recipe that handles this for us, e.g _k3d_create_init or something similar. We'd make use of a variable again to store default config and then export K3D_CREATE_FLAGS to the rest of the environment. This is more in line with what you're suggesting, but I think it just adds a bit more indirection when we can solve it maybe in a simpler way if we use env_var_or_default.

I'm by no means a just or makefile expert tho so lmk what you think.

I turned this into an issue in the kanban board, number 65 there.

Signed-off-by: Steve Jenson <stevej@buoyant.io>

mateiidavid

Looks really good 👌🏻 tbh there's nothing really blocking this from my side. I've left a few comments and will check back in but regardless of the outcome, as a first iteration I think this is good.

.github/workflows/cni-plugin-integration.yml

cni-plugin/integration/flannel/flannel_test.go

mateiidavid · 2023-01-12T14:17:26Z

cni-plugin/integration/run.sh

+    # TODO(stevej): how can we parameterize this manifest with `version` so we
+    # can enable a testing matrix?


Something I've done to template out a node-debugger pod manifest whenever I needed to change nodeName before applying is to replace the value with a placeholder and then use sed.

nodeName: <replace-me>

cat manifest.yaml | sed -e 's/<replace-me>/node-foo/g | kubectl apply -f -`

I'm not saying it's a good solution but it might do until we move -- if we ever decide to -- to a harness that programatically creates two resources.

I'm also trying to think about how do I abstract this to testing against other common cni plugins. Does my approach scale past 1 plugin? I'd love your thoughts on this:

run.sh will need to run tests from a different integration test subdirectory

I'll need to create the cluster with different CNI flags for each CNI plugin.

What else?

Now that you mention it, we could use sed to replace one image with another image.

cni-plugin/integration/run.sh

cni-plugin/integration/flannel/Dockerfile-tester

cni-plugin/integration/manifests/linkerd-cni.yaml

cni-plugin/integration/run.sh

Signed-off-by: Steve Jenson <stevej@buoyant.io>

…to ensure those packages are tested Signed-off-by: Steve Jenson <stevej@buoyant.io>

…in there to run Signed-off-by: Steve Jenson <stevej@buoyant.io>

Signed-off-by: Steve Jenson <stevej@buoyant.io>

cni-plugin/integration/run.sh

…lannel to signify that this is the rule to crib for other scenarios Signed-off-by: Steve Jenson <stevej@buoyant.io>

Signed-off-by: Steve Jenson <stevej@buoyant.io>

mateiidavid

Looks good!

mateiidavid · 2023-01-17T17:59:54Z

cni-plugin/integration/run.sh

+
+# Wait for linkerd-cni daemonset to complete
+if ! k rollout status --timeout=30s daemonset/linkerd-cni -n linkerd-cni; then
+  echo "!! linkerd-cni didn't rollout properly, check logs";


I wonder if we can print the logs here ourselves? Might be messy but usually when a GH runner fails it's hard to get access to the host to see the logs (if at all possible). Could be solved by dumping out the logs?

Maybe we can do it in a subsequent PR.

I've added this and I've also taken the opportunity to beef up error handling in the cleanup function. If one of those functions errored then the rest of the cleanup wouldn't happen properly and that was annoying.

…c information if linkerd-cni rollout fails Signed-off-by: Steve Jenson <stevej@buoyant.io>

mateiidavid

Still looks good :D

mateiidavid · 2023-01-18T16:32:50Z

cni-plugin/integration/run.sh

+  k describe ds linkerd-cni
+  k logs linkerd-cni -n linkerd-cni


we don't need a printf || echo here, do we?

Not strictly since if line 44 fails, line 45 would almost certainly fail but I think I'd like to err on having more information so I've put || echo on these lines as well.

Signed-off-by: Steve Jenson <stevej@buoyant.io>

Steve Jenson added 18 commits November 22, 2022 22:42

modifying import paths and making a temporary copy of testutil/annota…

906bd63

…tions.go Signed-off-by: Steve Jenson <stevej@buoyant.io>

removed testutil, dockerized cni installer tests now pass

bd20d59

Signed-off-by: Steve Jenson <stevej@buoyant.io>

moving internal to pkg/linkerd-, removing Dockerfile until fixed, cha…

b00fc23

…ngining imports, removing linkerd2 k8s client with client-go Signed-off-by: Steve Jenson <stevej@buoyant.io>

gofmt install-cni_test.go

c212c91

Signed-off-by: Steve Jenson <stevej@buoyant.io>

go mod updates

0ad053f

Signed-off-by: Steve Jenson <stevej@buoyant.io>

adding pkg to Docker image

839b80b

Signed-off-by: Steve Jenson <stevej@buoyant.io>

updating dev from v32 to v35 for go

8bb3c2f

Signed-off-by: Steve Jenson <stevej@buoyant.io>

moving back to old dev image

35e5d12

Signed-off-by: Steve Jenson <stevej@buoyant.io>

use dev:v32-go for go lint workflow

976c910

Signed-off-by: Steve Jenson <stevej@buoyant.io>

fixing linter complaints

95620d6

Signed-off-by: Steve Jenson <stevej@buoyant.io>

fixing linter complaints

9d8b1d9

Signed-off-by: Steve Jenson <stevej@buoyant.io>

turning off noisy lint #1

9613bc6

Signed-off-by: Steve Jenson <stevej@buoyant.io>

turning off noisy lint #2

c5b6130

Signed-off-by: Steve Jenson <stevej@buoyant.io>

turning off noisy lint #3

b439f84

Signed-off-by: Steve Jenson <stevej@buoyant.io>

turning off noisy lint #4

0990cf0

Signed-off-by: Steve Jenson <stevej@buoyant.io>

turning off noisy lint #5

e85e73d

Signed-off-by: Steve Jenson <stevej@buoyant.io>

turning off noisy lint #6

9010489

Signed-off-by: Steve Jenson <stevej@buoyant.io>

adding in Dockerfile, just rules for building, and a workflow for tes…

2e46623

…ting the cni-plugin installer script Signed-off-by: Steve Jenson <stevej@buoyant.io>

stevej requested a review from a team as a code owner December 5, 2022 19:54

stevej marked this pull request as draft December 5, 2022 19:57

Steve Jenson added 10 commits December 7, 2022 21:19

remember to setup docker

48be4dc

Signed-off-by: Steve Jenson <stevej@buoyant.io>

remember to setup docker-qemu

9e3e7f6

Signed-off-by: Steve Jenson <stevej@buoyant.io>

where is docker?

8c550e3

Signed-off-by: Steve Jenson <stevej@buoyant.io>

back to a named ubuntu version, removing devcontainer

189abac

Signed-off-by: Steve Jenson <stevej@buoyant.io>

we need just

69d3f39

Signed-off-by: Steve Jenson <stevej@buoyant.io>

WIP import of CNI plugin integration test environment. does not run d…

634f6f2

…ue to image pull errors. Signed-off-by: Steve Jenson <stevej@buoyant.io>

Resolved merge conflicts

aa569a9

rewriting just rules to match new rules

3df15ca

Signed-off-by: Steve Jenson <stevej@buoyant.io>

bumping dev version, renaming smoke test

a3e9801

Signed-off-by: Steve Jenson <stevej@buoyant.io>

WIP for running smoke tests

8644b70

Signed-off-by: Steve Jenson <stevej@buoyant.io>

Steve Jenson added 5 commits January 11, 2023 21:01

Merge branch 'main' into stevej/cni-plugin-build-workflow

880404d

Signed-off-by: Steve Jenson <stevej@buoyant.io>

remove hardcoded filename

e79b7a1

Signed-off-by: Steve Jenson <stevej@buoyant.io>

clarified comment

6215df3

Signed-off-by: Steve Jenson <stevej@buoyant.io>

fixed merge conflict error

46e7ebe

Signed-off-by: Steve Jenson <stevej@buoyant.io>

fix log levels

8ad4fbf

Signed-off-by: Steve Jenson <stevej@buoyant.io>

stevej commented Jan 11, 2023

View reviewed changes

fix a log level

2686a61

Signed-off-by: Steve Jenson <stevej@buoyant.io>

stevej added the github_actions Pull requests that update Github_actions code label Jan 11, 2023

stevej marked this pull request as ready for review January 11, 2023 21:25

stevej changed the title ~~cni-plugin build and test workflow for installer script, PoC spike for k8s-based CNI testing~~ PoC spike for k8s-based CNI testing Jan 11, 2023

stevej changed the title ~~PoC spike for k8s-based CNI testing~~ PoC spike for k8s-based linkerd-cni testing Jan 11, 2023

mateiidavid reviewed Jan 12, 2023

View reviewed changes

stevej commented Jan 12, 2023

View reviewed changes

cni-plugin/integration/flannel/Dockerfile-tester Outdated Show resolved Hide resolved

stevej commented Jan 12, 2023

View reviewed changes

cni-plugin/integration/manifests/linkerd-cni.yaml Show resolved Hide resolved

stevej commented Jan 12, 2023

View reviewed changes

cni-plugin/integration/run.sh Outdated Show resolved Hide resolved

Steve Jenson added 6 commits January 12, 2023 22:37

run test on all files in ./cni-plugin

a74b0ab

Signed-off-by: Steve Jenson <stevej@buoyant.io>

hcomment explaining why there's no ENTRYPOINT

367d112

Signed-off-by: Steve Jenson <stevej@buoyant.io>

use a map instead of an array for simplicity

4cb47d5

Signed-off-by: Steve Jenson <stevej@buoyant.io>

abstract which integration test subdirectory gets used, add internal …

0c1b31c

…to ensure those packages are tested Signed-off-by: Steve Jenson <stevej@buoyant.io>

go.yml is already running these tests are there no integration tests …

c426066

…in there to run Signed-off-by: Steve Jenson <stevej@buoyant.io>

breaking up a line

9712052

Signed-off-by: Steve Jenson <stevej@buoyant.io>

stevej commented Jan 13, 2023

View reviewed changes

cni-plugin/integration/run.sh Outdated Show resolved Hide resolved

Steve Jenson added 2 commits January 14, 2023 04:11

renaming SUBDIRECTORY to SCENARIO and renaming a run just target to f…

aef27c6

…lannel to signify that this is the rule to crib for other scenarios Signed-off-by: Steve Jenson <stevej@buoyant.io>

merge main

96e2825

Signed-off-by: Steve Jenson <stevej@buoyant.io>

mateiidavid approved these changes Jan 17, 2023

View reviewed changes

better error handling of the cleanup() function, print more diagnosti…

cc78037

…c information if linkerd-cni rollout fails Signed-off-by: Steve Jenson <stevej@buoyant.io>

mateiidavid approved these changes Jan 18, 2023

View reviewed changes

add error handling for describe ds and logs

edf6db5

Signed-off-by: Steve Jenson <stevej@buoyant.io>

stevej merged commit a3c65e2 into main Jan 18, 2023

stevej deleted the stevej/cni-plugin-build-workflow branch January 18, 2023 21:53

		# TODO(stevej): how can we parameterize this manifest with `version` so we
		# can enable a testing matrix?

PoC spike for k8s-based linkerd-cni testing #147

PoC spike for k8s-based linkerd-cni testing #147

Uh oh!

Conversation

stevej commented Dec 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mateiidavid left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mateiidavid left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mateiidavid left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

stevej commented Dec 5, 2022 •

edited

Loading