Skip to content

Conversation

@stevej
Copy link
Contributor

@stevej stevej commented Dec 5, 2022

PoC spike for testing integration of linkerd-cni within k3s. The initial test uses flannel as that's what k3s uses by default.

This branch is intended to kick off a discussion of ways to perform integration testing of linkerd-cni with commonly shared CNIs. Do we like this approach?

  • Github workflow rules for running linkerd-cni integration tests.
  • Dockerfile for linkerd-cni integration tests
  • Integration tests that ensure that linkerd-cni's install-cni.sh interacted nicely with flannel's conf file.
  • linkerd-cni.yaml generated from linkerd install-cni with arguments for running within k3d
  • run.sh executes the containerized integration tests within k8s.
  • justfile rules for building linkerd-cni Dockerfile and cni-plugin-test Dockerfile for integration tests
  • justfile rules for running integration tests for linkerd-cni

Signed-off-by: Steve Jenson stevej@buoyant.io

Steve Jenson added 18 commits November 22, 2022 22:42
…tions.go

Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
…ngining imports, removing linkerd2 k8s client with client-go

Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
…ting the cni-plugin installer script

Signed-off-by: Steve Jenson <stevej@buoyant.io>
@stevej stevej requested a review from a team as a code owner December 5, 2022 19:54
@stevej stevej marked this pull request as draft December 5, 2022 19:57
Steve Jenson added 10 commits December 7, 2022 21:19
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
…ue to image pull errors.

Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Steve Jenson added 5 commits January 11, 2023 21:01
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
# TODO(stevej): add a k3d-create-debug
export K3S_DISABLE := "local-storage,traefik,servicelb,metrics-server@server:*"
export K3D_CREATE_FLAGS := '--no-lb'
export K3D_CREATE_FLAGS := '--no-lb --k3s-arg "--debug@server:*"'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to move this into a flag for creating a cluster with debug logging.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, I have in mind perhaps two options. Both of them involve creating a cluster with --debug by default. I think it makes sense to have it on my default and then override it in CI (if we want to). Maybe this particular way of doing things wouldn't be a good fit for the other linkerd repos but in this case I think it makes sense, especially since we are going to be pretty reliant on kubelet logs for debugging. Anyway:

  1. We could use something like:
_default_k3d_flags := '--no-lb --k3s-arg "--debug@server:*"'
export K3D_CREATE_FLAGS := env_var_or_default("K3D_CREATE_FLAGS", _default_k3d_flags)

One of way of doing this is to export the environment variable based on a recipe-local temp var. Sounds like a mouthful but it means that by default we will use --debug. We can then override this either locally:

$ just _default_k3d_flags='--no-lb' k3d-create

or in CI using the same syntax.

Another approach here is to simply hardcode the default value in K3D_CREATE_FLAGS, locally and CI we can always override by setting the env variable.

  1. We could have a recipe that handles this for us, e.g _k3d_create_init or something similar. We'd make use of a variable again to store default config and then export K3D_CREATE_FLAGS to the rest of the environment. This is more in line with what you're suggesting, but I think it just adds a bit more indirection when we can solve it maybe in a simpler way if we use env_var_or_default.

I'm by no means a just or makefile expert tho so lmk what you think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I turned this into an issue in the kanban board, number 65 there.

Signed-off-by: Steve Jenson <stevej@buoyant.io>
@stevej stevej added the github_actions Pull requests that update Github_actions code label Jan 11, 2023
@stevej stevej marked this pull request as ready for review January 11, 2023 21:25
@stevej stevej changed the title cni-plugin build and test workflow for installer script, PoC spike for k8s-based CNI testing PoC spike for k8s-based CNI testing Jan 11, 2023
@stevej stevej changed the title PoC spike for k8s-based CNI testing PoC spike for k8s-based linkerd-cni testing Jan 11, 2023
Copy link
Member

@mateiidavid mateiidavid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really good 👌🏻 tbh there's nothing really blocking this from my side. I've left a few comments and will check back in but regardless of the outcome, as a first iteration I think this is good.

Comment on lines +20 to +21
# TODO(stevej): how can we parameterize this manifest with `version` so we
# can enable a testing matrix?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something I've done to template out a node-debugger pod manifest whenever I needed to change nodeName before applying is to replace the value with a placeholder and then use sed.

nodeName: <replace-me>

cat manifest.yaml | sed -e 's/<replace-me>/node-foo/g | kubectl apply -f -`

I'm not saying it's a good solution but it might do until we move -- if we ever decide to -- to a harness that programatically creates two resources.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also trying to think about how do I abstract this to testing against other common cni plugins. Does my approach scale past 1 plugin? I'd love your thoughts on this:

  • run.sh will need to run tests from a different integration test subdirectory
  • I'll need to create the cluster with different CNI flags for each CNI plugin.
  • What else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that you mention it, we could use sed to replace one image with another image.

Steve Jenson added 6 commits January 12, 2023 22:37
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
…to ensure those packages are tested

Signed-off-by: Steve Jenson <stevej@buoyant.io>
…in there to run

Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Steve Jenson added 2 commits January 14, 2023 04:11
…lannel to signify that this is the rule to crib for other scenarios

Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Copy link
Member

@mateiidavid mateiidavid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!


# Wait for linkerd-cni daemonset to complete
if ! k rollout status --timeout=30s daemonset/linkerd-cni -n linkerd-cni; then
echo "!! linkerd-cni didn't rollout properly, check logs";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we can print the logs here ourselves? Might be messy but usually when a GH runner fails it's hard to get access to the host to see the logs (if at all possible). Could be solved by dumping out the logs?

Maybe we can do it in a subsequent PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added this and I've also taken the opportunity to beef up error handling in the cleanup function. If one of those functions errored then the rest of the cleanup wouldn't happen properly and that was annoying.

…c information if linkerd-cni rollout fails

Signed-off-by: Steve Jenson <stevej@buoyant.io>
Copy link
Member

@mateiidavid mateiidavid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still looks good :D

Comment on lines 44 to 45
k describe ds linkerd-cni
k logs linkerd-cni -n linkerd-cni
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't need a printf || echo here, do we?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not strictly since if line 44 fails, line 45 would almost certainly fail but I think I'd like to err on having more information so I've put || echo on these lines as well.

Signed-off-by: Steve Jenson <stevej@buoyant.io>
@stevej stevej merged commit a3c65e2 into main Jan 18, 2023
@stevej stevej deleted the stevej/cni-plugin-build-workflow branch January 18, 2023 21:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

github_actions Pull requests that update Github_actions code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants