-
Notifications
You must be signed in to change notification settings - Fork 35
PoC spike for k8s-based linkerd-cni testing #147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…tions.go Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
…ngining imports, removing linkerd2 k8s client with client-go Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
…ting the cni-plugin installer script Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
…ue to image pull errors. Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
| # TODO(stevej): add a k3d-create-debug | ||
| export K3S_DISABLE := "local-storage,traefik,servicelb,metrics-server@server:*" | ||
| export K3D_CREATE_FLAGS := '--no-lb' | ||
| export K3D_CREATE_FLAGS := '--no-lb --k3s-arg "--debug@server:*"' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to move this into a flag for creating a cluster with debug logging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, I have in mind perhaps two options. Both of them involve creating a cluster with --debug by default. I think it makes sense to have it on my default and then override it in CI (if we want to). Maybe this particular way of doing things wouldn't be a good fit for the other linkerd repos but in this case I think it makes sense, especially since we are going to be pretty reliant on kubelet logs for debugging. Anyway:
- We could use something like:
_default_k3d_flags := '--no-lb --k3s-arg "--debug@server:*"'
export K3D_CREATE_FLAGS := env_var_or_default("K3D_CREATE_FLAGS", _default_k3d_flags)
One of way of doing this is to export the environment variable based on a recipe-local temp var. Sounds like a mouthful but it means that by default we will use --debug. We can then override this either locally:
$ just _default_k3d_flags='--no-lb' k3d-create
or in CI using the same syntax.
Another approach here is to simply hardcode the default value in K3D_CREATE_FLAGS, locally and CI we can always override by setting the env variable.
- We could have a recipe that handles this for us, e.g
_k3d_create_initor something similar. We'd make use of a variable again to store default config and then export K3D_CREATE_FLAGS to the rest of the environment. This is more in line with what you're suggesting, but I think it just adds a bit more indirection when we can solve it maybe in a simpler way if we useenv_var_or_default.
I'm by no means a just or makefile expert tho so lmk what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I turned this into an issue in the kanban board, number 65 there.
Signed-off-by: Steve Jenson <stevej@buoyant.io>
mateiidavid
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks really good 👌🏻 tbh there's nothing really blocking this from my side. I've left a few comments and will check back in but regardless of the outcome, as a first iteration I think this is good.
| # TODO(stevej): how can we parameterize this manifest with `version` so we | ||
| # can enable a testing matrix? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something I've done to template out a node-debugger pod manifest whenever I needed to change nodeName before applying is to replace the value with a placeholder and then use sed.
nodeName: <replace-me>
cat manifest.yaml | sed -e 's/<replace-me>/node-foo/g | kubectl apply -f -`
I'm not saying it's a good solution but it might do until we move -- if we ever decide to -- to a harness that programatically creates two resources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also trying to think about how do I abstract this to testing against other common cni plugins. Does my approach scale past 1 plugin? I'd love your thoughts on this:
- run.sh will need to run tests from a different integration test subdirectory
- I'll need to create the cluster with different CNI flags for each CNI plugin.
- What else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that you mention it, we could use sed to replace one image with another image.
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
…to ensure those packages are tested Signed-off-by: Steve Jenson <stevej@buoyant.io>
…in there to run Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
…lannel to signify that this is the rule to crib for other scenarios Signed-off-by: Steve Jenson <stevej@buoyant.io>
Signed-off-by: Steve Jenson <stevej@buoyant.io>
mateiidavid
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
cni-plugin/integration/run.sh
Outdated
|
|
||
| # Wait for linkerd-cni daemonset to complete | ||
| if ! k rollout status --timeout=30s daemonset/linkerd-cni -n linkerd-cni; then | ||
| echo "!! linkerd-cni didn't rollout properly, check logs"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we can print the logs here ourselves? Might be messy but usually when a GH runner fails it's hard to get access to the host to see the logs (if at all possible). Could be solved by dumping out the logs?
Maybe we can do it in a subsequent PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added this and I've also taken the opportunity to beef up error handling in the cleanup function. If one of those functions errored then the rest of the cleanup wouldn't happen properly and that was annoying.
…c information if linkerd-cni rollout fails Signed-off-by: Steve Jenson <stevej@buoyant.io>
mateiidavid
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still looks good :D
cni-plugin/integration/run.sh
Outdated
| k describe ds linkerd-cni | ||
| k logs linkerd-cni -n linkerd-cni |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't need a printf || echo here, do we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not strictly since if line 44 fails, line 45 would almost certainly fail but I think I'd like to err on having more information so I've put || echo on these lines as well.
Signed-off-by: Steve Jenson <stevej@buoyant.io>
PoC spike for testing integration of linkerd-cni within k3s. The initial test uses flannel as that's what k3s uses by default.
This branch is intended to kick off a discussion of ways to perform integration testing of linkerd-cni with commonly shared CNIs. Do we like this approach?
install-cni.shinteracted nicely with flannel'sconffile.linkerd-cni.yamlgenerated fromlinkerd install-cniwith arguments for running within k3drun.shexecutes the containerized integration tests within k8s.Signed-off-by: Steve Jenson stevej@buoyant.io