Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cilium: test encryption workflows for GKE #15595

Merged
merged 1 commit into from
Apr 13, 2021

Conversation

jrfastab
Copy link
Contributor

@jrfastab jrfastab commented Apr 7, 2021

Test work flows with encryption and GKE.

This adds an additional run of cilium test connectivity to the gke workflow, but this time with encryption enabled. After this commit the workflow is the following.

cilium install ...
cilium hubble enable
kubectl port-forward ...
cilium connectivity test
cilium uninstall
cilium install --encryption ...
cilium hubble enable
kubectl port-forward
kubectl delete pod  ... // delete cilium-test pods
cilium connectivity test

above omits a few --wait ops and the cluster setup for clarity. For now I manually delete the pods to work-around issue, cilium/cilium-cli#156, where the uninstall does not delete cilium-test pods and the --restart-pods on install does not restart them.

I've done multiple runs of the workflow and most passed. I observed two errors that seem unrelated to this PR. First sometimes the cleanup fails with,

Error: We are currently unable to download the log. Please try again later.

And then once the initial (without encryption) connectivity test failed the pod->world test. It seems like a flake.

@jrfastab jrfastab requested review from a team as code owners April 7, 2021 18:43
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Apr 7, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot added this to In progress in 1.10.0 Apr 7, 2021
@jrfastab jrfastab requested review from kkourt and aanm April 7, 2021 18:43
@jrfastab jrfastab changed the title DO NOT MERGE test workflows for AKS/EKS/GKE cilium: test workflows for AKS/EKS/GKE Apr 7, 2021
@jrfastab jrfastab marked this pull request as draft April 7, 2021 18:43
@jrfastab jrfastab changed the title cilium: test workflows for AKS/EKS/GKE cilium: test encryption workflows for AKS/EKS/GKE Apr 7, 2021
@jrfastab jrfastab force-pushed the pr/jrfastab/test-aks-eks-workflows branch 2 times, most recently from 0fd1bca to c92e13c Compare April 8, 2021 19:12
@jrfastab jrfastab force-pushed the pr/jrfastab/test-aks-eks-workflows branch 2 times, most recently from 7104738 to ceb2533 Compare April 8, 2021 20:32
@jrfastab jrfastab force-pushed the pr/jrfastab/test-aks-eks-workflows branch from ceb2533 to 2996a7a Compare April 8, 2021 20:37
@jrfastab
Copy link
Contributor Author

jrfastab commented Apr 8, 2021

Looks like its working... succeeded 2 minutes ago in 12m 20s

@jrfastab
Copy link
Contributor Author

jrfastab commented Apr 9, 2021

This is consistently passing the tests portion, but I see an occasional,

Error: We are currently unable to download the log. Please try again later.

from the cleanup step. Is this a known issue?

@jrfastab jrfastab changed the title cilium: test encryption workflows for AKS/EKS/GKE cilium: test encryption workflows for GKE Apr 9, 2021
@jrfastab jrfastab force-pushed the pr/jrfastab/test-aks-eks-workflows branch from 2996a7a to 1fa955b Compare April 9, 2021 00:03
@jrfastab jrfastab marked this pull request as ready for review April 9, 2021 00:10
@jrfastab jrfastab requested a review from nbusseneau April 9, 2021 00:10
@pchaigno pchaigno added area/CI Continuous Integration testing issue or flake release-note/ci This PR makes changes to the CI. labels Apr 9, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Apr 9, 2021
.github/workflows/gke.yaml Outdated Show resolved Hide resolved
@nbusseneau
Copy link
Member

nbusseneau commented Apr 9, 2021

This is consistently passing the tests portion, but I see an occasional,

Error: We are currently unable to download the log. Please try again later.

from the cleanup step. Is this a known issue?

I have seen that sometimes, but not much. I wonder if it's a GH related issue (i.e. not being able to retrieve workflow logs from the underlying machines for some reason, and we can't do anything about it), or if it's just that we hit the timeout-minutes specified for the job, which might explain why you have seen much more than I did (since you increased the running time).

Perhaps we ought to bump it to 20 or 25?

Actually, this also begs the question: do we want to split workflows so as not to test encryption when unnecessary, keeping the testing lean and fast?

@jrfastab
Copy link
Contributor Author

jrfastab commented Apr 9, 2021

It might be useful to split workflows so we can test one without the other.

@jrfastab
Copy link
Contributor Author

jrfastab commented Apr 9, 2021

@aanm wdyt split this into its own workflow or keep them together?

@aanm
Copy link
Member

aanm commented Apr 10, 2021

@jrfastab I think we should keep the encryption in the same workflow. Are the encryption tests taking that much time? It is preferable to test all things than test a couple things and then things end up breaking.

@nbusseneau what's the downside of enabling the workflow by default on all PRs with the test-me-please?

@aanm aanm removed their assignment Apr 12, 2021
@jrfastab
Copy link
Contributor Author

@aanm works for me. This is probably ready to go and deciding how/where to enable this can happen as a follow up?

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
@jrfastab jrfastab force-pushed the pr/jrfastab/test-aks-eks-workflows branch from 1fa955b to 748b8b5 Compare April 12, 2021 15:06
@aanm
Copy link
Member

aanm commented Apr 12, 2021

@aanm works for me. This is probably ready to go and deciding how/where to enable this can happen as a follow up?

correct

@nbusseneau
Copy link
Member

@nbusseneau what's the downside of enabling the workflow by default on all PRs with the test-me-please?

For potential external readers: discussed in community meeting => we are going to trigger the workflows with test-me-please on top of the existing Jenkins jobs.

@aanm works for me. This is probably ready to go and deciding how/where to enable this can happen as a follow up?

Do you think we should increase timeout from 20 to 25, as per proposed above?

@jrfastab
Copy link
Contributor Author

I don't think there is any reason to run all tests for a workflow change.

@jrfastab jrfastab added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Apr 13, 2021
@qmonnet qmonnet merged commit cb8b505 into master Apr 13, 2021
1.10.0 automation moved this from In progress to Done Apr 13, 2021
@qmonnet qmonnet deleted the pr/jrfastab/test-aks-eks-workflows branch April 13, 2021 09:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/CI Continuous Integration testing issue or flake ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/ci This PR makes changes to the CI.
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

6 participants