Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: Reduce build durations #14223

Merged
merged 5 commits into from
Dec 11, 2020
Merged

Conversation

pchaigno
Copy link
Member

@pchaigno pchaigno commented Nov 30, 2020

Summary of commits:

  1. Adds some helper functions to help in subsequent commits.
  2. Limits K8sHubble to run on GKE and 4.9 only.
  3. Limits K8sIdentity and K8sHealth to run on GKE only.
  4. and 5. Avoid installing Cilium for tests that are skipped or quarantined.

@pchaigno pchaigno added area/CI Continuous Integration testing issue or flake area/CI-improvement Topic or proposal to improve the Continuous Integration workflow release-note/ci This PR makes changes to the CI. labels Nov 30, 2020
@maintainer-s-little-helper maintainer-s-little-helper bot added this to In progress in 1.10.0 Nov 30, 2020
@pchaigno pchaigno force-pushed the pr/pchaigno/reduce-test-duration branch from 8283f9c to ca9e3ab Compare December 7, 2020 12:16
@pchaigno pchaigno requested review from a team and gandro and removed request for a team December 7, 2020 17:31
@pchaigno pchaigno requested review from a team and borkmann and removed request for a team December 7, 2020 17:32
@pchaigno pchaigno marked this pull request as ready for review December 7, 2020 17:33
@pchaigno pchaigno requested a review from a team as a code owner December 7, 2020 17:33
@pchaigno pchaigno requested a review from kkourt December 7, 2020 17:33
@pchaigno
Copy link
Member Author

pchaigno commented Dec 7, 2020

⚠️ @cilium/hubble @cilium/agent Please note the second commit test: Run Hubble, Identity, and Health tests on GKE only. Does it make sense to run those tests on more than just GKE? If we need to run those tests elsewhere, do we need to run them on all kernels?

@kkourt
Copy link
Contributor

kkourt commented Dec 8, 2020

I'm probably missing some context here, but from what I can tell the idea here is to run tests that do not depend on specific kernel versions on GKE so that we run them only once, instead for every different kernel that we test. Is this correct?

Also, out of curiosity, is there a reason that we run them in GKE instead of running them on a single kernel version?

@pchaigno
Copy link
Member Author

pchaigno commented Dec 8, 2020

I'm probably missing some context here, but from what I can tell the idea here is to run tests that do not depend on specific kernel versions on GKE so that we run them only once, instead for every different kernel that we test. Is this correct?

That's correct.

Also, out of curiosity, is there a reason that we run them in GKE instead of running them on a single kernel version?

Good question. I don't have a strong preference on the location, but:

  1. Our GKE pipeline is usually faster.
  2. It's currently one of our fastest test pipelines, so best to increase that rather than the longest (to minimize total build time).
  3. I think long term we want to move as much tests there as possible.

@gandro
Copy link
Member

gandro commented Dec 8, 2020

warning @cilium/hubble @cilium/agent Please note the second commit test: Run Hubble, Identity, and Health tests on GKE only. Does it make sense to run those tests on more than just GKE? If we need to run those tests elsewhere, do we need to run them on all kernels?

I don't think we need to run them on all kernels. However, I would prefer to have them being run on both a kernel that supports the kube-proxy replacement (i.e. one that does HostReachableServices) and one without. The trace events that the data path emits (and that hubble observes) are different when you have HostReachableServices vs. when you don't.

@pchaigno
Copy link
Member Author

pchaigno commented Dec 8, 2020

warning @cilium/hubble @cilium/agent Please note the second commit test: Run Hubble, Identity, and Health tests on GKE only. Does it make sense to run those tests on more than just GKE? If we need to run those tests elsewhere, do we need to run them on all kernels?

I don't think we need to run them on all kernels. However, I would prefer to have them being run on both a kernel that supports the kube-proxy replacement (i.e. one that does HostReachableServices) and one without. The trace events that the data path emits (and that hubble observes) are different when you have HostReachableServices vs. when you don't.

Makes sense 👍
That's only for K8sHubble, right? We can probably run it on GKE (has kube-proxy replacement) + K8s 4.9.

@gandro
Copy link
Member

gandro commented Dec 8, 2020

Makes sense +1
That's only for K8sHubble, right? We can probably run it on GKE (has kube-proxy replacement) + K8s 4.9.

Yes, K8sHubble is the primary test suite in that regard (there are other uses of Hubble in CI, but they don't test Hubble itself directly, just use Hubble to assert certain flows were forwarded at the right place, so they are not really relevant to the discussion here).

@pchaigno pchaigno force-pushed the pr/pchaigno/reduce-test-duration branch from 9dc9b50 to 946c120 Compare December 8, 2020 11:13
@pchaigno
Copy link
Member Author

pchaigno commented Dec 8, 2020

That's only for K8sHubble, right? We can probably run it on GKE (has kube-proxy replacement) + K8s 4.9.

Done. Thanks for taking a look @gandro! 🙏

Copy link
Contributor

@kkourt kkourt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just two minor nits. I don't think is worth redoing the PR for them.

test/k8sT/DatapathConfiguration.go Show resolved Hide resolved
test/k8sT/hubble.go Show resolved Hide resolved
@pchaigno pchaigno force-pushed the pr/pchaigno/reduce-test-duration branch 2 times, most recently from 24bacd6 to ec435b5 Compare December 9, 2020 18:19
@pchaigno pchaigno requested review from a team December 9, 2020 18:19
@pchaigno pchaigno requested a review from a team as a code owner December 9, 2020 18:19
@pchaigno pchaigno requested review from a team and twpayne December 9, 2020 18:19
@pchaigno pchaigno changed the title [RFC] test: Reduce build durations test: Reduce build durations Dec 9, 2020
Copy link
Member

@gandro gandro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hubble related changes look good to me!

And make use of the new RunsOnGKE/DoesNotRunOnGKE helpers in existing
testing code.

Signed-off-by: Paul Chaignon <paul@cilium.io>
The Hubble tests check packet trace events that change depending on
whether our kube-proxy replacement is used. We thus run those tests both
on GKE (with our kube-proxy replacement) and on 4.9 (without). We
however don't need to run on all our kernel versions.

This commit only moves all the initialization code into the Hubble
Observe Context, merging BeforeAlls, AfterAll, and variables.

Signed-off-by: Paul Chaignon <paul@cilium.io>
These tests should not be impacted by kernel versions and are therefore
safe to test only on GKE. The three test suites also needed to be
reorganized to avoid installing Cilium when the tests are skipped. To
that effect, all tests are moved into a new Context block.

Signed-off-by: Paul Chaignon <paul@cilium.io>
In K8sConformance, we skip the tests inside the It blocks instead of
using SkipIfIf. Because of that, ginkgo executes the
BeforeAll/BeforeEach blocks before figuring out the It blocks are
skipped anyway.

This commit fixes it by using SkipContextIf with all It blocks inside.

Signed-off-by: Paul Chaignon <paul@cilium.io>
Signed-off-by: Paul Chaignon <paul@cilium.io>
@pchaigno pchaigno force-pushed the pr/pchaigno/reduce-test-duration branch from ec435b5 to 683c1e9 Compare December 10, 2020 17:40
@pchaigno
Copy link
Member Author

Runtime-4.9 failed because of known flake #14125. The rest is green.

@jrajahalme jrajahalme merged commit b544268 into master Dec 11, 2020
@jrajahalme jrajahalme deleted the pr/pchaigno/reduce-test-duration branch December 11, 2020 04:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/CI Continuous Integration testing issue or flake area/CI-improvement Topic or proposal to improve the Continuous Integration workflow release-note/ci This PR makes changes to the CI.
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

7 participants