Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow tests to run back to back #372

Merged
merged 10 commits into from
Feb 28, 2022

Conversation

KauzClay
Copy link
Member

@KauzClay KauzClay commented Feb 4, 2022

Fixes #371

I would like to be able to run the e2e tests multiple times without cleaning everything up. Particularly, I don't want to have to recreate the vcsim.

From what I have seen, vcsim will remember the state of things. So if in the source test, we power off two vms, when we run again and try to power off those same vms, nothing will happen. No events will be sent, and the test will fail.

Same thing happens for the TestBindingGOVC. The tag and category will already exist, so those commands won't prompt any new events.

Proposed Changes

  • 🧹 Update or clean up current behavior
  • added helper functions to create/cleanup vcsim and secret
  • use those helper functions in the tests
  • added a checkpoint configmap to the CreateSource function. This will make sure the source goes back in time for events in case the test events happen before the source is ready.
  • improved cleanup functions to also clean up completed job pods associated with listener and govc jobs

Pre-review Checklist

  • At least 80% unit test coverage (I believe this does not apply, since I am modifying a test)
  • E2E tests for any new behavior
  • Docs for any user-facing impact (I don't believe there is any user-facing impact)

Release Note

N/A no user-facing impact

@KauzClay KauzClay requested a review from a team as a code owner February 4, 2022 23:57
@codecov-commenter
Copy link

codecov-commenter commented Feb 5, 2022

Codecov Report

Merging #372 (9a7424b) into main (abcb2be) will increase coverage by 0.19%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #372      +/-   ##
==========================================
+ Coverage   83.04%   83.23%   +0.19%     
==========================================
  Files          27       27              
  Lines        1032     1032              
==========================================
+ Hits          857      859       +2     
+ Misses        147      146       -1     
+ Partials       28       27       -1     
Impacted Files Coverage Δ
pkg/vsphere/adapter.go 64.12% <0.00%> (+1.52%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update abcb2be...9a7424b. Read the comment docs.

@embano1
Copy link
Contributor

embano1 commented Feb 5, 2022

QQ: shouldn't we actually do cleanup after test, eg delete deployments, instead of manually resetting state?

@KauzClay
Copy link
Member Author

KauzClay commented Feb 9, 2022

hi @embano1 , are you suggesting that the vcsim deployment be applied/deleted as part of the test too?

As far as I can tell, the vcsim deployment isn't applied as part of the test. You have to deploy it manually before running the e2e, and therefore, delete it manually too if you want to rerun

@embano1
Copy link
Contributor

embano1 commented Feb 9, 2022

hi @embano1 , are you suggesting that the vcsim deployment be applied/deleted as part of the test too?

Well, IMHO that would fix your problem and avoid sharing state between runs.

As far as I can tell, the vcsim deployment isn't applied as part of the test. You have to deploy it manually before running the e2e, and therefore, delete it manually too if you want to rerun

True, this repo still sets it up outside of the Go test framework. Two solutions:

  • add a step in the workflow to delete any state, e.g. vcsim
  • update the E2E tests to also manage the lifecycle of vcsim (which I typically do in other projetcts)

@KauzClay
Copy link
Member Author

KauzClay commented Feb 10, 2022

okay great, that sounds good then.

update the E2E tests to also manage the lifecycle of vcsim (which I typically do in other projetcts)

do you have an example of where you do this elsewhere? Is this a good example?

* it seems like when running the test, the events are being sent to
  vcsim before the vsphere source is ready. This means the source misses
  the events, and the test fails
* apparently, there is a default checkpoint window of 5 mins. However,
  that doesn't seem to be applying. This may need separate investigation
* creating a checkpoint configmap ourselves allows us to guarantee the
  source will ask vcsim for events from the past N minutes
* when the tests run, there are two lefover pods for the listener job and the govc job.
* this commit cleans those up, so upon finishing the test, all resources are cleaned up from the cluster
* now that e2e tests create and manage vcsim and secret, setup doesn't need to do it
Copy link
Contributor

@embano1 embano1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While reviewing this PR I saw that RunJobListener() calls pkgtest.CleanupOnInterrupt twice in the function. IMHO this is a bug if you look at how pkgtest.CleanupOnInterrupt works internally. Can you PTAL?

And in general I think the way we use pkgtest.CleanupOnInterrupt across the tests might not achieve the goal since multiple goroutines are registered, blocking for a signal and then racing to call os.Exit(1)

test/e2e/util.go Outdated Show resolved Hide resolved
test/e2e/util.go Show resolved Hide resolved
test/e2e/util.go Outdated Show resolved Hide resolved
Spec: corev1.PodSpec{
Containers: []corev1.Container{{
Name: vcsim,
Image: "vmware/vcsim:latest",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-blocking: if we see docker.io 429s we can always fall back to using ko: to build vcsim since it's already a vendored dependency.

@KauzClay
Copy link
Member Author

@embano1 I addressed the nits, hopefully to what you were expecting :)

Also, with regard to this comment...

While reviewing this PR I saw that RunJobListener() calls pkgtest.CleanupOnInterrupt twice in the function. IMHO this is a bug if you look at how pkgtest.CleanupOnInterrupt works internally. Can you PTAL?

And in general I think the way we use pkgtest.CleanupOnInterrupt across the tests might not achieve the goal since multiple goroutines are registered, blocking for a signal and then racing to call os.Exit(1)

The way I'm reading it, it seems like calling it multiple times is okay. It looks like cleanup.funcs = append(cleanup.funcs, cleanupFunc) is just adding them to the slice of functions.

Initially I got hung up on the cleanup.once.Do(waitForInterrupt) line, but I believe the once.Do() means that every call after that will skip over that line, so I think there is only 1 goroutine actually blocking on a call to os.Exit(1).

I tested it out here: https://gist.github.com/KauzClay/abf3fe664e6867bca90dc16c8a11f594
And this is what I saw:

❯ go run main.go      
doing stuff
sleeping, cancel me
^CI am cleanup function 3
I am cleanup function 2
I am cleanup function 1
exit status 

Copy link
Contributor

@embano1 embano1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thx mate!

And regarding CleanupOnInterrupt: I don't know which code I looked at when I wrote this. Just checked src again and it definitely does not call exit more than once. All good 👍

@gabo1208
Copy link
Contributor

Let me know if this is ready to review al lgtm :)

@gabo1208 gabo1208 merged commit 8c4047c into vmware-tanzu:main Feb 28, 2022
KauzClay added a commit to KauzClay/sources-for-knative that referenced this pull request Feb 28, 2022
* add helper for creating vcsim

* create vcsim in source test

* manually create checkpoint

* it seems like when running the test, the events are being sent to
  vcsim before the vsphere source is ready. This means the source misses
  the events, and the test fails
* apparently, there is a default checkpoint window of 5 mins. However,
  that doesn't seem to be applying. This may need separate investigation
* creating a checkpoint configmap ourselves allows us to guarantee the
  source will ask vcsim for events from the past N minutes

* remove dead code

* create vcsim in all tests

* clean up job pods

* when the tests run, there are two lefover pods for the listener job and the govc job.
* this commit cleans those up, so upon finishing the test, all resources are cleaned up from the cluster

* run update-codegen.sh

* update git workflow

* now that e2e tests create and manage vcsim and secret, setup doesn't need to do it

* refactor: address nits

* refactor: make context a variable
@KauzClay KauzClay mentioned this pull request Feb 28, 2022
3 tasks
KauzClay added a commit to KauzClay/sources-for-knative that referenced this pull request Feb 28, 2022
* add helper for creating vcsim

* create vcsim in source test

* manually create checkpoint

* it seems like when running the test, the events are being sent to
  vcsim before the vsphere source is ready. This means the source misses
  the events, and the test fails
* apparently, there is a default checkpoint window of 5 mins. However,
  that doesn't seem to be applying. This may need separate investigation
* creating a checkpoint configmap ourselves allows us to guarantee the
  source will ask vcsim for events from the past N minutes

* remove dead code

* create vcsim in all tests

* clean up job pods

* when the tests run, there are two lefover pods for the listener job and the govc job.
* this commit cleans those up, so upon finishing the test, all resources are cleaned up from the cluster

* run update-codegen.sh

* update git workflow

* now that e2e tests create and manage vcsim and secret, setup doesn't need to do it

* refactor: address nits

* refactor: make context a variable
@gabo1208 gabo1208 mentioned this pull request Feb 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] can't run e2e tests back to back without cleaning up vcsim
4 participants