Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add api-machinery 'watch-consistency' e2e test #69829

Merged
merged 1 commit into from
Oct 20, 2018

Conversation

jpbetz
Copy link
Contributor

@jpbetz jpbetz commented Oct 15, 2018

Added a watch consistency e2e test as originally proposed in #67717.

Plan is to make this a conformance test after it's had sufficient bake time.

This test continues to use ConfigMap for watch tests. We will expand the watch tests to also test others resource separately as part of #67718.

to ensure concurrent writes throughout the test, event production runs in the background throughout the test. It's rated limited to no more than 200 events/sec, but runs indefinitely until the watchers complete their testing.

100 iterations was selected since that runs in ~15 seconds and the recommended limit for e2e tests is 20 seconds.

NONE

/kind cleanup
/sig api-machinery
/cc @lavalamp

@jpbetz jpbetz self-assigned this Oct 15, 2018
@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. sig/testing Categorizes an issue or PR as relevant to SIG Testing. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Oct 15, 2018
@jpbetz jpbetz changed the title Add api-machinery 'watch-consistency' e2e test Add api-machinery 'watch-consistency' e2e conformance test Oct 15, 2018
@k8s-ci-robot k8s-ci-robot added kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed needs-kind Indicates a PR lacks a `kind/foo` label and requires one. labels Oct 15, 2018
@@ -314,6 +316,49 @@ var _ = SIGDescribe("Watchers", func() {
expectEvent(testWatch, watch.Modified, testConfigMapThirdUpdate)
expectEvent(testWatch, watch.Deleted, nil)
})

/*
Testname: watch-consistency
Copy link
Member

@lavalamp lavalamp Oct 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

      Something
      weird
    happened
   to
  this comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gah! fixed

@jpbetz jpbetz added area/conformance Issues or PRs related to kubernetes conformance tests area/etcd labels Oct 15, 2018
@k8s-ci-robot k8s-ci-robot added the sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. label Oct 15, 2018
Copy link
Member

@lavalamp lavalamp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nits. Do we want to resume a watch if it fails for some reason, or should that fail the test? I'm unsure. Watches shouldn't really randomly fail.

Expect(err).NotTo(HaveOccurred())
wcs = append(wcs, wc)
resourceVersion = waitForNextConfigMapEvent(wcs[0]).ResourceVersion
for _, wc := range wcs[1:] {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I bet doing these all in parallel would significantly reduce the wall time.

Copy link
Contributor Author

@jpbetz jpbetz Oct 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit to my surprise, this didn't speed things up. I'm guessing that the buffered channels <-watch.ResultChan() in waitForNextConfigMapEvent() are sufficiently concurrent already, since the watches are initiated in the outer for loop.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting.


existing := []int{}
for i := 0; ; i++ {
waitc := time.After(minWaitBetweenEvents) // rate limit
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tc := time.NewTicker(minWaitBetweenEvents)
defer tc.Stop()
for range tc.C

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see you have a multi channel select statement at the bottom, so the for statement I suggested won't work. I think it still makes sense to use the ticker?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code using the ticker reads better. Thanks!

Expect(err).NotTo(HaveOccurred())
existing = append(existing, i)
case updateEvent:
idx := rand.Int() % len(existing)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intn(existing)

Mod leaves a biased result if existing isn't a divisor of 2^32. This 100% doesn't matter for this use but no need to leave the example here to confuse people :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yikes, yes, happy to leave a good example here.

const (
createEvent = 0
updateEvent = iota
deleteEvent = iota
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can just:

  createEvent = iota
  updateEvent
  deleteEvent

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much better, fixed.

Expect(err).NotTo(HaveOccurred())
case deleteEvent:
idx := rand.Int() % len(existing)
name := fmt.Sprintf("cm-%d", existing[idx])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: might be worth it to declare name := func(n int) string { return fmt.Sprintf("cm-%d", existing[n]) } somewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sg

@lavalamp
Copy link
Member

I think we need to get the test in and demonstrate that it works before adding it to conformance.

@jpbetz jpbetz changed the title Add api-machinery 'watch-consistency' e2e conformance test Add api-machinery 'watch-consistency' e2e test Oct 16, 2018
@jpbetz jpbetz removed the area/conformance Issues or PRs related to kubernetes conformance tests label Oct 16, 2018
@jpbetz
Copy link
Contributor Author

jpbetz commented Oct 16, 2018

Thanks @lavalamp. Feedback applied. I've removed this from conformance (in code and in PR title/desc/labels). Once its had some bake time I'll propose it for conformance via a separate PR.

@jpbetz jpbetz force-pushed the watch-e2e-test1 branch 2 times, most recently from d39bd21 to 8616fbb Compare October 16, 2018 20:57
@jpbetz
Copy link
Contributor Author

jpbetz commented Oct 16, 2018

/test pull-kubernetes-e2e-kops-aws

@jpbetz
Copy link
Contributor Author

jpbetz commented Oct 16, 2018

/retest

1 similar comment
@jpbetz
Copy link
Contributor Author

jpbetz commented Oct 16, 2018

/retest

@jpbetz
Copy link
Contributor Author

jpbetz commented Oct 17, 2018

This is ready for review. Tests are all passing.

go func() {
defer GinkgoRecover()
produceConfigMapEvents(f, stopc, 5*time.Millisecond)
close(donec)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: probably best to defer this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Thanks!

@lavalamp
Copy link
Member

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jpbetz, lavalamp

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 19, 2018
@jpbetz
Copy link
Contributor Author

jpbetz commented Oct 19, 2018

/retest

1 similar comment
@jpbetz
Copy link
Contributor Author

jpbetz commented Oct 19, 2018

/retest

@lavalamp
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 19, 2018
@jpbetz
Copy link
Contributor Author

jpbetz commented Oct 19, 2018

/retest

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel comment for consistent failures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/etcd cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants