Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MESOS: fix race condition in contrib/mesos/pkg/queue/delay #24916

Merged

Conversation

jdef
Copy link
Contributor

@jdef jdef commented Apr 28, 2016

attempt to fix #23822

/cc @ihmccreery

@jdef jdef self-assigned this Apr 28, 2016
@k8s-github-robot k8s-github-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. release-note-label-needed labels Apr 28, 2016
@jdef jdef changed the title attempt to fix Unit flake in contrib/mesos/pkg/offers Fix race condition in contrib/mesos/pkg/offers Apr 28, 2016
@jdef jdef changed the title Fix race condition in contrib/mesos/pkg/offers attempt to fix Unit flake in contrib/mesos/pkg/queue/delay Apr 28, 2016
@jdef jdef changed the title attempt to fix Unit flake in contrib/mesos/pkg/queue/delay fix race condition in contrib/mesos/pkg/queue/delay Apr 28, 2016
@jdef jdef changed the title fix race condition in contrib/mesos/pkg/queue/delay MESOS: fix race condition in contrib/mesos/pkg/queue/delay Apr 28, 2016
@jdef jdef added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-label-needed labels Apr 28, 2016
@@ -164,6 +188,7 @@ func (q *DelayQueue) pop(next func() *qitem, cancel <-chan struct{}) interface{}
select {
case <-cancel:
item.readd(item)
sitAndSpin(&q.cond, ch)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prevents an (unlikely) goroutine leak (any future broadcast received by the Wait would cause the above goroutine to exit)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would a q.cond.Broadcast instead do any harm here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's simpler and (in real life usage) probably all that is needed. because we're not creating these queues and throwing them away in the real world, we use them indefinitely. however it's less correct so I think I prefer this version.

@jdef
Copy link
Contributor Author

jdef commented Apr 29, 2016

renamed the new func I added to something more appropriate

@@ -138,6 +139,29 @@ func (q *DelayQueue) Pop() interface{} {
}, nil)
}

func finishWaiting(cond *sync.Cond, waitFinished <-chan struct{}) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is pretty ugly. But I don't know a better solution right now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree. I'd really like to have something cleaner, but this should stop the
bleeding.

On Fri, Apr 29, 2016 at 8:46 AM, Dr. Stefan Schimanski <
notifications@github.com> wrote:

In contrib/mesos/pkg/queue/delay.go
#24916 (comment)
:

@@ -138,6 +139,29 @@ func (q *DelayQueue) Pop() interface{} {
}, nil)
}

+func finishWaiting(cond *sync.Cond, waitFinished <-chan struct{}) {

this is pretty ugly. But I don't know a better solution right now.


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
https://github.com/kubernetes/kubernetes/pull/24916/files/a74573277f81565b3aeff8c5d4bfe2a3110c1ccd#r61570968

@jdef jdef added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 29, 2016
@nikhiljindal
Copy link
Contributor

cc @k8s-oncall This is fixing a flaky test.
Can we have a hand merge?

@nikhiljindal nikhiljindal added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label May 3, 2016
@jdef
Copy link
Contributor Author

jdef commented May 4, 2016

Unrelated test failure:


Summarizing 1 Failure:

[Fail] [k8s.io] Networking [k8s.io] Granular Checks [It] should
function for pod communication between nodes
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/networking.go:257

On Tue, May 3, 2016 at 8:41 PM, Kubernetes Bot notifications@github.com
wrote:

GCE e2e build/test failed for commit a745732
a745732
.

Please reference the list of currently known flakes
https://github.com/kubernetes/kubernetes/issues?q=is:issue+label:kind/flake+is:open
when examining this failure. If you request a re-test, you must reference
the issue describing the flake.


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
#24916 (comment)

@wojtek-t
Copy link
Member

@k8s-bot e2e test this please github issue: #26311

@wojtek-t wojtek-t added this to the v1.3 milestone May 30, 2016
@wojtek-t
Copy link
Member

@k8s-bot unit test this please, issue: #26345

1 similar comment
@wojtek-t
Copy link
Member

@k8s-bot unit test this please, issue: #26345

@goltermann
Copy link
Contributor

@wojtek-t @jdef - what's next here? just prod the test bot again?

@sttts
Copy link
Contributor

sttts commented Jun 7, 2016

@k8s-bot test this issue: #26632

@sttts
Copy link
Contributor

sttts commented Jun 7, 2016

@k8s-bot test this issue: #23545

@yujuhong
Copy link
Contributor

yujuhong commented Jun 7, 2016

@k8s-bot node e2e test this issue: #26431

@goltermann
Copy link
Contributor

@k8s-bot node e2e test this issue: #IGNORE

@goltermann
Copy link
Contributor

@k8s-bot node e2e test this issue: #24476

@jdef
Copy link
Contributor Author

jdef commented Jun 17, 2016

@k8s-bot node e2e test this issue: #IGNORE

@jdef
Copy link
Contributor Author

jdef commented Jun 17, 2016

@k8s-bot e2e test this issue: #IGNORE

@jdef
Copy link
Contributor Author

jdef commented Jun 17, 2016

omg the tests finally passed. quick, merge queue, quick!

@k8s-bot
Copy link

k8s-bot commented Jun 17, 2016

GCE e2e build/test passed for commit a745732.

@k8s-github-robot
Copy link

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

@k8s-bot
Copy link

k8s-bot commented Jun 17, 2016

GCE e2e build/test passed for commit a745732.

@k8s-github-robot
Copy link

Automatic merge from submit-queue

@k8s-github-robot k8s-github-robot merged commit c538106 into kubernetes:master Jun 17, 2016
@jdef jdef deleted the jdef_fix_23822 branch June 17, 2016 13:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/platform/mesos lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unit flake in contrib/mesos/pkg/offers
9 participants