Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix scheduler race #11150

Merged
merged 1 commit into from
Jul 13, 2015
Merged

Fix scheduler race #11150

merged 1 commit into from
Jul 13, 2015

Conversation

bprashanth
Copy link
Contributor

The scheduler counts pods by combining the pods in the scheduled and assumed pod stores. The listing needs a to be synchronized with the informer.

/ref #10720
@davidopp @gmarek @lavalamp

@wojtek-t
Copy link
Member

Good catch - LGTM.

@k8s-bot
Copy link

k8s-bot commented Jul 13, 2015

GCE e2e build/test passed for commit 0728c08.

@gmarek gmarek added this to the v1.0 milestone Jul 13, 2015
@gmarek
Copy link
Contributor

gmarek commented Jul 13, 2015

Thanks @bprashanth great catch. I missed this, as nothing else was locked in this file:/

@thockin
Copy link
Member

thockin commented Jul 13, 2015

Does this need to cherry-pick to release-1.0 ?

@zmerlynn FYI

@thockin
Copy link
Member

thockin commented Jul 13, 2015

Waiting for OK tag before merge

@gmarek
Copy link
Contributor

gmarek commented Jul 13, 2015

It's a (very) probable cause of max-pods e2e test failure. Without fixing this it's possible that scheduler will overallocate Nodes (it happens in the test). Locking of modeller is distributed, which makes changes around LockedAction slightly tricky though...

@bprashanth
Copy link
Contributor Author

Shippable looks like a unittest flake (https://app.shippable.com/builds/55a3cd28a4a0be0e007c1f22), test-go passes locally so I kicked it.

Risk: low-medium, since it just adds a lock around a non blocking operation.
Value: @gmarek is right.

overcommit nodes are notoriously hard to debug and the kubelet won't reject these pods either, so I think we should consider either this or #11161.

@davidopp to make the call about cherry-pick

@davidopp
Copy link
Member

LGTM

Thanks for finding and fixing!

I'll add this to my list of things to cherry-pick.

@davidopp davidopp added lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-merge labels Jul 13, 2015
thockin added a commit that referenced this pull request Jul 13, 2015
@thockin thockin merged commit e71b6f9 into kubernetes:master Jul 13, 2015
@lavalamp
Copy link
Member

Good find!

zmerlynn added a commit that referenced this pull request Jul 14, 2015
…50-upstream-release-1.0

Automated cherry pick of #11150 upstream release 1.0
@rrati
Copy link

rrati commented Jul 16, 2015

@davidopp Did this get merged to 1.0?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm "Looks good to me", indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants