Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scheduler cache: API and implementation #21016

Merged
merged 1 commit into from
Feb 20, 2016

Conversation

hongchaodeng
Copy link
Contributor

This is the cache interface, impl. and tests separated out from #20669.
It depends on #20977.

@hongchaodeng
Copy link
Contributor Author

/cc @wojtek-t

@k8s-github-robot
Copy link

Labelling this PR as size/XL

@k8s-github-robot k8s-github-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Feb 11, 2016
@k8s-bot
Copy link

k8s-bot commented Feb 11, 2016

GCE e2e build/test failed for commit 67f9bc61a565ec89a3a1585072cbf740ecc9e10f.

@wojtek-t wojtek-t assigned wojtek-t and unassigned davidopp Feb 11, 2016
return nil
}

key := mustGetPodKey(pod)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving discussion from parent PR here:

@WojtekT
"I'm against having "mustGetPodKey" function. We shouldn't have any "panic" in the code in places where it depends on the data (there can be some data-corruption it shouldn't blow up our components).
It should be:

key, err := getPodKey(pod)
if err != nil {
return err
}

And you can move it outside lock BTW."

@hongchaodeng
"To return error is not intended, because this error case is not transient error (e.g. data corruption). The pod must have a unique key, e.g. a name.

Currently, Pod.Name is unique within Pod.Namespace, and we get "Namespace/Name" as key by using MetaNamespaceKeyFunc. If some changes breaks this compatibility, it is very dangerous and we shouldn't swallow the error.

We can extract namespace and name from api.Pod directly. But depending on MetaNamespaceKeyFunc seems more maintainable.

Not quite sure about this part. Just let you know that we shouldn't let such error not handled."

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't agree with you here.

Basically, assume that you have a corrupted data in etcd. Then:

  • if we leave as you are suggesting (with panic), this will cause scheduler crash-loop
  • if we change to what I'm suggesting, we may end up with scheduler trying to schedule that pod multiple pods, but it will be scheduling other pods in the meantime too.
    The second option is much better, since it doesn't break the system completely.

So I would like this to be changed to return an error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just point out one thing because I didn't know why corrupted data could cause the panic.

api.Pod is ensured from compiler level to have GetObjectMeta() and be instance of ObjectMetaAccessor. Thus, MetaNamespaceKeyFunc should always return the key. See here and here.

It's orthogonal to returning error. I also agree with you that It's fine to return error.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree that we should always be able to access all necessary fields. But we can e.g. change "MetaNamespaceKeyFunc" to do some other kind of validation, and it can potentially return an error.

So I would suggest changing to return an error.

@wojtek-t
Copy link
Member

@hongchaodeng
I've added some comments. Once you apply them and fix tests, I will take a look again (but generally it looks ok).

@hongchaodeng
Copy link
Contributor Author

Addressed all comments. The test failure was caused by the dependency on #20977. I have picked it here to show the test result.

@k8s-bot
Copy link

k8s-bot commented Feb 12, 2016

GCE e2e test build/test passed for commit 38335efbbf259efab82bf7be71c5c851f4d61664.

@k8s-bot
Copy link

k8s-bot commented Feb 12, 2016

GCE e2e test build/test passed for commit a8ee96af7c9cbdd532d43727d2905e14e388b58d.

@k8s-github-robot
Copy link

The author of this PR is not in the whitelist for merge, can one of the admins add the 'ok-to-merge' label?


cache.mu.Lock()
defer cache.mu.Unlock()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove empty line

@hongchaodeng
Copy link
Contributor Author

@wojtek-t

Addressed comments
Thanks for your review too.

@wojtek-t
Copy link
Member

LGTM - thanks!

@wojtek-t wojtek-t added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 16, 2016
@k8s-bot
Copy link

k8s-bot commented Feb 16, 2016

GCE e2e test build/test passed for commit 5c3d303.

@k8s-github-robot
Copy link

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

@k8s-bot
Copy link

k8s-bot commented Feb 16, 2016

GCE e2e test build/test passed for commit 5c3d303.

@k8s-github-robot
Copy link

@k8s-bot test this

Tests are more than 48 hours old. Re-running tests.

@k8s-bot
Copy link

k8s-bot commented Feb 18, 2016

GCE e2e test build/test passed for commit 5c3d303.

@k8s-github-robot
Copy link

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

@k8s-bot
Copy link

k8s-bot commented Feb 19, 2016

GCE e2e build/test failed for commit 5c3d303.

@wojtek-t
Copy link
Member

@k8s-bot e2e test this please github issue: #21463 #21467

@k8s-bot
Copy link

k8s-bot commented Feb 19, 2016

GCE e2e test build/test passed for commit 5c3d303.

@k8s-github-robot
Copy link

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

@k8s-bot
Copy link

k8s-bot commented Feb 19, 2016

GCE e2e test build/test passed for commit 5c3d303.

@wojtek-t
Copy link
Member

@k8s-bot unit test this please github issue: #21451

@k8s-github-robot
Copy link

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

@k8s-bot
Copy link

k8s-bot commented Feb 19, 2016

GCE e2e build/test failed for commit 5c3d303.

@wojtek-t
Copy link
Member

@k8s-bot e2e test this please github issue: #20787

@k8s-bot
Copy link

k8s-bot commented Feb 19, 2016

GCE e2e build/test failed for commit 5c3d303.

@wojtek-t
Copy link
Member

@k8s-bot e2e test this please github issue: #21467

@k8s-bot
Copy link

k8s-bot commented Feb 20, 2016

GCE e2e test build/test passed for commit 5c3d303.

@k8s-github-robot
Copy link

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

@k8s-bot
Copy link

k8s-bot commented Feb 20, 2016

GCE e2e test build/test passed for commit 5c3d303.

@k8s-github-robot
Copy link

Automatic merge from submit-queue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants