-
Notifications
You must be signed in to change notification settings - Fork 38.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Store the latest cloud provider node addresses #65226
Store the latest cloud provider node addresses #65226
Conversation
@sjenning PTAL |
pkg/kubelet/kubelet.go
Outdated
// Last list of node addresses retrieved from the cloud provider | ||
cloudproviderLastNodeAddresses []v1.NodeAddress | ||
// Last error retrieved from the cloud provider | ||
cloudproviderLastError error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this needed here as member var and not just in the kubelet_node_status code block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both variables needs to be on the global level so the go routine does not store its values into a local variable that disappears. Plus, the cloud related code is going out at some point and the variables are not exported. So it's fine to keep it that way. We can change it and polish later.
pkg/kubelet/kubelet_node_status.go
Outdated
@@ -486,7 +486,7 @@ func (kl *Kubelet) setNodeAddress(node *v1.Node) error { | |||
kl.cloudproviderRequestMux.Unlock() | |||
|
|||
go func() { | |||
nodeAddresses, err = instances.NodeAddresses(context.TODO(), kl.nodeName) | |||
kl.cloudproviderLastNodeAddresses, kl.cloudproviderLastError = instances.NodeAddresses(context.TODO(), kl.nodeName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i may be missing something obvious, but seems like cloudProviderLastError can just move next to declaration of var err error above on line 476
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The culprit is actually the case <-kl.cloudproviderRequestSync
case. Each time the go routine asking for the node addresses finishes, it sends a value to the kl.cloudproviderRequestSync
channel. If the happens after the kl.cloudproviderRequestTimeout
, the kl.cloudproviderRequestSync
is non-empty and in the next node status iteration the select picks the first case <-kl.cloudproviderRequestSync
instead of waiting for the go routine to finish. Given the nodeAddresses
and err
are local variables and are both nil (cause the go routine responsible for setting them stores the values into previous local variables which no longer exist in the setNodeAddress
scope), the nodeAddresses
and err
is always nil from the moment the timeout occurs.
Thus we need to store both (the list of node addresses and the error) into variables that are not local to the setNodeAddress
method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there must be a simpler way to do this. we've added 6 fields to the kubelet struct to support the timeout in this single area of the code. i'm not saying i see it yet. but there must be. i'm too tired to see it atm though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks like it is writing to member variables asynchronously outside a lock... isn't that a crashing data race?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unlikely to happen but it is. Just the chance is very low. I am addressing this in another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I will update this PR with the suggested refactoring
it may help give some clarity on the issue we are having to describe the reason for the change for posterity. |
/retest |
@ingvagabund what would you think about abstracting this to a
and abstract all this internal state out of the kubelet and remove the 6 new fields between #62543 and this PR?
I know this PR has been proven to work, so might do this later. But I had the idea in my head; wanted to get it in writing for later. |
@sjenning what you are saying make sense. So far I am aware of two requests that are send:
Do we want to have a cloud request manager for each cloud resource we ask for? Or group them based on some criteria? There are cloud resources we ask for once in a while (e.g. the node external ID) and cloud resources we ask for periodically (e.g. the node addresses). So maybe s/ |
Yes, I think we are on the same wavelength. CloudRequestSyncManager is good too. We can use it to basically buffer all cloudprovider requests the kubelet makes since, as we've found out, not all cloudprovider code is equal. The real question is do you want to go ahead with this PR for now and do that as a follow-on or modify this PR. I'm on the fence about it. |
After some thought, I am in favor of going forward with this fix and doing the refactor in a follow-on PR. |
/hold data race question in https://github.com/kubernetes/kubernetes/pull/65226/files#r198143459 is unresolved |
/lgtm cancel |
452a7d2
to
3203fa6
Compare
@dims thank you very much :) |
@vishh PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this PR is decoupling updating node status from fetching updated node IPs.
However it is not clear why kubelet needs to keep watching for node addresses continually in the first place.
Also, if there were any production issues that this PR addresses, it would help describe them here in this PR or in a separate issue (preferred).
} | ||
|
||
// NodeAddresses does not wait for cloud provider to return a node addresses. | ||
// It always returns node addresses or an error. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment true? The logic below is blocking on cloud provider API call to succeed at least once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first call to the NodeAddresses
can be blocking but the remaining ones are not. The assumption is that a node needs to register first before it starts periodically invoking the method. So it is negligible the first call hangs for some time. I can rephrase the statement to be more clear about the fact.
@vishh I agree that it might be unnecessary to poll the cloudprovider for our addresses if we can guarantee that the information we can on the first However, we didn't want to tackle that wider-ranging issue here. That is very cloudprovider specific and would require acks for all cloud providers saying "yes, our cloudprovider will not return changing return values for This PR is to simply protect the kubelet from latency introduced by the cloudprovider on a NodeAddresses() call. In our case (Azure), this was a throttling mechanism which could cause the node status update loop to stall and the Node go |
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
this is an improvement over what exists prior.
i think its up to more debate if the cloud provider interface should be handling this for us by taking a timeout or something on each of its calls.
case <-collected: | ||
return nodeAddresses, err | ||
case <-time.Tick(2 * nodeAddressesRetryPeriod): | ||
return nil, fmt.Errorf("Timeout after %v waiting for address to appear", 2*nodeAddressesRetryPeriod) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: s/Timeout/timeout
see the original code had this casing issue as well, doesnt report back directly to end user, so not a huge deal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, this is a test, so even less of a deal. got confused in my review.
// Request timeout | ||
cloudproviderRequestTimeout time.Duration | ||
// Handles requests to cloud provider with timeout | ||
cloudResourceSyncManager *cloudResourceSyncManager |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be good to have a broader discussion on if we want all interaction with cloud
to go through this interface in the future, or if we want to change the cloud provider interface to accept a context w/ timeout on operations in the future so each caller can decide how to handle it across the code base. for now, this cleans up the existing member vars on kubelet so is a nice incremental improvement.
kubelet.cloudproviderRequestParallelism = make(chan int, 1) | ||
kubelet.cloudproviderRequestSync = make(chan int) | ||
kubelet.cloudproviderRequestTimeout = 10 * time.Second | ||
kubelet.cloudResourceSyncManager = NewCloudResourceSyncManager(kubelet.cloud, kubelet.nodeName, kubelet.nodeStatusUpdateFrequency) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we can update this in the future from 10s to 1m, 5m, 10m, etc.
an issue or // todo to track it would be good.
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: derekwaynecarr, dims, sjenning The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold cancel |
/test all [submit-queue is verifying that this PR is safe to merge] |
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here. |
@ingvagabund: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Would it make sense to backport this to 1.11 to fix #68270 ? |
Having correct ip addresses set on the nodes is a plumbing for all subsequent operations on k8s. Would be really helpful to backport this fix to 1.11. We have a lot of users who run 1.11 who are currently facing this problem, and not planning to move to k8s 1.12 yet, |
I'm echoing @alena1108 This is a very critical fix and we have seen this on v1.11.2. Missing internal IP makes the API server can't talk to the worker node and it breaks everything. We also don't want to much v1.12 so fast because I'm afraid there will be new unknown issue. So please consider to backport this to v1.11.x. Thank you. |
We ran into this problem too. Our solution was to just pass the |
Follow on to kubernetes#65226.
…#65226-upstream-release-1.11 Automated cherry pick of #65226: Put all the node address cloud provider retrival complex
What this PR does / why we need it:
Buffer the recently retrieved node address so they can be used as soon as the next node status update is run.
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #65814
Special notes for your reviewer:
Release note: