New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

capi: Replicas can go stale in nodeGroup causing erratic behaviour #3104

Closed

enxebre opened this issue Apr 30, 2020 · 2 comments · Fixed by #3177

Labels

area/provider/cluster-api

Member

enxebre commented Apr 30, 2020 •

edited

DeleteNodes() is sometimes called in subroutines https://github.com/kubernetes/autoscaler/blob/cluster-autoscaler-1.18.1/cluster-autoscaler/core/scale_down.go#L1067-L1090

But our implementation does not account for that properly https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/clusterapi/clusterapi_nodegroup.go#L84-L149. And subroutines might compete to call setSize() with different values and getting stale replicas().

Also when we call setSize() we leave stale the replicas number in the machineSet for the scalableResource struct used by replicas() https://github.com/kubernetes/autoscaler/blob/cluster-autoscaler-1.18.1/cluster-autoscaler/cloudprovider/clusterapi/clusterapi_machineset.go#L63-L85.

We need to:

Ensure replicas() does never return an stale number and unit test it.
Ensure multiple subroutines calls to deleteNodes() never use an stale replicas number. We should probably lock deleteNodes so any upcoming routine pick a valid replicas().
Also we need to make the deleteNodes() logic more robust by comparing against minSize here https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/clusterapi/clusterapi_nodegroup.go#L106 rather than with 1.

/area provider/cluster-api

The text was updated successfully, but these errors were encountered:

k8s-ci-robot added the area/provider/cluster-api label

enxebre changed the title ~~Replicas can go stale in nodeGroup causing erratic behaviour~~ capi: Replicas can go stale in nodeGroup causing erratic behaviour

Member Author

enxebre commented Apr 30, 2020

Contributor

elmiko commented Apr 30, 2020

++, thanks Alberto!

enxebre mentioned this issue

Bug 1823667: UPSTREAM: <carry>: Get replicas always from API server openshift/kubernetes-autoscaler#147

Merged

enxebre added a commit to enxebre/autoscaler that referenced this issue


          UPSTREAM: <carry>: Get replicas always from API server

5f4f4c1

When getting Replicas() the local struct in the scalable resource might be stale. To mitigate possible side effects, we want always get a fresh replicas.

This is one in a series of PR to mitigate kubernetes#3104
Once we got all merge we'll put a PR upstream.

enxebre added a commit to enxebre/autoscaler that referenced this issue


          UPSTREAM: <carry>: Compare against minSize in deleteNodes()

2bad5d0

When calling deleteNodes() we should fail early if the operation could delete nodes below the nodeGroup minSize().

This is one in a series of PR to mitigate kubernetes#3104
Once we got all merge we'll put a PR upstream.

enxebre mentioned this issue

Bug 1823667: UPSTREAM: <carry>: Compare against minSize in deleteNodes() openshift/kubernetes-autoscaler#148

Closed

enxebre added a commit to enxebre/autoscaler that referenced this issue


          UPSTREAM: <carry>: Compare against minSize in deleteNodes()

3ecad8f

When calling deleteNodes() we should fail early if the operation could delete nodes below the nodeGroup minSize().

This is one in a series of PR to mitigate kubernetes#3104
Once we got all merge we'll put a PR upstream.

enxebre added a commit to enxebre/autoscaler that referenced this issue


          UPSTREAM: <carry>: Compare against minSize in deleteNodes()

e842ba8

When calling deleteNodes() we should fail early if the operation could delete nodes below the nodeGroup minSize().

This is one in a series of PR to mitigate kubernetes#3104
Once we got all merge we'll put a PR upstream.

enxebre added a commit to enxebre/autoscaler that referenced this issue


          UPSTREAM: <carry>: Compare against minSize in deleteNodes()

7323be7

When calling deleteNodes() we should fail early if the operation could delete nodes below the nodeGroup minSize().

This is one in a series of PR to mitigate kubernetes#3104
Once we got all merge we'll put a PR upstream.

enxebre added a commit to enxebre/autoscaler that referenced this issue


          UPSTREAM: <carry>: Compare against minSize in deleteNodes()

2d24858

When calling deleteNodes() we should fail early if the operation could delete nodes below the nodeGroup minSize().

This is one in a series of PR to mitigate kubernetes#3104
Once we got all merge we'll put a PR upstream.

elmiko added a commit to elmiko/kubernetes-autoscaler that referenced this issue


          UPSTREAM: <carry>: Add mutex to DeleteNodes

e40bd66

This change adds a mutex to the MachineController structure which is
used to gate access to the DeleteNodes function.

This is one in a series of PRs to mitigate kubernetes#3104

elmiko mentioned this issue

Bug 1823667: UPSTREAM: <carry>: Add mutex to DeleteNodes openshift/kubernetes-autoscaler#149

Merged

enxebre added a commit to enxebre/autoscaler that referenced this issue


          UPSTREAM: <carry>: Get replicas always from API server

6f34947

When getting Replicas() the local struct in the scalable resource might be stale. To mitigate possible side effects, we want always get a fresh replicas.

This is one in a series of PR to mitigate kubernetes#3104
Once we got all merge we'll put a PR upstream.

enxebre added a commit to enxebre/autoscaler that referenced this issue


          UPSTREAM: <carry>: Compare against minSize in deleteNodes()

e7f689b

When calling deleteNodes() we should fail early if the operation could delete nodes below the nodeGroup minSize().

This is one in a series of PR to mitigate kubernetes#3104
Once we got all of them merged we'll put a PR upstream.

enxebre mentioned this issue

Bug 1823667: UPSTREAM: <carry>: Compare against minSize in deleteNodes() openshift/kubernetes-autoscaler#150

Merged

elmiko added a commit to elmiko/kubernetes-autoscaler that referenced this issue


          UPSTREAM: <carry>: Add mutex to DeleteNodes

c2e44a6

This change adds a mutex to the MachineController structure which is
used to gate access to the DeleteNodes function.

This is one in a series of PRs to mitigate kubernetes#3104
Once a solution has been reached, this will be contribued upstream.

elmiko added a commit to elmiko/kubernetes-autoscaler that referenced this issue


          UPSTREAM: <carry>: Add mutex to DeleteNodes

9e22789

This change adds a mutex to the MachineController structure which is
used to gate access to the DeleteNodes function.

This is one in a series of PRs to mitigate kubernetes#3104
Once a solution has been reached, this will be contribued upstream.

elmiko added a commit to elmiko/kubernetes-autoscaler that referenced this issue


          Add mutex to DeleteNodes in cluster-autoscaler CAPI provider

f1407a1

This change adds a mutex to the MachineController structure which is
used to gate access to the DeleteNodes function.

This is one in a series of PRs to mitigate kubernetes#3104

elmiko pushed a commit to elmiko/kubernetes-autoscaler that referenced this issue


          Get replicas always from API server for cluster-autoscaler CAPI provider

9c8b78a

When getting Replicas() the local struct in the scalable resource might be stale. To mitigate possible side effects, we want always get a fresh replicas.

This is one in a series of PR to mitigate kubernetes#3104

elmiko pushed a commit to elmiko/kubernetes-autoscaler that referenced this issue


          Compare against minSize in deleteNodes() in cluster-autoscaler CAPI

dac1f7d

provider

When calling deleteNodes() we should fail early if the operation could delete nodes below the nodeGroup minSize().

This is one in a series of PR to mitigate kubernetes#3104

elmiko mentioned this issue

Fix stale replicas issue with cluster-autoscaler CAPI provider #3177

Merged

k8s-ci-robot closed this as completed in #3177

d-mo pushed a commit to mistio/autoscaler that referenced this issue


          Add mutex to DeleteNodes in cluster-autoscaler CAPI provider

991dd58

This change adds a mutex to the MachineController structure which is
used to gate access to the DeleteNodes function.

This is one in a series of PRs to mitigate kubernetes#3104

d-mo pushed a commit to mistio/autoscaler that referenced this issue


          Get replicas always from API server for cluster-autoscaler CAPI provider

23a9bcf

When getting Replicas() the local struct in the scalable resource might be stale. To mitigate possible side effects, we want always get a fresh replicas.

This is one in a series of PR to mitigate kubernetes#3104

d-mo pushed a commit to mistio/autoscaler that referenced this issue


          Compare against minSize in deleteNodes() in cluster-autoscaler CAPI

b2249dc

provider

When calling deleteNodes() we should fail early if the operation could delete nodes below the nodeGroup minSize().

This is one in a series of PR to mitigate kubernetes#3104

detiber pushed a commit to detiber/autoscaler that referenced this issue


          Add mutex to DeleteNodes in cluster-autoscaler CAPI provider

143877b

This change adds a mutex to the MachineController structure which is
used to gate access to the DeleteNodes function.

This is one in a series of PRs to mitigate kubernetes#3104

detiber pushed a commit to detiber/autoscaler that referenced this issue


          Get replicas always from API server for cluster-autoscaler CAPI provider

bcdc272

When getting Replicas() the local struct in the scalable resource might be stale. To mitigate possible side effects, we want always get a fresh replicas.

This is one in a series of PR to mitigate kubernetes#3104

detiber pushed a commit to detiber/autoscaler that referenced this issue


          Compare against minSize in deleteNodes() in cluster-autoscaler CAPI

05ae2be

provider

When calling deleteNodes() we should fail early if the operation could delete nodes below the nodeGroup minSize().

This is one in a series of PR to mitigate kubernetes#3104

detiber pushed a commit to detiber/autoscaler that referenced this issue


          Add mutex to DeleteNodes in cluster-autoscaler CAPI provider

74a6030

This change adds a mutex to the MachineController structure which is
used to gate access to the DeleteNodes function.

This is one in a series of PRs to mitigate kubernetes#3104

detiber pushed a commit to detiber/autoscaler that referenced this issue


          Get replicas always from API server for cluster-autoscaler CAPI provider

b26021b

When getting Replicas() the local struct in the scalable resource might be stale. To mitigate possible side effects, we want always get a fresh replicas.

This is one in a series of PR to mitigate kubernetes#3104

detiber pushed a commit to detiber/autoscaler that referenced this issue


          Compare against minSize in deleteNodes() in cluster-autoscaler CAPI

3fd2a31

provider

When calling deleteNodes() we should fail early if the operation could delete nodes below the nodeGroup minSize().

This is one in a series of PR to mitigate kubernetes#3104

benmoss pushed a commit to benmoss/autoscaler that referenced this issue


          Add mutex to DeleteNodes in cluster-autoscaler CAPI provider

3b213ac

This change adds a mutex to the MachineController structure which is
used to gate access to the DeleteNodes function.

This is one in a series of PRs to mitigate kubernetes#3104

ghost pushed a commit to Capillary/autoscaler that referenced this issue


          Add mutex to DeleteNodes in cluster-autoscaler CAPI provider

8ab2db1

This change adds a mutex to the MachineController structure which is
used to gate access to the DeleteNodes function.

This is one in a series of PRs to mitigate kubernetes#3104

ghost pushed a commit to Capillary/autoscaler that referenced this issue


          Get replicas always from API server for cluster-autoscaler CAPI provider

c8c0781

When getting Replicas() the local struct in the scalable resource might be stale. To mitigate possible side effects, we want always get a fresh replicas.

This is one in a series of PR to mitigate kubernetes#3104

ghost pushed a commit to Capillary/autoscaler that referenced this issue


          Compare against minSize in deleteNodes() in cluster-autoscaler CAPI

8d58fb8

provider

When calling deleteNodes() we should fail early if the operation could delete nodes below the nodeGroup minSize().

This is one in a series of PR to mitigate kubernetes#3104

colin-welch pushed a commit to Paperspace/autoscaler that referenced this issue


          Add mutex to DeleteNodes in cluster-autoscaler CAPI provider

9aac777

This change adds a mutex to the MachineController structure which is
used to gate access to the DeleteNodes function.

This is one in a series of PRs to mitigate kubernetes#3104

aksentyev pushed a commit to aksentyev/autoscaler that referenced this issue


          Add mutex to DeleteNodes in cluster-autoscaler CAPI provider

5007c89

This change adds a mutex to the MachineController structure which is
used to gate access to the DeleteNodes function.

This is one in a series of PRs to mitigate kubernetes#3104

aksentyev pushed a commit to aksentyev/autoscaler that referenced this issue


          Get replicas always from API server for cluster-autoscaler CAPI provider

54308d7

When getting Replicas() the local struct in the scalable resource might be stale. To mitigate possible side effects, we want always get a fresh replicas.

This is one in a series of PR to mitigate kubernetes#3104

aksentyev pushed a commit to aksentyev/autoscaler that referenced this issue


          Compare against minSize in deleteNodes() in cluster-autoscaler CAPI

e5fc7d7

provider

When calling deleteNodes() we should fail early if the operation could delete nodes below the nodeGroup minSize().

This is one in a series of PR to mitigate kubernetes#3104

piotrnosek pushed a commit to piotrnosek/autoscaler that referenced this issue


          Add mutex to DeleteNodes in cluster-autoscaler CAPI provider

81d5d62

This change adds a mutex to the MachineController structure which is
used to gate access to the DeleteNodes function.

This is one in a series of PRs to mitigate kubernetes#3104

piotrnosek pushed a commit to piotrnosek/autoscaler that referenced this issue


          Get replicas always from API server for cluster-autoscaler CAPI provider

7139dda

When getting Replicas() the local struct in the scalable resource might be stale. To mitigate possible side effects, we want always get a fresh replicas.

This is one in a series of PR to mitigate kubernetes#3104

piotrnosek pushed a commit to piotrnosek/autoscaler that referenced this issue


          Compare against minSize in deleteNodes() in cluster-autoscaler CAPI

344548a

provider

When calling deleteNodes() we should fail early if the operation could delete nodes below the nodeGroup minSize().

This is one in a series of PR to mitigate kubernetes#3104

enxebre mentioned this issue

Ensure ClusterAPI DeleteNodes accounts for out of band changes scale #4634

Merged

enxebre added a commit to enxebre/autoscaler that referenced this issue


          Get capi targetsize from cache

d7eb79a

This ensured that access to replicas during scale down operations were never stale by accessing the API server kubernetes#3104.
This honoured that behaviour while moving to unstructured client kubernetes#3312.
This regressed that behaviour while trying to reduce the API server load kubernetes#4443.
This put back the never stale replicas behaviour at the cost of loading back the API server kubernetes#4634.

This PR tries to satisfy both non stale replicas during scale down and prevent the API server from being overloaded. To achieve that it lets targetSize which is called on every autoscaling cluster state loop from come from cache.

Also note that the scale down implementation has changed https://github.com/kubernetes/autoscaler/commits/master/cluster-autoscaler/core/scaledown.

enxebre added a commit to enxebre/autoscaler that referenced this issue


          Get capi targetsize from cache

a80af8e

This ensured that access to replicas during scale down operations were never stale by accessing the API server kubernetes#3104.
This honoured that behaviour while moving to unstructured client kubernetes#3312.
This regressed that behaviour while trying to reduce the API server load kubernetes#4443.
This put back the never stale replicas behaviour at the cost of loading back the API server kubernetes#4634.

Currently on e.g a 48 minutes cluster it does 1.4k get request to the scale subresource.
This PR tries to satisfy both non stale replicas during scale down and prevent the API server from being overloaded. To achieve that it lets targetSize which is called on every autoscaling cluster state loop from come from cache.

Also note that the scale down implementation has changed https://github.com/kubernetes/autoscaler/commits/master/cluster-autoscaler/core/scaledown.

enxebre mentioned this issue

Get capi targetsize from cache #5025

Merged

enxebre added a commit to enxebre/autoscaler that referenced this issue


          Get capi targetsize from cache

b2f1823

This ensured that access to replicas during scale down operations were never stale by accessing the API server kubernetes#3104.
This honoured that behaviour while moving to unstructured client kubernetes#3312.
This regressed that behaviour while trying to reduce the API server load kubernetes#4443.
This put back the never stale replicas behaviour at the cost of loading back the API server kubernetes#4634.

Currently on e.g a 48 minutes cluster it does 1.4k get request to the scale subresource.
This PR tries to satisfy both non stale replicas during scale down and prevent the API server from being overloaded. To achieve that it lets targetSize which is called on every autoscaling cluster state loop from come from cache.

Also note that the scale down implementation has changed https://github.com/kubernetes/autoscaler/commits/master/cluster-autoscaler/core/scaledown.

navinjoy pushed a commit to navinjoy/autoscaler that referenced this issue


          Get capi targetsize from cache

f8a54b9

This ensured that access to replicas during scale down operations were never stale by accessing the API server kubernetes#3104.
This honoured that behaviour while moving to unstructured client kubernetes#3312.
This regressed that behaviour while trying to reduce the API server load kubernetes#4443.
This put back the never stale replicas behaviour at the cost of loading back the API server kubernetes#4634.

Currently on e.g a 48 minutes cluster it does 1.4k get request to the scale subresource.
This PR tries to satisfy both non stale replicas during scale down and prevent the API server from being overloaded. To achieve that it lets targetSize which is called on every autoscaling cluster state loop from come from cache.

Also note that the scale down implementation has changed https://github.com/kubernetes/autoscaler/commits/master/cluster-autoscaler/core/scaledown.

tim-smart pushed a commit to arisechurch/autoscaler that referenced this issue


          Add mutex to DeleteNodes in cluster-autoscaler CAPI provider

8e346ce

This change adds a mutex to the MachineController structure which is
used to gate access to the DeleteNodes function.

This is one in a series of PRs to mitigate kubernetes#3104

tim-smart pushed a commit to arisechurch/autoscaler that referenced this issue


          Get replicas always from API server for cluster-autoscaler CAPI provider

31ee6c7

When getting Replicas() the local struct in the scalable resource might be stale. To mitigate possible side effects, we want always get a fresh replicas.

This is one in a series of PR to mitigate kubernetes#3104

tim-smart pushed a commit to arisechurch/autoscaler that referenced this issue


          Compare against minSize in deleteNodes() in cluster-autoscaler CAPI

9640c4d

provider

When calling deleteNodes() we should fail early if the operation could delete nodes below the nodeGroup minSize().

This is one in a series of PR to mitigate kubernetes#3104

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment