Fleet autoscaler should not modify a Fleet's spec #3229

jeremylvln · 2023-06-21T19:52:26Z

What happened:

I created a FleetAutoscaler managing a Fleet with a Buffer policy. The autoscaler behavior is to modify the spec of the Fleet with an updated replicas count. This is wrong because if another controller is in charge of managing Agones resources, this controller and Agones will fight and update consequently the replicas field because one will think the value is wrong and cycle with that.

I consider this more a bug than a feature request because it should really not be done this way. But this ticket has a step in both categories.

What you expected to happen:

The expected behavior is something like Kubernetes built-in Deployment and HorizontalPodAutoscaler where the spec is not modified, but the handling is rather internal. So I think you have three ways of handling this issue:

Rather than modifying the spec modify the status. This way a watching controller will not try to fight Agones, but I agree that the semantic is a bit violated as one resource should not modify the status of another
Refactor the code to take into account the replicas count queried by the autoscaler in the fleet reconciler, if any, otherwise use the replicas count from the spec. This couple a bit the reconcile code of the fleet with the autoscaler, but it still a valid option
Implement the scale subresource on the Fleet and manages any runtime modification of the replicas through it (see below)

From Kubernetes's documentation about HPAs (link):

The HorizontalPodAutoscaler controller accesses corresponding workload resources that support scaling (such as Deployments and StatefulSet). These resources each have a subresource named scale, an interface that allows you to dynamically set the number of replicas and examine each of their current states. For general information about subresources in the Kubernetes API, see Kubernetes API Concepts.

How to reproduce it (as minimally and precisely as possible):

Create a dummy Fleet. Create a FleetAutoscaler with dummy values.

Anything else we need to know?:

Environment:

Agones version: 1.31.0
Kubernetes version (use kubectl version): 1.27
Cloud provider or hardware configuration: Scaleway
Install method (yaml/helm): yaml
Troubleshooting guide log(s): N/A
Others: N/A

The text was updated successfully, but these errors were encountered:

jeremylvln · 2023-06-22T07:51:44Z

Update: If I understood correctly the Fleet API, it seems that the scale subresource is at least declared (implemented?). So there is already a step in solution 3.

roberthbailey · 2023-06-23T08:14:06Z

The scale subresource has been implemented on Fleet for quite a while and is used in the fleet quickstart: https://agones.dev/site/docs/getting-started/create-fleet/#3-scale-up-the-fleet

I'm not sure I quite understand your issue though. If you use the scale subresource, that is just an indirect way of changing the replicas count in the spec for the Fleet, is it not?

Further down in the HPA documentation that you linked it says:

When you configure autoscaling for a Deployment, you bind a HorizontalPodAutoscaler to a single Deployment. The HorizontalPodAutoscaler manages the replicas field of the Deployment.

So even if the HPA controller is changing the replicas field via the scale subresource, it still manages that field. The scale subresource is just a generic interface to the replicas field so that the HPA controller doesn't need to have code added for any new resource types that can be scaled up and down.

Also, to your point about conflicting controllers, the HPA documentation says:

When an HPA is enabled, it is recommended that the value of spec.replicas of the Deployment and / or StatefulSet be removed from their manifest(s). If this isn't done, any time a change to that object is applied, for example via kubectl apply -f deployment.yaml, this will instruct Kubernetes to scale the current number of Pods to the value of the spec.replicas key. This may not be desired and could be troublesome when an HPA is active.

and the same is true for a Fleet. You can prevent your controller and the fleet autoscaler from fighting over the replicas field but not including replicas in your fleet specification, and allowing the fleet autoscaler to set the field on your behalf.

github-actions · 2024-06-15T10:00:20Z

'This issue is marked as Stale due to inactivity for more than 30 days. To avoid being marked as 'stale' please add 'awaiting-maintainer' label or add a comment. Thank you for your contributions '

jeremylvln added the kind/bug These are bugs. label Jun 21, 2023

jeremylvln mentioned this issue Jun 21, 2023

Autoscaling with Agones does not create new pods jeremylvln/Shulker#113

Closed

1 task

github-actions bot added the stale Pending closure unless there is a strong objection. label Jun 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fleet autoscaler should not modify a Fleet's spec #3229

Fleet autoscaler should not modify a Fleet's spec #3229

jeremylvln commented Jun 21, 2023 •

edited

Loading

jeremylvln commented Jun 22, 2023

roberthbailey commented Jun 23, 2023

github-actions bot commented Jun 15, 2024

Fleet autoscaler should not modify a Fleet's spec #3229

Fleet autoscaler should not modify a Fleet's spec #3229

Comments

jeremylvln commented Jun 21, 2023 • edited Loading

jeremylvln commented Jun 22, 2023

roberthbailey commented Jun 23, 2023

github-actions bot commented Jun 15, 2024

jeremylvln commented Jun 21, 2023 •

edited

Loading