Permalink
Browse files

Prioritise Allocation from Nodes with Allocated/Ready GameServers

One of the first parts for Node autoscaling (#368) - make sure we essentially
bin pack our allocated game servers.

This change makes allocation first prioritise allocation from `Nodes` that
already have the most `Allocated` `GameServers`, and then in the case of a tie,
to the `Nodes` that have the most `Ready` `GameServers`.

This sets us up for the next part, such that when we scale down a Fleet,
it removes `GameServers` from `Nodes` that have the least `GameServers` on
them.
  • Loading branch information...
markmandel committed Oct 2, 2018
1 parent 9e94bec commit 1923a41679abec3ff458e9fc9392671cfd3d3ce8
View
@@ -69,6 +69,9 @@ Documentation and usage guides on how to develop and host dedicated game servers
- [CPP Simple](./examples/cpp-simple) (C++) - C++ example that starts up, stays healthy and then shuts down after 60 seconds.
- [Xonotic](./examples/xonotic) - Wraps the SDK around the open source FPS game [Xonotic](http://www.xonotic.org) and hosts it on Agones.
### Advanced
- [Scheduling and Autoscaling](./docs/scheduling_autoscaling.md)
## Get involved
- [Slack](https://join.slack.com/t/agones/shared_invite/enQtMzE5NTE0NzkyOTk1LWQ2ZmY1Mjc4ZDQ4NDJhOGYxYTY2NTY0NjUwNjliYzVhMWFjYjMxM2RlMjg3NGU0M2E0YTYzNDIxNDMyZGNjMjU)
@@ -251,4 +251,6 @@ simple-udp-mzhrl-zg9rq Ready 10.30.64.99 [map[name:default port:7745]]
## Next Steps
Read the advanced [Scheduling and Autoscaling](scheduling_autoscaling.md) guide, for more details on autoscaling.
If you want to use your own GameServer container make sure you have properly integrated the [Agones SDK](../sdks/).
View
@@ -15,6 +15,7 @@ metadata:
name: fleet-example
spec:
replicas: 2
scheduling: Packed
strategy:
type: RollingUpdate
rollingUpdate:
@@ -53,6 +54,11 @@ This is a very common pattern in the Kubernetes ecosystem.
The `spec` field is the actual `Fleet` specification and it is composed as follow:
- `replicas` is the number of `GameServers` to keep Ready or Allocated in this Fleet
- `scheduling`(⚠️⚠️⚠️ **This is currently a development feature and has not been released** ⚠️⚠️⚠️) defines how GameServers are organised across the cluster. Currently only affects Allocation, but will expand
in future releases. Options include:
"Packed" (default) is aimed at dynamic Kubernetes clusters, such as cloud providers, wherein we want to bin pack
resources. "Distributed" is aimed at static Kubernetes clusters, wherein we want to distribute resources across the entire
cluster. See [Scheduling and Autoscaling](scheduling_autoscaling.md) for more details.
- `strategy` is the `GameServer` replacement strategy for when the `GameServer` template is edited.
- `type` is replacement strategy for when the GameServer template is changed. Default option is "RollingUpdate", but "Recreate" is also available.
- `RollingUpdate` will increment by `maxSurge` value on each iteration, while decrementing by `maxUnavailable` on each iteration, until all GameServers have been switched from one version to another.
@@ -0,0 +1,113 @@
# Scheduling and Autoscaling
⚠️⚠️⚠️ **This is currently a development feature and has not been released** ⚠️⚠️⚠️
> Autoscaling is currently ongoing work within Agones. The work you see here is just the beginning.
Table of Contents
=================
* [Fleet Autoscaling](#fleet-autoscaling)
* [Autoscalng Concepts](#autoscalng-concepts)
* [Allocation Scheduling](#allocation-scheduling)
* [Fleet Scheduling](#fleet-scheduling)
* [Packed](#packed)
* [Allocation Scheduling Strategy](#allocation-scheduling-strategy)
* [Distributed](#distributed)
* [Allocation Scheduling Stategy](#allocation-scheduling-stategy)
Scheduling and autoscaling go hand in hand, as where in the cluster `GameServers` are provisioned
impacts how to autoscale fleets up and down (or if you would even want to)
## Fleet Autoscaling
Fleet autoscaling is currently the only type of autoscaling that exists in Agones. It is also only available as a simple
buffer autoscaling strategy. Have a look at the [Create a Fleet Autoscaler](create_fleetautoscaler.md) quickstart,
and the [Fleet Autoscaler Specification](fleetautoscaler_spec.md) for details.
Node scaling, and more sophisticated fleet autoscaling will be coming in future releases ([design](https://github.com/GoogleCloudPlatform/agones/issues/368))
## Autoscaling Concepts
To facilitate autoscaling, we need to combine several piece of concepts and functionality, described below.
### Allocation Scheduling
Allocation scheduling refers to the order in which `GameServers`, and specifically their backing `Pods` are chosen
from across the Kubernetes cluster within a given `Fleet` when [allocation](./create_fleet.md#4-allocate-a-game-server-from-the-fleet) occurs.
## Fleet Scheduling
There are two scheduling strategies for Fleets - each designed for different types of Kubernetes Environments.
### Packed
```yaml
apiVersion: "stable.agones.dev/v1alpha1"
kind: Fleet
metadata:
name: simple-udp
spec:
replicas: 100
scheduling: Packed
template:
spec:
ports:
- containerPort: 7654
template:
spec:
containers:
- name: simple-udp
image: gcr.io/agones-images/udp-server:0.4
```
This is the *default* Fleet scheduling strategy. It is designed for dynamic Kubernetes environments, wherein you wish
to scale up and down as load increases or decreases, such as in a Cloud environment where you are paying
for the infrastructure you use.
It attempts to _pack_ as much as possible into the smallest set of nodes, to make
scaling infrastructure down as easy as possible.
Currently, Allocation scheduling is the only aspect this strategy affects, but in future releases it will
also affect `GameServer` `Pod` scheduling, and `Fleet` scale down scheduling as well.
#### Allocation Scheduling Strategy
Under the "Packed" strategy, allocation will prioritise allocating `GameServers` to nodes that are running on
Nodes that already have allocated `GameServers` running on them.
### Distributed
```yaml
apiVersion: "stable.agones.dev/v1alpha1"
kind: Fleet
metadata:
name: simple-udp
spec:
replicas: 100
scheduling: Distributed
template:
spec:
ports:
- containerPort: 7654
template:
spec:
containers:
- name: simple-udp
image: gcr.io/agones-images/udp-server:0.4
```
This Fleet scheduling strategy is designed for static Kubernetes environments, such as when you are running Kubernetes
on bare metal, and the cluster size rarely changes, if at all.
This attempts to distribute the load across the entire cluster as much as possible, to take advantage of the static
size of the cluster.
Currently, the only thing the scheduling strategy affects is Allocation scheduling, but in future releases it will
also affect `GameServer` `Pod` scheduling, and `Fleet` scaledown scheduling as well.
#### Allocation Scheduling Strategy
Under the "Distributed" strategy, allocation will prioritise allocating `GameSerers` to nodes that have the least
number of allocated `GameServers` on them.
View
@@ -27,6 +27,13 @@ metadata:
spec:
# the number of GameServers to keep Ready or Allocated in this Fleet
replicas: 2
# defines how GameServers are organised across the cluster. Currently only affects Allocation, but will expand
# in future releases. Options include:
# "Packed" (default) is aimed at dynamic Kubernetes clusters, such as cloud providers, wherein we want to bin pack
# resources
# "Distributed" is aimed at static Kubernetes clusters, wherein we want to distribute resources across the entire
# cluster
scheduling: Packed
# a GameServer template - see:
# https://github.com/GoogleCloudPlatform/agones/blob/master/docs/gameserver_spec.md for all the options
strategy:
@@ -22,11 +22,27 @@ import (
)
const (
// Packed scheduling strategy will prioritise allocating GameServers
// on Nodes with the most Allocated, and then Ready GameServers
// to bin pack as many Allocated GameServers on a single node.
// This is most useful for dynamic Kubernetes clusters - such as on Cloud Providers.
// In future versions, this will also impact Fleet scale down, and Pod Scheduling.
Packed SchedulingStrategy = "Packed"
// Distributed scheduling strategy will prioritise allocating GameServers
// on Nodes with the least Allocated, and then Ready GameServers
// to distribute Allocated GameServers across many nodes.
// This is most useful for statically sized Kubernetes clusters - such as on physical hardware.
// In future versions, this will also impact Fleet scale down, and Pod Scheduling.
Distributed SchedulingStrategy = "Distributed"
// FleetGameServerSetLabel is the label that the name of the Fleet
// is set to on the GameServerSet the Fleet controls
FleetGameServerSetLabel = stable.GroupName + "/fleet"
)
type SchedulingStrategy string
// +genclient
// +genclient:noStatus
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object
@@ -56,6 +72,8 @@ type FleetSpec struct {
Replicas int32 `json:"replicas"`
// Deployment strategy
Strategy appsv1.DeploymentStrategy `json:"strategy"`
// Scheduling strategy. Defaults to "Packed".
Scheduling SchedulingStrategy `json:"scheduling"`
// Template the GameServer template to apply for this Fleet
Template GameServerTemplateSpec `json:"template"`
}
@@ -105,6 +123,10 @@ func (f *Fleet) ApplyDefaults() {
f.Spec.Strategy.Type = appsv1.RollingUpdateDeploymentStrategyType
}
if f.Spec.Scheduling == "" {
f.Spec.Scheduling = Packed
}
if f.Spec.Strategy.Type == appsv1.RollingUpdateDeploymentStrategyType {
if f.Spec.Strategy.RollingUpdate == nil {
f.Spec.Strategy.RollingUpdate = &appsv1.RollingUpdateDeployment{}
@@ -60,11 +60,13 @@ func TestFleetApplyDefaults(t *testing.T) {
// gate
assert.EqualValues(t, "", f.Spec.Strategy.Type)
assert.EqualValues(t, "", f.Spec.Scheduling)
f.ApplyDefaults()
assert.Equal(t, appsv1.RollingUpdateDeploymentStrategyType, f.Spec.Strategy.Type)
assert.Equal(t, "25%", f.Spec.Strategy.RollingUpdate.MaxUnavailable.String())
assert.Equal(t, "25%", f.Spec.Strategy.RollingUpdate.MaxSurge.String())
assert.Equal(t, Packed, f.Spec.Scheduling)
}
func TestFleetUpperBoundReplicas(t *testing.T) {
@@ -20,7 +20,7 @@ import (
"sync"
"agones.dev/agones/pkg/apis/stable"
stablev1alpha1 "agones.dev/agones/pkg/apis/stable/v1alpha1"
"agones.dev/agones/pkg/apis/stable/v1alpha1"
"agones.dev/agones/pkg/client/clientset/versioned"
getterv1alpha1 "agones.dev/agones/pkg/client/clientset/versioned/typed/stable/v1alpha1"
"agones.dev/agones/pkg/client/informers/externalversions"
@@ -95,7 +95,7 @@ func NewController(
eventBroadcaster.StartRecordingToSink(&typedcorev1.EventSinkImpl{Interface: kubeClient.CoreV1().Events("")})
c.recorder = eventBroadcaster.NewRecorder(scheme.Scheme, corev1.EventSource{Component: "fleetallocation-controller"})
kind := stablev1alpha1.Kind("FleetAllocation")
kind := v1alpha1.Kind("FleetAllocation")
wh.AddHandler("/mutate", kind, admv1beta1.Create, c.creationMutationHandler)
wh.AddHandler("/validate", kind, admv1beta1.Create, c.creationValidationHandler)
wh.AddHandler("/validate", kind, admv1beta1.Update, c.mutationValidationHandler)
@@ -120,7 +120,7 @@ func (c *Controller) Run(workers int, stop <-chan struct{}) error {
func (c *Controller) creationMutationHandler(review admv1beta1.AdmissionReview) (admv1beta1.AdmissionReview, error) {
c.logger.WithField("review", review).Info("creationMutationHandler")
obj := review.Request.Object
fa := &stablev1alpha1.FleetAllocation{}
fa := &v1alpha1.FleetAllocation{}
err := json.Unmarshal(obj.Raw, fa)
if err != nil {
@@ -157,10 +157,10 @@ func (c *Controller) creationMutationHandler(review admv1beta1.AdmissionReview)
}
// When a GameServer is deleted, the FleetAllocation should go with it
ref := metav1.NewControllerRef(gs, stablev1alpha1.SchemeGroupVersion.WithKind("GameServer"))
ref := metav1.NewControllerRef(gs, v1alpha1.SchemeGroupVersion.WithKind("GameServer"))
fa.ObjectMeta.OwnerReferences = append(fa.ObjectMeta.OwnerReferences, *ref)
fa.Status = stablev1alpha1.FleetAllocationStatus{GameServer: gs}
fa.Status = v1alpha1.FleetAllocationStatus{GameServer: gs}
newFA, err := json.Marshal(fa)
if err != nil {
@@ -191,7 +191,7 @@ func (c *Controller) creationMutationHandler(review admv1beta1.AdmissionReview)
func (c *Controller) creationValidationHandler(review admv1beta1.AdmissionReview) (admv1beta1.AdmissionReview, error) {
c.logger.WithField("review", review).Info("creationValidationHandler")
obj := review.Request.Object
fa := &stablev1alpha1.FleetAllocation{}
fa := &v1alpha1.FleetAllocation{}
if err := json.Unmarshal(obj.Raw, fa); err != nil {
return review, errors.Wrapf(err, "error unmarshalling original FleetAllocation json: %s", obj.Raw)
}
@@ -225,8 +225,8 @@ func (c *Controller) creationValidationHandler(review admv1beta1.AdmissionReview
func (c *Controller) mutationValidationHandler(review admv1beta1.AdmissionReview) (admv1beta1.AdmissionReview, error) {
c.logger.WithField("review", review).Info("mutationValidationHandler")
newFA := &stablev1alpha1.FleetAllocation{}
oldFA := &stablev1alpha1.FleetAllocation{}
newFA := &v1alpha1.FleetAllocation{}
oldFA := &v1alpha1.FleetAllocation{}
if err := json.Unmarshal(review.Request.Object.Raw, newFA); err != nil {
return review, errors.Wrapf(err, "error unmarshalling new FleetAllocation json: %s", review.Request.Object.Raw)
@@ -256,8 +256,8 @@ func (c *Controller) mutationValidationHandler(review admv1beta1.AdmissionReview
}
// allocate allocated a GameServer from a given Fleet
func (c *Controller) allocate(f *stablev1alpha1.Fleet, fam *stablev1alpha1.FleetAllocationMeta) (*stablev1alpha1.GameServer, error) {
var allocation *stablev1alpha1.GameServer
func (c *Controller) allocate(f *v1alpha1.Fleet, fam *v1alpha1.FleetAllocationMeta) (*v1alpha1.GameServer, error) {
var allocation *v1alpha1.GameServer
// can only allocate one at a time, as we don't want two separate processes
// trying to allocate the same GameServer to different clients
c.allocationMutex.Lock()
@@ -272,19 +272,19 @@ func (c *Controller) allocate(f *stablev1alpha1.Fleet, fam *stablev1alpha1.Fleet
return allocation, err
}
for _, gs := range gsList {
if gs.Status.State == stablev1alpha1.Ready && gs.ObjectMeta.DeletionTimestamp.IsZero() {
allocation = gs
break
}
switch f.Spec.Scheduling {
case v1alpha1.Packed:
allocation = findReadyGameServerForAllocation(gsList, packedComparator)
case v1alpha1.Distributed:
allocation = findReadyGameServerForAllocation(gsList, distributedComparator)
}
if allocation == nil {
return allocation, ErrNoGameServerReady
}
gsCopy := allocation.DeepCopy()
gsCopy.Status.State = stablev1alpha1.Allocated
gsCopy.Status.State = v1alpha1.Allocated
if fam != nil {
c.patchMetadata(gsCopy, fam)
@@ -300,7 +300,7 @@ func (c *Controller) allocate(f *stablev1alpha1.Fleet, fam *stablev1alpha1.Fleet
}
// patch the labels and annotations of an allocated GameServer with metadata from a FleetAllocation
func (c *Controller) patchMetadata(gs *stablev1alpha1.GameServer, fam *stablev1alpha1.FleetAllocationMeta) {
func (c *Controller) patchMetadata(gs *v1alpha1.GameServer, fam *v1alpha1.FleetAllocationMeta) {
// patch ObjectMeta labels
if fam.Labels != nil {
if gs.ObjectMeta.Labels == nil {
Oops, something went wrong.

0 comments on commit 1923a41

Please sign in to comment.