Updates allocation load testing documentation (#2883)

* Updates allocation load testing documentation Co-authored-by: Robert Bailey <robertbailey@google.com>
googleforgames · Jan 5, 2023 · 73da855 · 73da855
1 parent 4a86106
commit 73da855
Show file tree

Hide file tree

Showing 5 changed files with 122 additions and 110 deletions.
diff --git a/site/content/en/docs/Advanced/allocator-service.md b/site/content/en/docs/Advanced/allocator-service.md
@@ -53,7 +53,7 @@ If the `agones-allocator` service is installed as a `LoadBalancer` [using a rese
 
 ```bash
 EXTERNAL_IP=$(kubectl get services agones-allocator -n agones-system -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
-helm upgrade --install --wait \
+helm upgrade my-release agones/agones -n agones-system --wait \
    --set agones.allocator.service.loadBalancerIP=${EXTERNAL_IP} \
    ...
 ```

diff --git a/test/load/allocation/README.md b/test/load/allocation/README.md
@@ -1,50 +1,45 @@
-# Load Test for gRPC allocation service
+# Load Tests for gRPC Allocation Service
 
-This load tests aims to validate performance of the gRPC allocation service.
+[Allocation Load Test](#allocation-load-test) and [Scenario Tests](#scenario-tests)
+for testing the performance of the gRPC allocation service.
 
-## Kubernetes Cluster Setup
+## Prerequisites
 
-For the load test you can follow the regular Kubernetes and Agones setup. In order to test the allocation performance in isolation, we let Agones to get all
-the game servers to Ready state before starting a test.
-Here are the few important things:
-- If you are running in GCP, use a regional cluster instead of a zonal cluster to ensure high availability of the cluster control plane
-- Use a dedicated node pool for the Agones controllers with multiple CPUs per node, e.g. `e2-standard-4'
-- In the default node pool (where the Game Server pods are created), 75 nodes are required to make sure there are enough nodes available for all game servers to move into the ready state. When using a regional cluster, with three zones with the region, that will require a configuration of 25 nodes per zone.
+1. A [Kubernetes cluster](https://agones.dev/site/docs/installation/creating-cluster/) with [Agones](https://agones.dev/site/docs/installation/install-agones/)
+    - We recommend installing Agones using the [Helm](https://agones.dev/site/docs/installation/install-agones/helm/) package manager.
+    - If you are running in GCP, use a regional cluster instead of a zonal cluster to ensure high availability of the cluster control plane.
+    - Use a dedicated node pool for the Agones controllers with multiple CPUs per node, e.g. 'e2-standard-4'.
+    - For Allocation Load Test:
+      - In the default node pool (where the Game Server pods are created), 75 nodes are required to make sure there are enough nodes available for all game servers to move into the ready state. When using a regional GKE cluster with three zones that will require a configuration of 25 nodes per zone.
+    - For Scenario Tests:
+      - See [Kubernetes Cluster Setup for Scenario Tests](#kubernetes-cluster-setup-for-scenario-tests)
+2. A configured [Allocator Service](https://agones.dev/site/docs/advanced/allocator-service/)
+    - The allocator service uses gRPC. In order to be able to call the service, TLS
+and mTLS have to be set up on the Server and Client.
+3. (Optional) [Metrics](https://agones.dev/site/docs/guides/metrics/) for monitoring Agones workloads
 
-## Fleet Setting
+# Allocation Load Test
 
-We used the sample [fleet configuration](./fleet.yaml) with some minor modifications. We updated the `replicas` to 4000.
-Also we set the `automaticShutdownDelaySec` parameter to 10 so simple-game-server game servers shutdown after 10
-minutes (see below).
-This helps to easily re-run the test without having to delete the game servers and allows to run tests continously.
-
-```yaml
-apiVersion: "agones.dev/v1"
-kind: Fleet
- ...
- spec:
-  # the number of GameServers to keep Ready
-  replicas: 4000
-  ...
-        # The GameServer's Pod template
-        template:
-            spec:
-                containers:
-                - args:
-                  # We setup the simple-game-server server to shutdown 10 mins after allocation
-                  - -automaticShutdownDelaySec=600
-                  image: us-docker.pkg.dev/agones-images/examples/simple-game-server:0.14
-                  name: simple-game-server
-  ...
-```
+This load test aims to validate performance of the gRPC allocation service.
 
-## Configuring the Allocator Service
+## Fleet Setting
+
+We used the sample [fleet configuration](./fleet.yaml). We set the `automaticShutdownDelaySec` parameter to 600 so simple-game-server game servers shutdown after 10
+minutes. This helps to easily re-run the test without having to delete the game servers and allows to run tests continously.
 
-The allocator service uses gRPC. In order to be able to call the service, TLS and mTLS has to be setup.
-For more information visit [Allocator Service](https://agones.dev/site/docs/advanced/allocator-service/).
 
 ## Running the test
 
+```
+kubectl apply -f ./fleet.yaml
+````
+Wait until the fleet shows 4000 ready game servers before running the allocation script.
+```
+kubectl get fleet
+NAME              SCHEDULING   DESIRED   CURRENT   ALLOCATED   READY   AGE
+load-test-fleet   Packed       4000      4000      0           4000    2m38s
+```
+
 You can use the provided runAllocation.sh script by providing two parameters:
 - number of clients (to do parallel allocations)
 - number of allocations for client
@@ -76,15 +71,15 @@ TESTRUNSCOUNT=1 ./runAllocation.sh 40 10
 ```
 
 
-# Running Scenario tests
+# Scenario Tests
 
 The scenario test allows you to generate a variable number of allocations to
 your cluster over time, simulating a game where clients arrive in an unsteady
 pattern. The game servers used in the test are configured to shutdown after
 being allocated, simulating the GameServer churn that is expected during
 normal game play.
 
-## Kubernetes Cluster Setup
+## Kubernetes Cluster Setup for Scenario Tests
 
 For the scenario test to achieve high throughput, you can create multiple groups
 of nodes in your cluster. During testing (on GKE), we created a node pool for
@@ -108,25 +103,28 @@ availability of the cluster control plane.
 The following commands were used to construct a cluster for testing:
 
 ```bash
-gcloud container clusters create scenario-test --cluster-version=1.21 \
+export REGION="us-west1"
+export VERSION="1.23"
+
+gcloud container clusters create scenario-test --cluster-version=$VERSION \
   --tags=game-server --scopes=gke-default --num-nodes=2 \
   --no-enable-autoupgrade --machine-type=n2-standard-2 \
-  --region=us-west1 --enable-ip-alias --cluster-ipv4-cidr 10.0.0.0/10
+  --region=$REGION --enable-ip-alias
 
 gcloud container node-pools create kube-system --cluster=scenario-test \
   --no-enable-autoupgrade \
   --node-taints components.gke.io/gke-managed-components=true:NoExecute \
-  --num-nodes=1 --machine-type=n2-standard-16 --region us-west1
+  --num-nodes=1 --machine-type=n2-standard-16 --region $REGION
 
 gcloud container node-pools create agones-system --cluster=scenario-test \
   --no-enable-autoupgrade --node-taints agones.dev/agones-system=true:NoExecute \
   --node-labels agones.dev/agones-system=true --num-nodes=1 \
-  --machine-type=n2-standard-16 --region us-west1
+  --machine-type=n2-standard-16 --region $REGION
 
 gcloud container node-pools create game-servers --cluster=scenario-test \
   --node-taints scenario-test.io/game-servers=true:NoExecute --num-nodes=1 \
   --machine-type n2-standard-2 --no-enable-autoupgrade \
-  --region us-west1 --tags=game-server --scopes=gke-default \
+  --region $REGION --tags=game-server --scopes=gke-default \
   --enable-autoscaling --max-nodes=300 --min-nodes=175
 ```
 
@@ -156,64 +154,7 @@ by running [`./configure-agones.sh`](configure-agones.sh).
 
 ## Fleet Setting
 
-We used the following fleet configuration:
-
-```
-apiVersion: "agones.dev/v1"
-kind: Fleet
-metadata:
-  name: scenario-test
-spec:
-  replicas: 10
-  template:
-    metadata:
-      labels:
-        gameName: simple-game-server
-    spec:
-      ports:
-      - containerPort: 7654
-        name: default
-      health:
-        initialDelaySeconds: 30
-        periodSeconds: 60
-      template:
-        spec:
-          tolerations:
-          - effect: NoExecute
-            key: scenario-test.io/game-servers
-            operator: Equal
-            value: 'true'
-          containers:
-          - name: simple-game-server
-            image: us-docker.pkg.dev/agones-images/examples/simple-game-server:0.14
-            args:
-            - -automaticShutdownDelaySec=60
-            - -readyIterations=10
-            resources:
-              limits:
-                cpu: 20m
-                memory: 24Mi
-              requests:
-                cpu: 20m
-                memory: 24Mi
-```
-
-and fleet autoscaler configuration:
-
-```
-apiVersion: "autoscaling.agones.dev/v1"
-kind: FleetAutoscaler
-metadata:
-  name: fleet-autoscaler-scenario-test
-spec:
-  fleetName: scenario-test
-  policy:
-    type: Buffer
-    buffer:
-      bufferSize: 2000
-      minReplicas: 10000
-      maxReplicas: 20000
-```
+We used the sample [fleet configuration](./scenario-fleet.yaml) and [fleet autoscaler configuration](./autoscaler.yaml).
 
 To reduce pod churn in the system, the simple game servers are configured to
 return themselves to `Ready` after being allocated the first 10 times following
@@ -223,12 +164,6 @@ integration pattern. After 10 simulated game sessions, the simple game servers
 then exit automatically. The fleet configuration above sets each game session to
 last for 1 minute, representing a short game.
 
-## Configuring the Allocator Service
-
-The allocator service uses gRPC. In order to be able to call the service, TLS
-and mTLS have to be set up. For more information visit
-[Allocator Service](https://agones.dev/site/docs/advanced/allocator-service/).
-
 ## Running the test
 
 You can use the provided runScenario.sh script by providing one parameter (a

diff --git a/test/load/allocation/autoscaler.yaml b/test/load/allocation/autoscaler.yaml
@@ -0,0 +1,26 @@
+# Copyright 2023 Google LLC All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+apiVersion: "autoscaling.agones.dev/v1"
+kind: FleetAutoscaler
+metadata:
+  name: fleet-autoscaler-scenario-test
+spec:
+  fleetName: scenario-test
+  policy:
+    type: Buffer
+    buffer:
+      bufferSize: 2000
+      minReplicas: 10000
+      maxReplicas: 20000
diff --git a/test/load/allocation/scenario-fleet.yaml b/test/load/allocation/scenario-fleet.yaml
@@ -0,0 +1,51 @@
+# Copyright 2023 Google LLC All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+apiVersion: "agones.dev/v1"
+kind: Fleet
+metadata:
+  name: scenario-test
+spec:
+  replicas: 10
+  template:
+    metadata:
+      labels:
+        gameName: simple-game-server
+    spec:
+      ports:
+      - containerPort: 7654
+        name: default
+      health:
+        initialDelaySeconds: 30
+        periodSeconds: 60
+      template:
+        spec:
+          tolerations:
+          - effect: NoExecute
+            key: scenario-test.io/game-servers
+            operator: Equal
+            value: 'true'
+          containers:
+          - name: simple-game-server
+            image: us-docker.pkg.dev/agones-images/examples/simple-game-server:0.14
+            args:
+            - -automaticShutdownDelaySec=60
+            - -readyIterations=10
+            resources:
+              limits:
+                cpu: 20m
+                memory: 24Mi
+              requests:
+                cpu: 20m
+                memory: 24Mi
diff --git a/test/load/allocation/variable.txt b/test/load/allocation/variable.txt
@@ -11,7 +11,7 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-
+#
 ### Varying allocations
 #Duration,Number_of_clients/allocations
 # run for 20 mins with 10 clients