Skip to content

Commit 55b1738

Browse files
authoredOct 24, 2023
Add event batch processing results and rerun reconfig test (#1186) (#1188)
Cherry Picking #1186 onto 1.0 release
1 parent 36bc032 commit 55b1738

File tree

3 files changed

+102
-75
lines changed

3 files changed

+102
-75
lines changed
 

‎tests/reconfig/results/1.0.0/1.0.0.md

+78
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
# Reconfiguration testing Results
2+
3+
<!-- TOC -->
4+
- [Reconfiguration testing Results](#reconfiguration-testing-results)
5+
- [Test environment](#test-environment)
6+
- [Results Tables](#results-tables)
7+
- [NGINX Reloads and Time to Ready](#nginx-reloads-and-time-to-ready)
8+
- [Event Batch Processing](#event-batch-processing)
9+
- [NumResources -> Total Resources](#numresources---total-resources)
10+
- [Observations](#observations)
11+
<!-- TOC -->
12+
13+
## Test environment
14+
15+
GKE cluster:
16+
17+
- Node count: 3
18+
- Instance Type: e2-medium
19+
- k8s version: 1.27.3-gke.100
20+
- Zone: us-central1-c
21+
- Total vCPUs: 6
22+
- Total RAM: 12GB
23+
- Max pods per node: 110
24+
25+
NGF deployment:
26+
27+
- NGF version: edge - git commit 29b45e38bacd7c4f22834938105e3cda4f29f6d1
28+
- NGINX Version: 1.25.2
29+
30+
## Results Tables
31+
32+
### NGINX Reloads and Time to Ready
33+
34+
| Test number | NumResources | TimeToReadyTotal (s) | TimeToReadyAvgSingle (s) | NGINX reloads | NGINX reload avg time (ms) | <= 500ms | <= 1000ms |
35+
|-------------|--------------|----------------------|--------------------------|---------------|----------------------------|----------|-----------|
36+
| 1 | 30 | 1 | 1 | 2 | 191 | 100% | 100% |
37+
| 1 | 150 | 2 | 2 | 2 | 440 | 50% | 100% |
38+
| 2 | 30 | 50 | <1 | 93 | 162 | 100% | 100% |
39+
| 2 | 150 | 208 | <1 | 396 | 281 | 96.46% | 100% |
40+
| 3 | 30 | 1 | 1 | 93 | 129 | 100% | 100% |
41+
| 3 | 150 | 1 | 1 | 453 | 130 | 100% | 100% |
42+
43+
44+
### Event Batch Processing
45+
46+
| Test number | NumResources | Event Batch Total | Event Batch Processing avg time (ms) | <= 500ms | <= 1000ms |
47+
|-------------|--------------|-------------------|--------------------------------------|----------|-----------|
48+
| 1 | 30 | 69 | 6.232 | 100% | 100% |
49+
| 1 | 150 | 309 | 3.638 | 99.68% | 100% |
50+
| 2 | 30 | 465 | 38.759 | 100% | 100% |
51+
| 2 | 150 | 1941 | 68.539 | 98.51% | 100% |
52+
| 3 | 30 | 374 | 36.834 | 99.73% | 99.73% |
53+
| 3 | 150 | 1812 | 40.411 | 99.94% | 99.94% |
54+
55+
56+
## NumResources -> Total Resources
57+
| NumResources | Gateways | Secrets | ReferenceGrants | Namespaces | application Pods | application Services | HTTPRoutes | Total Resources |
58+
| ------------ | -------- | ------- | --------------- | ---------- | ---------------- | -------------------- | ---------- | --------------- |
59+
| x | 1 | 1 | 1 | x+1 | 2x | 2x | 3x | <total> |
60+
| 30 | 1 | 1 | 1 | 31 | 60 | 60 | 90 | 244 |
61+
| 150 | 1 | 1 | 1 | 151 | 300 | 300 | 450 | 1204 |
62+
63+
## Observations
64+
65+
1. We are reloading after reconciling a ReferenceGrant even when there is no Gateway. This is because we treat every
66+
upsert/delete of a ReferenceGrant as a change. This means we will regenerate NGINX config every time a ReferenceGrant
67+
is created, updated (generation must change), or deleted, even if it does not apply to the accepted Gateway.
68+
69+
Issue filed: https://github.com/nginxinc/nginx-gateway-fabric/issues/1124
70+
71+
2. We are reloading after reconciling a HTTPRoute even when there is no accepted Gateway and no config being generated.
72+
73+
Issue filed: https://github.com/nginxinc/nginx-gateway-fabric/issues/1123
74+
75+
3. Majority of NGINX reloads were in the <= 500ms bucket, with all of them being in the <= 1000ms bucket. An increase
76+
in the reload time based on number of configured resources resulting in NGINX configuration changes was observed.
77+
78+
4. No errors (NGF or NGINX) were observed in any test run.

‎tests/reconfig/results/v1.0.0.md

-61
This file was deleted.

‎tests/reconfig/setup.md

+24-14
Original file line numberDiff line numberDiff line change
@@ -13,8 +13,8 @@
1313

1414
## Goals
1515

16-
- Measure how long it takes NGF to reconfigure NGINX when a number of Gateway API and referenced core Kubernetes
17-
resources are created at once.
16+
- Measure how long it takes NGF to reconfigure NGINX and update statuses when a number of Gateway API and
17+
referenced core Kubernetes resources are created at once.
1818
- Two runs of each test should be ran with differing numbers of resources. Each run will deploy:
1919
- a single Gateway, Secret, and ReferenceGrant resources
2020
- `x+1` number of namespaces
@@ -38,7 +38,8 @@
3838
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v0.8.1/standard-install.yaml
3939
```
4040

41-
3. Deploy NGF from edge using Helm install (NOTE: For Test 1, deploy AFTER resources):
41+
3. Deploy NGF from edge using Helm install and wait for LoadBalancer Service to be ready
42+
(NOTE: For Test 1, deploy AFTER resources):
4243

4344
```console
4445
helm install my-release oci://ghcr.io/nginxinc/charts/nginx-gateway-fabric --version 0.0.0-edge \
@@ -65,10 +66,20 @@
6566
kubectl port-forward $GW_POD -n nginx-gateway 9113:9113 &
6667
```
6768

68-
6. Measure Time To Ready as described in each test, get the reload count, and get the average NGINX reload duration.
69-
The average reload duration can be computed by taking the `nginx_gateway_fabric_nginx_reloads_milliseconds_sum`
70-
metric value and dividing it by the `nginx_gateway_fabric_nginx_reloads_milliseconds_count` metric value.
71-
7. For accuracy, repeat the test suite once or twice, take the averages, and look for any anomolies or outliers.
69+
6. Measure NGINX Reloads and Time to Ready Results
70+
1. TimeToReadyTotal as described in each test - NGF logs.
71+
2. TimeToReadyAvgSingle which is the average time between updating any resource and the
72+
NGINX configuration being reloaded - NGF logs.
73+
3. NGINX Reload count - metrics.
74+
4. Average NGINX reload duration - metrics.
75+
1. The average reload duration can be computed by taking the `nginx_gateway_fabric_nginx_reloads_milliseconds_sum`
76+
metric value and dividing it by the `nginx_gateway_fabric_nginx_reloads_milliseconds_count` metric value.
77+
7. Measure Event Batch Processing Results
78+
1. Event Batch Total - metrics.
79+
2. Average Event Batch Processing duration - metrics.
80+
1. The average event batch processing duraiton can be computed by taking the `nginx_gateway_fabric_event_batch_processing_milliseconds_sum`
81+
metric value and dividing it by the `nginx_gateway_fabric_event_batch_processing_milliseconds_count` metric value.
82+
8. For accuracy, repeat the test suite once or twice, take the averages, and look for any anomolies or outliers.
7283

7384
## Tests
7485

@@ -79,8 +90,8 @@
7990
e.g. `cd scripts && bash create-resources-gw-last.sh 30`. The script will deploy backend apps and services, wait
8091
60 seconds for them to be ready, and deploy 1 Gateway, 1 RefGrant, 1 Secret, and HTTPRoutes.
8192
2. Deploy NGF
82-
3. Check logs for time it takes from start-up -> config written and NGINX reloaded. Get reload count and average reload
83-
duration from metrics and logs.
93+
3. Measure TimeToReadyTotal as the time it takes from start-up -> config written and
94+
NGINX reloaded. Measure the other results as described in steps 6-7 of the [Setup](#setup) section.
8495

8596
### Test 2: Start NGF, deploy Gateway, create many resources attached to GW
8697

@@ -89,9 +100,8 @@
89100
2. Run the provided script with the required number of resources,
90101
e.g. `cd scripts && bash create-resources-routes-last.sh 30`. The script will deploy backend apps and services,
91102
wait 60 seconds for them to be ready, and deploy 1 Gateway, 1 Secret, 1 RefGrant, and HTTPRoutes at the same time.
92-
3. Check logs for time it takes from NGF receiving first resource update -> final config written, and NGINX's final
93-
reload. Check logs for average individual HTTPRoute TTR also. Get reload count and average reload duration from
94-
metrics and logs.
103+
3. Measure TimeToReadyTotal as the time it takes from NGF receiving the first HTTPRoute resource update -> final
104+
config written and NGINX reloaded. Measure the other results as described in steps 6-7 of the [Setup](#setup) section.
95105

96106
### Test 3: Start NGF, create many resources attached to a Gateway, deploy the Gateway
97107

@@ -101,5 +111,5 @@
101111
e.g. `cd scripts && bash create-resources-gw-last.sh 30`.
102112
The script will deploy the namespaces, backend apps and services, 1 Secret, 1 ReferenceGrant, and the HTTPRoutes;
103113
wait 60 seconds for the backend apps to be ready, and then deploy 1 Gateway for all HTTPRoutes.
104-
3. Check logs for time it takes from NGF receiving gateway resource -> config written and NGINX reloaded. Get reload
105-
count and average reload duration from metrics and logs.
114+
3. Measure TimeToReadyTotal as the time it takes from NGF receiving gateway resource -> config written and NGINX reloaded.
115+
Measure the other results as described in steps 6-7 of the [Setup](#setup) section.

0 commit comments

Comments
 (0)
Failed to load comments.