-
Notifications
You must be signed in to change notification settings - Fork 5
Description
The qa alldevices test does not care about location, so it often connects clients to devices on the other side of the world. This means we are testing global internet routing between qa hosts and devices. We don't really need to test this because we don't expect users to do this, instead we expect them to connect to the closest available device. So let's make the qa alldevices test location aware, and keep connections in the same region.
If we add a qa agent in London, and associate each device with the closest qa agent, and balance devices across agents when latency is below a threshold, we get the following options at different thresholds based on the 75 devices onchain as of 2025-12-23:
| Threshold | fra-mn-qa01 | sgp-mn-qa01 | sfo-mn-qa01 | nyc-mn-qa01 | lon-mn-qa01 | ams-mn-qa01 | Total |
|---|---|---|---|---|---|---|---|
| 10ms | 8 | 10 | 11 | 24 | 6 | 16 | 75 |
| 25ms | 11 | 10 | 11 | 18 | 12 | 13 | 75 |
| 50ms | 12 | 10 | 14 | 15 | 12 | 12 | 75 |
| 100ms | 12 | 11 | 13 | 13 | 13 | 13 | 75 |
| 150ms | 12 | 12 | 13 | 13 | 12 | 13 | 75 |
Looks like 50ms gives pretty good balancing, and we can complete all tests in 15 batches (a recent run had 17 batches).
Prerequisite:
- Add a GetLatency call to the QA agent
Testing process:
- When the alldevices test starts, call GetLatency for each client
- Create a list of devices for each host.
- If there are multiple hosts with <50ms latency for that device, assign the device to the host with the fewest devices
- Otherwise, associate each device with the host with the best latency
- Loop through n batches, where n is the largest number of devices associated with a host
- Look at each host
- If there are >= 2 hosts, and the current host has 0 devices left to test, remove the current host
- If there are 2 hosts, and the current host has 0 devices left to test, test the previously tested device again
- Test all hosts in current batch
- Look at each host