https://github.com/AlmaLinux/mirrors/blob/mirrors_service/src/backend/api/handlers.py#L73
When a client matches the network data service cone for any mirror, the offered mirror list is built using the following logic:
- Append mirror matching on ASN or subnet
- Create sorted list of mirrors by distance, filter out mirrors matching on first criteria, and extend list to desired minimum (LENGTH_CLOUD_MIRRORS_LIST)
The issue is that this process is completely deterministic and there is no load balancing for clients that match on any mirror's network service cone. There is no shuffling happening in either the first list of suitable_mirrors or the geographically based extension of the list to meet the minimum number of mirrors to offer.
Failure modes:
- A particularly large install base of hosts relies on a local mirror. When this mirror goes offline and before the health checkers have disqualified this mirror, ALL clients on the local subnet are steered to the next closest public mirror.
- An EXTREMELY large install base of hosts with two local mirrors are not able to load balance between the two mirrors. The list of mirrors is always returned in the same order (probably based on the order they happened to be added to the mirror database)
Recommendation:
- Shuffle list of mirrors which match on network data before the geographically based extension of the list.
- Generate a longer geographic extension of the list, shuffle, and then use to extend the original list. i.e. find the
2 * LENGTH_CLOUD_MIRRORS_LIST closest mirrors, shuffle this list, then suitable_mirrors.extend(nearby_mirrors_shuffled)[:LENGTH_CLOUD_MIRRORS_LIST - len(suitable_mirrors)]
https://github.com/AlmaLinux/mirrors/blob/mirrors_service/src/backend/api/handlers.py#L73
When a client matches the network data service cone for any mirror, the offered mirror list is built using the following logic:
The issue is that this process is completely deterministic and there is no load balancing for clients that match on any mirror's network service cone. There is no shuffling happening in either the first list of suitable_mirrors or the geographically based extension of the list to meet the minimum number of mirrors to offer.
Failure modes:
Recommendation:
2 * LENGTH_CLOUD_MIRRORS_LISTclosest mirrors, shuffle this list, thensuitable_mirrors.extend(nearby_mirrors_shuffled)[:LENGTH_CLOUD_MIRRORS_LIST - len(suitable_mirrors)]