New Balancing algorithm (Peak) EWMA #1570

git001 · 2022-02-24T01:15:50Z

Your Feature Request

It would be nice to have a balancing Algorithm which uses the backend server round-trip time (RTT) for weighting of the backend servers.
There is the peak exponentially-weighted moving average (“peak EWMA”) algorithm which uses the RTT from health checks or normal requests to measure a "fast" backend.

Cite from https://twitter.github.io/finagle/guide/Clients.html#power-of-two-choices-p2c-peak-ewma

Power of Two Choices (P2C) + Peak EWMA [2]

Backed by the P2C distributor, Peak EWMA uses a moving average over an endpoint’s round-trip time (RTT) that is highly sensitive to peaks. This average is then weighted by the number of outstanding requests, effectively increasing our resolution per-request. It is designed to react to slow endpoints more quickly than least loaded by penalizing them when they exhibit slow response times. This load metric operates under the assumption that a loaded endpoint takes time to recover and so it is generally safe for the advertised load to incorporate an endpoint’s history. However, this assumption breaks down in the presence of long polling clients.

Here also the link of linkerd' description https://linkerd.io/2016/03/16/beyond-round-robin-load-balancing-for-latency/

As far as I know the haproxy code I think there will be several steps to implement this feature.

optional: add bc_rtt or server_rtt similar to fc_rtt https://github.com/haproxy/haproxy/blob/master/src/tcp_sample.c#L478
add a rtt to the server struct? or use the already created calc from https://github.com/haproxy/haproxy/blob/master/src/log.c#L2452 "%Td"
add to tcp-check that the rtt from the check request is set to the server.
calculate the weight based on rtt and requests in queue for the backend servers, I think this is the trickiest part.
add balance ewma or balance pwma

Any opinions on this?

What are you trying to do?

I try to keep the latency for the clients low even when some servers handle "slow" requests.

Output of `haproxy -vv`

Feature haproxy versions

The text was updated successfully, but these errors were encountered:

wtarreau · 2022-02-24T06:42:19Z

I didn't know about peak EWMA, and it may actually solve this old desire to automatically compute weights based on response time and load without creating an oscillator. But at first glance it seems to me that we don't need to create a new LB algo for this, but instead use this to constantly adjust weights so that it could be used with any LB algo (e.g. even consistent hash, thus allowing cache farms to smoothly adjust to the load). An in-depth lecture of the document is needed to get the whole picture, though. Thanks Alex!

git001 · 2022-12-09T13:01:40Z

That's a quite good hint in case the rtt is not available.

https://github.com/tower-rs/tower/blob/master/tower/src/load/peak_ewma.rs#L32

/// When no latency information has been measured for an endpoint, an arbitrary default
/// RTT of 1 second is used to prevent the endpoint from being overloaded before a
/// meaningful baseline can be established..

Let me write some questions for implementation here so that we have something like a brainstorming for the implementation phase.

Should be there any default value for "rtt" something like latency default 1ms
What value should be used when no rtt is available
- Health check duration?
- Time of first byte or first body?
How often should the weight be calculated (update rate)?

Let me try to create a simple picture for the flow.

flowchart TD;
  subgraph backends [backend]

  subgraph be_servers [servers]
      direction LR
      id3([Server 1]) -..- id4([Server N]);
  end

   subgraph requests [requests]
      id3([Server 1])
      id4([Server N]);
  end

  subgraph health [health_check]
      id3([Server 1])
      id4([Server N]);
  end

  requests --- |get rrt or duration and set weight| be_servers
  health --- |get rrt or duration and set weight| be_servers

end

git001 · 2023-05-03T12:11:28Z

added bc_rtt with this commit 5529c99

jaroslawr · 2023-05-30T09:48:49Z

Should this really be based on TCP-level RTT? I think the linkerd/finagle docs use "round-trip time" to mean "RPC round-trip time" so the time between:
A) when a RPC request started being sent, and
B) when RPC response has been completely received
This should be highly correlated with overall server load because the time it takes for CPUs to compute the RPC response is nearly always much much higher than the network latency and this time increases as CPUs become more busy. Meanwhile I would expect the TCP-level RTT exposed by the kernel to be dominated by network latency - the time the backend server spends processing the TCP segments using its CPU should be a negligible part of the overall measured RTT unless the server is under really really heavy load, so I would expect this to not vary that much with changing server load and hence be much worse for load balancing. I think this TCP RTT is updated e.g. when you receive an ACK after sending a TCP segment, so that it is updated several times even just in the process of sending the request.

git001 · 2023-05-30T11:27:27Z

@jaroslawr Well you are right, there should be more the one factor for the algorithm to decide which server should be used as you can see on the picture. The TCP-level RTT is just one factor, but this feature was not in HAProxy so I contributed this feature 😃 . I think the #1977 is a per-requirement before the (Peak) EWMA algorithm could be implemented, but that's just my opinion. 🤷‍♂️

git001 added the type: feature This issue describes a feature request / wishlist. label Feb 24, 2022

git001 mentioned this issue Mar 16, 2022

Make health checks able to reuse idle connections #1613

Open

git001 mentioned this issue Dec 22, 2022

Server weight modulation based on smoothed average measurement #1977

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New Balancing algorithm (Peak) EWMA #1570

New Balancing algorithm (Peak) EWMA #1570

git001 commented Feb 24, 2022

wtarreau commented Feb 24, 2022

git001 commented Dec 9, 2022 •

edited

git001 commented May 3, 2023

jaroslawr commented May 30, 2023

git001 commented May 30, 2023

New Balancing algorithm (Peak) EWMA #1570

New Balancing algorithm (Peak) EWMA #1570

Comments

git001 commented Feb 24, 2022

Your Feature Request

What are you trying to do?

Output of haproxy -vv

wtarreau commented Feb 24, 2022

git001 commented Dec 9, 2022 • edited

git001 commented May 3, 2023

jaroslawr commented May 30, 2023

git001 commented May 30, 2023

Output of `haproxy -vv`

git001 commented Dec 9, 2022 •

edited