New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: PeakEWMA load balancer with response success rate #2274
Conversation
Use supermonkey to mock `time.Now` instead of a function variable
Use `Host.Weight()` as factor
Make random chosen at most once for each host (cherry picked from commit d8f4988)
# Conflicts: # pkg/upstream/cluster/host.go # pkg/upstream/cluster/mock_test.go # pkg/upstream/cluster/stats.go
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## master #2274 +/- ##
==========================================
- Coverage 60.30% 60.29% -0.01%
==========================================
Files 422 423 +1
Lines 37220 37410 +190
==========================================
+ Hits 22446 22558 +112
- Misses 12540 12611 +71
- Partials 2234 2241 +7
... and 6 files with indirect coverage changes Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report in Codecov by Sentry. |
Here is the formula for EWMA A constant In this PR, the specific calculation method is given: the It equals to |
Issues associated with this PR
#2252
Solutions
Forked from #2253. But two differents.
Duration with EWMA
The first is that using the two level implementation of an EWMA to calculate duration characteristics. The first level will be counted according to the second time interval to get the arithmetic mean, and then decay it through the exponential moving weighted average (EWMA).
Success Rate with EWMA
The second is that additional success rate metrics added. It's for fail-fast scenario.
For example, in a mixed deployment scenario, 1C, 2C, 4C, etc. exist at the same time. The slot-based concurrency control algorithm will manage the remaining available cores. When the core 1C server is overloaded and fail-fast, the load balancing based on response time Algorithms may mistakenly think this is best instead of considering servers with 2C or 4C (since they have a lot of active connections). But if we factor the response success rate into the calculation, we can avoid this from happening, because it knows that it is due to fast failure rather than it is really fast.
The success rate also using two-level EWMA to achieve the success rate of recession, avoiding the problem of low sensitivity of the arithmetic mean under very large samples.
Benchmark
Code Style
Goimports
has runGolint
result