-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ThresholdShedder Strategy for loadbalancer and expose loadbalance metric to prometheus #6772
Conversation
@hangc0276 Thanks for this PR. This looks exciting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hangc0276 this is a great feature! Looking pretty great!
/pulsarbot run-failure-checks |
… metric to prometheus (apache#6772) ### Motivation The Only one overload shedder strategy is `OverloadShedder`, which collects each broker's max resource usage and compare with threshold (default value is 85%). When max resource usage reaches the threshold, it will trigger bundle unloading, which will migrate parts of bundles to other brokers. The overload shedder strategy has some drawbacks as follows: - Not support configure other overload shedder strategies - It is hard to determine the threshold value, the default threshold is 85%. But for a broker, the max resource usage is few to reach 85%, which will lead to unbalanced traffic between brokers. The heavy traffic broker's read cache hit rate will decrease. - When you restart the most brokers of the pulsar cluster at the same time, the whole traffic in the cluster will goes to the rest brokers. The restarted brokers will have no traffic for a long time, due to the rest brokers max resource usage not reach the threshold. ### Changes 1. Support multiple overload shedder strategy, which only need to configure in `broker.conf` 2. I develop `ThresholdShedder` strategy, the main idea as follow: - Calculate the average resource usage of the brokers, and individual broker resource usage will compare with the average value. If it greatter than average value plus threshold, the overload shedder will be triggered. `broker resource usage > average resource usage + threshold` - Each kind of resources (ie bandwithIn, bandwithOut, CPU, Memory, Direct Memory), has weight(default is 1.0) when calculate broker's resource usage. - Record the pulsar broker cluster history average resource usage, new average resource usage will be calculate as follow: `new_avg = old_avg * factor + (1-factor) * avg` `new_avg`: newest average resoruce usage `old_avg`: old average resource usge which is calculate in last round. `factor`: the decrease factor, default value is `0.9` `avg`: the average resource usage of the brokers 3. expose load balance metric to prometheus 4. fix a bug in `OverloadShedder`, which specify the unloaded bundle in the overload's own broker. Please help check this implementation, if it is ok, i will add test case.
@hangc0276 Hi, I'm having issues running the ThresholdShedder, is there any special configuration that should be pointed out? I believe I have it enabled correctly, but I don't see any load shedded when I would expect it to, just from looking at the resource metrics. I can share more details if needed. If I should open a new issue for this, let me know! |
Motivation
The Only one overload shedder strategy is
OverloadShedder
, which collects each broker's max resource usage and compare with threshold (default value is 85%). When max resource usage reaches the threshold, it will trigger bundle unloading, which will migrate parts of bundles to other brokers. The overload shedder strategy has some drawbacks as follows:Changes
broker.conf
ThresholdShedder
strategy, the main idea as follow:broker resource usage > average resource usage + threshold
new_avg = old_avg * factor + (1-factor) * avg
new_avg
: newest average resoruce usageold_avg
: old average resource usge which is calculate in last round.factor
: the decrease factor, default value is0.9
avg
: the average resource usage of the brokersOverloadShedder
, which specify the unloaded bundle in the overload's own broker.Please help check this implementation, if it is ok, i will add test case.