Skip to content

Conversation

@Nicklee007
Copy link
Contributor

@Nicklee007 Nicklee007 commented Jul 13, 2022

Motivation

As the PIP 7: Pulsar Failure domain and Anti affinity namespaces design, the anti affinity namespaces should be distributed to evenly across all domain and all the brokers.
But in method getAntiAffinityNamespaceOwnedBrokers , brokerToAntiAffinityNamespaceCount add a count even the namespace equal the given bundle's namespace which will be load. The behavior will cause the namespace easy to distributed to
a broker which has another namespace in anti affinity group, the behavior broke the anti affinity balance. It's better behavior is the same namespace should be distributed to those broker which has loaded the same namespace when all broker have load at least one namespace in ti affinity group.

there is some case
ns-0 ns-1 ns-2 are all set the same anti affinity group like 'a-group

broker-0 own ns-0 bundle-0;
broker-1 own ns-1 bundle-0;
broker-2 own ns-2 bundle-0;
then another ns-2 bundle-1 need choice a broker to load. As the old policy, broker-0 broker-1 and broker-2 are all satisfy the least NamespaceCount; but if ns-2 bundle-1 load to broker-0 or broker-1 will broke the anti affinity balance. ns-2 bundle-1 need be load by broker-2 is better.

Also, the behavior will cause give up doLoadShedding when the all broker own one namespace in anti affinity group, but the different broker owned namespace bundle count and payload is different.
So brokerToAntiAffinityNamespaceCount should exclude the given namespace count add.

Modifications

  1. In LoadManagerShared.class getAntiAffinityNamespaceOwnedBrokers method, exclude the given namespace count add.
  2. changed shouldAntiAffinityNamespaceUnload check;
  3. add some unit test.

Documentation

  • doc-not-needed

Matching PR in forked repository

PR in forked repository: Nicklee007#5

@Nicklee007 Nicklee007 force-pushed the improve-anti-affinity-owned-brokers-check branch 3 times, most recently from 8ee2884 to 668d99e Compare July 20, 2022 03:00
@Nicklee007
Copy link
Contributor Author

@codelipenghui @Jason918 @eolivelli Could you help to review this PR, Thx.

@Jason918
Copy link
Contributor

there is some case
broker-0 own ns-0 bundle-0;
broker-1 own ns-1 bundle-0;
broker-2 own ns-2 bundle-0;
then another ns-2 bundle-1 need choice a broker to load. As the old policy, broker-0 broker-1 and broker-2 are all satisfy the least NamespaceCount; but if ns-2 bundle-1 load to broker-0 or broker-1 will broke the anti affinity balance. ns-2 bundle-1 need be load by broker-2 is better.

What's the anti affinity setting in this case?

@Nicklee007
Copy link
Contributor Author

Nicklee007 commented Jul 28, 2022

What's the anti affinity setting in this case?

@Jason918 ns-0 ns-1 ns-2 are all set the same anti affinity group like 'a-group', and the three ns be anti affinity load is expected.

@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Jul 28, 2022
@Nicklee007
Copy link
Contributor Author

/pulsarbot run-failure-checks

@Nicklee007 Nicklee007 force-pushed the improve-anti-affinity-owned-brokers-check branch from 4a3e4e8 to 197faef Compare September 21, 2022 09:17
@github-actions
Copy link

The pr had no activity for 30 days, mark with Stale label.

@github-actions github-actions bot added the Stale label Oct 22, 2022
@Nicklee007 Nicklee007 force-pushed the improve-anti-affinity-owned-brokers-check branch 4 times, most recently from c29ea60 to 83e5536 Compare December 8, 2022 12:35
@Nicklee007
Copy link
Contributor Author

/pulsarbot run-failure-checks

@codecov-commenter
Copy link

codecov-commenter commented Jan 11, 2023

Codecov Report

Merging #16563 (118b16b) into master (b05fddb) will decrease coverage by 14.98%.
The diff coverage is 0.00%.

Impacted file tree graph

@@              Coverage Diff              @@
##             master   #16563       +/-   ##
=============================================
- Coverage     45.64%   30.66%   -14.99%     
+ Complexity    11043     6016     -5027     
=============================================
  Files           773      633      -140     
  Lines         74463    59892    -14571     
  Branches       8018     6241     -1777     
=============================================
- Hits          33986    18363    -15623     
- Misses        36687    38973     +2286     
+ Partials       3790     2556     -1234     
Flag Coverage Δ
unittests 30.66% <0.00%> (-14.99%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...sar/broker/loadbalance/impl/LoadManagerShared.java 39.39% <0.00%> (-4.03%) ⬇️
...n/java/org/apache/pulsar/client/api/RawReader.java 0.00% <0.00%> (-100.00%) ⬇️
...ava/org/apache/pulsar/broker/admin/v1/Brokers.java 0.00% <0.00%> (-100.00%) ⬇️
...ava/org/apache/pulsar/broker/admin/v2/Brokers.java 0.00% <0.00%> (-100.00%) ⬇️
...va/org/apache/pulsar/broker/admin/v1/Clusters.java 0.00% <0.00%> (-100.00%) ⬇️
.../org/apache/pulsar/broker/admin/v1/Properties.java 0.00% <0.00%> (-100.00%) ⬇️
.../org/apache/pulsar/client/impl/RawMessageImpl.java 0.00% <0.00%> (-100.00%) ⬇️
.../apache/pulsar/broker/admin/v2/ResourceGroups.java 0.00% <0.00%> (-100.00%) ⬇️
...ar/common/naming/PartitionedManagedLedgerInfo.java 0.00% <0.00%> (-100.00%) ⬇️
...e/pulsar/broker/admin/impl/ResourceQuotasBase.java 0.00% <0.00%> (-96.43%) ⬇️
... and 308 more

@Nicklee007
Copy link
Contributor Author

/pulsarbot run-failure-checks

@github-actions
Copy link

The pr had no activity for 30 days, mark with Stale label.

@github-actions github-actions bot added the Stale label Feb 11, 2023
@poorbarcode poorbarcode modified the milestones: 3.0.0, 3.1.0 Apr 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants