Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug causing removed brokers to be considered as candidates #343

Merged
merged 7 commits into from
Apr 12, 2017

Conversation

bobbeyreese
Copy link
Contributor

Motivation

It has been observed that when brokers are removed, ModularLoadManagerImpl continues to consider them for assignment.

Modifications

The internal broker data map now removes brokers when they are no longer in the active available brokers cache. A unit test file ModularLoadManagerImplTest has been added in a similar fashion to SimpleLoadManagerImplTest along with a test for the candidate selection correctness.

Result

Only active brokers will be considered for candidate selection.

// Now disable the secondary broker.
secondaryLoadManager.disableBroker();
// Try 5 more selections, ensure they all go to the first broker.
for (int i = 2; i < 7; ++i) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about also checking brokerDataMap and check if broker2 is present or not?

brokerDataMap.put(broker, new BrokerData(localData));
}
} catch (Exception e) {
log.warn("Error reading broker data from cache for broker - [{}], [{}]", broker, e.getMessage());
}
}
// Remove obsolete brokers.
for (final String broker : brokerDataMap.keySet()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, updating brokerDataMap will still not fix issue 100% because updateAllBrokerData being called async when broker receives the watch. So, before it gets updated, lookup may select broker and it fails to get broker-url while creating lookup result.
So, while selecting broker at selectBrokerForAssignment it should read directly from the cache (as per your previous change) so, in that way cache is consistent in terms of selecting-broker and creating-lookupResult.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it could even still be an issue if the broker becomes unavailable while filtering policies and selecting candidates, though it would be impractical to handle this case perfectly so I will reduce synchronization and use availableActiveBrokers to give the best likelihood of selecting an active broker.


// Make sure broker is still active.
try {
if (!availableActiveBrokers.get().contains(broker)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove this double check of broker that is part of availableActiveBrokers because we have already verified at line-508. So, we can remove it and immediately return derived broker.

Copy link
Contributor

@rdhabalia rdhabalia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

Copy link
Contributor

@merlimat merlimat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@rdhabalia rdhabalia merged commit fc10931 into apache:master Apr 12, 2017
rdhabalia pushed a commit that referenced this pull request May 2, 2017
* Make candidate selection only consider active brokers
hrsakai pushed a commit to hrsakai/pulsar that referenced this pull request Dec 10, 2020
…ata://' (apache#343)

---

Master Issue: #<xyz>

*Motivation*

Make the oauth2 read the private key can handle with  'file://' schema and 'data://' schema.
hangc0276 pushed a commit to hangc0276/pulsar that referenced this pull request May 26, 2021
Thread sync problem.
Simple change. In order to keep CI stable.
It can be optimized later.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants