Skip to content

Conversation

@Demogorgon314
Copy link
Member

Fixes #16705

Motivation

When the getBundleWithHighestThroughput method try to use ((ModularLoadManagerWrapper)loadManager).getLoadManager().getBundleDataOrDefault(bundle.toString()); get the bundle load data, the load data might already updated by load manager.

public void writeBundleDataOnZooKeeper() {
updateBundleData();
// Write the bundle data to metadata store.
List<CompletableFuture<Void>> futures = new ArrayList<>();
for (Map.Entry<String, BundleData> entry : loadData.getBundleData().entrySet()) {
final String bundle = entry.getKey();
final BundleData data = entry.getValue();
futures.add(bundlesCache.readModifyUpdateOrCreate(getBundleDataPath(bundle), __ -> data)
.thenApply(__ -> null));
}

Also, we should give the bundleWithHighestThroughput a default value, since is possible to return a null value.

Modifications

  • Disable the writeResourceQuotasToZooKeeper task by set load balancer enabled false.

Documentation

Check the box below or label this PR directly.

Need to update docs?

  • doc-required
    (Your PR needs to update docs and you will update later)

  • doc-not-needed
    (Please explain why)

  • doc
    (Your PR contains doc changes)

  • doc-complete
    (Docs have been already added)

@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Jul 21, 2022
public void testSplitBundleWithHighestThroughput() throws Exception {

conf.setLoadManagerClassName(ModularLoadManagerImpl.class.getName());
conf.setLoadBalancerEnabled(false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, it's a little confusing. The test is related to the load manager but need to disable the load balancer in the test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think without this change, the flaky test can also be fixed? Since we will not return null for method getBundleWithHighestThroughput

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, disable the load balancer just to cancel the load report task.

You can see the below code, in this test, it tries to write the mock load data to storage, but if the load report task starts, the mock data might be overridden.

NamespaceBundle targetNamespaceBundle = bundles.findBundle(TopicName.get(topic + "0"));
String bundle = targetNamespaceBundle.getBundleRange();
String path = ModularLoadManagerImpl.getBundleDataPath(namespace + "/" + bundle);
NamespaceBundleStats defaultStats = new NamespaceBundleStats();
defaultStats.msgThroughputIn = 100000;
defaultStats.msgThroughputOut = 100000;
BundleData bd = new BundleData(10, 19, defaultStats);
bd.setTopics(10);
byte[] data = ObjectMapperFactory.getThreadLocal().writeValueAsBytes(bd);
pulsar.getLocalMetadataStore().put(path, data, Optional.empty());

NamespaceBundles bundles = getBundles(nsName);
double maxMsgThroughput = -1;
NamespaceBundle bundleWithHighestThroughput = null;
NamespaceBundle bundleWithHighestThroughput = bundles.getBundles().get(0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand, I think even if the bundle is updated, we should always be able to find the one with the highest load?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the bundle don't have any connected topic, the check bundleData.getTopics() > 0 will never pass, so the bundleWithHighestThroughput is null.

Copy link
Contributor

@Technoboy- Technoboy- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to discuss more about this modification, because it seems change back to #12957.

@codelipenghui codelipenghui modified the milestones: 2.11.0, 2.12.0 Jul 26, 2022
@github-actions
Copy link

The pr had no activity for 30 days, mark with Stale label.

@github-actions github-actions bot added the Stale label Sep 27, 2022
@congbobo184
Copy link
Contributor

@Demogorgon314 hi, I move this PR to release/2.9.5, if you have any questions, please ping me. thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Flaky-test: NamespaceServiceTest.testSplitBundleWithHighestThroughput

6 participants