Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partitions in partitioned topic should be automatically distributed to multiple brokers #386

Closed
yush1ga opened this issue Apr 27, 2017 · 6 comments
Labels
deprecated/question Questions should happened in GitHub Discussions lifecycle/stale

Comments

@yush1ga
Copy link
Contributor

yush1ga commented Apr 27, 2017

Expected behavior

Partitions in partitioned topic is automatically distributed to multiple brokers like following image.

Actual behavior

Partitions in partitioned topic is distributed to only one broker.
We know it is effective to split namespace bundle and unload
However, we think this step is complicated a little.
Can we distribute partitions to multiple brokers automatically without splitting namespace bundle and unloading?

@merlimat
Copy link
Contributor

Can we distribute partitions to multiple brokers automatically without splitting namespace bundle and unloading?

"Bundles" were always meant to be an internal concept that users should never need to know about. They facilitate the service discovery caching mechanism so that each broker only needs to cache a the bundles assigned instead of the individual topics.

There are few workaround options :

Enable auto-splitting

In conf/broker.conf:

# enable/disable namespace bundle auto split
loadBalancerAutoBundleSplitEnabled=true

I think we should turn on this config by default.

Also take a look at the split thresholds :

# maximum topics in a bundle, otherwise bundle split will be triggered
loadBalancerNamespaceBundleMaxTopics=1000

# maximum sessions (producers + consumers) in a bundle, otherwise bundle split will be triggered
loadBalancerNamespaceBundleMaxSessions=1000

# maximum msgRate (in + out) in a bundle, otherwise bundle split will be triggered
loadBalancerNamespaceBundleMaxMsgRate=1000

# maximum bandwidth (in + out) in a bundle, otherwise bundle split will be triggered
loadBalancerNamespaceBundleMaxBandwidthMbytes=100

Start with multiple bundles

When creating a namespace you can specify upfront the number of bundles to start with :

$ bin/pulsar-admin namespaces create my-namespace --bundles 16

The default is to start with 1 single bundle. It may be a good to have a configurable default number of bundles for a particular cluster, so that all the namespaces gets pre-created by default with more bundles.

Automatic bundle reassignment after splitting

Currently, after a bundle split happens, the new bundles are not directly moved off the current broker. The load manager, in case it decides that the broker is overloaded, will be able to just unload the newly created bundles.

As part of the work on the ModularLoadManager, @bobbeyreese is working in adding the immediate unload after the splits. It would be great if you guys have cycles to try that out and provide feedback.

@merlimat merlimat added the type/enhancement The enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages label Apr 28, 2017
@merlimat
Copy link
Contributor

@yush1ga In particular I was meaning to refer to this PR #385

@sijie
Copy link
Member

sijie commented Nov 28, 2018

@merlimat is there any thing we need to do more in this task?

@JevonQ
Copy link

JevonQ commented Jan 17, 2019

@merlimat Since there are several thresholds to control the split on the bundle, so there is one scenario that only one busy topic can cause the bundle to split, I'm wondering if the split on the bundle can spread the load for that? If yes, how?

@jiazhai jiazhai added deprecated/question Questions should happened in GitHub Discussions triage/week-1 and removed type/enhancement The enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages triage/week-20 labels Jan 3, 2020
@jiazhai
Copy link
Member

jiazhai commented Jan 3, 2020

@JevonQ loadbalancer will help on the load spread. https://pulsar.apache.org/docs/en/administration-load-balance

hrsakai pushed a commit to hrsakai/pulsar that referenced this issue Dec 10, 2020
### Issue
Retry policy not effective with non-FQDN topic.

- reproduction
	```go
	client, _ := pulsar.NewClient(pulsar.ClientOptions{URL: "pulsar://localhost:6650"})
	consumer, _ := client.Subscribe(pulsar.ConsumerOptions{
		Topic:            "topic-01",
		SubscriptionName: "my-sub",
		RetryEnable:      true,
		DLQ:              &pulsar.DLQPolicy{MaxDeliveries: 2},
	})
	msg, _ := consumer.Receive(context.Background())
	consumer.ReconsumeLater(msg, 5*time.Second)
	```
- logs

	```
	RN[0000] consumer of topic [persistent://public/default/topic-01] not exist unexpectedly  topic="[topic-01 persistent://public/default/my-sub-RETRY]"
	```

### Cause
For MultiTopicConsumer `consumers` map filed:
- key: user provided topic, maybe non-FQDN.
- value: consumer instance.

`ReconsumeLater` using msg's FQDN topic as key to find `consumer` in `consumers`,
 if mismatch with non-FQDN topic, this invoke will be ignored, lead to Retry policy not effective.

### Modifications
- Normalize user provided topics as FQDN topics before initializing consumers.
- Add non-FQDN topic consumption case in Retry policy tests.


### Verifying this change

- [x] Make sure that the change passes the CI checks.
hangc0276 pushed a commit to hangc0276/pulsar that referenced this issue May 26, 2021
This version update is convenient for tests in real environment since there's no binary download url for original pulsar `2.8.0-rc-202101252233`.

This PR fixes the API incompatibility problems that are introduced by apache#9397 and apache#9302.

Another significant change between these two versions is apache#9338, which introduced metadata-store API for cluster resources. This PR fixed the test failure caused by it as well. Since KoP `tests` module only uses one `MockZooKeeper` to manage z-nodes, see `KopProtocolHandlerTestBase#createMockZooKeeper`, the mocked `createConfigurationMetadataStore` method returns `mockedZooKeeper` here instead of a `mockedZooKeeperGlobal` like what Pulsar did in `MockedPulsarServiceBaseTest`.

Besides, there's a test bug in `testBrokerHandleTopicMetadataRequest` that was not exposed by the previous Pulsar. This PR fixes it.
@tisonkun
Copy link
Member

Closed as stale. Please open a new issue if it's still relevant in maintained versions.

@tisonkun tisonkun closed this as not planned Won't fix, can't repro, duplicate, stale Nov 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deprecated/question Questions should happened in GitHub Discussions lifecycle/stale
Projects
None yet
Development

No branches or pull requests

7 participants