Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce auto bundle split and unloading of split bundle in ModularLoadManager #857

Merged
merged 5 commits into from
Oct 26, 2017

Conversation

rdhabalia
Copy link
Contributor

Motivation

As described in #385
ModularLoadManagerImpl lacked bundle split capabilities. Additionally, automatic bundle splits reassigned the bundles to the same broker when the bundle could more usefully be reassigned after being split.

Modifications

This PR has same logic as #385 to find out bundles which are eligible for splitting. But this PR have additional changes which we haven't addressed in #385

  • no need depend on zk.exist watch
  • bundle split happens only at leader
  • split-bundle api has additional query-param to unload split bundle which gives more flexibility to control unload split bundles
  • delete bundle data for original split bundle from zk

Result

ModularLoadManager can auto split and unload bundles.

@rdhabalia rdhabalia added type/enhancement The enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages type/feature The PR added a new feature or issue requested a new feature labels Oct 25, 2017
@rdhabalia rdhabalia added this to the 1.21.0-incubating milestone Oct 25, 2017
@rdhabalia rdhabalia self-assigned this Oct 25, 2017
@merlimat
Copy link
Contributor

Nice! I will take a look and test it soon.

@@ -838,7 +838,8 @@ public void unloadNamespaceBundle(@PathParam("property") String property, @PathP
@ApiResponses(value = { @ApiResponse(code = 403, message = "Don't have admin permission") })
public void splitNamespaceBundle(@PathParam("property") String property, @PathParam("cluster") String cluster,
@PathParam("namespace") String namespace, @PathParam("bundle") String bundleRange,
@QueryParam("authoritative") @DefaultValue("false") boolean authoritative) {
@QueryParam("authoritative") @DefaultValue("false") boolean authoritative,
@QueryParam("unload") @DefaultValue("false") boolean unload) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the brokers already have the loadBalancerAutoUnloadSplitBundlesEnabled parameter, wouldn't this unload option be redundant?

Just trigger the split and broker will decide whether to unload or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct but this admin-api is triggered by load-balancer and cli-tool. Sometimes we might want to just split the bundle using cli-tool without unloading it. Therefore, this query-param can give flexibility to perform accordingly.??

return;
}
// Value may be changed dynamically.
if (conf.getLoadBalancerAutoBundleSplitEnabled()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (!conf.getLoadBalancerAutoBundleSplitEnabled()) {
 return;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, will fix it.

}
// Value may be changed dynamically.
if (conf.getLoadBalancerAutoBundleSplitEnabled()) {
log.info("Check bundle-split");// TODO: remove this check
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the TODO here

@@ -41,6 +41,8 @@
import java.util.regex.Pattern;
import java.util.stream.Collectors;

import javax.validation.constraints.Future;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably this is not the correct import class

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this import is not being used, will remove it.

@merlimat
Copy link
Contributor

@rdhabalia I'm getting one exception reported even though the split itself suceeds:

2017-10-25 11:07:25,192 - INFO  - [pulsar-load-manager-27-1:ModularLoadManagerImpl@415] - Writing local data to ZooKeeper because maximum change Infinity% exceeded threshold 10%; time since last report written is 35.002 seconds
2017-10-25 11:07:25,195 - INFO  - [main-EventThread:ZooKeeperDataCache@145] - [State:CONNECTED Timeout:30000 sessionid:0x15f54b6a93d0003 local:/127.0.0.1:53217 remoteserver:127.0.0.1/127.0.0.1:2181 lastZxid:3374 xid:807 sent:807 recv:809 queuedpkts:0 pendingresp:0 queuedevents:0] Received ZooKeeper watch event: WatchedEvent state:SyncConnected type:NodeDataChanged path:/loadbalance/brokers/localhost:8080
2017-10-25 11:07:25,198 - INFO  - [pulsar-modular-load-manager-76-1:ModularLoadManagerImpl@604] - Load-manager splitting budnle sample/standalone/test-2/0x00000000_0xffffffff and unloading false
2017-10-25 11:07:25,200 - INFO  - [pulsar-modular-load-manager-76-1:PulsarService@586] - Admin api url: http://localhost:8080
2017-10-25 11:07:25,231 - INFO  - [pulsar-web-77-19:Namespaces@843] - [null] Split namespace bundle sample/standalone/test-2/0x00000000_0xffffffff
2017-10-25 11:07:25,235 - INFO  - [pulsar-web-77-19:PulsarWebResource@221] - Successfully validated clusters on property [sample]
2017-10-25 11:07:25,237 - INFO  - [pulsar-web-77-19:OwnershipCache@208] - Trying to acquire ownership of sample/standalone/test-2/0x00000000_0x7fffffff
2017-10-25 11:07:25,237 - INFO  - [pulsar-web-77-19:OwnershipCache@208] - Trying to acquire ownership of sample/standalone/test-2/0x7fffffff_0xffffffff
2017-10-25 11:07:25,238 - INFO  - [main-EventThread:OwnershipCache@213] - Successfully acquired ownership of /namespace/sample/standalone/test-2/0x00000000_0x7fffffff
2017-10-25 11:07:25,238 - INFO  - [main-EventThread:OwnershipCache@213] - Successfully acquired ownership of /namespace/sample/standalone/test-2/0x7fffffff_0xffffffff
2017-10-25 11:07:25,242 - INFO  - [main-EventThread:ZooKeeperDataCache@145] - [State:CONNECTED Timeout:30000 sessionid:0x15f54b6a93d0003 local:/127.0.0.1:53217 remoteserver:127.0.0.1/127.0.0.1:2181 lastZxid:3377 xid:811 sent:811 recv:814 queuedpkts:0 pendingresp:0 queuedevents:1] Received ZooKeeper watch event: WatchedEvent state:SyncConnected type:NodeDataChanged path:/admin/local-policies/sample/standalone/test-2
2017-10-25 11:07:25,245 - INFO  - [pulsar-ordered-21-1:NamespaceBundleFactory@113] - Policy updated for namespace sample/standalone/test-2, refreshing the bundle cache.
2017-10-25 11:07:25,256 - INFO  - [pulsar-web-77-19:Namespaces@860] - [null] Successfully split namespace bundle sample/standalone/test-2/0x00000000_0xffffffff
2017-10-25 11:07:25,258 - INFO  - [pulsar-web-77-19:Slf4jRequestLog@60] - 127.0.0.1 - - [25/Oct/2017:18:07:25 +0000] "PUT //localhost:8080/admin/namespaces/sample/standalone/test-2/0x00000000_0xffffffff/split?unload=false HTTP/1.1" 204 0 "-" "Jersey/2.23.2 (HttpUrlConnection 1.8.0_121)" 32
2017-10-25 11:07:25,259 - INFO  - [ProcessThread(sid:0 cport:2181)::PrepRequestProcessor@648] - Got user-level KeeperException when processing sessionid:0x15f54b6a93d0003 type:delete cxid:0x32c zxid:0xd32 txntype:-1 reqpath:n/a Error Path:/loadbalance/bundle-data/sample/standalone/test-2 Error:KeeperErrorCode = NoNode for /loadbalance/bundle-data/sample/standalone/test-2
2017-10-25 11:07:25,261 - WARN  - [pulsar-modular-load-manager-76-1:ModularLoadManagerImpl@832] - Failed to delete bundle-data sample/standalone/test-2/0x00000000_0xffffffff from zookeeper
org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /loadbalance/bundle-data/sample/standalone/test-2/0x00000000_0xffffffff
	at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
	at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
	at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:873)
	at org.apache.bookkeeper.zookeeper.ZooKeeperClient.access$1501(ZooKeeperClient.java:60)
	at org.apache.bookkeeper.zookeeper.ZooKeeperClient$10.call(ZooKeeperClient.java:555)
	at org.apache.bookkeeper.zookeeper.ZooKeeperClient$10.call(ZooKeeperClient.java:549)
	at org.apache.bookkeeper.zookeeper.ZooWorker.syncCallWithRetries(ZooWorker.java:122)
	at org.apache.bookkeeper.zookeeper.ZooKeeperClient.delete(ZooKeeperClient.java:549)
	at org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl.deleteBundleDataFromZookeeper(ModularLoadManagerImpl.java:830)
	at org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl.checkNamespaceBundleSplit(ModularLoadManagerImpl.java:614)
	at org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl.updateAll(ModularLoadManagerImpl.java:428)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
	at java.lang.Thread.run(Thread.java:745)
2017-10-25 11:07:25,264 - INFO  - [pulsar-modular-load-manager-76-1:ModularLoadManagerImpl@615] - Successfully split namespace bundle sample/standalone/test-2/0x00000000_0xffffffff

@rdhabalia
Copy link
Contributor Author

2017-10-25 11:07:25,261 - WARN  - [pulsar-modular-load-manager-76-1:ModularLoadManagerImpl@832] - Failed to delete bundle-data sample/standalone/test-2/0x00000000_0xffffffff from zookeeper
org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /loadbalance/bundle-data/sample/standalone/test-2/0x00000000_0xffffffff

after splitting bundle we delete old bundle-data.It seems this exception happens while deleting the old-bundle-znode. I have added this logic in the end after testing this PR change. let me a take look of it.

@merlimat
Copy link
Contributor

delete BundleData only if it exists (initially it may not be present)

Uhm, I'm still seeing that being present eg:

[zk: localhost:2181(CONNECTED) 18] ls /loadbalance/bundle-data/sample/standalone/test-3
[0x6fffffff_0x7fffffff, 0x3fffffff_0x4fffffff, 0xbfffffff_0xcfffffff, 0x0fffffff_0x1fffffff, 
0x5fffffff_0x6fffffff, 0x2fffffff_0x3fffffff, 0xdfffffff_0xffffffff, 0xafffffff_0xbfffffff, 
0x7fffffff_0x9fffffff, 0xcfffffff_0xdfffffff, 0x9fffffff_0xafffffff, 0x4fffffff_0x5fffffff, 
0x1fffffff_0x2fffffff, 0x00000000_0x0fffffff]

All the bundles data is still in ZK some time after the splits (In this case I've started with 1 bundle and let the auto-split take over).

Same thing about the bundles list :

$ bin/pulsar-admin brokers namespaces --url localhost:8080 standalone
{
  "sample/standalone/test-3/0x1fffffff_0x2fffffff" : {
    "broker_assignment" : "shared",
    "is_controlled" : false,
    "is_active" : true
  },
  "sample/standalone/test-3/0xcfffffff_0xdfffffff" : {
    "broker_assignment" : "shared",
    "is_controlled" : false,
    "is_active" : true
  },
  "sample/standalone/test-3/0x1fffffff_0x3fffffff" : {
    "broker_assignment" : "shared",
    "is_controlled" : false,
    "is_active" : false
  },
  "pulsar/standalone/localhost:8080/0x00000000_0xffffffff" : {
    "broker_assignment" : "shared",
    "is_controlled" : false,
    "is_active" : true
  },
  "sample/standalone/test-3/0x7fffffff_0xffffffff" : {
    "broker_assignment" : "shared",
    "is_controlled" : false,
    "is_active" : false
  },
  "sample/standalone/test-3/0x00000000_0x0fffffff" : {
    "broker_assignment" : "shared",
    "is_controlled" : false,
    "is_active" : true
  },
...

This I think is not related to this PR, though something we might want to think about.

At the same time, the locks for the pre-split bundles are still there, but I think that is ok.

@rdhabalia
Copy link
Contributor Author

This I think is not related to this PR, though something we might want to think about

sorry, I couldn't understand it. This exception was related to this PR only.
If bundle-data is not present then we create a bundle-data when leader updates resource Quota from load-report.

Now, if bundle split happens even before leader-resource-quota created s bundle-data then bundle-data-znode will not be present. So, leader should not try to delete znode which doesn't exist.

Uhm, I'm still seeing that being present eg:

After this fix I have tested again with 3 brokers and started with 1 bundle in namespace, I don't see exception in broker-log and also don't see bundle-data into zookeeper.

[zk: localhost:2181(CONNECTED) 14] ls /loadbalance/bundle-data/sample/standalone/a1                     
[]
[zk: localhost:2181(CONNECTED) 15] ls /namespace/sample/standalone/a1              
[0x07ffffff_0x0fffffff, 0x082fffff_0x083fffff, 0x082fffff_0x0837ffff, 0x081fffff_0x083fffff, 0x07ffffff_0x08ffffff, 0x07ffffff_0x087fffff, 0x00000000_0xffffffff, 0x07ffffff_0x09ffffff, 0x00000000_0x1fffffff, 0x07ffffff_0x083fffff, 0x07ffffff_0x0bffffff, 0x00000000_0x7fffffff, 0x00000000_0x3fffffff, 0x00000000_0x0fffffff]

Are you still able to see bundle-data with clean bundle-split by load-balancer?

Same thing about the bundles list :
$ bin/pulsar-admin brokers namespaces --url localhost:8080 standalone
{
"sample/standalone/test-3/0x1fffffff_0x2fffffff" : {
"broker_assignment" : "shared",
"is_controlled" : false,
"is_active" : true
},

I didn't understand it fully. But what happens here:

  1. Broker-1 owns bundle b1 and leader splits it to b2 and b3 then broker-1 still owns b1 with in memory-state=disabled and zk-state=enabled.
  2. So, if Broker-2 still doesn't know about split (because it might not receive watch) then broker-2 redirects topic request to broker-1
  3. Broker-1 knows correct bundle for the lookup topic (either b2 or b3) and redirects the request to appropriate broker.

So, all old split-bundles will be disappeared once that broker will restart. We can't remove it from broker immediately because in step-2 if broker-2 may not receive watch and in that case we want broker-2 to redirect request to broker-1 by thinking that old-split-bundle still owned by broker-1.

@merlimat
Copy link
Contributor

So, all old split-bundles will be disappeared once that broker will restart. We can't remove it from broker immediately because in step-2 if broker-2 may not receive watch and in that case we want broker-2 to redirect request to broker-1 by thinking that old-split-bundle still owned by broker-1.

Correct, though we could clean them up after a certain time. We should at least be guaranteed that the other broker will have received the notification within a zkSessionTimeout period. So let say we could schedule a task to clean it up after ~1min to avoid the stale state in the broker reports.

@merlimat
Copy link
Contributor

@rdhabalia Anyway, I was meaning that this last part (cleaning up the bundles ownership) is unreleated to this PR.

Copy link
Contributor

@merlimat merlimat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@rdhabalia
Copy link
Contributor Author

though we could clean them up after a certain time. We should at least be guaranteed that the other broker will have received the notification within a zkSessionTimeout period.

Yes, but many times, we have seen (eg: replication-cluster change) that broker doesn't get watch. I will make this change and test this scenario in separate PR then.

@apache apache deleted a comment from merlimat Oct 26, 2017
@merlimat
Copy link
Contributor

@rdhabalia
Copy link
Contributor Author

yes, checking it..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/enhancement The enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages type/feature The PR added a new feature or issue requested a new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants