Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

splitting bundle error #5510

Closed
sunlc210 opened this issue Oct 31, 2019 · 15 comments
Closed

splitting bundle error #5510

sunlc210 opened this issue Oct 31, 2019 · 15 comments
Labels
area/broker lifecycle/stale type/bug The PR fixed a bug or issue reported a bug

Comments

@sunlc210
Copy link

sunlc210 commented Oct 31, 2019

Met the split bundle error when supply huge data into pulsar,block the data transfer

20:11:38.774 [pulsar-modular-load-manager-29-1] INFO  org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl - Load-manager splitting bundle crossjh2wx/jh2wx/0x8a000000_0x8b000000 and unloading true
20:11:38.779 [pulsar-web-30-6] INFO  org.apache.pulsar.broker.admin.impl.NamespacesBase - [null] Split namespace bundle crossjh2wx/jh2wx/0x8a000000_0x8b000000
20:11:38.779 [pulsar-web-30-6] ERROR org.apache.pulsar.broker.web.PulsarWebResource - [null] Failed to validate namespace bundle crossjh2wx/jh2wx/0x8a000000_0x8b000000
java.lang.IllegalArgumentException: Invalid upper boundary for bundle
        at com.google.common.base.Preconditions.checkArgument(Preconditions.java:122) ~[com.google.guava-guava-21.0.jar:?]
        at org.apache.pulsar.common.naming.NamespaceBundles.validateBundle(NamespaceBundles.java:114) ~[org.apache.pulsar-pulsar-broker-2.4.1.jar:2.4.1]
        at org.apache.pulsar.broker.web.PulsarWebResource.validateNamespaceBundleRange(PulsarWebResource.java:465) ~[org.apache.pulsar-pulsar-broker-2.4.1.jar:2.4.1]
        at org.apache.pulsar.broker.web.PulsarWebResource.validateNamespaceBundleOwnership(PulsarWebResource.java:491) ~[org.apache.pulsar-pulsar-broker-2.4.1.jar:2.4.1]
        at org.apache.pulsar.broker.admin.impl.NamespacesBase.internalSplitNamespaceBundle(NamespacesBase.java:741) ~[org.apache.pulsar-pulsar-broker-2.4.1.jar:2.4.1]
        at org.apache.pulsar.broker.admin.v2.Namespaces.splitNamespaceBundle(Namespaces.java:330) ~[org.apache.pulsar-pulsar-broker-2.4.1.jar:2.4.1]
        at sun.reflect.GeneratedMethodAccessor85.invoke(Unknown Source) ~[?:?]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_111]
        at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_111]
        at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory.lambda$static$0(ResourceMethodInvocationHandlerFactory.java:76) ~[org.glassfish.jersey.core-jersey-server-2.27.jar:?]
        at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:148) [org.glassfish.jersey.core-jersey-server-2.27.jar:?]
        at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:191) [org.glassfish.jersey.core-jersey-server-2.27.jar:?]
        at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$VoidOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:183) [org.glassfish.jersey.core-jersey-server-2.27.jar:?]
        at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:103) [org.glassfish.jersey.core-jersey-server-2.27.jar:?]
        at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:493) [org.glassfish.jersey.core-jersey-server-2.27.jar:?]
        at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:415) [org.glassfish.jersey.core-jersey-server-2.27.jar:?]
        at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:104) [org.glassfish.jersey.core-jersey-server-2.27.jar:?]
        at org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:277) [org.glassfish.jersey.core-jersey-server-2.27.jar:?]
        at org.glassfish.jersey.internal.Errors$1.call(Errors.java:272) [org.glassfish.jersey.core-jersey-common-2.27.jar:?]
        at org.glassfish.jersey.internal.Errors$1.call(Errors.java:268) [org.glassfish.jersey.core-jersey-common-2.27.jar:?]
        at org.glassfish.jersey.internal.Errors.process(Errors.java:316) [org.glassfish.jersey.core-jersey-common-2.27.jar:?]
        at org.glassfish.jersey.internal.Errors.process(Errors.java:298) [org.glassfish.jersey.core-jersey-common-2.27.jar:?]
        at org.glassfish.jersey.internal.Errors.process(Errors.java:268) [org.glassfish.jersey.core-jersey-common-2.27.jar:?]
        at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:289) [org.glassfish.jersey.core-jersey-common-2.27.jar:?]
        at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:256) [org.glassfish.jersey.core-jersey-server-2.27.jar:?]
        at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:703) [org.glassfish.jersey.core-jersey-server-2.27.jar:?]
        at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:416) [org.glassfish.jersey.containers-jersey-container-servlet-core-2.27.jar:?]
        at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:370) [org.glassfish.jersey.containers-jersey-container-servlet-core-2.27.jar:?]
        at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:389) [org.glassfish.jersey.containers-jersey-container-servlet-core-2.27.jar:?]
        at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:342) [org.glassfish.jersey.containers-jersey-container-servlet-core-2.27.jar:?]
        at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:229) [org.glassfish.jersey.containers-jersey-container-servlet-core-2.27.jar:?]
        at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:865) [org.eclipse.jetty-jetty-servlet-9.4.12.v20180830.jar:9.4.12.v20180830]
        at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1655) [org.eclipse.jetty-jetty-servlet-9.4.12.v20180830.jar:9.4.12.v20180830]
        at org.apache.pulsar.broker.web.ResponseHandlerFilter.doFilter(ResponseHandlerFilter.java:53) [org.apache.pulsar-pulsar-broker-2.4.1.jar:2.4.1]
        at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642) [org.eclipse.jetty-jetty-servlet-9.4.12.v20180830.jar:9.4.12.v20180830]
        at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533) [org.eclipse.jetty-jetty-servlet-9.4.12.v20180830.jar:9.4.12.v20180830]
        at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255) [org.eclipse.jetty-jetty-server-9.4.12.v20180830.jar:9.4.12.v20180830]
        at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595) [org.eclipse.jetty-jetty-server-9.4.12.v20180830.jar:9.4.12.v20180830]
        at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255) [org.eclipse.jetty-jetty-server-9.4.12.v20180830.jar:9.4.12.v20180830]
        at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1340) [org.eclipse.jetty-jetty-server-9.4.12.v20180830.jar:9.4.12.v20180830]
        at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203) [org.eclipse.jetty-jetty-server-9.4.12.v20180830.jar:9.4.12.v20180830]
        at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473) [org.eclipse.jetty-jetty-servlet-9.4.12.v20180830.jar:9.4.12.v20180830]
        at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564) [org.eclipse.jetty-jetty-server-9.4.12.v20180830.jar:9.4.12.v20180830]
        at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201) [org.eclipse.jetty-jetty-server-9.4.12.v20180830.jar:9.4.12.v20180830]
        at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1242) [org.eclipse.jetty-jetty-server-9.4.12.v20180830.jar:9.4.12.v20180830]
        at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144) [org.eclipse.jetty-jetty-server-9.4.12.v20180830.jar:9.4.12.v20180830]
        at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:220) [org.eclipse.jetty-jetty-server-9.4.12.v20180830.jar:9.4.12.v20180830]
        at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126) [org.eclipse.jetty-jetty-server-9.4.12.v20180830.jar:9.4.12.v20180830]
        at org.eclipse.jetty.server.handler.StatisticsHandler.handle(StatisticsHandler.java:174) [org.eclipse.jetty-jetty-server-9.4.12.v20180830.jar:9.4.12.v20180830]
        at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) [org.eclipse.jetty-jetty-server-9.4.12.v20180830.jar:9.4.12.v20180830]
        at org.eclipse.jetty.server.Server.handle(Server.java:503) [org.eclipse.jetty-jetty-server-9.4.12.v20180830.jar:9.4.12.v20180830]
        at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:364) [org.eclipse.jetty-jetty-server-9.4.12.v20180830.jar:9.4.12.v20180830]
        at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:260) [org.eclipse.jetty-jetty-server-9.4.12.v20180830.jar:9.4.12.v20180830]
        at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305) [org.eclipse.jetty-jetty-io-9.4.12.v20180830.jar:9.4.12.v20180830]
        at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103) [org.eclipse.jetty-jetty-io-9.4.12.v20180830.jar:9.4.12.v20180830]
        at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118) [org.eclipse.jetty-jetty-io-9.4.12.v20180830.jar:9.4.12.v20180830]
        at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333) [org.eclipse.jetty-jetty-util-9.4.12.v20180830.jar:9.4.12.v20180830]
        at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310) [org.eclipse.jetty-jetty-util-9.4.12.v20180830.jar:9.4.12.v20180830]
        at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168) [org.eclipse.jetty-jetty-util-9.4.12.v20180830.jar:9.4.12.v20180830]
        at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126) [org.eclipse.jetty-jetty-util-9.4.12.v20180830.jar:9.4.12.v20180830]
        at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366) [org.eclipse.jetty-jetty-util-9.4.12.v20180830.jar:9.4.12.v20180830]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_111]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_111]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_111]
20:11:38.783 [pulsar-web-30-6] INFO  org.eclipse.jetty.server.RequestLog - 172.18.17.1 - - [30/Oct/2019:20:11:38 +0800] "PUT /admin/v2/namespaces/crossjh2wx/jh2wx/0x8a000000_0x8b000000/split?unload=true HTTP/1.1" 500 5943 "-" "Pulsar-Java-v2.4.1" 6
20:11:38.784 [pulsar-modular-load-manager-29-1] ERROR org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl - Failed to split namespace bundle crossjh2wx/jh2wx/0x8a000000_0x8b000000
org.apache.pulsar.client.admin.PulsarAdminException$ServerSideErrorException: HTTP 500 Internal Server Error
        at org.apache.pulsar.client.admin.internal.BaseResource.getApiException(BaseResource.java:204) ~[org.apache.pulsar-pulsar-client-admin-original-2.4.1.jar:2.4.1]
        at org.apache.pulsar.client.admin.internal.NamespacesImpl.splitNamespaceBundle(NamespacesImpl.java:516) ~[org.apache.pulsar-pulsar-client-admin-original-2.4.1.jar:2.4.1]
        at org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl.checkNamespaceBundleSplit(ModularLoadManagerImpl.java:664) ~[org.apache.pulsar-pulsar-broker-2.4.1.jar:2.4.1]
        at org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl.updateAll(ModularLoadManagerImpl.java:449) ~[org.apache.pulsar-pulsar-broker-2.4.1.jar:2.4.1]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_111]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_111]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_111]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_111]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_111]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_111]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_111]
Caused by: javax.ws.rs.InternalServerErrorException: HTTP 500 Internal Server Error
        at org.glassfish.jersey.client.JerseyInvocation.convertToException(JerseyInvocation.java:1098) ~[org.glassfish.jersey.core-jersey-client-2.27.jar:?]
        at org.glassfish.jersey.client.JerseyInvocation.translate(JerseyInvocation.java:883) ~[org.glassfish.jersey.core-jersey-client-2.27.jar:?]
        at org.glassfish.jersey.client.JerseyInvocation.lambda$invoke$1(JerseyInvocation.java:767) ~[org.glassfish.jersey.core-jersey-client-2.27.jar:?]
        at org.glassfish.jersey.internal.Errors.process(Errors.java:316) ~[org.glassfish.jersey.core-jersey-common-2.27.jar:?]
        at org.glassfish.jersey.internal.Errors.process(Errors.java:298) ~[org.glassfish.jersey.core-jersey-common-2.27.jar:?]
        at org.glassfish.jersey.internal.Errors.process(Errors.java:229) ~[org.glassfish.jersey.core-jersey-common-2.27.jar:?]
        at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:414) ~[org.glassfish.jersey.core-jersey-common-2.27.jar:?]
        at org.glassfish.jersey.client.JerseyInvocation.invoke(JerseyInvocation.java:765) ~[org.glassfish.jersey.core-jersey-client-2.27.jar:?]
        at org.glassfish.jersey.client.JerseyInvocation$Builder.method(JerseyInvocation.java:456) ~[org.glassfish.jersey.core-jersey-client-2.27.jar:?]
        at org.glassfish.jersey.client.JerseyInvocation$Builder.put(JerseyInvocation.java:340) ~[org.glassfish.jersey.core-jersey-client-2.27.jar:?]
        at org.apache.pulsar.client.admin.internal.NamespacesImpl.splitNamespaceBundle(NamespacesImpl.java:514) ~[org.apache.pulsar-pulsar-client-admin-original-2.4.1.jar:2.4.1]
        ... 10 more
@sunlc210 sunlc210 added the type/bug The PR fixed a bug or issue reported a bug label Oct 31, 2019
@tuteng
Copy link
Member

tuteng commented Nov 6, 2019

I seem to have reproduced this problem today.
When multiple producers deliver messages to the same topic at the same time, the bundle will be split when the current bundle load is very high, so multiple producers will split the same bundle.

reproduce:

  1. Start three broker, loadBalancerAutoUnloadSplitBundlesEnabled=true

  2. Execute on any broker

./bin/pulsar-perf produce -o 100000 -r 100000 -n 3 test-topic-n

test-topic-n lookup broker-3

15:14:38.690 [pulsar-ordered-OrderedExecutor-2-0-EventThread] INFO  org.apache.pulsar.broker.service.persistent.PersistentTopic - [persistent://public/default/test-topic-o] Topic closed
15:14:38.693 [pulsar-ordered-OrderedExecutor-2-0-EventThread] INFO  org.apache.pulsar.broker.namespace.OwnershipCache - [/namespace/public/default/0xf63fffff_0xf65fffff] Removed zk lock for service unit: OK
15:14:38.693 [pulsar-ordered-OrderedExecutor-5-0] INFO  org.apache.pulsar.broker.namespace.OwnedBundle - Unloading public/default/0xf63fffff_0xf65fffff namespace-bundle with 1 topics completed in 262.0 ms
15:14:38.693 [pulsar-ordered-OrderedExecutor-5-0] INFO  org.apache.pulsar.broker.namespace.OwnedBundle - Disabling ownership: public/default/0xf65fffff_0xf67fffff
15:14:38.698 [pulsar-ordered-OrderedExecutor-2-0-EventThread] INFO  org.apache.pulsar.broker.namespace.OwnershipCache - [/namespace/public/default/0xf65fffff_0xf67fffff] Removed zk lock for service unit: OK
15:14:38.698 [pulsar-ordered-OrderedExecutor-5-0] INFO  org.apache.pulsar.broker.namespace.OwnedBundle - Unloading public/default/0xf65fffff_0xf67fffff namespace-bundle with 0 topics completed in 4.0 ms
15:14:38.698 [pulsar-web-32-3] INFO  org.apache.pulsar.broker.admin.impl.NamespacesBase - [null] Successfully split namespace bundle public/default/0xf63fffff_0xf67fffff
15:14:38.699 [pulsar-web-32-3] INFO  org.eclipse.jetty.server.RequestLog - - - [06/Nov/2019:15:14:38 +0000] "PUT /admin/v2/namespaces/public/default/0xf63fffff_0xf67fffff/split?unload=true HTTP/1.1" 204 0 "-" "Pulsar-Java-v2.4.1" 296
15:14:38.701 [pulsar-modular-load-manager-31-1] INFO  org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl - Successfully split namespace bundle public/default/0xf63fffff_0xf67fffff
15:14:39.041 [pulsar-io-24-7] INFO  org.apache.pulsar.broker.service.ServerCnx - New connection from /127.0.0.1:40652
15:14:39.043 [pulsar-io-24-8] INFO  org.apache.pulsar.broker.service.ServerCnx - New connection from /127.0.0.1:40654
15:14:39.053 [pulsar-3-12] INFO  org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl - 2 brokers being considered for assignment of public/default/0xf63fffff_0xf65fffff
15:14:39.055 [pulsar-io-24-9] INFO  org.apache.pulsar.broker.service.ServerCnx - New connection from /127.0.0.1:40656
15:14:40.865 [pulsar-ordered-OrderedExecutor-2-0-EventThread] INFO  org.apache.pulsar.zookeeper.ZooKeeperDataCache - [State:CONNECTED Timeout:30000 sessionid:0x20019517baa0024 local:/:48318 remoteserver:209.97.175.128/209.97.175.128:2181 lastZxid:4294973170 xid:788 sent:788 recv:850 queuedpkts:0 pendingresp:0 queuedevents:0] Received ZooKeeper watch event: WatchedEvent state:SyncConnected type:NodeDataChanged path:/loadbalance/brokers/159.65.9.22:8080
15:14:40.869 [pulsar-modular-load-manager-31-1] INFO  org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl - Load-manager splitting bundle public/default/0xf63fffff_0xf67fffff and unloading true
15:14:40.874 [pulsar-web-32-5] INFO  org.apache.pulsar.broker.admin.impl.NamespacesBase - [null] Split namespace bundle public/default/0xf63fffff_0xf67fffff
15:14:40.878 [pulsar-web-32-5] ERROR org.apache.pulsar.broker.web.PulsarWebResource - [null] Failed to validate namespace bundle public/default/0xf63fffff_0xf67fffff
java.lang.IllegalArgumentException: Invalid upper boundary for bundle
	at com.google.common.base.Preconditions.checkArgument(Preconditions.java:122) ~[com.google.guava-guava-21.0.jar:?]
	at org.apache.pulsar.common.naming.NamespaceBundles.validateBundle(NamespaceBundles.java:114) ~[org.apache.pulsar-pulsar-broker-2.4.1.jar:2.4.1]
	at org.apache.pulsar.broker.web.PulsarWebResource.validateNamespaceBundleRange(PulsarWebResource.java:465) ~[org.apache.pulsar-pulsar-broker-2.4.1.jar:2.4.1]
	at org.apache.pulsar.broker.web.PulsarWebResource.validateNamespaceBundleOwnership(PulsarWebResource.java:491) ~[org.apache.pulsar-pulsar-broker-2.4.1.jar:2.4.1]
	at org.apache.pulsar.broker.admin.impl.NamespacesBase.internalSplitNamespaceBundle(NamespacesBase.java:741) ~[org.apache.pulsar-pulsar-broker-2.4.1.jar:2.4.1]
	at org.apache.pulsar.broker.admin.v2.Namespaces.splitNamespaceBundle(Namespaces.java:330) ~[org.apache.pulsar-pulsar-broker-2.4.1.jar:2.4.1]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_222]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_222]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_222]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_222]
	at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory.lambda$static$0(ResourceMethodInvocationHandlerFactory.java:76) ~[org.glassfish.jersey.core-jersey-server-2.27.jar:?]
	at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:148) [org.glassfish.jersey.core-jersey-server-2.27.jar:?]
	at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:191) [org.glassfish.jersey.core-jersey-server-2.27.jar:?]
	at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$VoidOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:183) [org.glassfish.jersey.core-jersey-server-2.27.jar:?]
	at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:103) [org.glassfish.jersey.core-jersey-server-2.27.jar:?]
	at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:493) [org.glassfish.jersey.core-jersey-server-2.27.jar:?]
	at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:415) [org.glassfish.jersey.core-jersey-server-2.27.jar:?]
	at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:104) [org.glassfish.jersey.core-jersey-server-2.27.jar:?]
	at org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:277) [org.glassfish.jersey.core-jersey-server-2.27.jar:?]
	at org.glassfish.jersey.internal.Errors$1.call(Errors.java:272) [org.glassfish.jersey.core-jersey-common-2.27.jar:?]
	at org.glassfish.jersey.internal.Errors$1.call(Errors.java:268) [org.glassfish.jersey.core-jersey-common-2.27.jar:?]
	at org.glassfish.jersey.internal.Errors.process(Errors.java:316) [org.glassfish.jersey.core-jersey-common-2.27.jar:?]
	at org.glassfish.jersey.internal.Errors.process(Errors.java:298) [org.glassfish.jersey.core-jersey-common-2.27.jar:?]
	at org.glassfish.jersey.internal.Errors.process(Errors.java:268) [org.glassfish.jersey.core-jersey-common-2.27.jar:?]
	at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:289) [org.glassfish.jersey.core-jersey-common-2.27.jar:?]
	at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:256) [org.glassfish.jersey.core-jersey-server-2.27.jar:?]
	at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:703) [org.glassfish.jersey.core-jersey-server-2.27.jar:?]
	at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:416) [org.glassfish.jersey.containers-jersey-container-servlet-core-2.27.jar:?]
	at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:370) [org.glassfish.jersey.containers-jersey-container-servlet-core-2.27.jar:?]
	at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:389) [org.glassfish.jersey.containers-jersey-container-servlet-core-2.27.jar:?]
	at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:342) [org.glassfish.jersey.containers-jersey-container-servlet-core-2.27.jar:?]
	at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:229) [org.glassfish.jersey.containers-jersey-container-servlet-core-2.27.jar:?]
	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:865) [org.eclipse.jetty-jetty-servlet-9.4.12.v20180830.jar:9.4.12.v20180830]
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1655) [org.eclipse.jetty-jetty-servlet-9.4.12.v20180830.jar:9.4.12.v20180830]
	at org.apache.pulsar.broker.web.ResponseHandlerFilter.doFilter(ResponseHandlerFilter.java:53) [org.apache.pulsar-pulsar-broker-2.4.1.jar:2.4.1]
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642) [org.eclipse.jetty-jetty-servlet-9.4.12.v20180830.jar:9.4.12.v20180830]
	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533) [org.eclipse.jetty-jetty-servlet-9.4.12.v20180830.jar:9.4.12.v20180830]
	at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255) [org.eclipse.jetty-jetty-server-9.4.12.v20180830.jar:9.4.12.v20180830]
	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595) [org.eclipse.jetty-jetty-server-9.4.12.v20180830.jar:9.4.12.v20180830]
	at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255) [org.eclipse.jetty-jetty-server-9.4.12.v20180830.jar:9.4.12.v20180830]
	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1340) [org.eclipse.jetty-jetty-server-9.4.12.v20180830.jar:9.4.12.v20180830]
	at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203) [org.eclipse.jetty-jetty-server-9.4.12.v20180830.jar:9.4.12.v20180830]
	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473) [org.eclipse.jetty-jetty-servlet-9.4.12.v20180830.jar:9.4.12.v20180830]
	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564) [org.eclipse.jetty-jetty-server-9.4.12.v20180830.jar:9.4.12.v20180830]
	at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201) [org.eclipse.jetty-jetty-server-9.4.12.v20180830.jar:9.4.12.v20180830]
	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1242) [org.eclipse.jetty-jetty-server-9.4.12.v20180830.jar:9.4.12.v20180830]
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144) [org.eclipse.jetty-jetty-server-9.4.12.v20180830.jar:9.4.12.v20180830]
	at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:220) [org.eclipse.jetty-jetty-server-9.4.12.v20180830.jar:9.4.12.v20180830]
	at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126) [org.eclipse.jetty-jetty-server-9.4.12.v20180830.jar:9.4.12.v20180830]
	at org.eclipse.jetty.server.handler.StatisticsHandler.handle(StatisticsHandler.java:174) [org.eclipse.jetty-jetty-server-9.4.12.v20180830.jar:9.4.12.v20180830]
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) [org.eclipse.jetty-jetty-server-9.4.12.v20180830.jar:9.4.12.v20180830]
	at org.eclipse.jetty.server.Server.handle(Server.java:503) [org.eclipse.jetty-jetty-server-9.4.12.v20180830.jar:9.4.12.v20180830]
	at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:364) [org.eclipse.jetty-jetty-server-9.4.12.v20180830.jar:9.4.12.v20180830]
	at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:260) [org.eclipse.jetty-jetty-server-9.4.12.v20180830.jar:9.4.12.v20180830]
	at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305) [org.eclipse.jetty-jetty-io-9.4.12.v20180830.jar:9.4.12.v20180830]
	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103) [org.eclipse.jetty-jetty-io-9.4.12.v20180830.jar:9.4.12.v20180830]
	at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118) [org.eclipse.jetty-jetty-io-9.4.12.v20180830.jar:9.4.12.v20180830]
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333) [org.eclipse.jetty-jetty-util-9.4.12.v20180830.jar:9.4.12.v20180830]
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310) [org.eclipse.jetty-jetty-util-9.4.12.v20180830.jar:9.4.12.v20180830]
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168) [org.eclipse.jetty-jetty-util-9.4.12.v20180830.jar:9.4.12.v20180830]
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126) [org.eclipse.jetty-jetty-util-9.4.12.v20180830.jar:9.4.12.v20180830]
	at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366) [org.eclipse.jetty-jetty-util-9.4.12.v20180830.jar:9.4.12.v20180830]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_222]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_222]
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_222]
15:14:40.886 [pulsar-web-32-5] INFO  org.eclipse.jetty.server.RequestLog - - - [06/Nov/2019:15:14:40 +0000] "PUT /admin/v2/namespaces/public/default/0xf63fffff_0xf67fffff/split?unload=true HTTP/1.1" 500 6024 "-" "Pulsar-Java-v2.4.1" 14
15:14:40.900 [pulsar-modular-load-manager-31-1] ERROR org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl - Failed to split namespace bundle public/default/0xf63fffff_0xf67fffff
org.apache.pulsar.client.admin.PulsarAdminException$ServerSideErrorException: HTTP 500 Internal Server Error
	at org.apache.pulsar.client.admin.internal.BaseResource.getApiException(BaseResource.java:204) ~[org.apache.pulsar-pulsar-client-admin-original-2.4.1.jar:2.4.1]
	at org.apache.pulsar.client.admin.internal.NamespacesImpl.splitNamespaceBundle(NamespacesImpl.java:516) ~[org.apache.pulsar-pulsar-client-admin-original-2.4.1.jar:2.4.1]
	at org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl.checkNamespaceBundleSplit(ModularLoadManagerImpl.java:664) ~[org.apache.pulsar-pulsar-broker-2.4.1.jar:2.4.1]
	at org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl.updateAll(ModularLoadManagerImpl.java:449) ~[org.apache.pulsar-pulsar-broker-2.4.1.jar:2.4.1]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_222]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_222]
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_222]
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_222]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_222]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_222]
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_222]
Caused by: javax.ws.rs.InternalServerErrorException: HTTP 500 Internal Server Error
	at org.glassfish.jersey.client.JerseyInvocation.convertToException(JerseyInvocation.java:1098) ~[org.glassfish.jersey.core-jersey-client-2.27.jar:?]
	at org.glassfish.jersey.client.JerseyInvocation.translate(JerseyInvocation.java:883) ~[org.glassfish.jersey.core-jersey-client-2.27.jar:?]
	at org.glassfish.jersey.client.JerseyInvocation.lambda$invoke$1(JerseyInvocation.java:767) ~[org.glassfish.jersey.core-jersey-client-2.27.jar:?]
	at org.glassfish.jersey.internal.Errors.process(Errors.java:316) ~[org.glassfish.jersey.core-jersey-common-2.27.jar:?]
	at org.glassfish.jersey.internal.Errors.process(Errors.java:298) ~[org.glassfish.jersey.core-jersey-common-2.27.jar:?]
	at org.glassfish.jersey.internal.Errors.process(Errors.java:229) ~[org.glassfish.jersey.core-jersey-common-2.27.jar:?]
	at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:414) ~[org.glassfish.jersey.core-jersey-common-2.27.jar:?]
	at org.glassfish.jersey.client.JerseyInvocation.invoke(JerseyInvocation.java:765) ~[org.glassfish.jersey.core-jersey-client-2.27.jar:?]
	at org.glassfish.jersey.client.JerseyInvocation$Builder.method(JerseyInvocation.java:456) ~[org.glassfish.jersey.core-jersey-client-2.27.jar:?]
	at org.glassfish.jersey.client.JerseyInvocation$Builder.put(JerseyInvocation.java:340) ~[org.glassfish.jersey.core-jersey-client-2.27.jar:?]
	at org.apache.pulsar.client.admin.internal.NamespacesImpl.splitNamespaceBundle(NamespacesImpl.java:514) ~[org.apache.pulsar-pulsar-client-admin-original-2.4.1.jar:2.4.1]
	... 10 more

It can be seen that public/default/0xf63fffff_0xf67fffff made a split and failed the second time, throwing an exception of 500.

And I found that the topic was balanced on another broker broker-2.

15:15:16.049 [pulsar-ordered-OrderedExecutor-4-0-EventThread] INFO  org.apache.pulsar.broker.service.persistent.PersistentTopic - [persistent://public/default/test-topic-o] Topic closed
15:15:16.052 [pulsar-ordered-OrderedExecutor-4-0-EventThread] INFO  org.apache.pulsar.broker.namespace.OwnershipCache - [/namespace/public/default/0xf63fffff_0xf64fffff] Removed zk lock for service unit: OK
15:15:16.052 [pulsar-ordered-OrderedExecutor-5-0] INFO  org.apache.pulsar.broker.namespace.OwnedBundle - Unloading public/default/0xf63fffff_0xf64fffff namespace-bundle with 1 topics completed in 195.0 ms
15:15:16.053 [pulsar-ordered-OrderedExecutor-5-0] INFO  org.apache.pulsar.broker.namespace.OwnedBundle - Disabling ownership: public/default/0xf64fffff_0xf65fffff
15:15:16.056 [pulsar-ordered-OrderedExecutor-4-0-EventThread] INFO  org.apache.pulsar.broker.namespace.OwnershipCache - [/namespace/public/default/0xf64fffff_0xf65fffff] Removed zk lock for service unit: OK
15:15:16.056 [pulsar-ordered-OrderedExecutor-5-0] INFO  org.apache.pulsar.broker.namespace.OwnedBundle - Unloading public/default/0xf64fffff_0xf65fffff namespace-bundle with 0 topics completed in 3.0 ms
15:15:16.056 [pulsar-web-32-9] INFO  org.apache.pulsar.broker.admin.impl.NamespacesBase - [null] Successfully split namespace bundle public/default/0xf63fffff_0xf65fffff
15:15:16.057 [pulsar-web-32-9] INFO  org.eclipse.jetty.server.RequestLog -  - - [06/Nov/2019:15:15:15 +0000] "PUT /admin/v2/namespaces/public/default/0xf63fffff_0xf65fffff/split?authoritative=true&unload=true HTTP/1.1" 204 0 "-" "Pulsar-Java-v2.4.1" 244

@tuteng
Copy link
Member

tuteng commented Nov 7, 2019

@sijie @jiazhai

@lhotari
Copy link
Member

lhotari commented Nov 11, 2020

I'm running into the same issue with a custom build of Pulsar built from v2.6.2-candidate-2 tag (same as v2.6.2).
In my case, I have a separate namespace with auto topic creation disabled and the application creates topics explicitly.
The scenario where this fails creates 7500 topics in 3 minutes at the rate of 25 topics/second. 5 messages are produced in the each created topic and then later on read using a reader (with 25 simultaneous "virtual users"). Things start failing very soon in this scenario.

The workaround is to disable bundle auto splitting and use a reasonable default number of namespace bundles..

loadBalancerAutoBundleSplitEnabled=false
defaultNumberOfNamespaceBundles=32

With this workaround, Pulsar is able to handle the simple load test in a 3 node cluster (3 zk, 3 bookie, 3 broker).

@frankjkelly
Copy link
Contributor

Thanks @lhotari I have hit this too in Apache Pulsar 2.6.1
CC @addisonj

@frankjkelly
Copy link
Contributor

In chatting with @merlimat in the latest Office hours he felt this was fixed in 2.7

@lhotari
Copy link
Member

lhotari commented Aug 26, 2021

There's a report on Slack from a 2.7.3 user experiencing this issue.

@bpereto
Copy link
Contributor

bpereto commented Jan 28, 2022

we experience this also in version 2.8.1 - on high load (publish rate at 220k msg/s)
bundle splitting seems not to be useful at this stage, because we have 1 topic (2 partitions).

setup:

  • 1 proxy
  • 2 brokers
  • 3 bookies
  • 3 zookeepers

proxy

13:58:58.588 [pulsar-proxy-io-2-1] ERROR org.apache.pulsar.client.impl.ClientCnx - [id: 0xcab5b0cc .. Close connection because received internal-server error java.lang.IllegalStateException: Namespace bundle public/default/0x00e60000_0x00e68000 is being unloaded
13:58:58.590 [pulsar-proxy-io-2-1] WARN  org.apache.pulsar.proxy.server.LookupProxyHandler - Failed to lookup topic netflow.goflow-proxy.pb-partition-1: java.lang.IllegalStateException: Namespace bundle public/default/0x00e60000_0x00e68000 is being unloaded

broker
scraped logs:

13:57:59.179 [BookKeeperClientWorker-OrderedExecutor-3-0] WARN  org.apache.bookkeeper.client.BookieWatcherImpl - New ensemble: [lake03:3181, lake01:3181] is not adhering to Placement Policy. quarantinedBookies: []

13:58:03.348 [pulsar-modular-load-manager-33-1] INFO  org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl - Load-manager splitting bundle public/default/0x00e60000_0x00e80000 and unloading true
13:58:03.353 [pulsar-web-41-2] INFO  org.apache.pulsar.broker.admin.impl.NamespacesBase - [null] Split namespace bundle public/default/0x00e60000_0x00e80000
13:58:03.358 [pulsar-web-41-2] ERROR org.apache.pulsar.broker.web.PulsarWebResource - [null] Invalid bundle range public/default/0x00e60000_0x00e80000, Invalid upper boundary for bundle
13:58:03.361 [pulsar-modular-load-manager-33-1] ERROR org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl - Failed to split namespace bundle public/default/0x00e60000_0x00e80000
org.apache.pulsar.client.admin.PulsarAdminException$PreconditionFailedException: Invalid upper boundary for bundle
13:58:58.476 [pulsar-ordered-OrderedExecutor-5-0] INFO  org.apache.pulsar.broker.namespace.OwnedBundle - Disabling ownership: public/default/0x00e68000_0x00e70000
13:58:58.478 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L396 E6613
13:58:58.587 [pulsar-io-4-8] INFO  org.apache.pulsar.broker.lookup.TopicLookupBase - Failed to lookup null for topic persistent://public/default/netflow.goflow-proxy.pb-partition-1 with error Namespace bundle public/default/0x00e60000_0x00e68000 is being unloaded

with the workaround to disable bundle splitting loadBalancerAutoBundleSplitEnabled=false it seems stable.

@lhotari
Copy link
Member

lhotari commented Jan 28, 2022

@bpereto Thanks for reporting and sharing the relevant logs.

@frankjkelly
Copy link
Contributor

@lhotari @sijie any thoughts on how this might be mitigated e.g. smaller namespace bundles or load balancing on different conditions?

This has some interesting ideas #13340 which I'm not 100% apply but might be useful?

@lhotari
Copy link
Member

lhotari commented Mar 15, 2022

There's a bundle splitting related fix in #14680 .

@frankjkelly
Copy link
Contributor

Thanks @lhotari I don't think that was the issue in our case but it's certainly good to see.

@lhotari
Copy link
Member

lhotari commented May 5, 2022

I now noticed that there's an old issue #1117 which has a similar problem description. PR #1428 was merged to address that. Perhaps some issue is remaining.
I've been testing with a maintenance branch based on branch-2.7 . I'll try to see if the problem can still be reproduced in master.

@lhotari
Copy link
Member

lhotari commented May 5, 2022

btw. In my 2.7.x testing, I'm also hitting a problem fixed by #12252 . That is also related to bundle splitting. I'll soon experiment by cherry-picking #14680 and #12252 to see if that helps.

@frankjkelly
Copy link
Contributor

Thanks for the updates @lhotari

@tisonkun
Copy link
Member

tisonkun commented Dec 6, 2022

Closed as stale and it seems to be fixed. Please open a new issue if it's still relevant to the maintained versions.

@tisonkun tisonkun closed this as completed Dec 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/broker lifecycle/stale type/bug The PR fixed a bug or issue reported a bug
Projects
None yet
Development

No branches or pull requests

8 participants