-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Open
Description
When a test controller is restarted (such as in PinotTaskManagerStatelessTest), the controller will be come unavailable for ~1sec until it returns back to normal.
Issue was first discovered in #8390 but this could mean that the startController command didn't return a ready controller upon finish.
see code block:
Details
``` Connected to the target VM, address: '127.0.0.1:54163', transport: 'socket' Mar 28, 2022 2:03:04 PM org.glassfish.grizzly.http.server.NetworkListener start INFO: Started listener bound to [0.0.0.0:18998] Mar 28, 2022 2:03:04 PM org.glassfish.grizzly.http.server.HttpServer start INFO: [HttpServer] Started. Mar 28, 2022 2:03:10 PM org.glassfish.grizzly.http.server.NetworkListener shutdownNow INFO: Stopped listener bound to [0.0.0.0:18998] 14:03:10.409 [HelixController-pipeline-task-PinotTaskManagerStatelessTest-(b0992338_TASK)] ERROR org.apache.helix.controller.GenericHelixController - Cluster manager: localhost_18998 is not leader for PinotTaskManagerStatelessTest. Pipeline will not be invoked Mar 28, 2022 2:03:11 PM org.glassfish.grizzly.http.server.NetworkListener start INFO: Started listener bound to [0.0.0.0:18998] Mar 28, 2022 2:03:11 PM org.glassfish.grizzly.http.server.HttpServer start INFO: [HttpServer-1] Started. 14:03:36.069 [ZkClient-EventThread-113-localhost:2191] ERROR org.apache.pinot.controller.helix.core.realtime.PinotRealtimeSegmentManager - Caught exception while processing change for path /PinotTaskManagerStatelessTest/PROPERTYSTORE/CONFIGS/TABLE java.lang.IllegalStateException: ZkClient already closed! at org.apache.helix.manager.zk.zookeeper.ZkClient.retryUntilConnected(ZkClient.java:1171) ~[helix-core-0.9.8.jar:0.9.8] at org.apache.helix.manager.zk.zookeeper.ZkClient.getChildren(ZkClient.java:725) ~[helix-core-0.9.8.jar:0.9.8] at org.apache.helix.manager.zk.zookeeper.ZkClient.getChildren(ZkClient.java:718) ~[helix-core-0.9.8.jar:0.9.8] at org.apache.helix.manager.zk.ZkBaseDataAccessor.getChildNames(ZkBaseDataAccessor.java:511) ~[helix-core-0.9.8.jar:0.9.8] at org.apache.helix.manager.zk.ZkCacheBaseDataAccessor.getChildNames(ZkCacheBaseDataAccessor.java:675) ~[helix-core-0.9.8.jar:0.9.8] at org.apache.helix.store.zk.AutoFallbackPropertyStore.getChildNames(AutoFallbackPropertyStore.java:283) ~[helix-core-0.9.8.jar:0.9.8] at org.apache.helix.manager.zk.ZkCacheBaseDataAccessor.getChildren(ZkCacheBaseDataAccessor.java:692) ~[helix-core-0.9.8.jar:0.9.8] at org.apache.helix.manager.zk.ZkCacheBaseDataAccessor.getChildren(ZkCacheBaseDataAccessor.java:681) ~[helix-core-0.9.8.jar:0.9.8] at org.apache.pinot.controller.helix.core.realtime.PinotRealtimeSegmentManager.refreshWatchers(PinotRealtimeSegmentManager.java:318) ~[classes/:?] at org.apache.pinot.controller.helix.core.realtime.PinotRealtimeSegmentManager.processPropertyStoreChange(PinotRealtimeSegmentManager.java:295) [classes/:?] at org.apache.pinot.controller.helix.core.realtime.PinotRealtimeSegmentManager.handleChildChange(PinotRealtimeSegmentManager.java:383) [classes/:?] at org.apache.helix.manager.zk.zookeeper.ZkClient$8.run(ZkClient.java:1070) [helix-core-0.9.8.jar:0.9.8] at org.apache.helix.manager.zk.zookeeper.ZkEventThread.run(ZkEventThread.java:69) [helix-core-0.9.8.jar:0.9.8] 14:03:36.076 [ZkClient-EventThread-113-localhost:2191] ERROR org.apache.helix.manager.zk.zookeeper.ZkClient - Error handling event ZkEvent[Children of /PinotTaskManagerStatelessTest/PROPERTYSTORE/CONFIGS/TABLE changed sent to org.apache.pinot.controller.helix.core.realtime.PinotRealtimeSegmentManager@4ebfdbb4] java.lang.IllegalStateException: ZkClient already closed! at org.apache.helix.manager.zk.zookeeper.ZkClient.retryUntilConnected(ZkClient.java:1171) ~[helix-core-0.9.8.jar:0.9.8] at org.apache.helix.manager.zk.zookeeper.ZkClient.getChildren(ZkClient.java:725) ~[helix-core-0.9.8.jar:0.9.8] at org.apache.helix.manager.zk.zookeeper.ZkClient.getChildren(ZkClient.java:718) ~[helix-core-0.9.8.jar:0.9.8] at org.apache.helix.manager.zk.ZkBaseDataAccessor.getChildNames(ZkBaseDataAccessor.java:511) ~[helix-core-0.9.8.jar:0.9.8] at org.apache.helix.manager.zk.ZkCacheBaseDataAccessor.getChildNames(ZkCacheBaseDataAccessor.java:675) ~[helix-core-0.9.8.jar:0.9.8] at org.apache.helix.store.zk.AutoFallbackPropertyStore.getChildNames(AutoFallbackPropertyStore.java:283) ~[helix-core-0.9.8.jar:0.9.8] at org.apache.helix.manager.zk.ZkCacheBaseDataAccessor.getChildren(ZkCacheBaseDataAccessor.java:692) ~[helix-core-0.9.8.jar:0.9.8] at org.apache.helix.manager.zk.ZkCacheBaseDataAccessor.getChildren(ZkCacheBaseDataAccessor.java:681) ~[helix-core-0.9.8.jar:0.9.8] at org.apache.pinot.controller.helix.core.realtime.PinotRealtimeSegmentManager.refreshWatchers(PinotRealtimeSegmentManager.java:318) ~[classes/:?] at org.apache.pinot.controller.helix.core.realtime.PinotRealtimeSegmentManager.processPropertyStoreChange(PinotRealtimeSegmentManager.java:295) ~[classes/:?] at org.apache.pinot.controller.helix.core.realtime.PinotRealtimeSegmentManager.handleChildChange(PinotRealtimeSegmentManager.java:383) ~[classes/:?] at org.apache.helix.manager.zk.zookeeper.ZkClient$8.run(ZkClient.java:1070) ~[helix-core-0.9.8.jar:0.9.8] at org.apache.helix.manager.zk.zookeeper.ZkEventThread.run(ZkEventThread.java:69) [helix-core-0.9.8.jar:0.9.8] Mar 28, 2022 2:03:36 PM org.glassfish.grizzly.http.server.NetworkListener shutdownNow INFO: Stopped listener bound to [0.0.0.0:18998] 14:03:36.416 [HelixController-pipeline-task-PinotTaskManagerStatelessTest-(767f1028_TASK)] ERROR org.apache.helix.controller.GenericHelixController - Cluster manager: localhost_18998 is not leader for PinotTaskManagerStatelessTest. Pipeline will not be invoked===============================================
Default Suite
Total tests run: 1, Failures: 0, Skips: 0
Disconnected from the target VM, address: '127.0.0.1:54163', transport: 'socket'
Process finished with exit code 0
</details>
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels