Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[KYUUBI #1969] Fix race on some service during start and stop phase
<!-- Thanks for sending a pull request! Here are some tips for you: 1. If this is your first time, please read our contributor guidelines: https://kyuubi.readthedocs.io/en/latest/community/contributions.html 2. If the PR is related to an issue in https://github.com/apache/incubator-kyuubi/issues, add '[KYUUBI #XXXX]' in your PR title, e.g., '[KYUUBI #XXXX] Your PR title ...'. 3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][KYUUBI #XXXX] Your PR title ...'. --> ### _Why are the changes needed?_ <!-- Please clarify why the changes are needed. For instance, 1. If you add a feature, you can talk about the use case of it. 2. If you fix a bug, you can clarify why it is a bug. --> #1969 might be fixed by this. ``` 12:18:19.885 SparkTBinaryFrontendHandler-Pool: Thread-1260 INFO SparkSQLSessionManager: Session stopped due to shared level is Connection. 12:18:19.886 SparkTBinaryFrontendHandler-Pool: Thread-1260 INFO SparkSQLEngine: Service: [SparkTBinaryFrontend] is stopping. 12:18:19.886 SparkTBinaryFrontendHandler-Pool: Thread-1260 INFO SparkTBinaryFrontendService: SparkTBinaryFrontend has stopped 12:18:19.886 SparkTBinaryFrontendHandler-Pool: Thread-1260 INFO SparkTBinaryFrontendService: Service: [EngineServiceDiscovery] is stopping. 12:18:19.890 SparkTBinaryFrontendHandler-Pool: Thread-1260 DEBUG FailedDeleteManager: Path being added to guaranteed delete set: /kyuubi/CONNECTION/933147f8-1fcb-4cda-875f-0b219f97aaf3/serviceUri=localhost:43147;version=1.5.0-SNAPSHOT;sequence=0000000000 12:18:19.892 ProcessThread(sid:0 cport:40861): INFO PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x100000e6ac00000 type:delete cxid:0xb zxid:0x9 txntype:-1 reqpath:n/a Error Path:/kyuubi/CONNECTION/933147f8-1fcb-4cda-875f-0b219f97aaf3/serviceUri=localhost:43147;version=1.5.0-SNAPSHOT;sequence=0000000000 Error:KeeperErrorCode = NoNode for /kyuubi/CONNECTION/933147f8-1fcb-4cda-875f-0b219f97aaf3/serviceUri=localhost:43147;version=1.5.0-SNAPSHOT;sequence=0000000000 12:18:19.892 SyncThread:0 DEBUG FinalRequestProcessor: Processing request:: sessionid:0x100000e6ac00000 type:delete cxid:0xa zxid:0x8 txntype:2 reqpath:n/a 12:18:19.893 SparkTBinaryFrontendHandler-Pool: Thread-1260 ERROR ServiceDiscoveryClient: Failed to close the persistent ephemeral znodenull java.io.IOException: java.lang.InterruptedException at org.apache.curator.framework.recipes.nodes.PersistentNode.close(PersistentNode.java:296) ~[curator-recipes-2.12.0.jar:?] at org.apache.kyuubi.ha.client.zookeeper.ServiceDiscoveryClient.deregisterService(ServiceDiscoveryClient.scala:117) ~[kyuubi-ha_2.12-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] at org.apache.kyuubi.ha.client.EngineServiceDiscovery.stop(EngineServiceDiscovery.scala:35) ~[kyuubi-ha_2.12-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] at org.apache.kyuubi.service.CompositeService.$anonfun$stop$2(CompositeService.scala:75) ~[kyuubi-common_2.12-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] ``` the deregister for connection-level happens in session close and SparkTBinaryFrontendHandler pool, which might be stopped before finish executing zk node deletion. this PR contains some other race issues too ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [ ] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request Closes #1980 from yaooqinn/conc. Closes #1969 efd1270 [Kent Yao] address comments aed6636 [Kent Yao] Fix race on some service during stop phase 6359262 [Kent Yao] Fix race on some service during stop phase Authored-by: Kent Yao <yao@apache.org> Signed-off-by: ulysses-you <ulyssesyou@apache.org>
- Loading branch information