Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The LaunchEngine may be closed multiple times #1806

Closed
3 tasks done
cfmcgrady opened this issue Jan 19, 2022 · 0 comments
Closed
3 tasks done

The LaunchEngine may be closed multiple times #1806

cfmcgrady opened this issue Jan 19, 2022 · 0 comments

Comments

@cfmcgrady
Copy link
Contributor

cfmcgrady commented Jan 19, 2022

Code of Conduct

Search before asking

  • I have searched in the issues and found no similar issues.

Describe the bug

When a query run took more than kyuubi.operation.idle.timeout (3 hours), operation LaunchEngine will be closed multiple times.

  1. LaunchEngine will be closed by SessionManager.TimeoutChecker for the first time when the time reaches kyuubi.operation.idle.timeout 3 hours
  2. LaunchEngine will be closed for the second time when the Kyuubi server receives the request TCloseSessionReq

Affects Version(s)

master

Kyuubi Server Log Output

22/01/17 03:11:03 INFO operation.LaunchEngine: Processing etl's query[901dd0f0-917a-410c-9a1b-16bff14dee36]: INITIALIZED_STATE -> PENDING_STATE, statement: LAUNCH_ENGINE
22/01/17 03:11:03 INFO log.OperationLog: Creating operation log file /opt/kyuubi/work/server_operation_logs/2ff1515a-e59e-49c5-bdad-eb412d6dab27/901dd0f0-917a-410c-9a1b-16bff14dee36
22/01/17 03:11:03 INFO operation.LaunchEngine: Processing etl's query[901dd0f0-917a-410c-9a1b-16bff14dee36]: PENDING_STATE -> RUNNING_STATE, statement: LAUNCH_ENGINE
22/01/17 03:11:04 INFO operation.LaunchEngine: Processing etl's query[901dd0f0-917a-410c-9a1b-16bff14dee36]: RUNNING_STATE -> FINISHED_STATE, statement: LAUNCH_ENGINE, time taken: 0.024 seconds
22/01/17 06:12:39 WARN operation.KyuubiOperationManager: Operation OperationHandle [type=UNKNOWN_OPERATION, identifier: 901dd0f0-917a-410c-9a1b-16bff14dee36] is timed-out and will be closed
22/01/17 06:12:39 INFO operation.LaunchEngine: Processing etl's query[901dd0f0-917a-410c-9a1b-16bff14dee36]: FINISHED_STATE -> CLOSED_STATE, statement: LAUNCH_ENGINE
org.apache.kyuubi.KyuubiSQLException: Invalid OperationHandle [type=UNKNOWN_OPERATION, identifier: 901dd0f0-917a-410c-9a1b-16bff14dee36]
	at org.apache.kyuubi.KyuubiSQLException$.apply(KyuubiSQLException.scala:69)
	at org.apache.kyuubi.operation.OperationManager.removeOperation(OperationManager.scala:89)
	at org.apache.kyuubi.operation.OperationManager.closeOperation(OperationManager.scala:104)
	at org.apache.kyuubi.session.AbstractSession.$anonfun$closeOperation$1(AbstractSession.scala:182)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at org.apache.kyuubi.session.AbstractSession.withAcquireRelease(AbstractSession.scala:76)
	at org.apache.kyuubi.session.AbstractSession.closeOperation(AbstractSession.scala:180)
	at org.apache.kyuubi.session.KyuubiSessionImpl.close(KyuubiSessionImpl.scala:129)
	at org.apache.kyuubi.session.SessionManager.closeSession(SessionManager.scala:93)
	at org.apache.kyuubi.service.AbstractBackendService.closeSession(AbstractBackendService.scala:49)
	at org.apache.kyuubi.server.KyuubiServer$$anon$1.org$apache$kyuubi$server$BackendServiceTimeMetric$$super$closeSession(KyuubiServer.scala:139)
	at org.apache.kyuubi.server.BackendServiceTimeMetric.$anonfun$closeSession$1(BackendServiceTimeMetric.scala:43)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at org.apache.kyuubi.metrics.MetricsSystem$.timerTracing(MetricsSystem.scala:103)
	at org.apache.kyuubi.server.BackendServiceTimeMetric.closeSession(BackendServiceTimeMetric.scala:43)
	at org.apache.kyuubi.server.BackendServiceTimeMetric.closeSession$(BackendServiceTimeMetric.scala:41)
	at org.apache.kyuubi.server.KyuubiServer$$anon$1.closeSession(KyuubiServer.scala:139)
	at org.apache.kyuubi.service.ThriftBinaryFrontendService.CloseSession(ThriftBinaryFrontendService.scala:228)
	at org.apache.hive.service.rpc.thrift.TCLIService$Processor$CloseSession.getResult(TCLIService.java:1397)
	at org.apache.hive.service.rpc.thrift.TCLIService$Processor$CloseSession.getResult(TCLIService.java:1382)
	at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
	at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
	at org.apache.kyuubi.service.authentication.HadoopThriftAuthBridgeServer$TUGIAssumingProcessor.process(HadoopThriftAuthBridgeServer.scala:162)
	at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

Kyuubi Engine Log Output

No response

Kyuubi Server Configurations

No response

Kyuubi Engine Configurations

No response

Additional context

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!
@cfmcgrady cfmcgrady added the bug label Jan 19, 2022
@cfmcgrady cfmcgrady changed the title The LaunchEngine may close multi-time. The LaunchEngine may be closed multiple times Jan 19, 2022
yaooqinn pushed a commit that referenced this issue Jan 20, 2022
<!--
Thanks for sending a pull request!

Here are some tips for you:
  1. If this is your first time, please read our contributor guidelines: https://kyuubi.readthedocs.io/en/latest/community/contributions.html
  2. If the PR is related to an issue in https://github.com/apache/incubator-kyuubi/issues, add '[KYUUBI #XXXX]' in your PR title, e.g., '[KYUUBI #XXXX] Your PR title ...'.
  3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][KYUUBI #XXXX] Your PR title ...'.
-->

### _Why are the changes needed?_
<!--
Please clarify why the changes are needed. For instance,
  1. If you add a feature, you can talk about the use case of it.
  2. If you fix a bug, you can clarify why it is a bug.
-->

close #1806

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.readthedocs.io/en/latest/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #1807 from cfmcgrady/kyuubi-1806.

Closes #1806

c70a594 [Fu Chen] fix

Authored-by: Fu Chen <cfmcgrady@gmail.com>
Signed-off-by: Kent Yao <yao@apache.org>
(cherry picked from commit 5bc7ea1)
Signed-off-by: Kent Yao <yao@apache.org>
pan3793 pushed a commit that referenced this issue Mar 1, 2022
…ed multiple times

### _Why are the changes needed?_

After [PR-1807](#1807), the error will still occur.

```
11:11:05.589 INFO org.apache.kyuubi.operation.log.OperationLog: Creating operation log file /Users/fchen/Project/bigdata/incubator-kyuubi/server_operation_logs/6a640069-6a26-4817-8ad7-77385360bb9d/a09d4b0a-85ec-4640-8ab9-c6a37146fd1f
11:11:05.603 INFO org.apache.kyuubi.operation.LaunchEngine: Processing anonymous's query[a09d4b0a-85ec-4640-8ab9-c6a37146fd1f]: INITIALIZED_STATE -> PENDING_STATE, statement: LAUNCH_ENGINE
11:11:05.604 INFO org.apache.kyuubi.operation.LaunchEngine: Processing anonymous's query[a09d4b0a-85ec-4640-8ab9-c6a37146fd1f]: PENDING_STATE -> RUNNING_STATE, statement: LAUNCH_ENGINE
11:11:25.075 INFO org.apache.kyuubi.operation.LaunchEngine: Processing anonymous's query[a09d4b0a-85ec-4640-8ab9-c6a37146fd1f]: RUNNING_STATE -> FINISHED_STATE, statement: LAUNCH_ENGINE, time taken: 19.471 seconds
11:12:58.257 WARN org.apache.kyuubi.operation.KyuubiOperationManager: Operation OperationHandle [type=UNKNOWN_OPERATION, identifier: a09d4b0a-85ec-4640-8ab9-c6a37146fd1f] is timed-out and will be closed
11:12:58.258 INFO org.apache.kyuubi.operation.LaunchEngine: Processing anonymous's query[a09d4b0a-85ec-4640-8ab9-c6a37146fd1f]: FINISHED_STATE -> CLOSED_STATE, statement: LAUNCH_ENGINE
11:13:04.615 INFO org.apache.kyuubi.server.KyuubiTBinaryFrontendService: Received request of closing SessionHandle [6a640069-6a26-4817-8ad7-77385360bb9d]
11:13:04.618 INFO org.apache.kyuubi.session.KyuubiSessionManager: SessionHandle [6a640069-6a26-4817-8ad7-77385360bb9d] is closed, current opening sessions 0
11:13:04.619 ERROR org.apache.kyuubi.server.KyuubiTBinaryFrontendService: Error closing session:
org.apache.kyuubi.KyuubiSQLException: Invalid OperationHandle [type=UNKNOWN_OPERATION, identifier: a09d4b0a-85ec-4640-8ab9-c6a37146fd1f]
	at org.apache.kyuubi.KyuubiSQLException$.apply(KyuubiSQLException.scala:69) ~[classes/:?]
	at org.apache.kyuubi.operation.OperationManager.removeOperation(OperationManager.scala:89) ~[classes/:?]
	at org.apache.kyuubi.operation.OperationManager.closeOperation(OperationManager.scala:104) ~[classes/:?]
	at org.apache.kyuubi.session.AbstractSession.$anonfun$closeOperation$1(AbstractSession.scala:181) ~[classes/:?]
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) ~[scala-library-2.12.15.jar:?]
	at org.apache.kyuubi.session.AbstractSession.withAcquireRelease(AbstractSession.scala:75) ~[classes/:?]
	at org.apache.kyuubi.session.AbstractSession.closeOperation(AbstractSession.scala:179) ~[classes/:?]
	at org.apache.kyuubi.session.KyuubiSessionImpl.close(KyuubiSessionImpl.scala:149) ~[classes/:?]
	at org.apache.kyuubi.session.SessionManager.closeSession(SessionManager.scala:93) ~[classes/:?]
	at org.apache.kyuubi.service.AbstractBackendService.closeSession(AbstractBackendService.scala:49) ~[classes/:?]
	at org.apache.kyuubi.server.KyuubiServer$$anon$1.org$apache$kyuubi$server$BackendServiceMetric$$super$closeSession(KyuubiServer.scala:142) ~[classes/:?]
	at org.apache.kyuubi.server.BackendServiceMetric.$anonfun$closeSession$1(BackendServiceMetric.scala:43) ~[classes/:?]
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) ~[scala-library-2.12.15.jar:?]
	at org.apache.kyuubi.metrics.MetricsSystem$.timerTracing(MetricsSystem.scala:111) ~[classes/:?]
	at org.apache.kyuubi.server.BackendServiceMetric.closeSession(BackendServiceMetric.scala:43) ~[classes/:?]
	at org.apache.kyuubi.server.BackendServiceMetric.closeSession$(BackendServiceMetric.scala:41) ~[classes/:?]
	at org.apache.kyuubi.server.KyuubiServer$$anon$1.closeSession(KyuubiServer.scala:142) ~[classes/:?]
	at org.apache.kyuubi.service.TFrontendService.CloseSession(TFrontendService.scala:158) [classes/:?]
	at org.apache.hive.service.rpc.thrift.TCLIService$Processor$CloseSession.getResult(TCLIService.java:1397) [hive-service-rpc-2.3.9.jar:2.3.9]
	at org.apache.hive.service.rpc.thrift.TCLIService$Processor$CloseSession.getResult(TCLIService.java:1382) [hive-service-rpc-2.3.9.jar:2.3.9]
	at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38) [libthrift-0.16.0.jar:0.16.0]
	at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38) [libthrift-0.16.0.jar:0.16.0]
	at org.apache.kyuubi.service.authentication.TSetIpAddressProcessor.process(TSetIpAddressProcessor.scala:36) [classes/:?]
	at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:250) [libthrift-0.16.0.jar:0.16.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_202]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_202]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_202]
11:13:04.629 INFO org.apache.kyuubi.server.KyuubiTBinaryFrontendService: Finished closing SessionHandle [6a640069-6a26-4817-8ad7-77385360bb9d]

```

```
A => LaunchEngine Operation
                                                                    TCloseSessionReq
        A-PENDING        A-FINISHED                                       |
             Δ               Δ                                            |
             |               |                                            ▽
----t1------t2------t3------t4------------------------------t5------------t6--------------t7->
     |              |                                       |
     ▽              |                                       ▽
A-INITIALIZED       ▽                                A-CLOSED (timed-out)
                A-RUNNING
                             |<-- operation.idle.timeout -->|<-- operation.idle.timeout -->|

```

1. the variable `LaunchEngine.lastAccessTime` will be reset to `System.currentTimeMillis()` when the operation is timeouts(at time t5).

https://github.com/apache/incubator-kyuubi/blob/67d1c5dd513327b350dac3d6855af5223a594087/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/AbstractOperation.scala#L75-L89

2. when the Kyuubi server receives the request TCloseSessionReq at time t6 between t5 and t7. Since t6 < t7, `LaunchEngine.isTimedOut()` will return false, and we will close the operation LaunchEngine again.

https://github.com/apache/incubator-kyuubi/blob/67d1c5dd513327b350dac3d6855af5223a594087/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/AbstractOperation.scala#L189-L196

### _How was this patch tested?_

Pass CI.

Closes #1983 from cfmcgrady/kyuubi-1806-followup.

Closes #1983

Closes #1806

c764269 [Fu Chen] Fix LaunchEngine may be closed multiple times

Authored-by: Fu Chen <cfmcgrady@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
pan3793 pushed a commit that referenced this issue Mar 1, 2022
…ed multiple times

### _Why are the changes needed?_

After [PR-1807](#1807), the error will still occur.

```
11:11:05.589 INFO org.apache.kyuubi.operation.log.OperationLog: Creating operation log file /Users/fchen/Project/bigdata/incubator-kyuubi/server_operation_logs/6a640069-6a26-4817-8ad7-77385360bb9d/a09d4b0a-85ec-4640-8ab9-c6a37146fd1f
11:11:05.603 INFO org.apache.kyuubi.operation.LaunchEngine: Processing anonymous's query[a09d4b0a-85ec-4640-8ab9-c6a37146fd1f]: INITIALIZED_STATE -> PENDING_STATE, statement: LAUNCH_ENGINE
11:11:05.604 INFO org.apache.kyuubi.operation.LaunchEngine: Processing anonymous's query[a09d4b0a-85ec-4640-8ab9-c6a37146fd1f]: PENDING_STATE -> RUNNING_STATE, statement: LAUNCH_ENGINE
11:11:25.075 INFO org.apache.kyuubi.operation.LaunchEngine: Processing anonymous's query[a09d4b0a-85ec-4640-8ab9-c6a37146fd1f]: RUNNING_STATE -> FINISHED_STATE, statement: LAUNCH_ENGINE, time taken: 19.471 seconds
11:12:58.257 WARN org.apache.kyuubi.operation.KyuubiOperationManager: Operation OperationHandle [type=UNKNOWN_OPERATION, identifier: a09d4b0a-85ec-4640-8ab9-c6a37146fd1f] is timed-out and will be closed
11:12:58.258 INFO org.apache.kyuubi.operation.LaunchEngine: Processing anonymous's query[a09d4b0a-85ec-4640-8ab9-c6a37146fd1f]: FINISHED_STATE -> CLOSED_STATE, statement: LAUNCH_ENGINE
11:13:04.615 INFO org.apache.kyuubi.server.KyuubiTBinaryFrontendService: Received request of closing SessionHandle [6a640069-6a26-4817-8ad7-77385360bb9d]
11:13:04.618 INFO org.apache.kyuubi.session.KyuubiSessionManager: SessionHandle [6a640069-6a26-4817-8ad7-77385360bb9d] is closed, current opening sessions 0
11:13:04.619 ERROR org.apache.kyuubi.server.KyuubiTBinaryFrontendService: Error closing session:
org.apache.kyuubi.KyuubiSQLException: Invalid OperationHandle [type=UNKNOWN_OPERATION, identifier: a09d4b0a-85ec-4640-8ab9-c6a37146fd1f]
	at org.apache.kyuubi.KyuubiSQLException$.apply(KyuubiSQLException.scala:69) ~[classes/:?]
	at org.apache.kyuubi.operation.OperationManager.removeOperation(OperationManager.scala:89) ~[classes/:?]
	at org.apache.kyuubi.operation.OperationManager.closeOperation(OperationManager.scala:104) ~[classes/:?]
	at org.apache.kyuubi.session.AbstractSession.$anonfun$closeOperation$1(AbstractSession.scala:181) ~[classes/:?]
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) ~[scala-library-2.12.15.jar:?]
	at org.apache.kyuubi.session.AbstractSession.withAcquireRelease(AbstractSession.scala:75) ~[classes/:?]
	at org.apache.kyuubi.session.AbstractSession.closeOperation(AbstractSession.scala:179) ~[classes/:?]
	at org.apache.kyuubi.session.KyuubiSessionImpl.close(KyuubiSessionImpl.scala:149) ~[classes/:?]
	at org.apache.kyuubi.session.SessionManager.closeSession(SessionManager.scala:93) ~[classes/:?]
	at org.apache.kyuubi.service.AbstractBackendService.closeSession(AbstractBackendService.scala:49) ~[classes/:?]
	at org.apache.kyuubi.server.KyuubiServer$$anon$1.org$apache$kyuubi$server$BackendServiceMetric$$super$closeSession(KyuubiServer.scala:142) ~[classes/:?]
	at org.apache.kyuubi.server.BackendServiceMetric.$anonfun$closeSession$1(BackendServiceMetric.scala:43) ~[classes/:?]
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) ~[scala-library-2.12.15.jar:?]
	at org.apache.kyuubi.metrics.MetricsSystem$.timerTracing(MetricsSystem.scala:111) ~[classes/:?]
	at org.apache.kyuubi.server.BackendServiceMetric.closeSession(BackendServiceMetric.scala:43) ~[classes/:?]
	at org.apache.kyuubi.server.BackendServiceMetric.closeSession$(BackendServiceMetric.scala:41) ~[classes/:?]
	at org.apache.kyuubi.server.KyuubiServer$$anon$1.closeSession(KyuubiServer.scala:142) ~[classes/:?]
	at org.apache.kyuubi.service.TFrontendService.CloseSession(TFrontendService.scala:158) [classes/:?]
	at org.apache.hive.service.rpc.thrift.TCLIService$Processor$CloseSession.getResult(TCLIService.java:1397) [hive-service-rpc-2.3.9.jar:2.3.9]
	at org.apache.hive.service.rpc.thrift.TCLIService$Processor$CloseSession.getResult(TCLIService.java:1382) [hive-service-rpc-2.3.9.jar:2.3.9]
	at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38) [libthrift-0.16.0.jar:0.16.0]
	at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38) [libthrift-0.16.0.jar:0.16.0]
	at org.apache.kyuubi.service.authentication.TSetIpAddressProcessor.process(TSetIpAddressProcessor.scala:36) [classes/:?]
	at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:250) [libthrift-0.16.0.jar:0.16.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_202]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_202]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_202]
11:13:04.629 INFO org.apache.kyuubi.server.KyuubiTBinaryFrontendService: Finished closing SessionHandle [6a640069-6a26-4817-8ad7-77385360bb9d]

```

```
A => LaunchEngine Operation
                                                                    TCloseSessionReq
        A-PENDING        A-FINISHED                                       |
             Δ               Δ                                            |
             |               |                                            ▽
----t1------t2------t3------t4------------------------------t5------------t6--------------t7->
     |              |                                       |
     ▽              |                                       ▽
A-INITIALIZED       ▽                                A-CLOSED (timed-out)
                A-RUNNING
                             |<-- operation.idle.timeout -->|<-- operation.idle.timeout -->|

```

1. the variable `LaunchEngine.lastAccessTime` will be reset to `System.currentTimeMillis()` when the operation is timeouts(at time t5).

https://github.com/apache/incubator-kyuubi/blob/67d1c5dd513327b350dac3d6855af5223a594087/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/AbstractOperation.scala#L75-L89

2. when the Kyuubi server receives the request TCloseSessionReq at time t6 between t5 and t7. Since t6 < t7, `LaunchEngine.isTimedOut()` will return false, and we will close the operation LaunchEngine again.

https://github.com/apache/incubator-kyuubi/blob/67d1c5dd513327b350dac3d6855af5223a594087/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/AbstractOperation.scala#L189-L196

### _How was this patch tested?_

Pass CI.

Closes #1983 from cfmcgrady/kyuubi-1806-followup.

Closes #1983

Closes #1806

c764269 [Fu Chen] Fix LaunchEngine may be closed multiple times

Authored-by: Fu Chen <cfmcgrady@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
(cherry picked from commit 31ba97f)
Signed-off-by: Cheng Pan <chengpan@apache.org>
beryllw pushed a commit to beryllw/incubator-kyuubi that referenced this issue Jan 11, 2024
<!--
Thanks for sending a pull request!

Here are some tips for you:
  1. If this is your first time, please read our contributor guidelines: https://kyuubi.readthedocs.io/en/latest/community/contributions.html
  2. If the PR is related to an issue in https://github.com/apache/incubator-kyuubi/issues, add '[KYUUBI #XXXX]' in your PR title, e.g., '[KYUUBI #XXXX] Your PR title ...'.
  3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][KYUUBI #XXXX] Your PR title ...'.
-->

### _Why are the changes needed?_
<!--
Please clarify why the changes are needed. For instance,
  1. If you add a feature, you can talk about the use case of it.
  2. If you fix a bug, you can clarify why it is a bug.
-->

close apache#1806

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.readthedocs.io/en/latest/develop_tools/testing.html#running-tests) locally before make a pull request

Closes apache#1807 from cfmcgrady/kyuubi-1806.

Closes apache#1806

c70a594 [Fu Chen] fix

Authored-by: Fu Chen <cfmcgrady@gmail.com>
Signed-off-by: Kent Yao <yao@apache.org>
(cherry picked from commit 5bc7ea1)
beryllw pushed a commit to beryllw/incubator-kyuubi that referenced this issue Jan 11, 2024
… may be closed multiple times

### _Why are the changes needed?_

After [PR-1807](apache#1807), the error will still occur.

```
11:11:05.589 INFO org.apache.kyuubi.operation.log.OperationLog: Creating operation log file /Users/fchen/Project/bigdata/incubator-kyuubi/server_operation_logs/6a640069-6a26-4817-8ad7-77385360bb9d/a09d4b0a-85ec-4640-8ab9-c6a37146fd1f
11:11:05.603 INFO org.apache.kyuubi.operation.LaunchEngine: Processing anonymous's query[a09d4b0a-85ec-4640-8ab9-c6a37146fd1f]: INITIALIZED_STATE -> PENDING_STATE, statement: LAUNCH_ENGINE
11:11:05.604 INFO org.apache.kyuubi.operation.LaunchEngine: Processing anonymous's query[a09d4b0a-85ec-4640-8ab9-c6a37146fd1f]: PENDING_STATE -> RUNNING_STATE, statement: LAUNCH_ENGINE
11:11:25.075 INFO org.apache.kyuubi.operation.LaunchEngine: Processing anonymous's query[a09d4b0a-85ec-4640-8ab9-c6a37146fd1f]: RUNNING_STATE -> FINISHED_STATE, statement: LAUNCH_ENGINE, time taken: 19.471 seconds
11:12:58.257 WARN org.apache.kyuubi.operation.KyuubiOperationManager: Operation OperationHandle [type=UNKNOWN_OPERATION, identifier: a09d4b0a-85ec-4640-8ab9-c6a37146fd1f] is timed-out and will be closed
11:12:58.258 INFO org.apache.kyuubi.operation.LaunchEngine: Processing anonymous's query[a09d4b0a-85ec-4640-8ab9-c6a37146fd1f]: FINISHED_STATE -> CLOSED_STATE, statement: LAUNCH_ENGINE
11:13:04.615 INFO org.apache.kyuubi.server.KyuubiTBinaryFrontendService: Received request of closing SessionHandle [6a640069-6a26-4817-8ad7-77385360bb9d]
11:13:04.618 INFO org.apache.kyuubi.session.KyuubiSessionManager: SessionHandle [6a640069-6a26-4817-8ad7-77385360bb9d] is closed, current opening sessions 0
11:13:04.619 ERROR org.apache.kyuubi.server.KyuubiTBinaryFrontendService: Error closing session:
org.apache.kyuubi.KyuubiSQLException: Invalid OperationHandle [type=UNKNOWN_OPERATION, identifier: a09d4b0a-85ec-4640-8ab9-c6a37146fd1f]
	at org.apache.kyuubi.KyuubiSQLException$.apply(KyuubiSQLException.scala:69) ~[classes/:?]
	at org.apache.kyuubi.operation.OperationManager.removeOperation(OperationManager.scala:89) ~[classes/:?]
	at org.apache.kyuubi.operation.OperationManager.closeOperation(OperationManager.scala:104) ~[classes/:?]
	at org.apache.kyuubi.session.AbstractSession.$anonfun$closeOperation$1(AbstractSession.scala:181) ~[classes/:?]
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) ~[scala-library-2.12.15.jar:?]
	at org.apache.kyuubi.session.AbstractSession.withAcquireRelease(AbstractSession.scala:75) ~[classes/:?]
	at org.apache.kyuubi.session.AbstractSession.closeOperation(AbstractSession.scala:179) ~[classes/:?]
	at org.apache.kyuubi.session.KyuubiSessionImpl.close(KyuubiSessionImpl.scala:149) ~[classes/:?]
	at org.apache.kyuubi.session.SessionManager.closeSession(SessionManager.scala:93) ~[classes/:?]
	at org.apache.kyuubi.service.AbstractBackendService.closeSession(AbstractBackendService.scala:49) ~[classes/:?]
	at org.apache.kyuubi.server.KyuubiServer$$anon$1.org$apache$kyuubi$server$BackendServiceMetric$$super$closeSession(KyuubiServer.scala:142) ~[classes/:?]
	at org.apache.kyuubi.server.BackendServiceMetric.$anonfun$closeSession$1(BackendServiceMetric.scala:43) ~[classes/:?]
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) ~[scala-library-2.12.15.jar:?]
	at org.apache.kyuubi.metrics.MetricsSystem$.timerTracing(MetricsSystem.scala:111) ~[classes/:?]
	at org.apache.kyuubi.server.BackendServiceMetric.closeSession(BackendServiceMetric.scala:43) ~[classes/:?]
	at org.apache.kyuubi.server.BackendServiceMetric.closeSession$(BackendServiceMetric.scala:41) ~[classes/:?]
	at org.apache.kyuubi.server.KyuubiServer$$anon$1.closeSession(KyuubiServer.scala:142) ~[classes/:?]
	at org.apache.kyuubi.service.TFrontendService.CloseSession(TFrontendService.scala:158) [classes/:?]
	at org.apache.hive.service.rpc.thrift.TCLIService$Processor$CloseSession.getResult(TCLIService.java:1397) [hive-service-rpc-2.3.9.jar:2.3.9]
	at org.apache.hive.service.rpc.thrift.TCLIService$Processor$CloseSession.getResult(TCLIService.java:1382) [hive-service-rpc-2.3.9.jar:2.3.9]
	at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38) [libthrift-0.16.0.jar:0.16.0]
	at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38) [libthrift-0.16.0.jar:0.16.0]
	at org.apache.kyuubi.service.authentication.TSetIpAddressProcessor.process(TSetIpAddressProcessor.scala:36) [classes/:?]
	at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:250) [libthrift-0.16.0.jar:0.16.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_202]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_202]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_202]
11:13:04.629 INFO org.apache.kyuubi.server.KyuubiTBinaryFrontendService: Finished closing SessionHandle [6a640069-6a26-4817-8ad7-77385360bb9d]

```

```
A => LaunchEngine Operation
                                                                    TCloseSessionReq
        A-PENDING        A-FINISHED                                       |
             Δ               Δ                                            |
             |               |                                            ▽
----t1------t2------t3------t4------------------------------t5------------t6--------------t7->
     |              |                                       |
     ▽              |                                       ▽
A-INITIALIZED       ▽                                A-CLOSED (timed-out)
                A-RUNNING
                             |<-- operation.idle.timeout -->|<-- operation.idle.timeout -->|

```

1. the variable `LaunchEngine.lastAccessTime` will be reset to `System.currentTimeMillis()` when the operation is timeouts(at time t5).

https://github.com/apache/incubator-kyuubi/blob/67d1c5dd513327b350dac3d6855af5223a594087/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/AbstractOperation.scala#L75-L89

2. when the Kyuubi server receives the request TCloseSessionReq at time t6 between t5 and t7. Since t6 < t7, `LaunchEngine.isTimedOut()` will return false, and we will close the operation LaunchEngine again.

https://github.com/apache/incubator-kyuubi/blob/67d1c5dd513327b350dac3d6855af5223a594087/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/AbstractOperation.scala#L189-L196

### _How was this patch tested?_

Pass CI.

Closes apache#1983 from cfmcgrady/kyuubi-1806-followup.

Closes apache#1983

Closes apache#1806

c764269 [Fu Chen] Fix LaunchEngine may be closed multiple times

Authored-by: Fu Chen <cfmcgrady@gmail.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
(cherry picked from commit 31ba97f)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant