Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] kyuubi.frontend.thrift.binary.bind.port address already in use when run flink-yarn-session after start KyuubiServer #5957

Closed
2 of 4 tasks
hanson2021 opened this issue Jan 10, 2024 · 7 comments
Labels
kind:bug This is a clearly a bug priority:major

Comments

@hanson2021
Copy link

Code of Conduct

Search before asking

  • I have searched in the issues and found no similar issues.

Describe the bug

After start KyuubiServer, running flinksql on yarn-session failed because of Initializing server with reusing yuubi.frontend.thrift.binary.bind.port; Running flinksql on yarn-application and sparksql are both ok.
when replaced
with deprecated kyuubi.frontend.bind.port=10009,all engines are OK.

Affects Version(s)

master/1.8.0

Kyuubi Server Log Output

No response

Kyuubi Engine Log Output

2024-01-10 10:22:54.155 INFO main org.apache.kyuubi.engine.flink.FlinkSQLBackendService: Service[FlinkSQLBackendService] is initialized.
2024-01-10 10:22:54.231 ERROR main org.apache.kyuubi.engine.flink.FlinkTBinaryFrontendService: java.net.BindException: Address already in use (Bind failed)
2024-01-10 10:22:54.231 ERROR main org.apache.kyuubi.engine.flink.FlinkSQLEngine: Fatal error occurs, thus stopping the engines
org.apache.kyuubi.KyuubiException: Failed to initialize frontend service on sh1-bigdata-m04/10.2.50.73:10009.
        at org.apache.kyuubi.service.TBinaryFrontendService.initialize(TBinaryFrontendService.scala:127)
        at org.apache.kyuubi.service.CompositeService.$anonfun$initialize$1(CompositeService.scala:40)
        at org.apache.kyuubi.service.CompositeService.$anonfun$initialize$1$adapted(CompositeService.scala:40)
        at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:58)
        at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:51)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
        at org.apache.kyuubi.service.CompositeService.initialize(CompositeService.scala:40)
        at org.apache.kyuubi.service.Serverable.initialize(Serverable.scala:46)
        at org.apache.kyuubi.engine.flink.FlinkSQLEngine.initialize(FlinkSQLEngine.scala:43)
        at org.apache.kyuubi.engine.flink.FlinkSQLEngine$.$anonfun$startEngine$2(FlinkSQLEngine.scala:124)
        at org.apache.kyuubi.engine.flink.FlinkSQLEngine$.$anonfun$startEngine$2$adapted(FlinkSQLEngine.scala:123)
        at scala.Option.foreach(Option.scala:257)
        at org.apache.kyuubi.engine.flink.FlinkSQLEngine$.startEngine(FlinkSQLEngine.scala:123)
        at org.apache.kyuubi.engine.flink.FlinkSQLEngine$.main(FlinkSQLEngine.scala:100)
        at org.apache.kyuubi.engine.flink.FlinkSQLEngine.main(FlinkSQLEngine.scala)
Caused by: java.net.BindException: Address already in use (Bind failed)
        at java.net.PlainSocketImpl.socketBind(Native Method)
        at java.net.AbstractPlainSocketImpl.bind(AbstractPlainSocketImpl.java:387)
        at java.net.ServerSocket.bind(ServerSocket.java:375)
        at java.net.ServerSocket.<init>(ServerSocket.java:237)
        at org.apache.kyuubi.service.TBinaryFrontendService.initialize(TBinaryFrontendService.scala:101)
        ... 14 more

Kyuubi Server Configurations

kyuubi.frontend.thrift.binary.bind.port=10009
kyuubi.engine.type=FLINK_SQL
flink.execution.target=yarn-session

Kyuubi Engine Configurations

No response

Additional context

No response

Are you willing to submit PR?

  • Yes. I would be willing to submit a PR with guidance from the Kyuubi community to fix.
  • No. I cannot submit a PR at this time.
@hanson2021 hanson2021 added kind:bug This is a clearly a bug priority:major labels Jan 10, 2024
@pan3793
Copy link
Member

pan3793 commented Jan 10, 2024

cc @link3280

@pan3793
Copy link
Member

pan3793 commented Jan 10, 2024

The engine should not see the configured kyuubi.frontend.thrift.binary.bind.port

@link3280
Copy link
Contributor

The engine should not see the configured kyuubi.frontend.thrift.binary.bind.port

I find the engines use kyuubiConf.setIfMissing(KyuubiConf.FRONTEND_THRIFT_BINARY_BIND_PORT, 0). Could we just simply override kyuubi.frontend.thrift.binary.bind.port with 0 on the engine side?

@pan3793
Copy link
Member

pan3793 commented Jan 10, 2024

Do you mean replace setIfMissing with set? in spark cases, sometimes we want to use a specific port for debugging.

I think the right direction is filtering those configurations out before passing args to the engine launch command.

@link3280
Copy link
Contributor

link3280 commented Jan 12, 2024

@pan3793 Understood. But weird enough, I couldn't find the codes related to the filtering. Could you give me some pointers? Another possible approach is filtering config options with isServerOnly, WDYT?

@link3280
Copy link
Contributor

link3280 commented Feb 5, 2024

ping @pan3793

@ysmintor
Copy link

I use kyuubi 1.9.0
@hanson2021 I followed you way, and I can run on yarn application mode. But still have errors in yarn session mode.
If previous I use yarn application mode with same user. Then I can enter into kyuubi beeline interface. However, I got error once put Flink select command.

I believe @link3280 and @pan3793 would solve yarn session bugs someday later. I would like to put the informatio to reproduce bugs.

This is the root user which previous connected with yarn application mode successfully.

0: jdbc:hive2://node4:18009/> select 1+111 as re;
Error: Error operating ExecuteStatement: org.apache.flink.table.api.TableException: Failed to execute sql
        at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeQueryOperation(TableEnvironmentImpl.java:976)
        at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:1424)
        at org.apache.flink.table.gateway.service.operation.OperationExecutor.executeOperation(OperationExecutor.java:435)
        at org.apache.flink.table.gateway.service.operation.OperationExecutor.executeStatement(OperationExecutor.java:197)
        at org.apache.kyuubi.engine.flink.operation.ExecuteStatement.executeStatement(ExecuteStatement.scala:78)
        at org.apache.kyuubi.engine.flink.operation.ExecuteStatement.runInternal(ExecuteStatement.scala:62)
        at org.apache.kyuubi.operation.AbstractOperation.run(AbstractOperation.scala:173)
        at org.apache.kyuubi.session.AbstractSession.runOperation(AbstractSession.scala:100)
        at org.apache.kyuubi.session.AbstractSession.$anonfun$executeStatement$1(AbstractSession.scala:130)
        at org.apache.kyuubi.session.AbstractSession.withAcquireRelease(AbstractSession.scala:81)
        at org.apache.kyuubi.session.AbstractSession.executeStatement(AbstractSession.scala:127)
        at org.apache.kyuubi.service.AbstractBackendService.executeStatement(AbstractBackendService.scala:66)
        at org.apache.kyuubi.service.TFrontendService.ExecuteStatement(TFrontendService.scala:253)
        at org.apache.kyuubi.shaded.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1557)
        at org.apache.kyuubi.shaded.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1542)
        at org.apache.kyuubi.shaded.thrift.ProcessFunction.process(ProcessFunction.java:38)
        at org.apache.kyuubi.shaded.thrift.TBaseProcessor.process(TBaseProcessor.java:38)
        at org.apache.kyuubi.service.authentication.TSetIpAddressProcessor.process(TSetIpAddressProcessor.scala:35)
        at org.apache.kyuubi.shaded.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:250)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:750)
Caused by: org.apache.flink.util.FlinkException: Failed to execute job 'collect'.
        at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.executeAsync(StreamExecutionEnvironment.java:2221)
        at org.apache.flink.table.planner.delegation.DefaultExecutor.executeAsync(DefaultExecutor.java:95)
        at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeQueryOperation(TableEnvironmentImpl.java:957)
        ... 21 more
Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to submit JobGraph.
        at org.apache.flink.client.program.rest.RestClusterClient.lambda$submitJob$12(RestClusterClient.java:453)
        at java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:884)
        at java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:866)
        at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
        at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990)
        at org.apache.flink.util.concurrent.FutureUtils.lambda$retryOperationWithDelay$6(FutureUtils.java:272)
        at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
        at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
        at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
        at java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:575)
        at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:943)
        at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:456)
        ... 3 more
Caused by: org.apache.flink.runtime.rest.util.RestClientException: [Not found: /v1/jobs]
        at org.apache.flink.runtime.rest.RestClient.parseResponse(RestClient.java:590)
        at org.apache.flink.runtime.rest.RestClient.lambda$submitRequest$4(RestClient.java:570)
        at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:966)
        at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:940)
        ... 4 more (state=,code=0)

I new user with like this output

SJCK-BASE80:/opt/bigdata/kyuubi # bin/beeline -u 'jdbc:hive2://node4:18009/;#kyuubi.engine.type=FLINK_SQL;flink.execution.target=yarn-session;yarn.application.id=application_1709174207782_0177' -n roota
[INFO] Unable to bind key for unsupported operation: backward-delete-word
[INFO] Unable to bind key for unsupported operation: backward-delete-word
[INFO] Unable to bind key for unsupported operation: down-history
[INFO] Unable to bind key for unsupported operation: up-history
[INFO] Unable to bind key for unsupported operation: up-history
[INFO] Unable to bind key for unsupported operation: down-history
[INFO] Unable to bind key for unsupported operation: up-history
[INFO] Unable to bind key for unsupported operation: down-history
[INFO] Unable to bind key for unsupported operation: up-history
[INFO] Unable to bind key for unsupported operation: down-history
[INFO] Unable to bind key for unsupported operation: up-history
[INFO] Unable to bind key for unsupported operation: down-history
[INFO] Unable to bind key for unsupported operation: up-history
[INFO] Unable to bind key for unsupported operation: down-history
[INFO] Unable to bind key for unsupported operation: up-history
[INFO] Unable to bind key for unsupported operation: down-history
[INFO] Unable to bind key for unsupported operation: up-history
[INFO] Unable to bind key for unsupported operation: down-history
Connecting to jdbc:hive2://node4:18009/;#kyuubi.engine.type=FLINK_SQL;flink.execution.target=yarn-session;yarn.application.id=application_1709174207782_0177
2024-03-28 11:19:57.017 INFO KyuubiSessionManager-exec-pool: Thread-196 org.apache.kyuubi.operation.LaunchEngine: Processing roota's query[937e38e4-4cfc-4633-b2ba-0c5ae31bd784]: PENDING_STATE -> RUNNING_STATE, statement:
LaunchEngine
2024-03-28 11:19:57.019 INFO KyuubiSessionManager-exec-pool: Thread-196 org.apache.kyuubi.shaded.curator.framework.imps.CuratorFrameworkImpl: Starting
2024-03-28 11:19:57.019 INFO KyuubiSessionManager-exec-pool: Thread-196 org.apache.kyuubi.shaded.zookeeper.ZooKeeper: Initiating client connection, connectString=10.133.195.122:2181 sessionTimeout=60000 watcher=org.apache.kyuubi.shaded.curator.ConnectionState@62389686
2024-03-28 11:19:57.021 INFO KyuubiSessionManager-exec-pool: Thread-196-SendThread(SJCK-GBASE80:2181) org.apache.kyuubi.shaded.zookeeper.ClientCnxn: Opening socket connection to server SJCK-GBASE80/10.133.195.122:2181. Will not attempt to authenticate using SASL (unknown error)
2024-03-28 11:19:57.022 INFO KyuubiSessionManager-exec-pool: Thread-196-SendThread(SJCK-GBASE80:2181) org.apache.kyuubi.shaded.zookeeper.ClientCnxn: Socket connection established to SJCK-GBASE80/10.133.195.122:2181, initiating session
2024-03-28 11:19:57.024 INFO KyuubiSessionManager-exec-pool: Thread-196-SendThread(SJCK-GBASE80:2181) org.apache.kyuubi.shaded.zookeeper.ClientCnxn: Session establishment complete on server SJCK-GBASE80/10.133.195.122:2181, sessionid = 0x119ce24d0810007, negotiated timeout = 60000
2024-03-28 11:19:57.025 INFO KyuubiSessionManager-exec-pool: Thread-196-EventThread org.apache.kyuubi.shaded.curator.framework.state.ConnectionStateManager: State change: CONNECTED
2024-03-28 11:19:57.045 INFO KyuubiSessionManager-exec-pool: Thread-196 org.apache.kyuubi.engine.ProcBuilder: Creating roota's working directory at /opt/bigdata/kyuubi/work/roota
2024-03-28 11:19:57.053 INFO KyuubiSessionManager-exec-pool: Thread-196 org.apache.kyuubi.engine.EngineRef: Launching engine:
/usr/local/jdk-17.0.10/bin/java \
        -Xmx1g \
        -cp /opt/bigdata/kyuubi/externals/engines/flink/kyuubi-flink-sql-engine_2.12-1.9.0.jar:/opt/bigdata/flink/opt/flink-sql-client-1.17.2.jar:/opt/bigdata/flink/opt/flink-sql-gateway-1.17.2.jar:/opt/bigdata/flink/lib/*:/opt/bigdata/flink/conf:/opt/bigdata/hadoop/etc/hadoop:/opt/bigdata/hadoop/etc/hadoop:/opt/bigdata/hadoop/share/hadoop/common/lib/*:/opt/bigdata/hadoop/share/hadoop/common/*:/opt/bigdata/hadoop/share/hadoop/hdfs:/opt/bigdata/hadoop/share/hadoop/hdfs/lib/*:/opt/bigdata/hadoop/share/hadoop/hdfs/*:/opt/bigdata/hadoop/share/hadoop/mapreduce/lib/*:/opt/bigdata/hadoop/share/hadoop/mapreduce/*:/opt/bigdata/hadoop/share/hadoop/yarn:/opt/bigdata/hadoop/share/hadoop/yarn/lib/*:/opt/bigdata/hadoop/share/hadoop/yarn/*:/opt/bigdata/hadoop/bin/hadoop org.apache.kyuubi.engine.flink.FlinkSQLEngine \
        --conf kyuubi.session.user=roota \
        --conf flink.app.name=kyuubi_USER_FLINK_SQL_roota_default_4c0efb34-450f-496d-9588-12ab86698b83 \
        --conf flink.execution.target=yarn-session \
        --conf hive.server2.thrift.resultset.default.fetch.size=1000 \
        --conf kyuubi.client.ipAddress=10.133.195.122 \
        --conf kyuubi.client.version=1.9.0 \
        --conf kyuubi.engine.share.level=USER \
        --conf kyuubi.engine.submit.time=1711595997043 \
        --conf kyuubi.engine.type=FLINK_SQL \
        --conf kyuubi.frontend.protocols=THRIFT_BINARY,REST \
        --conf kyuubi.ha.addresses=10.133.195.122:2181 \
        --conf kyuubi.ha.engine.ref.id=4c0efb34-450f-496d-9588-12ab86698b83 \
        --conf kyuubi.ha.namespace=/kyuubi_1.9.0_USER_FLINK_SQL/roota/default \
        --conf kyuubi.ha.zookeeper.auth.type=NONE \
        --conf kyuubi.metrics.prometheus.port=18007 \
        --conf kyuubi.server.ipAddress=10.133.151.189 \
        --conf kyuubi.session.connection.url=10.133.151.189:18009 \
        --conf kyuubi.session.real.user=roota \
        --conf yarn.application.id=application_1709174207782_0177
2024-03-28 11:19:57.055 INFO KyuubiSessionManager-exec-pool: Thread-196 org.apache.kyuubi.engine.ProcBuilder: Logging to /opt/bigdata/kyuubi/work/roota/kyuubi-flink-sql-engine.log.0
2024-03-28 11:19:59.051 INFO Curator-Framework-0 org.apache.kyuubi.shaded.curator.framework.imps.CuratorFrameworkImpl: backgroundOperationsLoop exiting
2024-03-28 11:19:59.054 INFO KyuubiSessionManager-exec-pool: Thread-196 org.apache.kyuubi.shaded.zookeeper.ZooKeeper: Session: 0x119ce24d0810007 closed
2024-03-28 11:19:59.054 INFO KyuubiSessionManager-exec-pool: Thread-196-EventThread org.apache.kyuubi.shaded.zookeeper.ClientCnxn: EventThread shut down for session: 0x119ce24d0810007
2024-03-28 11:19:59.055 INFO KyuubiSessionManager-exec-pool: Thread-196 org.apache.kyuubi.operation.LaunchEngine: Processing roota's query[937e38e4-4cfc-4633-b2ba-0c5ae31bd784]: RUNNING_STATE -> ERROR_STATE, time taken: 2.038 seconds
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/bigdata/flink/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/bigdata/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Exception in thread "main" java.lang.NoSuchMethodError: 'void org.apache.hadoop.security.HadoopKerberosName.setRuleMechanism(java.lang.String)'
        at org.apache.hadoop.security.HadoopKerberosName.setConfiguration(HadoopKerberosName.java:84)
        at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:318)
        at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:303)
        at org.apache.hadoop.security.UserGroupInformation.doSubjectLogin(UserGroupInformation.java:1827)
        at org.apache.hadoop.security.UserGroupInformation.createLoginUser(UserGroupInformation.java:709)
        at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:659)
        at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:570)
        at org.apache.kyuubi.Utils$.currentUser(Utils.scala:217)
        at org.apache.kyuubi.engine.flink.FlinkSQLEngine$.<init>(FlinkSQLEngine.scala:66)
        at org.apache.kyuubi.engine.flink.FlinkSQLEngine$.<clinit>(FlinkSQLEngine.scala)
        at org.apache.kyuubi.engine.flink.FlinkSQLEngine.main(FlinkSQLEngine.scala)
Error: org.apache.kyuubi.KyuubiSQLException: org.apache.kyuubi.KyuubiSQLException: Exception in thread "main" java.lang.NoSuchMethodError: 'void org.apache.hadoop.security.HadoopKerberosName.setRuleMechanism(java.lang.String)'
        at org.apache.hadoop.security.HadoopKerberosName.setConfiguration(HadoopKerberosName.java:84)
        at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:318)
        at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:303)
        at org.apache.hadoop.security.UserGroupInformation.doSubjectLogin(UserGroupInformation.java:1827)
        at org.apache.hadoop.security.UserGroupInformation.createLoginUser(UserGroupInformation.java:709)
        at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:659)
        at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:570)
        at org.apache.kyuubi.Utils$.currentUser(Utils.scala:217)
        at org.apache.kyuubi.engine.flink.FlinkSQLEngine$.<init>(FlinkSQLEngine.scala:66)
        at org.apache.kyuubi.engine.flink.FlinkSQLEngine$.<clinit>(FlinkSQLEngine.scala)
        at org.apache.kyuubi.engine.flink.FlinkSQLEngine.main(FlinkSQLEngine.scala)
 See more: /opt/bigdata/kyuubi/work/roota/kyuubi-flink-sql-engine.log.0
        at org.apache.kyuubi.KyuubiSQLException$.apply(KyuubiSQLException.scala:69)
        at org.apache.kyuubi.engine.ProcBuilder.$anonfun$start$1(ProcBuilder.scala:234)
        at java.base/java.lang.Thread.run(Thread.java:842)
.
FYI: The last 10 line(s) of log are:
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/bigdata/flink/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/bigdata/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
        at org.apache.kyuubi.KyuubiSQLException$.apply(KyuubiSQLException.scala:69)
        at org.apache.kyuubi.engine.ProcBuilder.getError(ProcBuilder.scala:281)
        at org.apache.kyuubi.engine.ProcBuilder.getError$(ProcBuilder.scala:270)
        at org.apache.kyuubi.engine.flink.FlinkProcessBuilder.getError(FlinkProcessBuilder.scala:39)
        at org.apache.kyuubi.engine.EngineRef.$anonfun$create$1(EngineRef.scala:236)
        at org.apache.kyuubi.ha.client.zookeeper.ZookeeperDiscoveryClient.tryWithLock(ZookeeperDiscoveryClient.scala:166)
        at org.apache.kyuubi.engine.EngineRef.tryWithLock(EngineRef.scala:178)
        at org.apache.kyuubi.engine.EngineRef.create(EngineRef.scala:183)
        at org.apache.kyuubi.engine.EngineRef.$anonfun$getOrCreate$1(EngineRef.scala:317)
        at scala.Option.getOrElse(Option.scala:189)
        at org.apache.kyuubi.engine.EngineRef.getOrCreate(EngineRef.scala:317)
        at org.apache.kyuubi.session.KyuubiSessionImpl.$anonfun$openEngineSession$2(KyuubiSessionImpl.scala:159)
        at org.apache.kyuubi.session.KyuubiSessionImpl.$anonfun$openEngineSession$2$adapted(KyuubiSessionImpl.scala:133)
        at org.apache.kyuubi.ha.client.DiscoveryClientProvider$.withDiscoveryClient(DiscoveryClientProvider.scala:36)
        at org.apache.kyuubi.session.KyuubiSessionImpl.$anonfun$openEngineSession$1(KyuubiSessionImpl.scala:133)
        at org.apache.kyuubi.session.KyuubiSession.handleSessionException(KyuubiSession.scala:49)
        at org.apache.kyuubi.session.KyuubiSessionImpl.openEngineSession(KyuubiSessionImpl.scala:133)
        at org.apache.kyuubi.operation.LaunchEngine.$anonfun$runInternal$1(LaunchEngine.scala:60)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
        at java.base/java.lang.Thread.run(Thread.java:842) (state=,code=0)

And my kyuubi-defaults.conf is

## Kyuubi Configurations

#
# kyuubi.authentication                    NONE
#
kyuubi.frontend.bind.host                10.133.151.189
kyuubi.frontend.protocols                THRIFT_BINARY,REST
#kyuubi.frontend.thrift.binary.bind.port  18009
kyuubi.frontend.rest.bind.port           18099
kyuubi.frontend.bind.port                18009
#
kyuubi.engine.type                       FLINK_SQL
kyuubi.engine.share.level                USER
# kyuubi.session.engine.initialize.timeout PT3M
#
# kyuubi.ha.addresses                      node2:2181,node6:2181,node8:2181
# kyuubi.ha.namespace                      kyuubi
#
# Monitoring
kyuubi.metrics.prometheus.port           18007

# Details in https://kyuubi.readthedocs.io/en/master/configuration/settings.html

# Spark engine
# spark.master                              yarn
# spark.yarn.queue                          default

# Flink engine
flink.execution.target                    yarn-session
yarn.application.id                       application_1709174207782_0177

kyuubi-env.sh

export JAVA_HOME=/usr/local/jdk-17.0.10
export SPARK_HOME=/opt/bigdata/spark
export FLINK_HOME=/opt/bigdata/flink
export HIVE_HOME=/opt/bigdata/hive
# export FLINK_HADOOP_CLASSPATH=/path/to/hadoop-client-runtime-3.3.2.jar:/path/to/hadoop-client-api-3.3.2.jar
# export FLINK_HADOOP_CLASSPATH=/opt/bigdata/hadoop/share/hadoop/client/hadoop-client-runtime-3.1.4.jar:/opt/bigdata/hadoop/share/hadoop/client/hadoop-client-api-3.1.4.jar
export FLINK_HADOOP_CLASSPATH=`hadoop classpath`
export HIVE_HADOOP_CLASSPATH=${HADOOP_HOME}/share/hadoop/common/lib/commons-collections-3.2.2.jar:${HADOOP_HOME}/share/hadoop/client/hadoop-client-runtime-3.1.4.jar:${HADOOP_HOME}/share/hadoop/client/hadoop-client-api-3.1.4.jar:${HADOOP_HOME}/share/hadoop/common/lib/htrace-core4-4.1.0-incubating.jar
export HADOOP_CONF_DIR=/opt/bigdata/hadoop/etc/hadoop
# export YARN_CONF_DIR=/usr/ndp/current/yarn/conf
# export KYUUBI_JAVA_OPTS="-Xmx10g -XX:MaxMetaspaceSize=512m -XX:MaxDirectMemorySize=1024m -XX:+UseG1GC -XX:+UseStringDeduplication -XX:+UnlockDiagnosticVMOptions -XX:+UseCondCardMark -XX:+UseGCOverheadLimit -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=./logs -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -verbose:gc -Xloggc:./logs/kyuubi-server-gc-%t.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=20M"
# export KYUUBI_BEELINE_OPTS="-Xmx2g -XX:+UseG1GC -XX:+UnlockDiagnosticVMOptions -XX:+UseCondCardMark"

@pan3793 pan3793 closed this as completed in fe5377e Jun 6, 2024
pan3793 added a commit that referenced this issue Jun 6, 2024
# 🔍 Description

This is the root cause of #5957. Which is accidentally introduced in b315123, thus affects 1.8.0, 1.8.1, 1.8.2, 1.9.0, 1.9.1.

`kyuubi-defaults.conf` is kind of a server side configuration file, all Kyuubi confs engine required should be passed via CLI args to sub-process.

## Types of changes 🔖

- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

Pass GHA.

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6455 from pan3793/flink-conf-load.

Closes #5957

2972fbc [Cheng Pan] Flink engine should not load kyuubi-defaults.conf

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
(cherry picked from commit fe5377e)
Signed-off-by: Cheng Pan <chengpan@apache.org>
pan3793 added a commit that referenced this issue Jun 6, 2024
# 🔍 Description

This is the root cause of #5957. Which is accidentally introduced in b315123, thus affects 1.8.0, 1.8.1, 1.8.2, 1.9.0, 1.9.1.

`kyuubi-defaults.conf` is kind of a server side configuration file, all Kyuubi confs engine required should be passed via CLI args to sub-process.

## Types of changes 🔖

- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

Pass GHA.

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6455 from pan3793/flink-conf-load.

Closes #5957

2972fbc [Cheng Pan] Flink engine should not load kyuubi-defaults.conf

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
(cherry picked from commit fe5377e)
Signed-off-by: Cheng Pan <chengpan@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:bug This is a clearly a bug priority:major
Projects
None yet
Development

No branches or pull requests

4 participants