Description
Discussed in #14289
Originally posted by harryruirui November 15, 2022
Role assigned FE3[FE-A,FE-B,FE-C(master)] BEN
It seems everything is fine through BE updating and FE is ok with metadata of former version.
After updating FE-A, when I connect deriect to FE-A
mysql> SHOW PROC '/frontends'; ERROR 1105 (HY000): TApplicationException, msg: Internal error processing forward mysql> admin set frontend config("disable_tablet_scheduler" = "false"); ERROR 1105 (HY000): TApplicationException, msg: Internal error processing forward mysql> admin set frontend config("disable_tablet_scheduler" = "false"); ERROR 1105 (HY000): TApplicationException, msg: Internal error processing forward
And log in FE-A
2022-11-15 15:55:54,412 INFO (replayer|82) [Catalog.replayJournal():2444] replayed journal id is 52750, replay to journal id is 52751 2022-11-15 15:55:55,596 INFO (replayer|82) [Catalog.replayJournal():2444] replayed journal id is 52751, replay to journal id is 52752 2022-11-15 15:55:59,423 INFO (replayer|82) [Catalog.replayJournal():2444] replayed journal id is 52752, replay to journal id is 52753 2022-11-15 15:56:03,717 INFO (doris-mysql-nio-pool-0|133) [MasterOpExecutor.forward():102] Forward statement 6 to Master TNetworkAddress(hostname:10.18.136.53, port:9020) 2022-11-15 15:56:03,718 WARN (doris-mysql-nio-pool-0|133) [StmtExecutor.execute():482] execute Exception. stmt[6, 1f19a3138ff840f6-8cda867f4c4ae1b4] org.apache.thrift.TApplicationException: Internal error processing forward at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:79) ~[spark-dpp-1.0-SNAPSHOT.jar:1.0-SNAPSHOT] at org.apache.doris.thrift.FrontendService$Client.recvForward(FrontendService.java:467) ~[spark-dpp-1.0-SNAPSHOT.jar:1.0-SNAPSHOT] at org.apache.doris.thrift.FrontendService$Client.forward(FrontendService.java:454) ~[spark-dpp-1.0-SNAPSHOT.jar:1.0-SNAPSHOT] at org.apache.doris.qe.MasterOpExecutor.forward(MasterOpExecutor.java:106) ~[doris-fe.jar:1.0-SNAPSHOT] at org.apache.doris.qe.MasterOpExecutor.execute(MasterOpExecutor.java:61) ~[doris-fe.jar:1.0-SNAPSHOT] at org.apache.doris.qe.StmtExecutor.forwardToMaster(StmtExecutor.java:537) ~[doris-fe.jar:1.0-SNAPSHOT] at org.apache.doris.qe.StmtExecutor.execute(StmtExecutor.java:359) ~[doris-fe.jar:1.0-SNAPSHOT] at org.apache.doris.qe.StmtExecutor.execute(StmtExecutor.java:322) ~[doris-fe.jar:1.0-SNAPSHOT] at org.apache.doris.qe.ConnectProcessor.handleQuery(ConnectProcessor.java:216) ~[doris-fe.jar:1.0-SNAPSHOT] at org.apache.doris.qe.ConnectProcessor.dispatch(ConnectProcessor.java:353) ~[doris-fe.jar:1.0-SNAPSHOT] at org.apache.doris.qe.ConnectProcessor.processOnce(ConnectProcessor.java:543) ~[doris-fe.jar:1.0-SNAPSHOT] at org.apache.doris.mysql.nio.ReadListener.lambda$handleEvent$0(ReadListener.java:50) ~[doris-fe.jar:1.0-SNAPSHOT] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?] at java.lang.Thread.run(Thread.java:834) ~[?:?] 2022-11-15 15:56:04,308 INFO (replayer|82) [Catalog.replayJournal():2444] replayed journal id is 52753, replay to journal id is 52754 2022-11-15 15:56:04,309 INFO (replayer|82) [DatabaseTransactionMgr.replayUpsertTransactionState():1646] replay a committed transaction TransactionState. transaction id: 6392, label: 3bcd5a2d5c114c74-945cd9144cfe1999, db id: 11004, table id list: 11005, callback id: 11523, coordinator: FE: 10.18.136.53, transaction status: COMMITTED, error replicas num: 0, replica ids: , prepare time: 1668498933724, commit time: 1668498964306, finish time: -1, reason: attactment: RLTaskTxnCommitAttachment [filteredRows=0, loadedRows=2757, unselectedRows=0, receivedBytes=852966, taskExecutionTimeMs=30581, taskId=null, jobId=0, progress=KafkaProgress [partitionIdToOffset=2_21951290859]] 2022-11-15 15:56:04,331 INFO (replayer|82) [Catalog.replayJournal():2444] replayed journal id is 52754, replay to journal id is 52755
And when I connect to FE-B or FE-C,nothing wrong.
After upgrading all FE nodes ,everything recovered.I think there is some problems with the communication among the followers of the 2 different versions FE.