Skip to content

Ensure the DataNode can obtain the complete list of ConfigNodes#12220

Closed
liyuheng55555 wants to merge 3 commits intoapache:masterfrom
liyuheng55555:Working/fix-cluster-restart-IT
Closed

Ensure the DataNode can obtain the complete list of ConfigNodes#12220
liyuheng55555 wants to merge 3 commits intoapache:masterfrom
liyuheng55555:Working/fix-cluster-restart-IT

Conversation

@liyuheng55555
Copy link
Copy Markdown
Collaborator

@liyuheng55555 liyuheng55555 commented Mar 22, 2024

Over the past period, some initialization work has been brought forward to the DataNode startup phase, which has slightly slowed down the startup speed of the DataNode, and in turn, increased the time between DataNode registration and the start of the RPC service.

Now, if the ConfigNode and DataNode are started simultaneously, a fault may occur where the DataNode cannot obtain a complete list of ConfigNodes.

The occurrence of this fault requires a specific sequence of events:

  • When the DataNode registration returns, because the ConfigNode has not completed registration, the DataNode does not get a complete list of ConfigNodes.
  • After the ConfigNode registration is complete, it fails to send the complete list to the DataNode via RPC because the DataNode RPC service has not yet started.

A temporary solution to this problem is to move the classLoader() function somewhere after RPC service starting.

By the way, the classLoader function :

  private void classLoader() {
    try {
      // StatementGenerator
      Class.forName(StatementGenerator.class.getName());
      Class.forName(ASTVisitor.class.getName());
      Class.forName(SqlLexer.class.getName());
      Class.forName(CommonTokenStream.class.getName());
      Class.forName(IoTDBSqlParser.class.getName());
      // SourceRewriter
      Class.forName(SourceRewriter.class.getName());
      Class.forName(DistributionPlanContext.class.getName());
      // LogicalPlaner
      Class.forName(LogicalPlanVisitor.class.getName());
      Class.forName(LogicalQueryPlan.class.getName());
      // TsFileProcessor
      Class.forName(TsFileProcessor.class.getName());
    } catch (ClassNotFoundException e) {
      logger.error("load class error: ", e);
    }
  }

@liyuheng55555 liyuheng55555 force-pushed the Working/fix-cluster-restart-IT branch from 297ea5f to a5a1daa Compare March 22, 2024 10:09
Copy link
Copy Markdown
Collaborator

@HxpSerein HxpSerein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Contributor

@OneSizeFitsQuorum OneSizeFitsQuorum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to resolve it thorough, or the code looks even uglier

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants