Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd: move unlock logic to the end of startNode to fix temporary bad block when node starts #1141

Merged

Conversation

yoomee1313
Copy link
Contributor

@yoomee1313 yoomee1313 commented Feb 3, 2022

Proposed changes

closes #1103

Types of changes

Please put an x in the boxes related to your change.

  • Bugfix
  • New feature or enhancement
  • Others

Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

  • I have read the CONTRIBUTING GUIDELINES doc
  • I have signed the CLA
  • Lint and unit tests pass locally with my changes ($ make test)
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)
  • Any dependent changes have been merged and published in downstream modules

Related issues

  • Please leave the issue numbers or links related to this PR here.

Further comments

If this is a relatively large or complex change, kick off the discussion by explaining why you chose the solution you did and what alternatives you considered, etc...

@yoomee1313 yoomee1313 added this to the v1.8.0 milestone Feb 3, 2022
@yoomee1313 yoomee1313 self-assigned this Feb 3, 2022
@yoomee1313 yoomee1313 added this to In progress in Consensus via automation Feb 3, 2022
@@ -120,6 +110,17 @@ func startNode(ctx *cli.Context, stack *node.Node) {
} else {
startKlaytnAuxiliaryService(ctx, stack)
}

Copy link
Member

@aidan-kwon aidan-kwon Feb 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean that the above code should be finished earlier than other code being executed in another goroutine. So, you move the following code at the bottom of the function, right? Can you explain details about the disorder of which functions makes the bad block issue?
Your change seems that it can mitigate the issue somewhat, but cannot resolve it completely.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change is okay if we need lots of engineering to resolve the issue, but we need to figure out whether other race issues can occur because of the same reason.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aidan-kwon It seems that the node used uninitilialized istanbul chainConfig because sync starts before istanbul chainConfig is set.

The following functions are called to set istanbul chainConfig. startKlaytnAuxiliaryService -> cn.StartMining -> go s.miner.Start() -> self.worker.start() -> istanbul.Start -> sb.SetChain(chain)

At sb.SetChain(chain), chainConfig is set by setting chain to sb.chain. So, I changed the code to call startKlaytnAuxiliaryService earlier than the unlocking process.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot get the explanation. As far as I know, the moved code is not affected by chainConfig. What is the code using uninitilialized istanbul chainConfig?

@kjhman21 kjhman21 removed this from the v1.8.0 milestone Feb 10, 2022
Consensus automation moved this from In progress to Review in progress Feb 11, 2022
@yoomee1313 yoomee1313 merged commit 0b78111 into klaytn:dev Feb 11, 2022
Consensus automation moved this from Review in progress to Done Feb 11, 2022
@yoomee1313 yoomee1313 deleted the temporary-badblock-when-node-starts-fix branch February 11, 2022 02:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

A bad block occurs when downloader starts before the node's init job is done
4 participants