Skip to content

Fix a step regression bug in Tendermint::worker::new_blocks #1772

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 11, 2019

Conversation

foriequal0
Copy link
Contributor

@foriequal0 foriequal0 commented Sep 11, 2019

We can observe the regression in the node's peer state update log.

Reconstructed timeline of the node from log of other nodes:
(Timestamp may not be monotonic due to the network recover)

  1. Being partitioned after sending vote for height: 468212, view: 0, step: Precommit.
  2. Recovered from partition around height: 468215, view: 0, step: Precommit
  3. Sends pending previous messages for { height: 468212, view: 0, step: Precommit }
Peer state update step: VoteStep { height: 468212, view: 0, step: Precommit }
Received RequestMessage for VoteStep { height: 468212, view: 0, step: Precommit }
  1. Sends body request.
Received body request from Global xxx.xxx.xxx.xxx:9486
  1. Receives 468213, 468214 (proposal)
Peer #Global xxx.xxx.xxx.xxx:9486 status update: total_score: 37299169980127389355583666307342289824362786282821, best_hash: 0xde5c…9196
  • Tendermint::worker::new_blocks([468213, 468214(proposal)], [])
    • height_changed = true;
      self.move_to_height(468213);
      self.save_last_confirmed_view(0);
    • self.on_imported_proposal(468214);
      • self.move_to_height(468214);
        self.move_to_step(Step::Prevote);
        Peer state update step: VoteStep { height: 468214, view: 0, step: Prevote } proposal None
        
        • self.request_messages_to_all({ 468214, 0, Prevote });
        Received RequestMessage for VoteStep { height: 468214, view: 0, step: Prevote }
        
        • self.generate_and_broadcast_message();
        VoteOn { step: VoteStep { height: 468214, view: 0, step: Prevote }, block_hash: None }
        
    • self.move_to_step(Step::Propose); } // this is a regression
      Peer state update step: VoteStep { height: 468214, view: 0, step: Propose } proposal None
      
      • network::Event::SetTimerStep { Step::Propose }
  1. on_timeout for Step::Propose
    • self.move_to_step(Step::Prevote); }
      Peer state update step: VoteStep { height: 468214, view: 0, step: Prevote } proposal Some
      
      • self.votes_received = BitSet::new();
      • self.request_messages_to_all({ 468214, 0, Prevote });
        Received RequestMessage for VoteStep { height: 468214, view: 0, step: Prevote }
        
      • self.generate_and_broadcast_message(); // this is a double vote
        VoteOn { step: VoteStep { height: 468214, view: 0, step: Prevote }, block_hash: Some }
        
Node id:
350381008       388247645       729999778       167729828
        760824946       913276364       744880070
Log timestamp (Seconds after 2019-09-06 01:03)          Log message
47.4196 45.2747 41.7924 41.7540 33.3642 23.2340 25.8503	Peer state update step: VoteStep { height: 468212, view: 0, step: Precommit }
                                        25.8494 26.0468	Received RequestMessage for VoteStep { height: 468212, view: 0, step: Precommit } ...
                                                26.0472	Received body request from Global xxx.xxx.xxx.xxx:9486

47.4215 45.2721 41.7926 41.7545 33.3635 26.2126 26.2135	Peer #Global xxx.xxx.xxx.xxx:9486 status update: total_score: 37299169980127389355583666307342289824362786282821, best_hash: 0xde5c…9196

47.4226         41.7933         33.3646                 Peer state update step: VoteStep { height: 468214, view: 0, step: Prevote } proposal None
                                        26.3621         Received RequestMessage for VoteStep { height: 468214, view: 0, step: Prevote }
                41.7965         33.3651 26.3634 26.4077 VoteOn { step: VoteStep { height: 468214, view: 0, step: Prevote }, block_hash: None }

47.4232	45.2751         41.7552	33.3653                 Peer state update step: VoteStep { height: 468214, view: 0, step: Propose } proposal None

        45.2762                                         Peer state update step: VoteStep { height: 468214, view: 0, step: Prevote } proposal Some
                                        29.2148         Received RequestMessage for VoteStep { height: 468214, view: 0, step: Prevote }
                41.7970         33.3658         29.2175 VoteOn { step: VoteStep { height: 468214, view: 0, step: Prevote }, block_hash: Some }
47.4241	45.2740         41.7564         29.7166         Peer #Global xxx.xxx.xxx.xxx:9486 status update: total_score: 37299169980127389355583666307342289824362786282821, best_hash: 0xde5c…9196

167729828.log
350381008.log
388247645.log
729999778.log
744880070.log
760824946.log
913276364.log

@foriequal0 foriequal0 added the bug Something isn't working label Sep 11, 2019
@foriequal0 foriequal0 requested a review from majecty September 11, 2019 05:17
@foriequal0 foriequal0 self-assigned this Sep 11, 2019
@@ -1456,7 +1456,7 @@ impl Worker {
self.save_last_confirmed_view(prev_block_view);
}
}
if height_changed {
if height_changed && (self.height, Step::Propose) > (self.height, self.step.to_step()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the self.height needed?
It seems the two expressions are the same.

  1. (self.height, Step::Propose) > (self.height, self.step.to_step())
  2. Step::Propose > self.step.to_step()

@foriequal0 foriequal0 force-pushed the fix/on-blocks branch 4 times, most recently from bc24e15 to ad32319 Compare September 11, 2019 07:49
A regression in the step can cause a node double-vote.
@majecty majecty merged commit 2b12d0b into CodeChain-io:master Sep 11, 2019
@foriequal0 foriequal0 deleted the fix/on-blocks branch September 26, 2019 08:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants