Skip to content
This repository has been archived by the owner on Sep 6, 2018. It is now read-only.

The restart node should only apply committed logs #69

Closed
xiang90 opened this issue Jul 10, 2013 · 4 comments
Closed

The restart node should only apply committed logs #69

xiang90 opened this issue Jul 10, 2013 · 4 comments
Labels

Comments

@xiang90
Copy link
Contributor

xiang90 commented Jul 10, 2013

the restart one can know that by receive the first heartbeat from the current leader (but it is slow to replay all the logs after the server has been started)

maybe we can keep the committed index and flush that to disk every several seconds?

@benbjohnson
Copy link
Contributor

@ongardie Should a server wait until receiving an AppendEntries RPC (to receive the commitIndex) from the current leader before applying commands in the log? Or should the committed index be saved to disk?

@ongardie
Copy link

To make sure everyone's clear, persisting commitIndex isn't needed for safety, since a new leader can always figure this out again with help from a quorum, and then tells everyone else through AppendEntries RPCs. I assume the question is when/whether it's beneficial to persist the commitIndex to disk. You can certainly do it, and as xiangli points out, you can do so asynchronously or periodically. But here are a couple reasons why you may not want to:

  • It's extra code.
  • If you're booting a server, hopefully it's in the minority of your cluster that's not needed for availability.
  • If you're booting a server, it's probably already experienced significant downtime. If you delay applying log commands even by a few minutes, it wouldn't necessarily add significantly more downtime.
  • If you need to read the log from a magnetic disk, you're lucky if you can get 100MB/s. Applying the commands should be much faster (a few hundred MB/s probably), so Amdahl's law says you can't expect much gain by overlapping the two. A good SSD might change this, though.

So I guess the most super-optimized implementations would do this, but I probably wouldn't bother.

@benbjohnson
Copy link
Contributor

@ongardie I understand that it's not needed for safety. I guess my biggest concern was if the entire cluster goes down and then reboots then every node is waiting for an AE. I suppose it could wait for an election timeout before trying to replay the log. That way you wouldn't get delayed by replaying first if it's not needed.

@xiang90
Copy link
Contributor Author

xiang90 commented Jul 26, 2013

fixed.

@xiang90 xiang90 closed this as completed Jul 26, 2013
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants