Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Send heartbeat with a blank log #151

Closed
Tracked by #253
ariesdevil opened this issue Jan 25, 2022 · 1 comment · Fixed by #483
Closed
Tracked by #253

Send heartbeat with a blank log #151

ariesdevil opened this issue Jan 25, 2022 · 1 comment · Fixed by #483
Assignees

Comments

@ariesdevil
Copy link
Contributor

ariesdevil commented Jan 25, 2022

We can change sending heartbeat with a blank log that we can combines heartbeat with append entry

@github-actions
Copy link

👋 Thanks for opening this issue!

Get help or engage by:

  • /help : to print help messages.
  • /assignme : to assign this issue to you.

@ariesdevil ariesdevil self-assigned this Jan 25, 2022
@drmingdrmer drmingdrmer mentioned this issue Mar 19, 2022
3 tasks
drmingdrmer added a commit to drmingdrmer/openraft that referenced this issue Jul 31, 2022
Heartbeat in standard raft is the way for a leader to assert it is still alive.
- A leader send heartbeat at a regular interval.
- A follower that receives a heartbeat believes there is an active leader thus it rejects election request(`send_vote`) from another node unreachable to the leader, for a short period.

Openraft heartbeat is a blank log

Such a heartbeat mechanism depends on clock time.
But raft as a distributed consensus already has its own **pseudo time** defined very well.
The **pseudo time** in openraft is a tuple `(vote, last_log_id)`, compared in dictionary order.

Why it works

To refuse the election by a node that does not receive recent messages from the current leader,
just let the active leader send a **blank log** to increase the **pseudo time** on a quorum.

Because the leader must have the greatest **pseudo time**,
thus by comparing the **pseudo time**, a follower automatically refuse election request from a node unreachable to the leader.

And comparing the **pseudo time** is already done by `handle_vote_request()`,
there is no need to add another timer for the active leader.

Other changes:

- Feature: add API to switch timeout based events:
  - `Raft::enable_tick()`: switch on/off election and heartbeat.
  - `Raft::enable_heartbeat()`: switch on/off heartbeat.
  - `Raft::enable_elect()`: switch on/off election.

  These methods make some testing codes easier to write.
  The corresponding `Config` entries are also added:
  `Config::enable_tick`
  `Config::enable_heartbeat`
  `Config::enable_elect`

- Refactor: remove Engine `Command::RejectElection`.
  Rejecting election now is part of `handle_vote_req()` as blank-log
  heartbeat is introduced.

- Refactor: heartbeat is removed from `ReplicationCore`.
  Instead, heartbeat is emitted by `RaftCore`.

- Fix: when failed to sending append-entries, do not clear
  `need_to_replicate` flag.

- CI: add test with higher network delay.

- Doc: explain why using blank log as heartbeat.

- Fix: datafuselabs#151
drmingdrmer added a commit that referenced this issue Aug 1, 2022
* Feature: use blank log for heartbeat

Heartbeat in standard raft is the way for a leader to assert it is still alive.
- A leader send heartbeat at a regular interval.
- A follower that receives a heartbeat believes there is an active leader thus it rejects election request(`send_vote`) from another node unreachable to the leader, for a short period.

Openraft heartbeat is a blank log

Such a heartbeat mechanism depends on clock time.
But raft as a distributed consensus already has its own **pseudo time** defined very well.
The **pseudo time** in openraft is a tuple `(vote, last_log_id)`, compared in dictionary order.

Why it works

To refuse the election by a node that does not receive recent messages from the current leader,
just let the active leader send a **blank log** to increase the **pseudo time** on a quorum.

Because the leader must have the greatest **pseudo time**,
thus by comparing the **pseudo time**, a follower automatically refuse election request from a node unreachable to the leader.

And comparing the **pseudo time** is already done by `handle_vote_request()`,
there is no need to add another timer for the active leader.

Other changes:

- Feature: add API to switch timeout based events:
  - `Raft::enable_tick()`: switch on/off election and heartbeat.
  - `Raft::enable_heartbeat()`: switch on/off heartbeat.
  - `Raft::enable_elect()`: switch on/off election.

  These methods make some testing codes easier to write.
  The corresponding `Config` entries are also added:
  `Config::enable_tick`
  `Config::enable_heartbeat`
  `Config::enable_elect`

- Refactor: remove Engine `Command::RejectElection`.
  Rejecting election now is part of `handle_vote_req()` as blank-log
  heartbeat is introduced.

- Refactor: heartbeat is removed from `ReplicationCore`.
  Instead, heartbeat is emitted by `RaftCore`.

- Fix: when failed to sending append-entries, do not clear
  `need_to_replicate` flag.

- CI: add test with higher network delay.

- Doc: explain why using blank log as heartbeat.

- Fix: #151
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants