Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Learners do not appear in Membership.all_nodes #608

Closed
avantgardnerio opened this issue Dec 3, 2022 · 6 comments
Closed

Learners do not appear in Membership.all_nodes #608

avantgardnerio opened this issue Dec 3, 2022 · 6 comments

Comments

@avantgardnerio
Copy link

avantgardnerio commented Dec 3, 2022

Describe the bug

This may be my own misunderstanding, but I assumed (and infer from the docs in main) that Membership.all_nodes should include learners:

    /// A node-id key that is in `nodes` but is not in `configs` is a **learner**.

However, when I add a learner in my test, I see that it is receiving messages (added to the "volitile state" LeaderState.nodes ), but is not present in the leader's Membership state, and no EntryPayload::Membership is ever sent to the group.

To Reproduce
Steps to reproduce the behavior:

  1. Create a raft group
  2. Add a learner
  3. wait for things to settle
  4. print the leader state and the learner state
  5. observe the learner is in sync with the leader, but neither has the learner as a member

Expected behavior

The learner is shown in all_nodes.

Actual behavior

Test: shutting down learner=RaftMetrics { running_state: Ok(()), id: 5, state: Learner, current_term: 2, last_log_index: Some(3), last_applied: Some(LogId { term: 2, index: 3 }), current_leader: Some(4), membership_config: EffectiveMembership { log_id: LogId { term: 0, index: 0 }, membership: Membership { configs: [{1, 2, 3, 4}], all_nodes: {1, 2, 3, 4} } }, snapshot: None, leader_metrics: None }
Test: shutting down leader=RaftMetrics { running_state: Ok(()), id: 4, state: Leader, current_term: 2, last_log_index: Some(3), last_applied: Some(LogId { term: 2, index: 3 }), current_leader: Some(4), membership_config: EffectiveMembership { log_id: LogId { term: 0, index: 0 }, membership: Membership { configs: [{1, 2, 3, 4}], all_nodes: {1, 2, 3, 4} } }, snapshot: None, leader_metrics: Some(LeaderMetrics { replication: {1: ReplicationMetrics { matched: Some(LogId { term: 2, index: 3 }) }, 3: ReplicationMetrics { matched: Some(LogId { term: 2, index: 3 }) }, 5: ReplicationMetrics { matched: Some(LogId { term: 2, index: 3 }) }, 2: ReplicationMetrics { matched: Some(LogId { term: 2, index: 3 }) }} }) }

snip:

learner=RaftMetrics... membership: Membership { configs: [{1, 2, 3, 4}], all_nodes: {1, 2, 3, 4} } }
leader=RaftMetrics... membership: Membership { configs: [{1, 2, 3, 4}], all_nodes: {1, 2, 3, 4} } }

(the ID of the learner is 5 during this test)

Env (please complete the following information):

  • OS: ubuntu 22.04
  • CPU: lambda book something
  • Rust-toolchain: nightly
  • Openraft version 0.7.3

Additional files:

Additional context

N/A

@github-actions
Copy link

github-actions bot commented Dec 3, 2022

👋 Thanks for opening this issue!

Get help or engage by:

  • /help : to print help messages.
  • /assignme : to assign this issue to you.

@drmingdrmer
Copy link
Member

0.7.3 does not implement persistent learner storage in Membership.

If you need this feature, use the main branch instead. :(

@avantgardnerio
Copy link
Author

Thanks for your response. I switched to the master branch, and I can confirm it works there!

I did want to ask about some possibly unrelated behavior: on the 0.7.x branch I saw Blank messages were sent as heartbeats, but I didn't think they were appended to the log or applied to the state machine. In master, I see lots of Blank traffic:

Node 1: append_to_log log_idx=LogId { leader_id: LeaderId { term: 1, node_id: 4 }, index: 69 } entry=Entry { log_id: LogId { leader_id: LeaderId { term: 1, node_id: 4 }, index: 69 }, payload: Blank }
Node 2: append_to_log log_idx=LogId { leader_id: LeaderId { term: 1, node_id: 4 }, index: 69 } entry=Entry { log_id: LogId { leader_id: LeaderId { term: 1, node_id: 4 }, index: 69 }, payload: Blank }
Node 3: append_to_log log_idx=LogId { leader_id: LeaderId { term: 1, node_id: 4 }, index: 69 } entry=Entry { log_id: LogId { leader_id: LeaderId { term: 1, node_id: 4 }, index: 69 }, payload: Blank }
Node 5: append_to_log log_idx=LogId { leader_id: LeaderId { term: 1, node_id: 4 }, index: 69 } entry=Entry { log_id: LogId { leader_id: LeaderId { term: 1, node_id: 4 }, index: 69 }, payload: Blank }
Node 4: append_to_log log_idx=LogId { leader_id: LeaderId { term: 1, node_id: 4 }, index: 70 } entry=Entry { log_id: LogId { leader_id: LeaderId { term: 1, node_id: 4 }, index: 70 }, payload: Blank }
Node 4: sending AppendEntriesRequest to 1 AppendEntriesRequest { vote: Vote { term: 1, node_id: 4, committed: true }, prev_log_id: Some(LogId { leader_id: LeaderId { term: 1, node_id: 4 }, index: 69 }), entries: [Entry { log_id: LogId { leader_id: LeaderId { term: 1, node_id: 4 }, index: 70 }, payload: Blank }], leader_commit: Some(LogId { leader_id: LeaderId { term: 1, node_id: 4 }, index: 69 }) }
Node 4: sending AppendEntriesRequest to 2 AppendEntriesRequest { vote: Vote { term: 1, node_id: 4, committed: true }, prev_log_id: Some(LogId { leader_id: LeaderId { term: 1, node_id: 4 }, index: 69 }), entries: [Entry { log_id: LogId { leader_id: LeaderId { term: 1, node_id: 4 }, index: 70 }, payload: Blank }], leader_commit: Some(LogId { leader_id: LeaderId { term: 1, node_id: 4 }, index: 69 }) }
Node 4: sending AppendEntriesRequest to 3 AppendEntriesRequest { vote: Vote { term: 1, node_id: 4, committed: true }, prev_log_id: Some(LogId { leader_id: LeaderId { term: 1, node_id: 4 }, index: 69 }), entries: [Entry { log_id: LogId { leader_id: LeaderId { term: 1, node_id: 4 }, index: 70 }, payload: Blank }], leader_commit: Some(LogId { leader_id: LeaderId { term: 1, node_id: 4 }, index: 69 }) }
Node 4: sending AppendEntriesRequest to 5 AppendEntriesRequest { vote: Vote { term: 1, node_id: 4, committed: true }, prev_log_id: Some(LogId { leader_id: LeaderId { term: 1, node_id: 4 }, index: 69 }), entries: [Entry { log_id: LogId { leader_id: LeaderId { term: 1, node_id: 4 }, index: 70 }, payload: Blank }], leader_commit: Some(LogId { leader_id: LeaderId { term: 1, node_id: 4 }, index: 69 }) }
Node 1: append_to_log log_idx=LogId { leader_id: LeaderId { term: 1, node_id: 4 }, index: 70 } entry=Entry { log_id: LogId { leader_id: LeaderId { term: 1, node_id: 4 }, index: 70 }, payload: Blank }
Node 2: append_to_log log_idx=LogId { leader_id: LeaderId { term: 1, node_id: 4 }, index: 70 } entry=Entry { log_id: LogId { leader_id: LeaderId { term: 1, node_id: 4 }, index: 70 }, payload: Blank }
Node 3: append_to_log log_idx=LogId { leader_id: LeaderId { term: 1, node_id: 4 }, index: 70 } entry=Entry { log_id: LogId { leader_id: LeaderId { term: 1, node_id: 4 }, index: 70 }, payload: Blank }
Node 5: append_to_log log_idx=LogId { leader_id: LeaderId { term: 1, node_id: 4 }, index: 70 } entry=Entry { log_id: LogId { leader_id: LeaderId { term: 1, node_id: 4 }, index: 70 }, payload: Blank }

Is that normal? It seems like heartbeats in the raft log would cause quite a bit of write amplification.

@avantgardnerio
Copy link
Author

Follow up confirmation, this filter used to work for heartbeats:

impl RaftNetwork<ColDbRaftConfig> for GrpcNetworkConnection {
    async fn send_append_entries(
        &mut self,
        req: AppendEntriesRequest<ColDbRaftConfig>,
    ) -> Result<AppendEntriesResponse<Nid>, RPCError<Nid, UuidNode, AppendEntriesError<Nid>>> {
        let is_heartbeat = req.entries.is_empty();
        if !is_heartbeat {
            println!("Node {}: sending AppendEntriesRequest to {} {:?}", self.owner.node_id, self.target, req);
        }

@drmingdrmer
Copy link
Member

Yes, the heartbeat message is changed:

In main branch, I was trying to implement a heartbeat with a message with a blank log, instead of just a blank message, i.e., every heartbeat will commit a read log entry.

This simplifies the logic when granting a candidate's vote request. But it introduces unnecessary IO(to flush blank log to disk).

See:

@avantgardnerio
Copy link
Author

Thanks for the clarification!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants