Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raft: send empty appends when replication is paused #14633

Merged
merged 7 commits into from
Nov 8, 2022

Conversation

pav-kv
Copy link
Contributor

@pav-kv pav-kv commented Oct 26, 2022

When Inflights to a particular node is full, i.e. MaxInflightMsgs for the append messages flow is saturated, it is still necessary to continue sending MsgApp to ensure progress. Currently this is achieved by "forgetting" the first in-flight message in the window, which frees up quota for one new MsgApp.

This new message is constructed in such a way that it potentially has multiple entries, or a large entry. The effect of this is that the in-flight limitations can be exceeded arbitrarily, for as long as the flow to this node continues being saturated. In particular, if a follower is stuck, the leader will keep sending entries to it.

This commit makes the MsgApp empty when Inflights is saturated, and prevents the described leakage of Entries to slow followers.

Signed-off-by: Pavel Kalinnikov pavel@cockroachlabs.com

@pav-kv
Copy link
Contributor Author

pav-kv commented Oct 26, 2022

  • Fix the tests in raft_flow_control_test.go
  • Make a regression test for the described unlimited leakage behaviour.

@codecov-commenter
Copy link

codecov-commenter commented Oct 26, 2022

Codecov Report

Merging #14633 (b525d02) into main (5073af6) will decrease coverage by 0.28%.
The diff coverage is 90.19%.

@@            Coverage Diff             @@
##             main   #14633      +/-   ##
==========================================
- Coverage   75.81%   75.53%   -0.29%     
==========================================
  Files         457      457              
  Lines       37292    37298       +6     
==========================================
- Hits        28274    28174     -100     
- Misses       7271     7348      +77     
- Partials     1747     1776      +29     
Flag Coverage Δ
all 75.53% <90.19%> (-0.29%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
raft/tracker/inflights.go 95.83% <ø> (-0.09%) ⬇️
raft/raft.go 89.93% <89.47%> (-0.33%) ⬇️
raft/tracker/progress.go 95.91% <89.47%> (-1.76%) ⬇️
raft/rafttest/interaction_env_handler_add_nodes.go 71.64% <90.90%> (+2.28%) ⬆️
raft/rafttest/interaction_env.go 100.00% <100.00%> (ø)
client/pkg/v3/testutil/leak.go 62.83% <0.00%> (-7.08%) ⬇️
client/v3/experimental/recipes/queue.go 58.62% <0.00%> (-6.90%) ⬇️
client/v3/leasing/util.go 91.66% <0.00%> (-6.67%) ⬇️
server/etcdserver/api/v3rpc/interceptor.go 77.60% <0.00%> (-5.21%) ⬇️
server/storage/mvcc/watchable_store.go 89.13% <0.00%> (-5.08%) ⬇️
... and 22 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@pav-kv pav-kv force-pushed the send_empty_append branch 2 times, most recently from 8bf78f9 to e00dba8 Compare October 28, 2022 18:29
@pav-kv
Copy link
Contributor Author

pav-kv commented Oct 28, 2022

@tbg This is now ready for review.

raft/tracker/progress.go Outdated Show resolved Hide resolved
Copy link
Contributor

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I'd mainly try to rearrange code and add more comments so that the intricacies of this dance are bit easier to grasp for future readers. See inline comments.

I also think it is worthwhile to add a TestInteraction test for this.

raft/raft.go Show resolved Hide resolved
raft/tracker/progress.go Outdated Show resolved Hide resolved
raft/tracker/progress.go Outdated Show resolved Hide resolved
@pav-kv pav-kv force-pushed the send_empty_append branch 3 times, most recently from 7680b2a to cae22cd Compare October 31, 2022 23:08
Copy link
Contributor Author

@pav-kv pav-kv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tbg Addressed most comments, notably added the data-driven test.

raft/raft.go Show resolved Hide resolved
raft/tracker/progress.go Outdated Show resolved Hide resolved
@pav-kv pav-kv force-pushed the send_empty_append branch 6 times, most recently from 1928d4d to 9d81638 Compare November 1, 2022 22:19
@pav-kv
Copy link
Contributor Author

pav-kv commented Nov 1, 2022

@ahrtr Please review or redirect to someone in etcd/raft folks. Commit-by-commit review is recommended. Only the 5-th commit raft: send empty appends when replication is paused changes the behaviour.

Overall commits structure:

  • First 2 commits only add and strengthen tests. In particular, the first commit captures the problematic behaviour that this PR solves.
  • Commits 3 and 4 contain drive-by small refactorings that don't change behaviour, but pave the way towards the fix.
  • Commit 5 is the fix. Notice how it modifies the behaviour, which is reflected in tests.
  • The last commit removes a no longer used method.

raft/testdata/replicate_pause.txt Show resolved Hide resolved
raft/raft.go Show resolved Hide resolved
raft/raft.go Show resolved Hide resolved
raft/tracker/progress.go Outdated Show resolved Hide resolved
@tbg
Copy link
Contributor

tbg commented Nov 3, 2022

Looks good to me.

@ahrtr mind taking a look? There is an E2E failure1; I find these impossible to decode and they often seem to be flaky, so not sure if it's something that we should just retry.

Footnotes

  1. https://github.com/etcd-io/etcd/actions/runs/3373310602/jobs/5598102673#step:5:1112

@pav-kv
Copy link
Contributor Author

pav-kv commented Nov 4, 2022

There is an E2E failure

Probably flaky. The last run is good.

raft/tracker/progress.go Outdated Show resolved Hide resolved
raft/tracker/progress.go Outdated Show resolved Hide resolved
raft/tracker/progress.go Outdated Show resolved Hide resolved
raft/raft.go Show resolved Hide resolved
Copy link
Contributor Author

@pav-kv pav-kv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed comments form @ahrtr

raft/raft.go Show resolved Hide resolved
raft/tracker/progress.go Outdated Show resolved Hide resolved
raft/tracker/progress.go Outdated Show resolved Hide resolved
@pav-kv pav-kv force-pushed the send_empty_append branch 3 times, most recently from 5817a3f to b525d02 Compare November 8, 2022 13:15
raft/tracker/progress.go Show resolved Hide resolved
Copy link
Member

@ahrtr ahrtr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me.

Thank you @pavelkalinnikov

@ahrtr
Copy link
Member

ahrtr commented Nov 8, 2022

We recently added some mix-versions test, and also removed some dependencies, so please rebase this PR. Once all pipelines are green, then it's OK to merge this PR.

This commit adds a data-driven test which simulates conditions under which Raft
messages flow to a particular node is throttled while in StateReplicate. The
test demonstrates that MsgApp messages with non-empty Entries may "leak" to a
paused stream every time there is successful heartbeat exchange.

Signed-off-by: Pavel Kalinnikov <pavel@cockroachlabs.com>
Signed-off-by: Pavel Kalinnikov <pavel@cockroachlabs.com>
- avoid large indented blocks, leave the main block unindented
- declare pb.Message inlined in the sending call

Signed-off-by: Pavel Kalinnikov <pavel@cockroachlabs.com>
Previously, Progress update on MsgApp send was scattered across raft.go and
tracker/progress.go. This commit better encapsulates this logic in the Progress
type.

Signed-off-by: Pavel Kalinnikov <pavel@cockroachlabs.com>
When Inflights to a particular node is full, i.e. MaxInflightMsgs for the
append messages flow is saturated, it is still necessary to continue sending
MsgApp to ensure progress. Currently this is achieved by "forgetting" the first
in-flight message in the window, which frees up quota for one new MsgApp.

This new message is constructed in such a way that it potentially has multiple
entries, or a large entry. The effect of this is that the in-flight limitations
can be exceeded arbitrarily, for as long as the flow to this node continues
being saturated. In particular, if a follower is stuck, the leader will keep
sending entries to it.

This commit makes the MsgApp empty when Inflights is saturated, and prevents
the described leakage of Entries to slow followers.

Signed-off-by: Pavel Kalinnikov <pavel@cockroachlabs.com>
Signed-off-by: Pavel Kalinnikov <pavel@cockroachlabs.com>
Make the field name and comment clearer on the fact that it's used both in
StateProbe and StateReplicate. The old name ProbeSent was slightly confusing,
and also triggered thinking that it's used only in StateProbe.

Signed-off-by: Pavel Kalinnikov <pavel@cockroachlabs.com>
@pav-kv
Copy link
Contributor Author

pav-kv commented Nov 8, 2022

We recently added some mix-versions test, and also removed some dependencies, so please rebase this PR. Once all pipelines are green, then it's OK to merge this PR.

@ahrtr Rebased. Could you trigger CI once again and/or merge?

@ahrtr
Copy link
Member

ahrtr commented Nov 8, 2022

We recently added some mix-versions test, and also removed some dependencies, so please rebase this PR. Once all pipelines are green, then it's OK to merge this PR.

@ahrtr Rebased. Could you trigger CI once again and/or merge?

Done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

4 participants