New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: replace remote proposal tracking with uncommitted log size protection #31408

Merged
merged 4 commits into from Oct 17, 2018

Conversation

Projects
None yet
5 participants
@nvanbenschoten
Member

nvanbenschoten commented Oct 15, 2018

Closes #30064.

This change reverts most of the non-testing code from 03b116f and f2f3fd2
and replaces it with use of the MaxUncommittedEntriesSize config. This
configuration was added in etcd-io/etcd#10167 and provides protection against
unbounded Raft log growth when a Raft group stops being able to commit
entries. It makes proposals into Raft safer because proposers don't need
to verify before the fact that the proposal isn't a duplicate that might
be blowing up the size of the Raft group.

By default, the configuration is set to double the Replica's proposal quota.
The logic here is that the quotaPool should be responsible for throttling
proposals in all cases except for unbounded Raft re-proposals because it
queues efficiently instead of dropping proposals on the floor indiscriminately.

Release note (bug fix): Fix a bug where Raft proposals could get
stuck if forwarded to a leader who could not itself append a new
entry to its log.

This will be backported, but not to 2.1.0. The plan is to get it into 2.1.1.

@nvanbenschoten nvanbenschoten requested review from bdarnell and tschottdorf Oct 15, 2018

@nvanbenschoten nvanbenschoten requested review from cockroachdb/admin-ui-prs as code owners Oct 15, 2018

@cockroach-teamcity

This comment has been minimized.

Show comment
Hide comment
@cockroach-teamcity

cockroach-teamcity Oct 15, 2018

Member

This change is Reviewable

Member

cockroach-teamcity commented Oct 15, 2018

This change is Reviewable

@bdarnell

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/storage/storagepb/state.proto, line 93 at r3 (raw file):

  uint64 last_index = 2;
  uint64 num_pending = 3;
  uint64 num_remote_pending = 9;

Add reserved 9.

@tschottdorf

:lgtm:

curious whether etcd-io/etcd#10063 will break something.

Reviewed 2 of 2 files at r1, 11 of 11 files at r2, 8 of 8 files at r3, 3 of 3 files at r4.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)


pkg/base/config.go, line 101 at r2 (raw file):

	// defaultRaftLogMaxSize specifies the upper bound that a single Range's
	// Raft log is limited to.
	defaultRaftLogMaxSize = envutil.EnvOrDefaultInt64(

Consider renaming so that it's obvious that it's the threshold after which truncation is preferred over letting the Raft log grow further due to straggler followers. Perhaps RaftLogForceTruncationThreshold?.


pkg/base/config.go, line 107 at r2 (raw file):

	// that a leader will send to followers in a single MsgApp.
	defaultRaftMaxSizePerMsg = envutil.EnvOrDefaultInt(
		"COCKROACH_RAFT_MAX_SIZE_PER_MSG", 16*1024)

/* 16 kb */


pkg/base/config.go, line 496 at r3 (raw file):

	if cfg.RaftProposalQuota == 0 {
		// By default, set this to a fraction of RaftLogMaxSize. See comment
		// above for the tradeoffs of setting this higher or lower.

"See comment above" -> "See the comment on the field."

@vilterp

LGTM for UI stuff

@nvanbenschoten

TFTRs!

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained


pkg/base/config.go, line 101 at r2 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Consider renaming so that it's obvious that it's the threshold after which truncation is preferred over letting the Raft log grow further due to straggler followers. Perhaps RaftLogForceTruncationThreshold?.

Done.


pkg/base/config.go, line 107 at r2 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

/* 16 kb */

Done.


pkg/base/config.go, line 496 at r3 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

"See comment above" -> "See the comment on the field."

Done.


pkg/storage/storagepb/state.proto, line 93 at r3 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Add reserved 9.

Done.

nvanbenschoten added some commits Oct 14, 2018

vendor: Update etcd
Picks up etcd-io/etcd#10167. Future commits will use the new setting
to replace broken logic that prevented unbounded Raft log growth.

This also picks up etcd-io/etcd#10063.

Release note: None
storage: move Raft log configurations into base.RaftConfig
This centralizes all Raft configuration and makes it easier
to configure in tests.

Release note: None
storage: replace remote proposal tracking with uncommitted log size p…
…rotection

This change reverts most of the non-testing code from 03b116f and f2f3fd2
and replaces it with use of the MaxUncommittedEntriesSize config. This
configuration was added in etcd-io/etcd#10167 and provides protection against
unbounded Raft log growth when a Raft group stops being able to commit
entries. It makes proposals into Raft safer because proposers don't need
to verify before the fact that the proposal isn't a duplicate that might
be blowing up the size of the Raft group.

By default, the configuration is set to double the Replica's proposal quota.
The logic here is that the quotaPool should be responsible for throttling
proposals in all cases except for unbounded Raft re-proposals because it
queues efficiently instead of dropping proposals on the floor indiscriminately.

Release note (bug fix): Fix a bug where Raft proposals could get
stuck if forwarded to a leader who could not itself append a new
entry to its log.
@nvanbenschoten

This comment has been minimized.

Show comment
Hide comment
@nvanbenschoten

nvanbenschoten Oct 17, 2018

Member

bors r+

Member

nvanbenschoten commented Oct 17, 2018

bors r+

craig bot pushed a commit that referenced this pull request Oct 17, 2018

Merge #31408
31408: storage: replace remote proposal tracking with uncommitted log size protection r=nvanbenschoten a=nvanbenschoten

Closes #30064.

This change reverts most of the non-testing code from 03b116f and f2f3fd2
and replaces it with use of the MaxUncommittedEntriesSize config. This
configuration was added in etcd-io/etcd#10167 and provides protection against
unbounded Raft log growth when a Raft group stops being able to commit
entries. It makes proposals into Raft safer because proposers don't need
to verify before the fact that the proposal isn't a duplicate that might
be blowing up the size of the Raft group.

By default, the configuration is set to double the Replica's proposal quota.
The logic here is that the quotaPool should be responsible for throttling
proposals in all cases except for unbounded Raft re-proposals because it
queues efficiently instead of dropping proposals on the floor indiscriminately.

Release note (bug fix): Fix a bug where Raft proposals could get
stuck if forwarded to a leader who could not itself append a new
entry to its log.

This will be backported, but not to 2.1.0. The plan is to get it into 2.1.1.

Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
@craig

This comment has been minimized.

Show comment
Hide comment
@craig

craig bot commented Oct 17, 2018

Build succeeded

@craig craig bot merged commit 0ffdb68 into cockroachdb:master Oct 17, 2018

3 checks passed

GitHub CI (Cockroach) TeamCity build finished
Details
bors Build succeeded
Details
license/cla Contributor License Agreement is signed.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment