storage: use raft log size (in bytes) as metric for truncation #7438

Merged
merged 1 commit into from Jul 7, 2016

Conversation

Projects
None yet
4 participants
@d4l3k
Contributor

d4l3k commented Jun 23, 2016

The raft log currently truncates based of the number of entries (max 10000).
These entries can be of any size so it makes more sense to use the byte size of
the raft log as a metric for truncation.

If the raft log is greater than 64MB and there is a behind node, it will
truncate the log to the quorum committed index.

See #7065, #7201.

This PR has the changes that were discussed last Tuesday. I've also reenabled TestGetTruncatableIndexes since it seems to be fixed with these changes (verified with 4000 iterations using stress). I haven't removed the comments yet since I'm not entirely sure why.

This is a new pull request since I closed the old one while I was working on it and then force pushed which stops me from reopening it.


This change is Reviewable

@tamird

This comment has been minimized.

Show comment
Hide comment
@tamird

tamird Jun 23, 2016

Collaborator

Reviewed 13 of 13 files at r1.
Review status: all files reviewed at latest revision, 5 unresolved discussions, some commit checks pending.


storage/raft_log_queue.go, line 67 [r1] (raw file):

// getTruncatableIndexes returns the total number of stale raft log entries that
// can be truncated and the oldest index that cannot be pruned. If estimate is

remove this change

We need a comment here that describes in some detail the raft log truncation strategy.


storage/raft_log_queue.go, line 100 [r1] (raw file):

  oldestIndex := getMaximumMatchedIndex(raftStatus)

  // Truncate raft logs to quorum index if the raft log is too big, or the

raftStatus.Commit is not the quorum index...


storage/raft_log_queue.go, line 160 [r1] (raw file):

}

// getMaximumMatchedIndex returns the index that has been committed by every

"the maximum index"


storage/replica.go, line 206 [r1] (raw file):

      // Last index persisted to the raft log (not necessarily committed).
      lastIndex uint64
      // The size of the persisted raft log.

"The size in bytes"


storage/replica_raftstorage.go, line 457 [r1] (raw file):

// of r.lastIndex and returns a new value. We do this rather than
// modifying r.lastIndex directly because this modification needs to
// be atomic with the commit of the batch.

needs an update


Comments from Reviewable

Collaborator

tamird commented Jun 23, 2016

Reviewed 13 of 13 files at r1.
Review status: all files reviewed at latest revision, 5 unresolved discussions, some commit checks pending.


storage/raft_log_queue.go, line 67 [r1] (raw file):

// getTruncatableIndexes returns the total number of stale raft log entries that
// can be truncated and the oldest index that cannot be pruned. If estimate is

remove this change

We need a comment here that describes in some detail the raft log truncation strategy.


storage/raft_log_queue.go, line 100 [r1] (raw file):

  oldestIndex := getMaximumMatchedIndex(raftStatus)

  // Truncate raft logs to quorum index if the raft log is too big, or the

raftStatus.Commit is not the quorum index...


storage/raft_log_queue.go, line 160 [r1] (raw file):

}

// getMaximumMatchedIndex returns the index that has been committed by every

"the maximum index"


storage/replica.go, line 206 [r1] (raw file):

      // Last index persisted to the raft log (not necessarily committed).
      lastIndex uint64
      // The size of the persisted raft log.

"The size in bytes"


storage/replica_raftstorage.go, line 457 [r1] (raw file):

// of r.lastIndex and returns a new value. We do this rather than
// modifying r.lastIndex directly because this modification needs to
// be atomic with the commit of the batch.

needs an update


Comments from Reviewable

@d4l3k

This comment has been minimized.

Show comment
Hide comment
@d4l3k

d4l3k Jun 23, 2016

Contributor

Review status: all files reviewed at latest revision, 5 unresolved discussions, some commit checks broke.


storage/raft_log_queue.go, line 67 [r1] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

remove this change

We need a comment here that describes in some detail the raft log truncation strategy.

Done.

storage/raft_log_queue.go, line 100 [r1] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

raftStatus.Commit is not the quorum index...

How so? Isn't it the latest committed index that the current node knows about? Hence the quorum committed index?

storage/raft_log_queue.go, line 160 [r1] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

"the maximum index"

Done.

storage/replica.go, line 206 [r1] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

"The size in bytes"

Done.

storage/replica_raftstorage.go, line 457 [r1] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

needs an update

Done.

Comments from Reviewable

Contributor

d4l3k commented Jun 23, 2016

Review status: all files reviewed at latest revision, 5 unresolved discussions, some commit checks broke.


storage/raft_log_queue.go, line 67 [r1] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

remove this change

We need a comment here that describes in some detail the raft log truncation strategy.

Done.

storage/raft_log_queue.go, line 100 [r1] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

raftStatus.Commit is not the quorum index...

How so? Isn't it the latest committed index that the current node knows about? Hence the quorum committed index?

storage/raft_log_queue.go, line 160 [r1] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

"the maximum index"

Done.

storage/replica.go, line 206 [r1] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

"The size in bytes"

Done.

storage/replica_raftstorage.go, line 457 [r1] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

needs an update

Done.

Comments from Reviewable

@tamird

This comment has been minimized.

Show comment
Hide comment
@tamird

tamird Jun 23, 2016

Collaborator

Reviewed 3 of 3 files at r2.
Review status: all files reviewed at latest revision, 2 unresolved discussions, some commit checks pending.


storage/raft_log_queue.go, line 67 [r1] (raw file):

Previously, d4l3k (Tristan Rice) wrote…

Done.

This isn't detailed enough. An uninitiated reader will not understand what this really does from this description.

storage/raft_log_queue.go, line 101 [r2] (raw file):

  // Truncate raft logs to the quorum committed index if the raft log is too
  // big, or the oldest node is too far behind.

can you refine "too far behind"? probably "or there exists a node which can only be caught up by snapshot", but even that description is anemic. I think this comment should be reworked completely to make the consequences of this decision clear.


Comments from Reviewable

Collaborator

tamird commented Jun 23, 2016

Reviewed 3 of 3 files at r2.
Review status: all files reviewed at latest revision, 2 unresolved discussions, some commit checks pending.


storage/raft_log_queue.go, line 67 [r1] (raw file):

Previously, d4l3k (Tristan Rice) wrote…

Done.

This isn't detailed enough. An uninitiated reader will not understand what this really does from this description.

storage/raft_log_queue.go, line 101 [r2] (raw file):

  // Truncate raft logs to the quorum committed index if the raft log is too
  // big, or the oldest node is too far behind.

can you refine "too far behind"? probably "or there exists a node which can only be caught up by snapshot", but even that description is anemic. I think this comment should be reworked completely to make the consequences of this decision clear.


Comments from Reviewable

@d4l3k

This comment has been minimized.

Show comment
Hide comment
@d4l3k

d4l3k Jun 23, 2016

Contributor

Review status: all files reviewed at latest revision, 2 unresolved discussions, some commit checks failed.


storage/raft_log_queue.go, line 67 [r1] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

This isn't detailed enough. An uninitiated reader will not understand what this really does from this description.

But then they won't have to suffer like I did!

Better?


storage/raft_log_queue.go, line 101 [r2] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

can you refine "too far behind"? probably "or there exists a node which can only be caught up by snapshot", but even that description is anemic. I think this comment should be reworked completely to make the consequences of this decision clear.

Better?

Comments from Reviewable

Contributor

d4l3k commented Jun 23, 2016

Review status: all files reviewed at latest revision, 2 unresolved discussions, some commit checks failed.


storage/raft_log_queue.go, line 67 [r1] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

This isn't detailed enough. An uninitiated reader will not understand what this really does from this description.

But then they won't have to suffer like I did!

Better?


storage/raft_log_queue.go, line 101 [r2] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

can you refine "too far behind"? probably "or there exists a node which can only be caught up by snapshot", but even that description is anemic. I think this comment should be reworked completely to make the consequences of this decision clear.

Better?

Comments from Reviewable

@tamird

This comment has been minimized.

Show comment
Hide comment
@tamird

tamird Jun 23, 2016

Collaborator

Reviewed 1 of 1 files at r3.
Review status: all files reviewed at latest revision, 2 unresolved discussions, some commit checks pending.


storage/raft_log_queue.go, line 67 [r1] (raw file):

Previously, d4l3k (Tristan Rice) wrote…

But then they won't have to suffer like I did!

Better?

Still find this hard to read, but OK.

storage/raft_log_queue.go, line 102 [r3] (raw file):

  // Always truncate the oldest raft log entry which has been committed by all replicas.
  oldestIndex := getMaximumMatchedIndex(raftStatus)

this is misnamed, right? it's the latest (and more accurately maximum) index, not the oldest. This misnaming also causes the below comment to be difficult to parse.


Comments from Reviewable

Collaborator

tamird commented Jun 23, 2016

Reviewed 1 of 1 files at r3.
Review status: all files reviewed at latest revision, 2 unresolved discussions, some commit checks pending.


storage/raft_log_queue.go, line 67 [r1] (raw file):

Previously, d4l3k (Tristan Rice) wrote…

But then they won't have to suffer like I did!

Better?

Still find this hard to read, but OK.

storage/raft_log_queue.go, line 102 [r3] (raw file):

  // Always truncate the oldest raft log entry which has been committed by all replicas.
  oldestIndex := getMaximumMatchedIndex(raftStatus)

this is misnamed, right? it's the latest (and more accurately maximum) index, not the oldest. This misnaming also causes the below comment to be difficult to parse.


Comments from Reviewable

@d4l3k

This comment has been minimized.

Show comment
Hide comment
@d4l3k

d4l3k Jun 23, 2016

Contributor

Review status: all files reviewed at latest revision, 1 unresolved discussion, some commit checks pending.


storage/raft_log_queue.go, line 102 [r3] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

this is misnamed, right? it's the latest (and more accurately maximum) index, not the oldest. This misnaming also causes the below comment to be difficult to parse.

It's the index that we can truncate to, which could be thought of as the "oldest index" we keep. Renamed and clarified the below comment.

Comments from Reviewable

Contributor

d4l3k commented Jun 23, 2016

Review status: all files reviewed at latest revision, 1 unresolved discussion, some commit checks pending.


storage/raft_log_queue.go, line 102 [r3] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

this is misnamed, right? it's the latest (and more accurately maximum) index, not the oldest. This misnaming also causes the below comment to be difficult to parse.

It's the index that we can truncate to, which could be thought of as the "oldest index" we keep. Renamed and clarified the below comment.

Comments from Reviewable

@tschottdorf

This comment has been minimized.

Show comment
Hide comment
@tschottdorf

tschottdorf Jun 24, 2016

Member

:lgtm:. Mind adding in the commit message that we keep track only in-memory, and that that's sufficient (can be the same text as requested in one of my comments below)


Reviewed 13 of 13 files at r1, 3 of 3 files at r2, 1 of 1 files at r3, 1 of 1 files at r4, 1 of 1 files at r5.
Review status: all files reviewed at latest revision, 4 unresolved discussions, all commit checks successful.


storage/raft_log_queue.go, line 169 [r1] (raw file):

      }
  }
  return smallestMatch

Whatever you're returning here must never be > raftStatus.Commit. Pretty sure that's not violated (because that means something being in all logs, but not having been committed, and Raft would have updated the commit before it got here), but let's be paranoid and enforce smallestMatch <= Commit.


storage/raft_log_queue.go, line 73 [r3] (raw file):

via the raft log (due to a previous truncation),


storage/raft_log_queue_test.go, line 63 [r1] (raw file):

func TestGetTruncatableIndexes(t *testing.T) {
  defer leaktest.AfterTest(t)()
  //t.Skip("TODO(bram): #7056")

Remove.


storage/replica.go, line 206 [r1] (raw file):

      // Last index persisted to the raft log (not necessarily committed).
      lastIndex uint64
      // The size of the persisted raft log.

More words. Someone who wasn't in the room when the semantics were discussed will have no idea how this field behaves.


Comments from Reviewable

Member

tschottdorf commented Jun 24, 2016

:lgtm:. Mind adding in the commit message that we keep track only in-memory, and that that's sufficient (can be the same text as requested in one of my comments below)


Reviewed 13 of 13 files at r1, 3 of 3 files at r2, 1 of 1 files at r3, 1 of 1 files at r4, 1 of 1 files at r5.
Review status: all files reviewed at latest revision, 4 unresolved discussions, all commit checks successful.


storage/raft_log_queue.go, line 169 [r1] (raw file):

      }
  }
  return smallestMatch

Whatever you're returning here must never be > raftStatus.Commit. Pretty sure that's not violated (because that means something being in all logs, but not having been committed, and Raft would have updated the commit before it got here), but let's be paranoid and enforce smallestMatch <= Commit.


storage/raft_log_queue.go, line 73 [r3] (raw file):

via the raft log (due to a previous truncation),


storage/raft_log_queue_test.go, line 63 [r1] (raw file):

func TestGetTruncatableIndexes(t *testing.T) {
  defer leaktest.AfterTest(t)()
  //t.Skip("TODO(bram): #7056")

Remove.


storage/replica.go, line 206 [r1] (raw file):

      // Last index persisted to the raft log (not necessarily committed).
      lastIndex uint64
      // The size of the persisted raft log.

More words. Someone who wasn't in the room when the semantics were discussed will have no idea how this field behaves.


Comments from Reviewable

@tamird

This comment has been minimized.

Show comment
Hide comment
@tamird

tamird Jun 27, 2016

Collaborator

Reviewed 1 of 1 files at r4, 1 of 1 files at r5.
Review status: all files reviewed at latest revision, 4 unresolved discussions, some commit checks failed.


Comments from Reviewable

Collaborator

tamird commented Jun 27, 2016

Reviewed 1 of 1 files at r4, 1 of 1 files at r5.
Review status: all files reviewed at latest revision, 4 unresolved discussions, some commit checks failed.


Comments from Reviewable

@d4l3k

This comment has been minimized.

Show comment
Hide comment
@d4l3k

d4l3k Jun 29, 2016

Contributor

Added.


Review status: all files reviewed at latest revision, 4 unresolved discussions, some commit checks failed.


storage/raft_log_queue.go, line 169 [r1] (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Whatever you're returning here must never be > raftStatus.Commit. Pretty sure that's not violated (because that means something being in all logs, but not having been committed, and Raft would have updated the commit before it got here), but let's be paranoid and enforce smallestMatch <= Commit.

Done. Changed the upper bound from `math.MaxUint64` to `raftStatus.Commit`.

storage/raft_log_queue.go, line 73 [r3] (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

via the raft log (due to a previous truncation),

Done.

storage/raft_log_queue_test.go, line 63 [r1] (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Remove.

Done.

storage/replica.go, line 206 [r1] (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

More words. Someone who wasn't in the room when the semantics were discussed will have no idea how this field behaves.

Done.

Comments from Reviewable

Contributor

d4l3k commented Jun 29, 2016

Added.


Review status: all files reviewed at latest revision, 4 unresolved discussions, some commit checks failed.


storage/raft_log_queue.go, line 169 [r1] (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Whatever you're returning here must never be > raftStatus.Commit. Pretty sure that's not violated (because that means something being in all logs, but not having been committed, and Raft would have updated the commit before it got here), but let's be paranoid and enforce smallestMatch <= Commit.

Done. Changed the upper bound from `math.MaxUint64` to `raftStatus.Commit`.

storage/raft_log_queue.go, line 73 [r3] (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

via the raft log (due to a previous truncation),

Done.

storage/raft_log_queue_test.go, line 63 [r1] (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Remove.

Done.

storage/replica.go, line 206 [r1] (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

More words. Someone who wasn't in the room when the semantics were discussed will have no idea how this field behaves.

Done.

Comments from Reviewable

@tschottdorf

This comment has been minimized.

Show comment
Hide comment
@tschottdorf

tschottdorf Jun 29, 2016

Member

Reviewed 11 of 11 files at r6.
Review status: all files reviewed at latest revision, all discussions resolved, some commit checks pending.


Comments from Reviewable

Member

tschottdorf commented Jun 29, 2016

Reviewed 11 of 11 files at r6.
Review status: all files reviewed at latest revision, all discussions resolved, some commit checks pending.


Comments from Reviewable

@tamird

This comment has been minimized.

Show comment
Hide comment
@tamird

tamird Jun 29, 2016

Collaborator

Reviewed 11 of 11 files at r6.
Review status: all files reviewed at latest revision, all discussions resolved, some commit checks pending.


Comments from Reviewable

Collaborator

tamird commented Jun 29, 2016

Reviewed 11 of 11 files at r6.
Review status: all files reviewed at latest revision, all discussions resolved, some commit checks pending.


Comments from Reviewable

@bdarnell

This comment has been minimized.

Show comment
Hide comment
@bdarnell

bdarnell Jul 3, 2016

Member

:lgtm:


Review status: all files reviewed at latest revision, 3 unresolved discussions, some commit checks failed.


storage/raft_log_queue.go, line 105 [r6] (raw file):

  // Truncate raft logs to the quorum committed index if the raft log is too
  // big. If the most behind node's committed index is before the first raft log

This is fine for three-replica configurations, but is suboptimal for five replicas. In a five-replica configuration with one node far behind and one node slightly behind, truncating to the commit index will require the node that is only slightly behind to apply a snapshot too. I think we should aim to only cut off one node per truncation: iterate over raftStatus.Progress to find the smallest match index which is greater than firstIndex and use that for the truncation.

It's possible that we're still over the limit after doing this (if there are two far-behind nodes), but the next iteration of the queue will truncate the log further.


storage/replica.go, line 210 [r6] (raw file):

      // raftLogSize is the approximate size in bytes of the persisted raft log.
      // On server restart, this value is assumed to be zero to avoid costly scans
      // of the raft log. After the first raft log truncation it will be correct.

A single truncation won't necessarily make this correct (see the check for negative values in TruncateLog). Instead, it is correct when all the remaining entries have been written during the lifetime of this process.


storage/storagebase/state.proto, line 62 [r6] (raw file):

  uint64 num_pending = 3;
  util.hlc.Timestamp last_verification = 4 [(gogoproto.nullable) = false];
  int64 raft_log_size = 5;

Comment that this may be inaccurate on nodes which have recently restarted.


Comments from Reviewable

Member

bdarnell commented Jul 3, 2016

:lgtm:


Review status: all files reviewed at latest revision, 3 unresolved discussions, some commit checks failed.


storage/raft_log_queue.go, line 105 [r6] (raw file):

  // Truncate raft logs to the quorum committed index if the raft log is too
  // big. If the most behind node's committed index is before the first raft log

This is fine for three-replica configurations, but is suboptimal for five replicas. In a five-replica configuration with one node far behind and one node slightly behind, truncating to the commit index will require the node that is only slightly behind to apply a snapshot too. I think we should aim to only cut off one node per truncation: iterate over raftStatus.Progress to find the smallest match index which is greater than firstIndex and use that for the truncation.

It's possible that we're still over the limit after doing this (if there are two far-behind nodes), but the next iteration of the queue will truncate the log further.


storage/replica.go, line 210 [r6] (raw file):

      // raftLogSize is the approximate size in bytes of the persisted raft log.
      // On server restart, this value is assumed to be zero to avoid costly scans
      // of the raft log. After the first raft log truncation it will be correct.

A single truncation won't necessarily make this correct (see the check for negative values in TruncateLog). Instead, it is correct when all the remaining entries have been written during the lifetime of this process.


storage/storagebase/state.proto, line 62 [r6] (raw file):

  uint64 num_pending = 3;
  util.hlc.Timestamp last_verification = 4 [(gogoproto.nullable) = false];
  int64 raft_log_size = 5;

Comment that this may be inaccurate on nodes which have recently restarted.


Comments from Reviewable

@d4l3k

This comment has been minimized.

Show comment
Hide comment
@d4l3k

d4l3k Jul 5, 2016

Contributor

Review status: 0 of 14 files reviewed at latest revision, 3 unresolved discussions, some commit checks pending.


storage/raft_log_queue.go, line 105 [r6] (raw file):

Previously, bdarnell (Ben Darnell) wrote…

This is fine for three-replica configurations, but is suboptimal for five replicas. In a five-replica configuration with one node far behind and one node slightly behind, truncating to the commit index will require the node that is only slightly behind to apply a snapshot too. I think we should aim to only cut off one node per truncation: iterate over raftStatus.Progress to find the smallest match index which is greater than firstIndex and use that for the truncation.

It's possible that we're still over the limit after doing this (if there are two far-behind nodes), but the next iteration of the queue will truncate the log further.

Done.

storage/replica.go, line 210 [r6] (raw file):

Previously, bdarnell (Ben Darnell) wrote…

A single truncation won't necessarily make this correct (see the check for negative values in TruncateLog). Instead, it is correct when all the remaining entries have been written during the lifetime of this process.

Done.

storage/storagebase/state.proto, line 62 [r6] (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Comment that this may be inaccurate on nodes which have recently restarted.

Done.

Comments from Reviewable

Contributor

d4l3k commented Jul 5, 2016

Review status: 0 of 14 files reviewed at latest revision, 3 unresolved discussions, some commit checks pending.


storage/raft_log_queue.go, line 105 [r6] (raw file):

Previously, bdarnell (Ben Darnell) wrote…

This is fine for three-replica configurations, but is suboptimal for five replicas. In a five-replica configuration with one node far behind and one node slightly behind, truncating to the commit index will require the node that is only slightly behind to apply a snapshot too. I think we should aim to only cut off one node per truncation: iterate over raftStatus.Progress to find the smallest match index which is greater than firstIndex and use that for the truncation.

It's possible that we're still over the limit after doing this (if there are two far-behind nodes), but the next iteration of the queue will truncate the log further.

Done.

storage/replica.go, line 210 [r6] (raw file):

Previously, bdarnell (Ben Darnell) wrote…

A single truncation won't necessarily make this correct (see the check for negative values in TruncateLog). Instead, it is correct when all the remaining entries have been written during the lifetime of this process.

Done.

storage/storagebase/state.proto, line 62 [r6] (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Comment that this may be inaccurate on nodes which have recently restarted.

Done.

Comments from Reviewable

@tamird

This comment has been minimized.

Show comment
Hide comment
@tamird

tamird Jul 5, 2016

Collaborator

Reviewed 13 of 14 files at r7.
Review status: 13 of 14 files reviewed at latest revision, 9 unresolved discussions, some commit checks pending.


storage/raft_log_queue.go, line 67 [r7] (raw file):

// getTruncatableIndexes returns the number of truncatable indexes and the
// oldest index that cannot be pruned for the replica.

s/pruned/truncated/ unless we use pruned elsewhere.


storage/raft_log_queue.go, line 99 [r7] (raw file):

}

// computeTruncatableIndex returns the oldest index that cannot be pruned. If

ditto


storage/raft_log_queue.go, line 106 [r7] (raw file):

// bytes value (typically 64MB). When there are no nodes behind, or we can't
// catch them up via the raft log (due to a previous truncation) this returns
// the quorum committed index.

no longer true


storage/raft_log_queue.go, line 110 [r7] (raw file):

  truncatableIndex := raftStatus.Commit
  behindIndexes := getBehindIndexes(raftStatus)
  for _, i := range behindIndexes {

i is typically used as a loop variable, can you name this the singular of the slice being iterated?


storage/raft_log_queue.go, line 120 [r7] (raw file):

      // quorum committed index. This allows for multiple nodes to be behind and not
      // give up on the more recently behind node.
      if raftLogSize > targetSize && i <= firstIndex {

why can't this be:

if raftLogSize <= targetSize {
  truncatableIndex = i
  break
}

storage/replica.go, line 210 [r6] (raw file):

Previously, d4l3k (Tristan Rice) wrote…

Done.

pedantic nit: "This will be correct when all log entries predating this process have been truncated."

storage/storagebase/state.proto, line 64 [r7] (raw file):

  uint64 num_dropped = 5;
  // raft_log_size may be initially inaccurate after a server restart.
  // See storage.Replica.mu.raftLogSize

period


Comments from Reviewable

Collaborator

tamird commented Jul 5, 2016

Reviewed 13 of 14 files at r7.
Review status: 13 of 14 files reviewed at latest revision, 9 unresolved discussions, some commit checks pending.


storage/raft_log_queue.go, line 67 [r7] (raw file):

// getTruncatableIndexes returns the number of truncatable indexes and the
// oldest index that cannot be pruned for the replica.

s/pruned/truncated/ unless we use pruned elsewhere.


storage/raft_log_queue.go, line 99 [r7] (raw file):

}

// computeTruncatableIndex returns the oldest index that cannot be pruned. If

ditto


storage/raft_log_queue.go, line 106 [r7] (raw file):

// bytes value (typically 64MB). When there are no nodes behind, or we can't
// catch them up via the raft log (due to a previous truncation) this returns
// the quorum committed index.

no longer true


storage/raft_log_queue.go, line 110 [r7] (raw file):

  truncatableIndex := raftStatus.Commit
  behindIndexes := getBehindIndexes(raftStatus)
  for _, i := range behindIndexes {

i is typically used as a loop variable, can you name this the singular of the slice being iterated?


storage/raft_log_queue.go, line 120 [r7] (raw file):

      // quorum committed index. This allows for multiple nodes to be behind and not
      // give up on the more recently behind node.
      if raftLogSize > targetSize && i <= firstIndex {

why can't this be:

if raftLogSize <= targetSize {
  truncatableIndex = i
  break
}

storage/replica.go, line 210 [r6] (raw file):

Previously, d4l3k (Tristan Rice) wrote…

Done.

pedantic nit: "This will be correct when all log entries predating this process have been truncated."

storage/storagebase/state.proto, line 64 [r7] (raw file):

  uint64 num_dropped = 5;
  // raft_log_size may be initially inaccurate after a server restart.
  // See storage.Replica.mu.raftLogSize

period


Comments from Reviewable

@d4l3k

This comment has been minimized.

Show comment
Hide comment
@d4l3k

d4l3k Jul 5, 2016

Contributor

Review status: 10 of 14 files reviewed at latest revision, 9 unresolved discussions, some commit checks pending.


storage/raft_log_queue.go, line 67 [r7] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

s/pruned/truncated/ unless we use pruned elsewhere.

Done.

storage/raft_log_queue.go, line 99 [r7] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

ditto

Done.

storage/raft_log_queue.go, line 106 [r7] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

no longer true

Done.

storage/raft_log_queue.go, line 110 [r7] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

i is typically used as a loop variable, can you name this the singular of the slice being iterated?

Done.

storage/raft_log_queue.go, line 120 [r7] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

why can't this be:

if raftLogSize <= targetSize {
  truncatableIndex = i
  break
}
I simplified the logic a bit.

When the raftLogSize <= targetSize we want to keep the log as is when it's already been truncated to the furthest behind node.

Otherwise, we want to truncate to the next behind node.


storage/replica.go, line 210 [r6] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

pedantic nit: "This will be correct when all log entries predating this process have been truncated."

Done.

storage/storagebase/state.proto, line 64 [r7] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

period

Done.

Comments from Reviewable

Contributor

d4l3k commented Jul 5, 2016

Review status: 10 of 14 files reviewed at latest revision, 9 unresolved discussions, some commit checks pending.


storage/raft_log_queue.go, line 67 [r7] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

s/pruned/truncated/ unless we use pruned elsewhere.

Done.

storage/raft_log_queue.go, line 99 [r7] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

ditto

Done.

storage/raft_log_queue.go, line 106 [r7] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

no longer true

Done.

storage/raft_log_queue.go, line 110 [r7] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

i is typically used as a loop variable, can you name this the singular of the slice being iterated?

Done.

storage/raft_log_queue.go, line 120 [r7] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

why can't this be:

if raftLogSize <= targetSize {
  truncatableIndex = i
  break
}
I simplified the logic a bit.

When the raftLogSize <= targetSize we want to keep the log as is when it's already been truncated to the furthest behind node.

Otherwise, we want to truncate to the next behind node.


storage/replica.go, line 210 [r6] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

pedantic nit: "This will be correct when all log entries predating this process have been truncated."

Done.

storage/storagebase/state.proto, line 64 [r7] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

period

Done.

Comments from Reviewable

@tamird

This comment has been minimized.

Show comment
Hide comment
@tamird

tamird Jul 5, 2016

Collaborator

Reviewed 3 of 3 files at r8.
Review status: 13 of 14 files reviewed at latest revision, 5 unresolved discussions, some commit checks failed.


storage/raft_log_queue.go, line 112 [r8] (raw file):

  behindIndexes := getBehindIndexes(raftStatus)
  for _, behindIndex := range behindIndexes {
      // If the behind node's committed index is before the first raft log

this comment is now hard to read given the code below is testing for the inverse condition.


storage/raft_log_queue_test.go, line 32 [r8] (raw file):

)

// TestGetBehindIndexes verifies that the correc

missing the rest of the sentence


Comments from Reviewable

Collaborator

tamird commented Jul 5, 2016

Reviewed 3 of 3 files at r8.
Review status: 13 of 14 files reviewed at latest revision, 5 unresolved discussions, some commit checks failed.


storage/raft_log_queue.go, line 112 [r8] (raw file):

  behindIndexes := getBehindIndexes(raftStatus)
  for _, behindIndex := range behindIndexes {
      // If the behind node's committed index is before the first raft log

this comment is now hard to read given the code below is testing for the inverse condition.


storage/raft_log_queue_test.go, line 32 [r8] (raw file):

)

// TestGetBehindIndexes verifies that the correc

missing the rest of the sentence


Comments from Reviewable

@d4l3k

This comment has been minimized.

Show comment
Hide comment
@d4l3k

d4l3k Jul 6, 2016

Contributor

Review status: 11 of 14 files reviewed at latest revision, 5 unresolved discussions, some commit checks pending.


storage/raft_log_queue.go, line 112 [r8] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

this comment is now hard to read given the code below is testing for the inverse condition.

Done.

storage/raft_log_queue_test.go, line 32 [r8] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

missing the rest of the sentence

Done.

Comments from Reviewable

Contributor

d4l3k commented Jul 6, 2016

Review status: 11 of 14 files reviewed at latest revision, 5 unresolved discussions, some commit checks pending.


storage/raft_log_queue.go, line 112 [r8] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

this comment is now hard to read given the code below is testing for the inverse condition.

Done.

storage/raft_log_queue_test.go, line 32 [r8] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

missing the rest of the sentence

Done.

Comments from Reviewable

@tamird

This comment has been minimized.

Show comment
Hide comment
@tamird

tamird Jul 6, 2016

Collaborator

Reviewed 3 of 3 files at r9.
Review status: all files reviewed at latest revision, 3 unresolved discussions, some commit checks pending.


Comments from Reviewable

Collaborator

tamird commented Jul 6, 2016

Reviewed 3 of 3 files at r9.
Review status: all files reviewed at latest revision, 3 unresolved discussions, some commit checks pending.


Comments from Reviewable

storage: use raft log size (in bytes) as metric for truncation
The raft log currently truncates based of the number of entries (max 10000).
These entries can be of any size so it makes more sense to use the byte size of
the raft log as a metric for truncation.

If the raft log is greater than 64MB and there is a behind node, it will
truncate the log to the quorum committed index.

The in-memory size is the approximate size in bytes of the persisted raft log.
On server restart, this value is assumed to be zero to avoid costly scans of the
raft log. After the first raft log truncation it will be correct.

See #7065.
@d4l3k

This comment has been minimized.

Show comment
Hide comment
@d4l3k

d4l3k Jul 7, 2016

Contributor

Since the LGTMs are stale. Everyone happy with me merging this?

Contributor

d4l3k commented Jul 7, 2016

Since the LGTMs are stale. Everyone happy with me merging this?

@bdarnell

This comment has been minimized.

Show comment
Hide comment
@bdarnell

bdarnell Jul 7, 2016

Member

:lgtm:


Reviewed 7 of 14 files at r7, 1 of 3 files at r8, 3 of 3 files at r9, 3 of 3 files at r10.
Review status: all files reviewed at latest revision, 1 unresolved discussion, all commit checks successful.


Comments from Reviewable

Member

bdarnell commented Jul 7, 2016

:lgtm:


Reviewed 7 of 14 files at r7, 1 of 3 files at r8, 3 of 3 files at r9, 3 of 3 files at r10.
Review status: all files reviewed at latest revision, 1 unresolved discussion, all commit checks successful.


Comments from Reviewable

@d4l3k d4l3k merged commit 6b1738f into cockroachdb:master Jul 7, 2016

3 checks passed

ci/circleci Your tests passed on CircleCI!
Details
code-review/reviewable Review complete: 1 of 0 LGTMs obtained (and 1 stale)
Details
licence/cla Contributor License Agreement is signed.
Details

@d4l3k d4l3k deleted the d4l3k:truncate-bytes branch Jul 7, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment