Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vsr: synchronize op and commit_max updates on_prepare #1576

Merged
merged 3 commits into from Feb 20, 2024

Conversation

matklad
Copy link
Member

@matklad matklad commented Feb 20, 2024

Before, op was updated first, and commit_max at the end, breaking

assert(self.commit_max >= self.op -| constants.pipeline_prepare_queue_max);

invariant. Broken invariant was observed by, eg, op_repair_min (see the first commit).

To fix this:

  • rearrange the code a bit, to separate logical update to commit_max from attempting to commit stuff
  • move logical update forward in on_prepare flow

This currently fails: when appending a prepare, we set op and commit_max
separately. Will fix in the next commit.

Seed: 1648503712295043012
@matklad matklad added the vopr VOPR Hub testing is enabled label Feb 20, 2024
});
self.commit_max = self.commit_min;
}
self.advance_commit_max(self.commit_min, @src().fn_name);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels something suspicious. I think it would be better to update commit_max before the primary starts committing, not after it is done. Though I guess for solo mode updating later is actually desirable?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 I like the "on primary, commit_min == commit_max invariant" that we have.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, this feels extra tricky --- it seems like we could be advancing commit_max here even if we are no longer primary, if view change happens while we are committing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 if we allow primary's commit_max to run ahead of commit_min, that might increase latency tolerance. If the primary is slow to commit, clients could get replies from backups which commit ahead of the primary.

@@ -1393,14 +1393,11 @@ pub fn ReplicaType(
assert(message.header.op > self.op);
assert(message.header.op > self.commit_min);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a assert(message.header.op > message.header.commit); to emphasize that the self.advance_commit_max(message.header.commit, @src().fn_name); below is ok even though we won't have loaded message.header into our journal yet.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

advancing commit_max should be ok even if we don't have a header? We allow maybe(self.op < self.commit_max)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good point!

@@ -309,7 +309,6 @@ pub fn ReplicaType(
/// * never decreases.
/// Invariants (status=normal primary):
/// * `replica.commit_max = replica.commit_min`.
/// * `replica.commit_max = replica.op - pipeline.queue.prepare_queue.count`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does this not hold?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, forgot to mention that I was confused by this. This almost never holds? For example, in an idle cluster replica.commit_max = replica.op. Or am I misreading this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pipeline.queue.prepare_queue.count is the number of items currently in the pipeline, not the capacity. So in an idle cluster pipeline.queue.prepare_queue.count == 0.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, indeed, I misread that as pipeline_prepare_queue_max 🤦

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, now advance_commit_max actually checks that invariant (had to move pipeline prepare popping to commit_op for that)

@@ -784,7 +783,8 @@ pub fn ReplicaType(

if (self.solo()) {
if (self.commit_min < self.op) {
self.commit_journal(self.op);
self.advance_commit_max(self.op, @src().fn_name);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure whether I like @src().fn_name. On the one hand, it is cryptic and too clever. On the other hand, we did have bugs where we moved code around but didn't update call-sites.

If we double-down here, we might pass the whole @src() in even

@matklad matklad force-pushed the matklad/advance-commit-max branch 2 times, most recently from 763793e to 4196714 Compare February 20, 2024 17:36
Before, `commit_journal` played dual role:

* it advanced commit_max
* it tried executing everything up to commit_max

In parciular, we had funny calls `commit_journal(self.commit_max)`,
which neutered the first role in a somewhat roundabout way.

Now, we have two functions:

* "sensitive" `advance_commit_max` to be called when you have learned
  something that justifies _logically_ committing more
* "safe" `commit_journal`, which can be called whenever, and just
  executes locally available prepares up to commit_max

There shouldn't be any behavior changes here, but in the next commit
I'll tweak the logic in `on_prepare` to advance commit_max _before_
appending the prepare, so as not to break op/commit_max invariant even
temporarily.
Before, op was updated first, and commit_max at the end, breaking

```
assert(self.commit_max >= self.op -| constants.pipeline_prepare_queue_max);
```

invariant

Seed: 1648503712295043012
@matklad matklad removed the vopr VOPR Hub testing is enabled label Feb 20, 2024
@matklad
Copy link
Member Author

matklad commented Feb 20, 2024

Apparently, it's better not to forec-push VOPR-labeled pull requests!

assert(prepare.message.header.op == self.commit_max);
assert(prepare.ok_quorum_received);
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did like how "pop prepare" was immediately followed by "pop request -> push prepare to pipeline". But commit_op is already pretty gigantic 😓

@matklad matklad added this pull request to the merge queue Feb 20, 2024
Merged via the queue into main with commit c318c49 Feb 20, 2024
25 checks passed
@matklad matklad deleted the matklad/advance-commit-max branch February 20, 2024 18:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants