New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[sinks] Generate updated as_of in rehydrating client #14785
Conversation
13b39ca
to
4d8f36d
Compare
This is ready for review! I was not able to reliably reproduce the panic on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should work. But I had a concern about race conditions. Might be that we have to merge as is.
In any case, this was some nice sleuthing! 😊
// The controller has the dependency recorded in it's `exported_collections` so this | ||
// should not change at least until the sink is started up (because the storage | ||
// controller will not downgrade the source's since). | ||
let from_since = from_read_handle.since(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that I look at this I'm a bit concerned about races. The following might happen:
- controller holds a read handle for the implied since
t
- the actual since of the shard is
t - 10
because someone else is still holding a handle - we set the
as_of
tot - 10
because of this - that third party released their hold, shard since advances to
t
- the sink, in
storaged
tries to read and fails
I think 1. and 2. can't currently happen together, because we know that the controller initializes the implied frontier to the shard since. But we might change that in the future, maybe by accident.
It would be nicer if we can thread through the exact since that the controller is holding, but that might not be easily feasible. In that case we should probably merge as is. Tricky ... 🙈
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the correctness here is slightly more embedded than you describe: one of the first things the storage controller does is install a dependency between the from
source / table / etc and this newly created sink. That will keep it from updating the read capability of the source
so, while someone else can definitely downgrade their handle, the handle managed by the storage controller should keep the since for the collection itself from being downgraded
@@ -183,6 +185,39 @@ where | |||
.await; | |||
} | |||
|
|||
for export in self.exports.values_mut() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rehydration logic is also used when the controller restarts? I'm asking because we removed the logic from there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is, that's correct!
Previously, we always used the same as_of when the storaged instance was rehydrated. This meant that the following sequence would result in a panic:
s
(which has as_oft_source
)now() = t_sink > t_source
s
is compacted so thatt_source_1 >
t_sink`t_sink
s
We handled the case when the source was recreated on restart of
environmentd
by looking at the current since of the source in the storage controller. However, in this case, the storage controller doesn't instruct the client to restart -- the rehydrating task does. So we simply move the logic down a layerMotivation
Fixes #14555
Checklist
This PR has adequate test coverage / QA involvement has been duly considered.
This PR evolves an existing
$T ⇔ Proto$T
mapping (possibly in a backwards-incompatible way) and therefore is tagged with aT-protobuf
label.This PR includes the following user-facing behavior changes: