-
Notifications
You must be signed in to change notification settings - Fork 459
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: simplify upsert state management into read and write halves #18878
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(with my very limited knowledge) lgtm! And I will rebase my WIP PR after this change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something seems off with this implementation. I would expect that we iterate over the pending commands and for every key we get the current value, if any, from state
. That would form the command_state
HashMap
.
Then we would run the upsert logic exactly as it was before but replace state
with command_state
.
Then, after we're done emitting data we would write all the changes of command_state
back to state
. The last part needs some care so that we preserve deletions.
b10c156
to
a542568
Compare
7f74438
to
0fa28c4
Compare
src/storage/src/render/upsert.rs
Outdated
let state_value = state.get(key); | ||
commands_state | ||
.entry(*key) | ||
.or_insert_with(|| state_value.cloned()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not move the whole state.get(key).cloned()
in the closure?
src/storage/src/render/upsert.rs
Outdated
match value { | ||
Some(value) => { | ||
if let Some(old_value) = state.insert(key, value.clone()) { | ||
if let Some(old_value) = command_state.take() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: command_state.replace(value.clone())
exists in Option
and does the same as the insert
on the hashmap!
src/storage/src/render/upsert.rs
Outdated
output_updates.push((old_value, ts, -1)); | ||
} | ||
*command_state = None; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is superfluous
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good modulo these last comments
0fa28c4
to
a058ac2
Compare
This pr is in service of simplifying: https://github.com/MaterializeInc/materialize/pull/18810/files. In the upsert case, we require a single hashmap lookup per-key, which the current upsert implementation does not do correctly. We also want to simplify the traitification of
upsert
, by making it so the trait interface can just be a multiget, and a multiput. In this pr, we do this by using a temporary (but re-used) hashmap to perform the upsert logic on.This pr will cause a known regression on feature benchmarking for upsert. This is considered unimportant, because upsert is already significantly faster than it has to be, we are replacing it with io, and we prefer a simpler implementation. Its slower because we:
Clone
an extraRow
(for the previous value)Motivation
Checklist
$T ⇔ Proto$T
mapping (possibly in a backwards-incompatible way) and therefore is tagged with aT-proto
label.