Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dispatch reduce tasks for all unmaterialized entries during start up #441

Closed
adzialocha opened this issue Jul 7, 2023 · 0 comments · Fixed by #623
Closed

Dispatch reduce tasks for all unmaterialized entries during start up #441

adzialocha opened this issue Jul 7, 2023 · 0 comments · Fixed by #623

Comments

@adzialocha
Copy link
Member

adzialocha commented Jul 7, 2023

Our materializer has two sorts of "events" which are important to re-attempt when a node quit prematurely to assure we're not losing data:

  1. Re-attempt tasks
  2. Re-attempt unmaterialized operations

They seem related but actually are independent from each other: Tasks do not necessarily represent arriving operations. Let's say an operation arrives for the first time, kicks in a reduce task, followed by a dependency task. Now the node got shut off before that dependency task finished. We're sending that operation again on restart to re-attempt that flow, the reduce task will quit early, saying it already has done its work last time. No dependency task will be dispatched, we're having a problem and lost data.

This is also true vice-versa: Tasks are handled too late in some race conditions where operations got successfully stored, but the node quit before the reduce task got created. We've lost data again.

The first point (Tasks) we already solved, but we need to also account for unmaterialized operations. This was not possible until now, since it wasn't easy to distinct in our database if an operation has been materialized or not. Now we have a sorted_index which represents that state, see: #438

On node startup we should check which operations have sorted_index = None and then issue reduce tasks for them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant