Timely worker synchronization for new dataflow creation #6

rjnn · 2019-03-13T19:19:57Z

All timely workers need to create new dataflows in the exact same order. This is arguably an implementation flaw - it is possible that timely workers could have a namespace and new dataflows could register against this namespace. But today, there is a single global counter from which channels are assigned for dataflows. Thus, every timely worker must create new dataflows in the same order.

In order to facilitate this ordering, the interactive crate has a meta-dataflow in which workers insert their intents to create new dataflows. This meta-dataflow assigns an ordering, which workers use to coordinate.

One disadvantage of this hack is that dataflows are not persistent. if we want to be able to recover from crashes, we want to be able to see the history of dataflows we created (i.e. the views). Thus, replacing this meta-dataflow framework with a consensus-replicated durable log fulfills this other goal, as well as satisfying the original desideratum of having a consistent shared ordering.

We should rip out the meta-dataflow, and instead put all dataflow creation commands through a shared durable log (i.e. on top of zookeeper/etcd).

The text was updated successfully, but these errors were encountered:

frankmcsherry · 2019-03-13T20:07:45Z

Btw, it is probably worth spec'ing out which parts of Materialize exist in which offerings. My sense is:

There is a very reasonable, non-durable, non-autoscaling, non-clustermanaged thing that each user could try out on their laptop / in a browser / etc., which is a simple binary download without GBs of Apache cruft, and which gets them up and running in minutes.
There is also a very reasonable durable, autoscaling, cluster managed thing that wraps around this that is worth paying for, and that serious people would agree that they need, even though the above did not crash during their demo runs.

That's one hypothetical partitioning of value, but with something like that in mind we can more clearly determine which features need to go where. At the moment interactive is the closest thing to (1.) above, but it isn't obvious (to me) that we want to add all of the features in to it, both because they may make the initial experience more awkward, and because it may be "useful" to hold them back.

rjnn · 2019-03-13T20:10:22Z

Currently, @benesch is working on an intermediate API, 'metastore'. The intention is for metastore to be backed by zookeeper for (2). But it's equally possible to have a shim 'zookeeper' that's a linked-list that satisfies (1) with no additional Apache cruft.

The reason why zookeeper is such an attractive option is that if the primary streaming data ingest layer is Kafka, then Kafka users typically have Zookeeper running anyhow.

benesch · 2019-04-01T14:52:54Z

Closing this out. I'm pretty happy with the end result, which uses ZooKeeper to store the metadata, but then the Chosen Node (worker node 0) pushes them through a timely sequencer for a consistent ordering.

benesch · 2019-04-01T14:54:13Z

For posterity: getting ZooKeeper to expose a consistent stream of events is literally impossible. It's not built for that. I think it might be possible with etcd, but even if it is, it's definitely awkward. If we want that, we'd need to bundle a Raft implementation, and that seems like serious overkill. The solution of using ZooKeeper for persistence and timely's sequencer for ordering actually seems like the best option.

benesch closed this as completed Apr 1, 2019

benesch mentioned this issue Nov 21, 2019

rdkafka crashes when loaded on mac #1049

Closed

krishmanoh2 mentioned this issue Nov 2, 2020

OOM issue reported by user #4205

Closed

philip-stoev mentioned this issue May 24, 2021

sql+: apd using heap-allocated type #6784

Merged

mgree mentioned this issue May 2, 2023

MIR type checking #18666

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Timely worker synchronization for new dataflow creation #6

Timely worker synchronization for new dataflow creation #6

rjnn commented Mar 13, 2019

frankmcsherry commented Mar 13, 2019

rjnn commented Mar 13, 2019

benesch commented Apr 1, 2019

benesch commented Apr 1, 2019

Timely worker synchronization for new dataflow creation #6

Timely worker synchronization for new dataflow creation #6

Comments

rjnn commented Mar 13, 2019

frankmcsherry commented Mar 13, 2019

rjnn commented Mar 13, 2019

benesch commented Apr 1, 2019

benesch commented Apr 1, 2019