-
Notifications
You must be signed in to change notification settings - Fork 126
Allow for configuring concurrency and batch size for all sinks in self-hosted #1088
Copy link
Copy link
Closed as not planned
Closed as not planned
Copy link
Labels
Description
All sinks have the following tunable config settings:
- concurrency
- batch_size
- batch wait (timeout)
We specify defaults for all sinks here:
https://github.com/sequinstream/sequin/blob/main/lib/sequin/runtime/sink_pipeline.ex#L205-L237
"processors" is CPU-intensive work, like preparing JSON. Batchers batch one or more messages from processors, and are what do the I/O work to write to sink destinations.
Sink modules can overwrite these settings, like this:
https://github.com/sequinstream/sequin/blob/main/lib/sequin/runtime/kafka_pipeline.ex#L30-L38
Note the hack there to make Kafka super concurrent!
What we should do instead:
sink_consumersalready has abatch_sizecolumn/setting- we should add
processor_concurrencyandbatcher_concurrencyandbatch_waittosink_consumers- we can set what we think are reasonable defaults for each one (like this)
- in self-hosted Sequin, all of these should be configurable. @acco can determine the copy, as well as where each field ought to live.
For cloud, we'll tackle separately. So, these fields should only be shown on self-hosted for now.
Reactions are currently unavailable