Be adaptive to stream speed #32

shawncao · 2020-11-26T18:52:11Z

Today, for real time streaming data source, such as Kafka, nebula create each data block based on offset start and end, and seal a data block when all records arrived, then put this sealed block into queryable blocks pool. Query will scan blocks that are in the pool.

For busy streams, this may not be an issue, as the system will place data blocks very often. But for slow streams, it may wait for a few minutes to generate a sizable batch, our current solution is to decrease the batch size for that stream, this will end up with more blocks to manage and not adaptable to stream speed, some traffic speeds up and down in different period.

The issue is to ask improvement to make ingestion adaptive to stream speed and still maintain ideal block size, options are

keep latest block open for both query and append
make a copy of a progressive block until it's sealed.
open for other designs

shawncao added engine enhancement New feature or request help wanted Extra attention is needed perf performance related issues labels Nov 29, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Be adaptive to stream speed #32

Be adaptive to stream speed #32

shawncao commented Nov 26, 2020

Be adaptive to stream speed #32

Be adaptive to stream speed #32

Comments

shawncao commented Nov 26, 2020