Skip to content

Allow to run MergeRollupTask for specific days and/or segments #14138

@saifat29

Description

@saifat29

Currently MergeRollupTask runs in a forward manner, meaning segments that were processed are not processed again even after modifying the watermark in Zookeeper.

Possible use case is that if ingestion of data is not uniform, some days receive more events than others, so rollup for days where events are less, results in let's say 10 segments, but for days where events are much higher, segments are rolled up into 1000 segments.

If MergeRollupTask can be made to run again with new configuration for those affected days it would be ideal.

Druid has this really useful feature called Reindex which does this.

Metadata

Metadata

Assignees

No one assigned

    Labels

    featureNew functionalityingestionRelated to data ingestion pipeline

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions