Skip to content

index_parallel not creating subtask for a specific interval  #11348

@eeren0

Description

@eeren0

Affected Version

0.20.0

Description

To provide some background, we are performing a batch ingestion of datasource A -> datasource A_tmp, and we have a kafka_indexer ingesting into A_tmp ,then we perform an index_parallel again to copy A_tmp back into A.

However, we are seeing something really obscure where on a granularity of 'MONTH', it can't seem to generate a sub_task for a specific segment range (2019-12-01-2020-01-01 in this case). We have the exact same set up in a different environment, which doesn't have the same issue.

The only notable difference from the index_parallel log is the below line for the one that failed :
ParallelIndexPhaseRunner - There's no input split to process

As for the same set up in another environment, a subtask with intervals of 2019-12-01-2020-01-01 is being generated and submitted.

We could see segments for 2019-12-01 to 2021-01-01 being generated in the _tmp datasource.

No errors/exceptions observed in historicals/coordinator.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions