Skip to content

Infinite Loop in ingest_v2 When Record Payload Exceeds batch_num_bytes #5240

@Stool233

Description

@Stool233

Describe the bug

When attempting to use ingest_v2, I noticed that sometimes the CPU usage spikes, with one core being utilized at 100%. It appears to trigger some sort of infinite loop.

Upon debugging, I found that the issue occurs when a record payload slightly exceeds batch_num_bytes (I used the default configuration DEFAULT_BATCH_NUM_BYTES=1 MiB).

Specifically, this issue is in quickwit/quickwit-ingest/src/ingest_v2/fetch.rs:

if mrecord_buffer.len() + payload.len() > mrecord_buffer.capacity() {
has_drained_queue = false;
break;
}

Here, since the payload length exceeds batch_num_bytes, the if condition is met, and has_drained_queue is set to false. The program then loops back to the start:

if has_drained_queue && self.shard_status_rx.changed().await.is_err() {

Since has_drained_queue is false, the if condition fails, and the program proceeds to the subsequent code block, looping back again to:

if mrecord_buffer.len() + payload.len() > mrecord_buffer.capacity() {
has_drained_queue = false;
break;
}

has_drained_queue is repeatedly set to false, resulting in an infinite loop.

Steps to Reproduce (if applicable)

Steps to reproduce the behavior:
When the record payload is greater than batch_num_bytes, this issue is triggered.

I wrote a test to reproduce this issue:
Stool233@33ea614
Here are the action run results:
https://github.com/Stool233/quickwit/actions/runs/10021273109/job/27699744997#step:10:2061

Expected behavior

When encountering a record payload greater than batch_num_bytes, either temporarily increase the size of batch_num_bytes to handle the record, or reject the record instead of entering an infinite loop.

I wrote a patch that temporarily increases the size of batch_num_bytes to handle such records, which works in my scenario:

If the community finds it appropriate, I can submit a related PR.

Configuration:

Please provide:

  1. Output of quickwit --version

    • Compiled from the latest main (8e6dc17)
  2. The index_config.yaml

    QW_ENABLE_INGEST_V2=true

    indexer:
      enable_cooperative_indexing: true

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions