-
Notifications
You must be signed in to change notification settings - Fork 492
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
Kinesis source cannot be created or started on streams that have undergone resharding (split/merge operations). The source repeatedly fails with "channel closed" errors and enters an infinite restart loop.
Based on code investigation, the likely cause is:
- No differentiation between closed and active shards: When calling
list_shards, Quickwit receives both closed (parent) shards and active (child) shards after resharding, but attempts to create consumers for all of them - Attempting to read from closed shards: The source tries to get shard iterators and read data from shards that have ending_sequence_number set (indicating they are closed)
- Missing shard lineage tracking: Quickwit doesn't track parent-child relationships between shards, so it cannot properly transition from closed parent shards to their active children
Steps to reproduce (if applicable)
- Have a Kinesis stream that has undergone resharding (split or merge operations)
- Attempt to create a new Quickwit source for this stream
- Observe that the source starts failing with repeated restarts
Expected behavior
The Kinesis source should handle shard resharding gracefully and continue processing data from the new shard configuration.
Actual Behavior
- Cannot successfully create a working source on a resharded stream
- Source enters an infinite failure loop immediately after creation
- All KinesisShardConsumer actors terminate with Failure(channel closed) errors
- Source restarts approximately every minute but never succeeds
- Deleting and recreating the source does not resolve the issue
Configuration:
- Quickwit Version:
qw-airmail-20250522-hotfix - Shard configuration:
- Stream has 7 shards (shardId-000000000000 to shardId-000000000006)
- Stream previously underwent resharding (1 -> 2 -> 4)
Logs
Pattern repeats every ~60 seconds
2025-09-15 18:00:22.391 | 2025-09-15T09:00:22.391Z INFO quickwit_actors::spawn_builder: actor-exit actor_id=KinesisShardConsumer-falling-mwVX exit_status=killed |
-- | -- | --
| | 2025-09-15 18:00:22.328 | 2025-09-15T09:00:22.328Z INFO quickwit_actors::spawn_builder: actor-exit actor_id=KinesisShardConsumer-twilight-Q4N4 exit_status=killed |
| | 2025-09-15 18:00:22.321 | 2025-09-15T09:00:22.321Z INFO quickwit_actors::spawn_builder: actor-exit actor_id=KinesisShardConsumer-ancient-ixPf exit_status=killed |
| | 2025-09-15 18:00:22.294 | 2025-09-15T09:00:22.294Z INFO quickwit_actors::spawn_builder: actor-exit actor_id=KinesisShardConsumer-bitter-pE4p exit_status=killed |
| | 2025-09-15 17:59:22.476 | 2025-09-15T08:59:22.476Z ERROR quickwit_actors::actor_context: exit activating-kill-switch actor=KinesisShardConsumer-floral-WDmb exit_status=Failure(channel closed) |
| | 2025-09-15 17:59:22.476 | 2025-09-15T08:59:22.476Z INFO quickwit_actors::spawn_builder: actor-exit actor_id=KinesisShardConsumer-floral-WDmb exit_status=failure(cause=channel closed) |
| | 2025-09-15 17:59:22.278 | 2025-09-15T08:59:22.278Z ERROR quickwit_actors::actor_context: exit activating-kill-switch actor=KinesisShardConsumer-little-4Tut exit_status=Failure(channel closed) |
| | 2025-09-15 17:59:22.278 | 2025-09-15T08:59:22.278Z INFO quickwit_actors::spawn_builder: actor-exit actor_id=KinesisShardConsumer-little-4Tut exit_status=failure(cause=channel closed) |
| | 2025-09-15 17:59:22.276 | 2025-09-15T08:59:22.276Z ERROR quickwit_actors::actor_context: exit activating-kill-switch actor=KinesisShardConsumer-holy-xeKo exit_status=Failure(channel closed) |
| | 2025-09-15 17:59:22.276 | 2025-09-15T08:59:22.276Z INFO quickwit_actors::spawn_builder: actor-exit actor_id=KinesisShardConsumer-holy-xeKo exit_status=failure(cause=channel closed) |
| | 2025-09-15 17:59:22.238 | 2025-09-15T08:59:22.238Z INFO quickwit_indexing::source::kinesis::kinesis_source: Starting Kinesis source. stream_name=earlbread-kinesis-test-stream assigned_shards=shardId-000000000000, shardId-000000000001, shardId-000000000002, shardId-000000000003, shardId-000000000004, shardId-000000000005, shardId-000000000006
Stream Info
{
"StreamDescriptionSummary": {
"StreamName": "earlbread-kinesis-test-stream",
"StreamARN": "arn:aws:kinesis:ap-northeast-2:314695318048:stream/earlbread-kinesis-test-stream",
"StreamStatus": "ACTIVE",
"StreamModeDetails": {
"StreamMode": "PROVISIONED"
},
"RetentionPeriodHours": 24,
"StreamCreationTimestamp": "2025-09-15T17:39:46+09:00",
"EnhancedMonitoring": [
{
"ShardLevelMetrics": []
}
],
"EncryptionType": "NONE",
"OpenShardCount": 4,
"ConsumerCount": 0
}
}
Shard List
{
"ShardId": "shardId-000000000000",
"Status": "CLOSED",
"ParentShardId": null,
"AdjacentParentShardId": null,
"StartingHashKey": "0",
"EndingHashKey": "340282366920938463463374607431768211455",
"StartingSequenceNumber": "49667106115779825810338104373766199978273704206997127170",
"EndingSequenceNumber": "49667106115790976182937369685335758911590473956837031938"
}
{
"ShardId": "shardId-000000000001",
"Status": "CLOSED",
"ParentShardId": "shardId-000000000000",
"AdjacentParentShardId": null,
"StartingHashKey": "0",
"EndingHashKey": "170141183460469231731687303715884105727",
"StartingSequenceNumber": "49667106428726183181318338918936934498348221487401402386",
"EndingSequenceNumber": "49667106428737333553917604230506493431664933512880848914"
}
{
"ShardId": "shardId-000000000002",
"Status": "CLOSED",
"ParentShardId": "shardId-000000000000",
"AdjacentParentShardId": null,
"StartingHashKey": "170141183460469231731687303715884105728",
"EndingHashKey": "340282366920938463463374607431768211455",
"StartingSequenceNumber": "49667106428748483926516869542078470216620869848907382818",
"EndingSequenceNumber": "49667106428759634299116134853648029149937581874386829346"
}
{
"ShardId": "shardId-000000000003",
"Status": "OPEN",
"ParentShardId": "shardId-000000000001",
"AdjacentParentShardId": null,
"StartingHashKey": "0",
"EndingHashKey": "85070591730234615865843651857942052863",
"StartingSequenceNumber": "49667106441972825829245529065009151152301350764574408754",
"EndingSequenceNumber": null
}
{
"ShardId": "shardId-000000000004",
"Status": "OPEN",
"ParentShardId": "shardId-000000000001",
"AdjacentParentShardId": null,
"StartingHashKey": "85070591730234615865843651857942052864",
"EndingHashKey": "170141183460469231731687303715884105727",
"StartingSequenceNumber": "49667106441995126574444059688150686870573999126080389186",
"EndingSequenceNumber": null
}
{
"ShardId": "shardId-000000000005",
"Status": "OPEN",
"ParentShardId": "shardId-000000000002",
"AdjacentParentShardId": null,
"StartingHashKey": "170141183460469231731687303715884105728",
"EndingHashKey": "255211775190703847597530955573826158591",
"StartingSequenceNumber": "49667106442017427319642590311292222588846647487586369618",
"EndingSequenceNumber": null
}
{
"ShardId": "shardId-000000000006",
"Status": "OPEN",
"ParentShardId": "shardId-000000000002",
"AdjacentParentShardId": null,
"StartingHashKey": "255211775190703847597530955573826158592",
"EndingHashKey": "340282366920938463463374607431768211455",
"StartingSequenceNumber": "49667106442039728064841120934433758307119295849092350050",
"EndingSequenceNumber": null
}
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working