Blob read fails due to stale blob_id→location index when GC runs before log replay

SH test found a read_verify failed with "Invalid shard_id". 
The reason is that the blob_id→location index was stale: during crash recovery, GC moved the blob’s data to a new chunk before log replay completed, and the subsequent replay updated the index back to the old location. Reads then targeted the wrong block, finding a header with a different shard_id.

For example:
  - T1: shard=1, blob=100 committed at LSN 100.
  - T2: SM crashed with cp_lsn=90 and dc_lsn=110.
  - T3: On restart, GC manager started and run before replication svc start completed.
  - T4: GC moved the blob’s chunk (location changed, e.g., from chunk=40 to chunk=95).
  - T5: Log replay applied LSN 100 and reset the blob’s index to the old location (chunk=40), resulting in reads targeting stale blocks that no longer contain the correct shard’s data.

Relevant Logs:
  - Blob get failure (invalid shard_id):
  ```
[02/05/26 23:51:59.708] [storage_mgr] [debug] [402416] [hs_blob_manager.cpp:322:_get_blob] [traceID=8132044346085171088,shardID=0x48,pg=0,shard=0x48,blob=3099880] Blob Get request: pg=0, group=b9af911a-6c1b-4ff4-94fa-c10aa39f0d0e, shard=0x48, blob=3099880, offset=0, len=0
  [02/05/26 23:51:59.708] [storage_mgr] [debug] [402416] [hs_blob_manager.cpp:360:_get_blob_data] [traceID=8132044346085171088,shardID=0x48,pg=0,shard=0x48,blob=3099880] Reading from blkid=[{blk#=287643 count=2 chunk=40},] to buf=0x582e379e2000
  [02/05/26 23:51:59.718] [storage_mgr] [error] [18] [hs_blob_manager.cpp:708:do_verify_blob] Invalid shard_id in header: [header=magic=0x21fdffdba8d68fc6 version=2 shard=0x57 blob_size=4096 user_size=32 algo=1 hash=65 2a 49 ae ..., user_key=storage_hammer:nuobject2-sh-1007]
```

  - Blob put and commit to chunk 40:
 ```
 [02/02/26 09:18:47.222] [storage_mgr] [debug] [98] [hs_blob_manager.cpp:118:_put_blob] [traceID=15106702424960586969,shardID=0x48,pg=0,shard=0x48,blob=3099880] Blob Put request: pg=0, group=b9af911a-6c1b-4ff4-94fa-c10aa39f0d0e, shard=0x48, length=4096
  [02/02/26 09:18:47.222] [storage_mgr] [trace] [98] [hs_blob_manager.cpp:193:_put_blob] ... header=... shard=0x48 ...
  [02/02/26 09:18:47.222] [storage_mgr] [debug] [98] [hs_blob_manager.cpp:497:blob_put_get_blk_alloc_hints] ... Picked p_chunk_id=40 ...
  [02/02/26 09:18:49.269] [storage_mgr] [trace] [79] [hs_blob_manager.cpp:257:on_blob_put_commit] blob put commit lsn=4920918, pbas=[{blk#=287643 count=2 chunk=40},]
  [02/02/26 09:18:49.269] [storage_mgr] [debug] [79] ... Put blob success blkid=[{blk#=287643 count=2 chunk=40},]
```

  - Checkpoint flush during put:
 ` [02/02/26 09:18:48.923] [storage_mgr] [debug] [21] [raft_repl_dev.cpp:2456:cp_flush] ... cp flush ... lsn=4920910, clsn=3910000 ...`

  - Shard sealed:
  ```
[02/02/26 09:18:56.311] [storage_mgr] [debug] [99] [hs_shard_manager.cpp:256:_seal_shard] ... Seal shard request: is_open=true
  [02/02/26 09:18:56.313] [storage_mgr] [debug] [79] ... Shard sealed success, is_open=false
```

  - Restart order: GC starts before log replay completes:
  ```
[02/02/26 09:21:44.725] [storage_mgr] [info] [10] [raft_repl_dev.cpp:88:RaftReplDev] ... commited_lsn=4923664, checkpoint_lsn:4920910 ...
  [02/02/26 09:21:44.728] [storage_mgr] [info] [10] [hs_homeobject.cpp:358:operator()] Starting GC manager
  [02/02/26 09:21:44.728] [storage_mgr] [info] [10] [raft_repl_service.cpp:171:start] Starting LogStore service, fist_boot = false
```

  - GC moves data from chunk 40 to 95:
 ```
 [02/02/26 09:22:44.728] [storage_mgr] [debug] [69] [gc_manager.cpp:1206:process_gc_task] ... start process gc task for move_from_chunk=40 ...
  [02/02/26 09:24:50.015] [storage_mgr] [info] [69] [gc_manager.cpp:1307:process_gc_task] ... task for move_from_chunk=40 to move_to_chunk=95 ... completed!
```

  - Log replay re-applies old location (chunk=40):
```
  [02/02/26 09:28:04.506] [storage_mgr] [trace] [10] [raft_repl_dev.cpp:2583:on_log_found] ... Applying ... lsn=4920918 ...
  [02/02/26 09:28:04.506] [storage_mgr] [debug] [10] [raft_repl_dev.cpp:2638:on_log_found] Replay log on restart, ... local_blkid=[[{blk#=287643 count=2 chunk=40},] ]
  [02/02/26 09:28:04.506] [storage_mgr] [debug] [10] [raft_repl_dev.cpp:1631:handle_commit] ... Commit ... local_blkid=[[{blk#=287643 count=2 chunk=40},] ]
  [02/02/26 09:28:04.506] [storage_mgr] [trace] [10] [hs_blob_manager.cpp:222:local_add_blob_info] ... blob put commit ... pbas=[{blk#=287643 count=2 chunk=40},]
...
  [02/02/26 09:28:04.834] [storage_mgr] [info] [10] [raft_repl_service.cpp:173:start] Started LogStore service, log replay should already done till this point
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Blob read fails due to stale blob_id→location index when GC runs before log replay #388

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Blob read fails due to stale blob_id→location index when GC runs before log replay #388

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions