vLLM emits a fqn -> tensors state-dict.
This state-dict does not contains the sharding details of the tensor to be loaded. ( It does not have the start of the shard index).
What we do right now, is to manually calculate the sharding details in sharding::_calculate_tensor_shard().
This is a fragile code ( doing outside vLLM sharder + manual routine).
Ideally we should get the sharding info when we call the model.state_dict() of vLLM loaded model.