You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I use tensor_parallel to make my model inference on two GPUs(the model is deployed on Flask pywsgi), I also use gunicorn to manage my flask app to make it can accept many request and then make my model inference concurrently, I test this on single GPU is ok, but as long as I use tensor_parallel lib, it will comes an error:tensor_parallel/cross_device_ops.py", line 78, in forward inputs = tuple(map(torch.Tensor.contiguous, inputs)) TypeError: descriptor 'contiguous' for 'torch._C._TensorBase' objects doesn't apply to a 'NoneType' object, it also raise this exception:TypeError: Caught TypeError in replica 0 on device 0., what shall I do? Can you analysis the problem?
The text was updated successfully, but these errors were encountered:
@zoubaihan sadly, tensor_parallel models can't be used concurrently because otherwise the communications break.
For example, the code @linfan provided showcases how it happens when two concurrent calls reach a round of communication both trying to broadcast the the first part of some tensor, overwriting each other and leaving the second part None, but both thinking that the broadcast is complete since two parts were communicated in total.
For your specific case I'd recommend to find a library to batch requests and thus utilize the resources more efficiently. But forward calls themselves should not be concurrent.
When I use
tensor_parallel
to make my model inference on two GPUs(the model is deployed on Flask pywsgi), I also use gunicorn to manage my flask app to make it can accept many request and then make my model inference concurrently, I test this on single GPU is ok, but as long as I usetensor_parallel
lib, it will comes an error:tensor_parallel/cross_device_ops.py", line 78, in forward inputs = tuple(map(torch.Tensor.contiguous, inputs)) TypeError: descriptor 'contiguous' for 'torch._C._TensorBase' objects doesn't apply to a 'NoneType' object
, it also raise this exception:TypeError: Caught TypeError in replica 0 on device 0.
, what shall I do? Can you analysis the problem?The text was updated successfully, but these errors were encountered: