-
-
Notifications
You must be signed in to change notification settings - Fork 10.5k
Closed
Description
Currently there is no way to use large models hence there is no support for 8-bit quantization and more importantly there is no support for device mapping.
As you can see first GPU is filled but second GPU is left unallocated.
Here is the error message:
OutOfMemoryError: CUDA out of memory. Tried to allocate 270.00 MiB (GPU 0; 23.70 GiB total capacity; 22.40 GiB already allocated; 247.50 MiB free; 22.40 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Metadata
Metadata
Assignees
Labels
No labels