torch.cuda.OutOfMemoryError #29

iamblue · 2023-04-02T12:08:31Z

使用 13B 模型，並用以下指令：

CUDA_VISIBLE_DEVICES=1 python generate.py --model_path "decapoda-research/llama-13b-hf" --lora_path "Chinese-Vicuna/Chinese-Vicuna-lora-13b-belle-and-guanaco" --use_local 1

最終出現這樣錯誤：

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 68.00 MiB (GPU 0; 10.75 GiB total capacity; 10.17 GiB already allocated; 47.94 MiB free; 10.17 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

顯卡是使用 RTX 2080 11G

有設置過

export PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:32

依然無用

也有至 generate.py 把 batch_size = 2 使用依然無效

有什麼建議嗎？

The text was updated successfully, but these errors were encountered:

Facico · 2023-04-02T15:56:39Z

@iamblue 如果要在2080Ti上使用推理建議使用7b的模型，比如這個generate腳本中所示，13b的模型建議使用更大顯存的顯卡或者使用CPU進行推理（如果內存大小支持的話）

iamblue · 2023-04-04T05:03:07Z

@Facico 好的，用 cpu 測試下來，需要 memory 約近 54GB ，想問一下後續有機會優化記憶體空間嗎？

Facico · 2023-04-04T07:55:19Z

我們也支持使用gptq的量化（不過在量化的過程中需要比較大的顯存），由於用那個量化的方法目前效果不好，沒有將量化後的模型上傳，後續我們會關註這上面的問題。

Facico closed this as completed Apr 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

torch.cuda.OutOfMemoryError #29

torch.cuda.OutOfMemoryError #29

iamblue commented Apr 2, 2023

Facico commented Apr 2, 2023

iamblue commented Apr 4, 2023

Facico commented Apr 4, 2023 •

edited

torch.cuda.OutOfMemoryError #29

torch.cuda.OutOfMemoryError #29

Comments

iamblue commented Apr 2, 2023

Facico commented Apr 2, 2023

iamblue commented Apr 4, 2023

Facico commented Apr 4, 2023 • edited

Facico commented Apr 4, 2023 •

edited