Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

torch.cuda.OutOfMemoryError #29

Closed
iamblue opened this issue Apr 2, 2023 · 3 comments
Closed

torch.cuda.OutOfMemoryError #29

iamblue opened this issue Apr 2, 2023 · 3 comments

Comments

@iamblue
Copy link

iamblue commented Apr 2, 2023

使用 13B 模型,並用以下指令:

CUDA_VISIBLE_DEVICES=1 python generate.py --model_path "decapoda-research/llama-13b-hf" --lora_path "Chinese-Vicuna/Chinese-Vicuna-lora-13b-belle-and-guanaco" --use_local 1

最終出現這樣錯誤:

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 68.00 MiB (GPU 0; 10.75 GiB total capacity; 10.17 GiB already allocated; 47.94 MiB free; 10.17 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

顯卡是使用 RTX 2080 11G

有設置過

export PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:32

依然無用

也有至 generate.py 把 batch_size = 2 使用依然無效

有什麼建議嗎?

@Facico
Copy link
Owner

Facico commented Apr 2, 2023

@iamblue 如果要在2080Ti上使用推理建議使用7b的模型,比如這個generate腳本中所示,13b的模型建議使用更大顯存的顯卡或者使用CPU進行推理(如果內存大小支持的話)

@iamblue
Copy link
Author

iamblue commented Apr 4, 2023

@Facico 好的,用 cpu 測試下來,需要 memory 約近 54GB ,想問一下後續有機會優化記憶體空間嗎?

@Facico
Copy link
Owner

Facico commented Apr 4, 2023

我們也支持使用gptq的量化(不過在量化的過程中需要比較大的顯存),由於用那個量化的方法目前效果不好,沒有將量化後的模型上傳,後續我們會關註這上面的問題。

@Facico Facico closed this as completed Apr 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants