Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OutOfMemoryError: CUDA out of memory #5

Open
Sicmatr1x opened this issue Apr 9, 2023 · 5 comments
Open

OutOfMemoryError: CUDA out of memory #5

Sicmatr1x opened this issue Apr 9, 2023 · 5 comments

Comments

@Sicmatr1x
Copy link

  File "C:\Users\sicma/.cache\huggingface\modules\transformers_modules\local\modeling_chatglm.py", line 277, in attention_fn
    attention_scores = attention_scores * query_key_layer_scaling_coeff
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 116.00 MiB (GPU 0; 6.00 GiB total capacity; 5.07 GiB already allocated; 0 bytes free; 5.31 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

My card is NVIDIA GeForce RTX2060, memory size is 6144MB

@josStorer
Copy link
Owner

Attempt to close some gpu memory-consuming programs. For chatglm, 6GB of memory is just enough to run

@Terrency
Copy link

您好, 进行多轮对话的时候,会出现显存不足的情况,请问main.py上有没有好的办法限制呢?
当多轮对话的内容超过一定的长度,程序就会崩溃掉,需要重启
image

@josStorer
Copy link
Owner

josStorer commented May 11, 2023

@Terrency 我正在开发RWKV-Runner 以获得最佳的体验, 避免爆显存, 并适配2GB-20GB各规模显存, 这个模型可商用, 灵活性强, 模型结构方面也比较有潜力, 大概一周内会出一个初版

@Terrency
Copy link

@josStorer 你咋这么牛逼呢,期待最新作品, 我就为了搭建以后团队内部能查查资料。

@josStorer
Copy link
Owner

josStorer commented May 22, 2023

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants