Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

torch.cuda.OutOfMemoryError to infer for a large protein #4

Closed
yingnan-hou opened this issue Apr 11, 2024 · 1 comment
Closed

torch.cuda.OutOfMemoryError to infer for a large protein #4

yingnan-hou opened this issue Apr 11, 2024 · 1 comment

Comments

@yingnan-hou
Copy link

Hi, when I have attempted to infer (not train) a protein with a sequence length of 525 using Str2Str, a memory error occurred as following:
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 4.19 GiB. GPU 0 has a total capacity of 44.40 GiB of which 4.20 GiB is free. Including non-PyTorch memory, this process has 40.20 GiB memory in use. Of the allocated memory 39.85 GiB is allocated by PyTorch, and 40.98 MiB is reserved by PyTorch but unallocated.
The code was conducted on the NVIDIA A40-Xeon-48GB GPU. Is there any way that I can successfully infer this protein using Str2Str within the constraints of this computing resources? Looking forward to your reply, Thanks.

@lujiarui
Copy link
Owner

Hi @yingnan-hou, for CUDA mem issue in general, you may use lower precision such as fp16 without changing any code on the same device. Besides, note that the inference script will use an default inference batch size larger than one (see this), if this is the case, you may use smaller batch size. Hopefully it addressed your concern.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants