-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle GPU memory management #20
Comments
This was handled in NVcaffe with v0.13.1 with the CNMeM memory manager. Now Caffe automatically uses ~90% of the available GPU memory. NVIDIA/caffe#12 It's not a complete solution to the issue reported above - sometimes you still need to lower your batch size manually - but it solves most of the problem. |
@lukeyeager thank you for handling this annoying issue. I came across this cuda mem issue recently and I noticed that in fact caffe allocates memory for BOTH train and test nets. Is there a way to "swap" these nets in and out of the gpu memory? That is, during training only allocate and work on the training net. Then when starting a test phase, swap the training net out of GPU and allocate for the test net? |
@shaibagon that seems like a good idea to me, but I may be unaware of some limitation that makes it hard/impossible. Either way, the DIGITS team wouldn't be involved in making that sort of a change, and we probably wouldn't do it in NVcaffe either. This would probably need to be changed in BVLC/caffe and then we'd pull it into NVcaffe for our next release at some point. |
FTR, one way to slightly diminish the test net memory load is to set its batch_size to 1, and to set the testing iteration above or equal to the total number of iterations. Very inconenient and not solving the issue, but this is a trick I use in my own dealing with my own version of Caffe. Swapping would indeed be great and could also be done without too much difficulty by directly dealing with the prototxt and loading portions on purpose. |
@lukeyeager - thank you for your reply. |
@homah what does your question have to do with this thread? Also, we would prefer for you to use our user group for questions, and to use GitHub only to report bugs and feature requests. This is clearly explained in our README: |
See #18.
DIGITS should handle GPU memory allocation for the user automatically. This could be done in a few ways:
Memory required for data
line in caffe's output, but it seems totally unrelated to the amount of memory used on the GPU.CUDNN_CONVOLUTION_FWD_SPECIFY_WORKSPACE_LIMIT
option could be used to say "use the fastest algorithm that fits within the specified memory budget." This would be a change to caffe, not to DIGITS, and it's not a complete solution since many people will be using caffe without cuDNN and maybe even without CUDA.The text was updated successfully, but these errors were encountered: