-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
out of memory! Could you please tell me your GPU card type? #33
Comments
I reduce the 768 images to 160 images. It is working fine with me with 8.5G memory left. But if I use 768 images nearly 5 times larger. So I guess I need 40G to 50G GPU memories. And it is difficult to run on pycaffe with multiple GPUs. Could you plz help me! I am a new to action detection. Really appreciated! |
@sijun-zhou hello, I have meet the same problem, do you solved it? and i an also a new about the action detection, thanks a lot |
@yanqian123 I used 1080 Ti *1. My problem solved when I open CUDNN for the project. |
about Makefile.config CUDNN==1 ? right?? thanks again |
@sijun-zhou about Makefile.config CUDNN==1 ? right?? thanks again |
@yanqian123 yes |
i am sorry to say,it did not work, my gpu is 1050, but when i set CUDNN==1, i could not solve my problem. could you give me some advice? |
@sijun-zhou thank you again |
@yanqian123 As far as i am remember, if you do not change batch size(700+? i don't remember it clearly). It will consume approximate 5-6G GPU memory. It obvious that 1050 cannot support it. |
I am getting the same out of memory error while testing. F1115 21:40:12.954958 25933 syncedmem.cpp:56] Check failed: error == cudaSuccess (2 vs. 0) out of memory Any way to handle this like reducing batch size or number of frames? |
@viswalal hello, I have meet the same problem, I think it's due to a mismatch in the number of GPUs. I only have one GPU, but it is 'GPU_ID: 1', it should be 'GPU_ID: 0' , but I can‘t find the config file, do you solved it? |
@Xchangjiang Hi, I also have only one GPU. For me, GPU ID is coming as 0 in log while running script_test.sh. I am not able to resolve it. While running the test, I have checked GPU usage. It is increasing and getting crashed when memory is full. I am not able to reduce the batch size. Actually not able to identify where to change it. |
You can change GPU_ID in the file “script_train.sh”.
On Nov 26, 2018, at 09:56, viswalal <notifications@github.com<mailto:notifications@github.com>> wrote:
@Xchangjiang<https://github.com/Xchangjiang> Hi, I also have only one GPU. For me, GPU ID is coming as 0 in log while running script_test.sh. I am not able to resolve it
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#33 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AFOa_0qtzigzXEJEyEF5p_v1e2ykKUIdks5uzCs4gaJpZM4VOfTR>.
|
@huijuan88 , Hello.. I think the GPU ID 0 is correct for me. Since my GPU is only 4 GB it is getting crashed. I want to change the batch size for running script_test.sh like we set 'batch_size' in the network definition prototxt file. ( or maybe reducing the number of frames it loads at a time will help). |
You can change the buffer size in “td_cnn_end2end.yml”. LENGTH: [768]
You also need to change the data process file to make everything consistent.
On Nov 26, 2018, at 22:49, viswalal <notifications@github.com<mailto:notifications@github.com>> wrote:
@huijuan88<https://github.com/huijuan88> , Hello.. I think the GPU ID 0 is correct for me. Since my GPU is only 4 GB it is getting crashed. I want to change the batch size for running script_test.sh like we set 'batch_size' in the network definition prototxt file. ( or maybe reducing the number of frames it loads at a time will help).
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#33 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AFOa_5oHAfLruY8s2PWtSnuajw7B6o_tks5uzOCTgaJpZM4VOfTR>.
|
@huijuan88 thank you.. I will try that |
@huijuan88 hi, I have tried with length=256,128,64 and 32 and changed the data generation also (by editing generate_roidb_512.py and running the same) still getting the same error. I am stuck at this point. |
The error is about memory. But it should fit for such small length, e.g. 32.
On Nov 27, 2018, at 22:36, viswalal <notifications@github.com<mailto:notifications@github.com>> wrote:
@huijuan88<https://github.com/huijuan88> hi, I have tried with length=256,128,64 and 32 and changed the data generation also (by editng generate_roidb_512.py and running the same) still getting the same error. I am stuck at this point.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#33 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AFOa_y0Tq_bPF3w6Av8yQRcXfIaXN1A5ks5uzi78gaJpZM4VOfTR>.
|
@viswalal hi, I also meet this problem, and the error is still there although I tried with length=256,128,64,32 and 16 in the generate_roidb_512.py, Have you solved this problem? Really appreciated! |
Hi, Huijuan @huijuan88
I am using a card of 1080Ti with 11G memory, but 2.5G was used by other students, so I was only left with 8.5G memory with GPU. But when I run the test script in ActivityNet with your provided script, only loaded one 1 video's frams(768 images), but out of memory at the step:
blobs_out = net.forward(**forward_kwargs)
"""
F0713 15:08:15.452706 22317 syncedmem.cpp:56] Check failed: error == cudaSuccess (2 vs. 0) out of memory
*** Check failure stack trace: ***
Aborted (core dumped)
"""
so could you plz tell me what is your GPU type and how many GPUs have you used when testing and training this code?
Thanks in advance!
The text was updated successfully, but these errors were encountered: