out of memory! Could you please tell me your GPU card type? #33

sijun-zhou · 2018-07-13T08:19:43Z

Hi, Huijuan @huijuan88
I am using a card of 1080Ti with 11G memory, but 2.5G was used by other students, so I was only left with 8.5G memory with GPU. But when I run the test script in ActivityNet with your provided script, only loaded one 1 video's frams(768 images), but out of memory at the step:
blobs_out = net.forward(**forward_kwargs)
"""
F0713 15:08:15.452706 22317 syncedmem.cpp:56] Check failed: error == cudaSuccess (2 vs. 0) out of memory
*** Check failure stack trace: ***
Aborted (core dumped)
"""

so could you plz tell me what is your GPU type and how many GPUs have you used when testing and training this code?
Thanks in advance!

sijun-zhou · 2018-07-13T09:52:14Z

I reduce the 768 images to 160 images. It is working fine with me with 8.5G memory left. But if I use 768 images nearly 5 times larger. So I guess I need 40G to 50G GPU memories. And it is difficult to run on pycaffe with multiple GPUs. Could you plz help me! I am a new to action detection. Really appreciated!

YanYan0716 · 2018-08-31T08:29:56Z

@sijun-zhou hello, I have meet the same problem, do you solved it? and i an also a new about the action detection, thanks a lot

sijun-zhou · 2018-08-31T08:33:09Z

@yanqian123 I used 1080 Ti *1. My problem solved when I open CUDNN for the project.

YanYan0716 · 2018-08-31T10:10:59Z

about Makefile.config CUDNN==1 ？ right？？ thanks again

YanYan0716 · 2018-08-31T10:11:16Z

@sijun-zhou about Makefile.config CUDNN==1 ？ right？？ thanks again

sijun-zhou · 2018-08-31T10:15:56Z

@yanqian123 yes

YanYan0716 · 2018-09-03T08:59:38Z

i am sorry to say,it did not work, my gpu is 1050, but when i set CUDNN==1, i could not solve my problem. could you give me some advice?

YanYan0716 · 2018-09-03T08:59:56Z

@sijun-zhou thank you again

sijun-zhou · 2018-09-03T09:32:18Z

@yanqian123 As far as i am remember, if you do not change batch size(700+? i don't remember it clearly). It will consume approximate 5-6G GPU memory. It obvious that 1050 cannot support it.

viswalal · 2018-11-15T16:45:02Z

I am getting the same out of memory error while testing.

F1115 21:40:12.954958 25933 syncedmem.cpp:56] Check failed: error == cudaSuccess (2 vs. 0) out of memory
*** Check failure stack trace: ***

Any way to handle this like reducing batch size or number of frames?
GPU - GeForce 940MX ( 4 GB Only)

Xchangjiang · 2018-11-26T03:54:55Z

@viswalal hello, I have meet the same problem, I think it's due to a mismatch in the number of GPUs. I only have one GPU, but it is 'GPU_ID: 1', it should be 'GPU_ID: 0' , but I can‘t find the config file, do you solved it?

viswalal · 2018-11-26T17:56:08Z

@Xchangjiang Hi, I also have only one GPU. For me, GPU ID is coming as 0 in log while running script_test.sh. I am not able to resolve it. While running the test, I have checked GPU usage. It is increasing and getting crashed when memory is full. I am not able to reduce the batch size. Actually not able to identify where to change it.

huijuan88 · 2018-11-27T06:39:31Z

You can change GPU_ID in the file “script_train.sh”. On Nov 26, 2018, at 09:56, viswalal <notifications@github.com<mailto:notifications@github.com>> wrote: @Xchangjiang<https://github.com/Xchangjiang> Hi, I also have only one GPU. For me, GPU ID is coming as 0 in log while running script_test.sh. I am not able to resolve it — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#33 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AFOa_0qtzigzXEJEyEF5p_v1e2ykKUIdks5uzCs4gaJpZM4VOfTR>.

viswalal · 2018-11-27T06:49:55Z

@huijuan88 , Hello.. I think the GPU ID 0 is correct for me. Since my GPU is only 4 GB it is getting crashed. I want to change the batch size for running script_test.sh like we set 'batch_size' in the network definition prototxt file. ( or maybe reducing the number of frames it loads at a time will help).

huijuan88 · 2018-11-27T07:03:20Z

You can change the buffer size in “td_cnn_end2end.yml”. LENGTH: [768] You also need to change the data process file to make everything consistent. On Nov 26, 2018, at 22:49, viswalal <notifications@github.com<mailto:notifications@github.com>> wrote: @huijuan88<https://github.com/huijuan88> , Hello.. I think the GPU ID 0 is correct for me. Since my GPU is only 4 GB it is getting crashed. I want to change the batch size for running script_test.sh like we set 'batch_size' in the network definition prototxt file. ( or maybe reducing the number of frames it loads at a time will help). — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#33 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AFOa_5oHAfLruY8s2PWtSnuajw7B6o_tks5uzOCTgaJpZM4VOfTR>.

viswalal · 2018-11-27T07:04:35Z

@huijuan88 thank you.. I will try that

viswalal · 2018-11-28T06:36:43Z

@huijuan88 hi, I have tried with length=256,128,64 and 32 and changed the data generation also (by editing generate_roidb_512.py and running the same) still getting the same error. I am stuck at this point.

huijuan88 · 2018-11-30T03:50:00Z

The error is about memory. But it should fit for such small length, e.g. 32. On Nov 27, 2018, at 22:36, viswalal <notifications@github.com<mailto:notifications@github.com>> wrote: @huijuan88<https://github.com/huijuan88> hi, I have tried with length=256,128,64 and 32 and changed the data generation also (by editng generate_roidb_512.py and running the same) still getting the same error. I am stuck at this point. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#33 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AFOa_y0Tq_bPF3w6Av8yQRcXfIaXN1A5ks5uzi78gaJpZM4VOfTR>.

mxguo · 2018-12-26T01:52:27Z

@huijuan88 hi, I have tried with length=256,128,64 and 32 and changed the data generation also (by editing generate_roidb_512.py and running the same) still getting the same error. I am stuck at this point.

@viswalal hi, I also meet this problem, and the error is still there although I tried with length=256,128,64,32 and 16 in the generate_roidb_512.py, Have you solved this problem? Really appreciated!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

out of memory! Could you please tell me your GPU card type? #33

out of memory! Could you please tell me your GPU card type? #33

sijun-zhou commented Jul 13, 2018 •

edited

Loading

sijun-zhou commented Jul 13, 2018

YanYan0716 commented Aug 31, 2018

sijun-zhou commented Aug 31, 2018

YanYan0716 commented Aug 31, 2018

YanYan0716 commented Aug 31, 2018

sijun-zhou commented Aug 31, 2018

YanYan0716 commented Sep 3, 2018

YanYan0716 commented Sep 3, 2018

sijun-zhou commented Sep 3, 2018

viswalal commented Nov 15, 2018

Xchangjiang commented Nov 26, 2018

viswalal commented Nov 26, 2018 •

edited

Loading

huijuan88 commented Nov 27, 2018 via email

viswalal commented Nov 27, 2018

huijuan88 commented Nov 27, 2018 via email

viswalal commented Nov 27, 2018

viswalal commented Nov 28, 2018 •

edited

Loading

huijuan88 commented Nov 30, 2018 via email

mxguo commented Dec 26, 2018

out of memory! Could you please tell me your GPU card type? #33

out of memory! Could you please tell me your GPU card type? #33

Comments

sijun-zhou commented Jul 13, 2018 • edited Loading

sijun-zhou commented Jul 13, 2018

YanYan0716 commented Aug 31, 2018

sijun-zhou commented Aug 31, 2018

YanYan0716 commented Aug 31, 2018

YanYan0716 commented Aug 31, 2018

sijun-zhou commented Aug 31, 2018

YanYan0716 commented Sep 3, 2018

YanYan0716 commented Sep 3, 2018

sijun-zhou commented Sep 3, 2018

viswalal commented Nov 15, 2018

Xchangjiang commented Nov 26, 2018

viswalal commented Nov 26, 2018 • edited Loading

huijuan88 commented Nov 27, 2018 via email

viswalal commented Nov 27, 2018

huijuan88 commented Nov 27, 2018 via email

viswalal commented Nov 27, 2018

viswalal commented Nov 28, 2018 • edited Loading

huijuan88 commented Nov 30, 2018 via email

mxguo commented Dec 26, 2018

sijun-zhou commented Jul 13, 2018 •

edited

Loading

viswalal commented Nov 26, 2018 •

edited

Loading

viswalal commented Nov 28, 2018 •

edited

Loading