RuntimeError: CUDA error: out of memory #19

albertchristian92 · 2021-01-11T08:30:10Z

Hi, Thank you for your works. Actually, I am interested in this works, but when I tried to start training your code using Docker, I met a problem RuntimeError: CUDA error: out of memory as shown here:

I am using Multi GPU GeForce GTX 1080 as following:

Here, how I run your code:

python3 main.py
ddd
--exp_id centerfusion
--shuffle_train
--train_split mini_train
--val_split mini_val
--val_intervals 1
--run_dataset_eval
--nuscenes_att
--velocity
--batch_size 4
--lr 2.5e-4
--num_epochs 60
--lr_step 50
--save_point 20,40,50
--gpus 0,2,3
--not_rand_crop
--flip 0.5
--shift 0.1
--pointcloud
--radar_sweeps 6
--pc_z_offset 0.0
--pillar_dims 1.0,0.2,0.2
--max_pc_dist 60.0
--num_workers 0
--load_model ../models/centerfusion_e60.pth \

Please give any suggestion regarding this issue. Thank you very much.

fabrizioschiano · 2021-10-08T14:44:16Z

Hi @albertchristian92 , how did you solve this problem?

My solution was to decrease the batch_size parameter to 4 but I see that you have already did it.

fabrizioschiano · 2021-10-08T15:23:55Z

The train.sh is now running and I have the following configuration

nvidia-smi
Fri Oct  8 17:22:38 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.74       Driver Version: 470.74       CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   68C    P0    74W /  N/A |   6350MiB /  7973MiB |     93%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1043      G   /usr/lib/xorg/Xorg                102MiB |
|    0   N/A  N/A      1656      G   /usr/lib/xorg/Xorg                442MiB |
|    0   N/A  N/A      1787      G   /usr/bin/gnome-shell               54MiB |
|    0   N/A  N/A     25109      G   ...Webex/bin/CiscoCollabHost       20MiB |
|    0   N/A  N/A     88863      G   .../debug.log --shared-files       13MiB |
|    0   N/A  N/A     88929      G   ...AAAAAAAAA= --shared-files      117MiB |
|    0   N/A  N/A    113326      C   python                           5579MiB |

albertchristian92 closed this as completed Jan 11, 2021

AHappyFlyBird mentioned this issue Feb 3, 2021

Training problem #22

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: CUDA error: out of memory #19

RuntimeError: CUDA error: out of memory #19

albertchristian92 commented Jan 11, 2021 •

edited

fabrizioschiano commented Oct 8, 2021 •

edited

fabrizioschiano commented Oct 8, 2021

RuntimeError: CUDA error: out of memory #19

RuntimeError: CUDA error: out of memory #19

Comments

albertchristian92 commented Jan 11, 2021 • edited

fabrizioschiano commented Oct 8, 2021 • edited

fabrizioschiano commented Oct 8, 2021

albertchristian92 commented Jan 11, 2021 •

edited

fabrizioschiano commented Oct 8, 2021 •

edited