How much memory is needed for infer？ #51

huangluyao · 2021-06-07T07:51:04Z

My graphics boards is gtx1660ti, memory 6G.
I run this code to report an error:
RuntimeError: CUDA out of memory. Tried to allocate 1.00 GiB (GPU 0; 5.81 GiB total capacity; 2.90 GiB already allocated; 420.50 MiB free; 3.84 GiB reserved in total by PyTorch)

selimlouis · 2021-07-01T09:49:14Z

I have a similar problem. Just want to test the whole thing with my gtx 970 memory 4G.

I get:

Traceback (most recent call last):
  File "fsod_train_net.py", line 118, in <module>
    args=(args,),
  File "/home/selim/anaconda3/envs/FewX/lib/python3.7/site-packages/detectron2/engine/launch.py", line 62, in launch
    main_func(*args)
  File "fsod_train_net.py", line 106, in main
    return trainer.train()
  File "/home/selim/anaconda3/envs/FewX/lib/python3.7/site-packages/detectron2/engine/defaults.py", line 431, in train
    super().train(self.start_iter, self.max_iter)
  File "/home/selim/anaconda3/envs/FewX/lib/python3.7/site-packages/detectron2/engine/train_loop.py", line 138, in train
    self.run_step()
  File "/home/selim/anaconda3/envs/FewX/lib/python3.7/site-packages/detectron2/engine/defaults.py", line 441, in run_step
    self._trainer.run_step()
  File "/home/selim/anaconda3/envs/FewX/lib/python3.7/site-packages/detectron2/engine/train_loop.py", line 232, in run_step
    loss_dict = self.model(data)
  File "/home/selim/anaconda3/envs/FewX/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/selim/FewShot/FewX/fewx/modeling/fsod/fsod_rcnn.py", line 153, in forward
    support_features = self.backbone(support_images)
  File "/home/selim/anaconda3/envs/FewX/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/selim/anaconda3/envs/FewX/lib/python3.7/site-packages/detectron2/modeling/backbone/resnet.py", line 444, in forward
    x = self.stem(x)
  File "/home/selim/anaconda3/envs/FewX/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/selim/anaconda3/envs/FewX/lib/python3.7/site-packages/detectron2/modeling/backbone/resnet.py", line 355, in forward
    x = self.conv1(x)
  File "/home/selim/anaconda3/envs/FewX/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/selim/anaconda3/envs/FewX/lib/python3.7/site-packages/detectron2/layers/wrappers.py", line 88, in forward
    x = self.norm(x)
  File "/home/selim/anaconda3/envs/FewX/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/selim/anaconda3/envs/FewX/lib/python3.7/site-packages/detectron2/layers/batch_norm.py", line 65, in forward
    eps=self.eps,
  File "/home/selim/anaconda3/envs/FewX/lib/python3.7/site-packages/torch/nn/functional.py", line 2058, in batch_norm
    training, momentum, eps, torch.backends.cudnn.enabled
RuntimeError: CUDA out of memory. Tried to allocate 1000.00 MiB (GPU 0; 3.94 GiB total capacity; 2.15 GiB already allocated; 340.25 MiB free; 2.79 GiB reserved in total by PyTorch)

I tried halving the BATCH_SIZE_PER_IMAGE and IMS_PER_BATCH settings in the config but I still get memory problems. I dont want to make them too small, I think it would lead to bad results. Not an expert though.

Did anyone find a solution?

selimlouis · 2021-07-08T09:21:19Z

Ok so I continued trying to get it to work.

I found success when setting the SOLVER.IMS_PER_BATCH to 1 in configs/fsod/Base-FSOD-C4.yaml

I did not run a complete training process since it would have taken me 2 days and 11 hours, but it started training without issues.
Hope this helps someone else too

xiaohei1001 · 2021-12-30T11:47:27Z

It depends on your support set. Maybe you can try to make RPN.POST_NMS_TOPK_TEST small.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How much memory is needed for infer？ #51

How much memory is needed for infer？ #51

huangluyao commented Jun 7, 2021

selimlouis commented Jul 1, 2021

selimlouis commented Jul 8, 2021

xiaohei1001 commented Dec 30, 2021

How much memory is needed for infer？ #51

How much memory is needed for infer？ #51

Comments

huangluyao commented Jun 7, 2021

selimlouis commented Jul 1, 2021

selimlouis commented Jul 8, 2021

xiaohei1001 commented Dec 30, 2021