Reduce Memory Use of GPUs in one line code. #34

yumi-cn · 2020-12-11T11:00:15Z

I have try to run this project codes in RTX2080Ti (11GB) x 4，

the original args like "--view-per-batch 4 --pixel-per-view 2048" will cause the OOM Error In Cuda devices in just 2 iters,

so I try to reduce the batch size to "--view-per-batch 4 --pixel-per-view 128"，and it works well in the first 5000 iters,

and the args "--view-per-batch 2 --pixel-per-view 128", works well in the first 25000 iters,

They will finally cause the OOM Error in the voxels split step(just a guess)，So I try to check the codes about the mm control part, and I did not found any codes about "Release the unused cache of Pytorch"，like some codes：

torch.cuda.empty_cache()

so I try to add this code to the "fairnr/models/nsvf.py/NSVFModel/clean_caches"：

    def clean_caches(self, reset=False):
        self.encoder.clean_runtime_caches()
        if reset:
            self.encoder.reset_runtime_caches()
        torch.cuda.empty_cache() # cache release after Model do all things

And this really help me to do more split steps (but still can not do more split steps like after 75000 iters)。

Before Add this line code：

Mem use of Cuda device: 4000MB ->(voxel split) 8000MB -> (voxel split) OOM Error

After Add this line code:

Mem use of Cuda device: 4000MB ->(voxel split) 6800MB -> (voxel split) 9900MB ->  (voxel split) OOM Error

And I don't find any bad affect on the results, yet.

I also try other ways to solve the problem of OOM, like add args "--fp16" to turn on the fp16 mode in apex module(which says can reduce the mem use due to use float16),
But this just cause error which I post the Issue #33.

If you guys have interesets about how to run these codes in the other cuda device(especially those not have so much gpu mm as V100 32GB)，This line code and the bug report maybe useful for you guys.

Thanks for replying.

The text was updated successfully, but these errors were encountered:

ghasemikasra39 · 2020-12-12T19:13:53Z

On which dataset and on which object of that dataset are you training?

yumi-cn · 2020-12-13T01:14:43Z

On which dataset and on which object of that dataset are you training?

I have test on the Synthetic-NSVF dataset, such as the Bike and the Palace.

yyeboah · 2020-12-16T01:33:58Z

@yumi-cn Thanks for sharing your insights. For those that cannot make use of half precision, and haven’t got 32 GB of GPU memory, is there any other way to get past sub-division at 75K ?

yumi-cn · 2020-12-16T05:12:39Z

@yumi-cn Thanks for sharing your insights. For those that cannot make use of half precision, and haven’t got 32 GB of GPU memory, is there any other way to get past sub-division at 75K ?

Actually I do have some ideas about this, but I cannot share those to u now (maybe some paper ideas). And I find that the 25K sub-division and 40K iters can be usable for most scenes at general precision, if u don't need such high precision, dont need to take sub-divison at 75K.

MultiPath · 2020-12-21T19:26:06Z

Also, maybe the initial voxel size is too small. Maybe you can make it bigger.

iernstig mentioned this issue Jul 16, 2021

Cuda OOM in plots.py during mesh extraction lioryariv/idr#18

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce Memory Use of GPUs in one line code. #34

Reduce Memory Use of GPUs in one line code. #34

yumi-cn commented Dec 11, 2020

ghasemikasra39 commented Dec 12, 2020

yumi-cn commented Dec 13, 2020

yyeboah commented Dec 16, 2020

yumi-cn commented Dec 16, 2020

MultiPath commented Dec 21, 2020

Reduce Memory Use of GPUs in one line code. #34

Reduce Memory Use of GPUs in one line code. #34

Comments

yumi-cn commented Dec 11, 2020

ghasemikasra39 commented Dec 12, 2020

yumi-cn commented Dec 13, 2020

yyeboah commented Dec 16, 2020

yumi-cn commented Dec 16, 2020

MultiPath commented Dec 21, 2020