Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cuda out of memory #31

Closed
lulianLiu opened this issue Oct 27, 2022 · 12 comments
Closed

Cuda out of memory #31

lulianLiu opened this issue Oct 27, 2022 · 12 comments

Comments

@lulianLiu
Copy link

Dear author, you said that Use smaller 2D backbone by chaning the basemodel_name and num_features
The pretrained model name is here. You can try the efficientnet B5 can reduces the memory, I want to know the B5 weight and the value of num_features?

@anhquancao
Copy link
Collaborator

You can try the "tf_efficientnet_b5_ns" and the num_features should be 2048 if I am correct.

@anhquancao
Copy link
Collaborator

You need to adapt the skip_input= for all upsampling layer also.
https://github.com/cv-rits/MonoScene/blob/master/monoscene/models/unet2d.py#L76
for example

self.up16 = UpSampleBN(
                skip_input=features + 224, output_features=self.feature_1_16
            )

you need to change "+ 224" by another number to match the encoder feature dimension

@lulianLiu
Copy link
Author

You can try the "tf_efficientnet_b5_ns" and the num_features should be 2048 if I am correct.

I have tried it ,but the error is:RuntimeError: Given groups=1, weight of size [1024, 2272, 3, 3], expected input[1, 2224, 30, 40] to have 2272 channels, but got 2224 channels instead

You can try the "tf_efficientnet_b5_ns" and the num_features should be 2048 if I am correct.

You need to adapt the skip_input= for all upsampling layer also. https://github.com/cv-rits/MonoScene/blob/master/monoscene/models/unet2d.py#L76 for example

self.up16 = UpSampleBN(
                skip_input=features + 224, output_features=self.feature_1_16
            )

you need to change "+ 224" by another number to match the encoder feature dimension

how to make sure of the skip_input add?

@anhquancao
Copy link
Collaborator

anhquancao commented Oct 27, 2022

From the error message, the input dim is 2224 so you need to change the skip_input value to features + some number to match 2224. So with features=2048 as when you use the b5, some number should be 2224 - 2048. You should do the same with other upsampling layer.

@lulianLiu
Copy link
Author

lulianLiu commented Oct 27, 2022

From the error message, the input dim is 2224 so you need to change the skip_input value to features + some number to match 2224. So with features=2048 as when you use the b5, some number should be 2224 - 2048. You should do the same with other upsampling layer.

sorry, I have not understannd. Have any approches to reduce cuda? I have use 5 gpus,but it can't work.
the feature add 176, but the next I don't know how to caculate?
RuntimeError: Given groups=1, weight of size [512, 1072, 3, 3], expected input[1, 1088, 60, 80] to have 1072 channels, but got 1088 channels instead

@anhquancao
Copy link
Collaborator

you can simply set skip_input=2224 as:

self.up16 = UpSampleBN(
    skip_input=2224, output_features=self.feature_1_16
)

and for others up8, up4, ... you can see the error message and change accordingly

@anhquancao
Copy link
Collaborator

Another simpler approach is to keep the b7 and set this features variable to a smaller number liks
features=256
https://github.com/cv-rits/MonoScene/blob/master/monoscene/models/unet2d.py#L40

@lulianLiu
Copy link
Author

you can simply set skip_input=2224 as:

self.up16 = UpSampleBN(
    skip_input=2224, output_features=self.feature_1_16
)

and for others up8, up4, ... you can see the error message and change accordingly

I solve it. But also cuda out of memory. Have any approch to reduce ?

@lulianLiu
Copy link
Author

Another simpler approach is to keep the b7 and set this features variable to a smaller number liks features=256 https://github.com/cv-rits/MonoScene/blob/master/monoscene/models/unet2d.py#L40

It also didn't work
RuntimeError: CUDA out of memory. Tried to allocate 26.00 MiB (GPU 2; 14.62 GiB total capacity; 10.56 GiB already allocated; 9.00 MiB free; 10.74 GiB reserved in total by PyTorch)

@anhquancao
Copy link
Collaborator

anhquancao commented Oct 28, 2022

The model use full 32g GPU so you need to reduce the size of the network even more to fit into 14G. You can also reduce the input image size by half, becareful to scale the 3D-> 2D projection accordingly.

@lulianLiu
Copy link
Author

The model use full 32g GPU so you need to reduce the size of the network even more to fit into 14G. You can also reduce the input image size by half, becareful to scale the 3D-> 2D projection accordingly.

Thanks. I use B1 and reduce the image_size by half, now it works.

@anhquancao
Copy link
Collaborator

Great! I glad to hear that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants