BN memory usage #8

WANGSSSSSSS · 2022-07-29T07:53:45Z

Hello, impressed by your block squeezing and block linearization idea, but the memory usage of bn implemented by pytorch is some kind of weird, which is not IN-PLACE, which allocates an input tensor size memory buffer for output, thus doubles the memory consumption. please refer to https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/cudnn/BatchNorm.cpp so i think your comparison is not fair, to some extent.

JUGGHM · 2022-08-01T12:04:21Z

Thanks for your interest WangS! I agree that in-place BN layers are more memory-efficient. When we were conducting this work we followed the implementations of Ding etal's and didn't take in-place BN layers into consideration, neither in DBB/RepVGG nor ours. This surely leads to higher extra memory to a severer extent for DBB. However, I could not provide results for in-placed variants currently since I am working on some other projects now.

WANGSSSSSSS closed this as completed Aug 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BN memory usage #8

BN memory usage #8

WANGSSSSSSS commented Jul 29, 2022

JUGGHM commented Aug 1, 2022

BN memory usage #8

BN memory usage #8

Comments

WANGSSSSSSS commented Jul 29, 2022

JUGGHM commented Aug 1, 2022