-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Is it normal that MXNet consumes much more system memory than Caffe during training in GPU mode? #2111
Comments
Memory cost is related to Batch size. It is not normal, because in same batch size, MXNet will use much fewer memory. |
Thanks for the reply @antinucleon. Forgot to mention that I set the |
I think @EasonD3 means the CPU memory consumption. This could due to the memory needed for the recordIO pipeline due to current setting of caching queues. We tuned the queue size for faster prefetching and decoding speed, maybe the setting was a bit large to eat up a bit more RAM |
@tqchen Yes. I just checked GPU feature memory in Inception BN is 861 MB when batch size is 20. |
@antinucleon Thanks for the number. GPU-wise, the memory consumption is roughly the same as my side. But my issue is with the system memory. I should've mentioned more clearly in my post. @tqchen Thanks. If I'd like to tune the RAM usage, can you advise how to do that? |
@tqchen Speaking of the queue size as you pointed out, I also observe that the latest MXNet code consumes about 50% more RAM compared to an older version of 2~3 months ago. |
Thanks for pointing out it. Recently there is a refactor of IO. I will check it after I finish my job today, |
Some quick things to try
|
@tqchen Thanks for the hints. I just tested by training 15000 color images of size 224x224 with the Inception-BN model. I set |
I get the same problem when training ssd or imagenet using .rec file! |
This issue is closed due to lack of activity in the last 90 days. Feel free to ping me to reopen if this is still an active issue. Thanks! |
I just switched from Caffe to MXNet. When training a GoogleNet model using the provided python scripts, I observe that MXNet always consumes much more system memory than Caffe in GPU mode. For instance, MXNet can easily eat 10GB RAM during training, while Caffe only takes less than 1GB.
I'm not sure if I didn't compile the MXNet code correctly. But before compilation, the only change I made in the
config.mk
is to enable CUDA.Anyone could comment on that? Is there anything I need to set properly for MXNet in order to reduce the memory usage?
The text was updated successfully, but these errors were encountered: