Memory GPU leak, possibly with Caffe services #12

beniz · 2015-06-11T05:59:32Z

There is a weird leak behavior with Caffe:

the first service running Caffe almost determines the leak by leaving a leak behind. The size of the leak is directly proportional to the size of the training dataset
the leak appears to be independent from the input connector (witnessed for both CSV and text)
leak grow slowly with every new service, proportionally to initial size. This suggests that a structure somewhere is being incremented, with initial size set by the first run service.

First set of investigations appears to rule out the Caffe's net destruction, and first valgrind pass does not reveal much yet.

beniz · 2015-06-12T11:10:11Z

Two points:

leak is only visible with GPU
in CPU-only mode, valgrind reports no leak
in GPU-only mode, valgrind reports large chunks held by CUDA (libcuda.so), this is also visible while running with nvidia-smi.

Typical used memory values on a 4GB GPU during a run:
before start: 152MB
training start: 198MB
training: 589MB
training finished, net is up for prediction: 436MB
service destruction: 200MB
server termination: 152MB

Caffe is allocating some GPU data and not releasing it or there's a way to clear the GPU memory overhead after using it within a net with Caffe.

beniz · 2015-10-06T15:18:05Z

There were a stream of recent changes in Caffe memory allocation and I cannot reproduce this issue anymore. Closing for now.

beniz added the type:bug label Jun 11, 2015

beniz self-assigned this Jun 11, 2015

beniz added the mllib:caffe label Jun 11, 2015

beniz added type:wontfix kind:cuda labels Jun 12, 2015

beniz changed the title ~~Memory leak, possibly with Caffe services~~ Memory GPU leak, possibly with Caffe services Jun 12, 2015

beniz closed this as completed Oct 6, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory GPU leak, possibly with Caffe services #12

Memory GPU leak, possibly with Caffe services #12

beniz commented Jun 11, 2015

beniz commented Jun 12, 2015

beniz commented Oct 6, 2015

Memory GPU leak, possibly with Caffe services #12

Memory GPU leak, possibly with Caffe services #12

Comments

beniz commented Jun 11, 2015

beniz commented Jun 12, 2015

beniz commented Oct 6, 2015