-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Proposal] Strategy to reduce memory usage #29
Comments
#31 follows Strategy-1. I Implement a Memory Pool which is transparent to Also, the Memory Pool is thread local, And can shared memory between networks in the same thread, Which may be helpful in some situation. Current implement can help example r-fcn which uses ResNet-50 model, memory usage reduces from 2G to 500M on CPU context. It's reasonable as conv3x3 pattern is repeated and all memory can be reused between layers. |
More memory analyse is needed to improve this pool implement. Currently I can see some fixed pattern in memory requests which can help to improve the pool memory management. We can use Least Frequently Used algorithm to reduce the memory. Some memory pattern can be view in #31 |
The memory for model parameters are also managed by memory pool? If multiple images pass the same model, do you fix the memory for the model? |
Yes, parameters are also managed by the pool. However the parameters memory won't be released unless the Net object is destructed. Clearing the Memory Pool only release the memory which is not used. Temporary memory and network internal buffer will be marked as unused during the forward, but model parameters won't. |
Thanks. Another question: the buffers allocated for feature maps (in/out) aren't shared the same memory block during the forward ? |
The whole idea is pretty simple. Determine the life time of all Blobs in network layer by layer. As network definition itself is already in topo order. If a Blob has no later layer using it as input, we can mark this Blob reusable and share the memory. The input Blob can naturally be shared/reused as Forward goes on. However, I think it's more convenient to not reused the input Blob, at the next Forward, you don't need to call |
Static memory place is impl in #70 |
Caffe currently consumes too much memory during the Forward phase. It's mainly because the internal temporary buffer held by the Layer. e.g Convolution Layer need to cache im2col result for gemm operation. These temporary buffer is not shared between layers which causing too much memory usage. Second, since we don't perform backward operation, network internal buffer can also be reused or freed as long as no other layer needs it.
Mini-Caffe should change this situation without breaking any high level API exposed in
include
(maybe add some APIs).Some ideas.
Layer who needs temporary memory should requests memory from a global memory source manager. The Manager itself hold the data and borrow the memory buffer to which requests. A memory pool can be implemented or just reuse the same memory and resize if request is too big. The network internal buffer can also be requested through the manager but need to track the dependency of this named blob, return the memory to manager when no other layer needs it. This strategy comes within every Forward phase.
Since Caffe network graph is static, we can plan the memory before forwarding the graph. Some Layer API changes will be helpful. Layer itself should only gives network the memory size it needs and let network holds the memory and borrow it to the layer during Forward. This includes bottom, top and temporary memory. Change
Reshape
function of every layer. Counting the dependency of network internal blobs, plan the memory and reuse the internal blobs. This strategy comes before every Forward phase.The text was updated successfully, but these errors were encountered: