-
Notifications
You must be signed in to change notification settings - Fork 662
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Local dep object pool #5953
Local dep object pool #5953
Conversation
CHECK_OR_RETURN(!local_dep_object_pool->empty()); | ||
size_t pool_size = local_dep_object_pool->size(); | ||
static thread_local int64_t index = 0; | ||
return local_dep_object_pool->at(index++ % pool_size).Mutable(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LocalDepObject的复用不会造成任何问题,甚至还有好处。比如cuda_h2d device上只准备了2个LocalDepObject,这样整个计算流就在double buffer的模式下工作。
可以认为LocalDepObject就是流控机制。
Maybe<size_t> Device::instr_local_dep_object_pool_size() const { | ||
static const size_t kDoubleBufferPoolSize = 2; | ||
static const HashMap<std::string, size_t> type2pool_size{ | ||
{"cpu", GetInstructionHighWaterMark()}, {"cuda", GetInstructionHighWaterMark()}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里把 pool size 设置成 GetInstructionHighWaterMark(),相当于并没有顺序化?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
当然不是。你可以认为之前这里是无穷大,那种情况下都有顺序化呀。顺序化是device的LocalDepObject成员控制的。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
哦哦,我想错了
Speed stats:
|
解决cuda_h2d导致的内存开销过大问题。