-
Notifications
You must be signed in to change notification settings - Fork 24
Add Global Average Pool #89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
f797ad3 to
fcdc5f5
Compare
| } | ||
|
|
||
| template<typename Tdata, typename TIdata, typename Ldata, typename LIdata> | ||
| void launch_global_avg_pool_folding(GlobalAvgPoolCudaDescriptor_t desc, void *y, void const *x, void *workspace, uint64_t workspace_size, void *stream, unsigned pack_size) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里为什么是根据 workspace_size 决定用哪个 kernel 呢?workspace_size 不是 kernel 自己返回的需求吗
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是的,这里是根据是否有workspace进行优化,因为workspace根据不同的数据类型会有所变化,比如fp32和fp64这种情况下其实是不需要workspace的,这时候就不用调用有workspace的版本。如果像fp16需要workspace的话就会自动调用有workspace的版本。
… remove some test cases, etc.
No description provided.