Skip to content

Conversation

@Ziminli
Copy link
Collaborator

@Ziminli Ziminli commented Oct 28, 2024

No description provided.

@Ziminli Ziminli added 模块:算子 添加新的算子 进度:进行中 还在开发过程中,勿合 类别:开发 新功能开发 labels Oct 28, 2024
@Ziminli Ziminli self-assigned this Oct 28, 2024
@Ziminli Ziminli changed the title fp16 and fp32 support for global avg pool (initial commit) Add Global Average Pool Oct 28, 2024
@Ziminli Ziminli force-pushed the add_global_avg_pool branch from f797ad3 to fcdc5f5 Compare October 28, 2024 05:17
}

template<typename Tdata, typename TIdata, typename Ldata, typename LIdata>
void launch_global_avg_pool_folding(GlobalAvgPoolCudaDescriptor_t desc, void *y, void const *x, void *workspace, uint64_t workspace_size, void *stream, unsigned pack_size) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里为什么是根据 workspace_size 决定用哪个 kernel 呢?workspace_size 不是 kernel 自己返回的需求吗

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是的,这里是根据是否有workspace进行优化,因为workspace根据不同的数据类型会有所变化,比如fp32和fp64这种情况下其实是不需要workspace的,这时候就不用调用有workspace的版本。如果像fp16需要workspace的话就会自动调用有workspace的版本。

@Ziminli Ziminli added 进度:已完成 开发完成,等待审阅合并 and removed 进度:进行中 还在开发过程中,勿合 labels Nov 6, 2024
@PanZezhong1725 PanZezhong1725 merged commit fdbf030 into dev Nov 6, 2024
1 check passed
@PanZezhong1725 PanZezhong1725 deleted the add_global_avg_pool branch November 6, 2024 08:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

模块:算子 添加新的算子 类别:开发 新功能开发 进度:已完成 开发完成,等待审阅合并

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants