Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

使用GPU算法选择文件加速模型初始化,存在corner case未被加速 #55

Open
chillingche opened this issue Jul 20, 2021 · 1 comment

Comments

@chillingche
Copy link

GPU的算法文件包含algorithmMap和kernelThreadMap,当模型仅包含一些简单OP(eltwise, power等)时,不需要对tiling等参数做搜索,这时algorithmMap就是空的,kernelThreadMap中仍然包含着这些OP的local搜索结果。

因此存在一种corner case:algorithmMap.size() == 0 && kernelThreadMap.size() > 0

这时void saveMapToFile() 就会出现bug,导致这种模型的local搜索结果不会被保存到算法文件中。从而,模型下次初始化时虽然链接了这个算法文件,仍然需要重新搜索local。这时模型的第一次执行就会非常慢。具体表现是-w 0和-w 1的执行时间差异非常明显。

@yunfanxiao
Copy link
Contributor

感谢您的反馈,确实会存在上述问题,可以尝试删除common/uni/include/algorithm_map.h 377行的if (targetMap.size() > 0)判断

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants