2.3.6 代码 batch_size 调整后 结果 #20
Replies: 2 comments 1 reply
-
为什么64的时候最快那?难道L1 cache只有0.25KB(64 x 4 = 256B=0.25KB)? |
Beta Was this translation helpful? Give feedback.
-
https://tvm.apache.org/docs/how_to/optimize_operators/opt_gemm.html#blocking
设置为32 就是 32 * 32 * 4 = 4K |
Beta Was this translation helpful? Give feedback.
-
浅浅的上传一下实验结果,可能粒度不够细致
Beta Was this translation helpful? Give feedback.
All reactions