Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于 07/03_prefetch/06 运行结果的疑问 #31

Open
rickif opened this issue Jun 9, 2024 · 2 comments
Open

关于 07/03_prefetch/06 运行结果的疑问 #31

rickif opened this issue Jun 9, 2024 · 2 comments

Comments

@rickif
Copy link

rickif commented Jun 9, 2024

hi, 小彭老师好。关于 07/03_prefetch/06 例子运行结果我有一些疑问,望指正。
我的平台是 Intel i5-13500, Ubuntu 24.04, gcc version 13.2.0
在运行 07/03_prefetch/06 这个例子时,
去掉例子中的 #pragma omp parallel for 才能得到与课程中类似的结果。我不清楚 #pragma omp parallel for 是否除了并行之外还有其他的优化?

原始版本运行结果

从运行结果可以看到,BM_write_stream_then_read 跟 BM_write_streamed 运行耗时相近,似乎读对 stream 指令并没有影响

-----------------------------------------------------------------------
Benchmark                             Time             CPU   Iterations
-----------------------------------------------------------------------
BM_read                        25228152 ns     18180668 ns           38
BM_write                       32696238 ns     25309548 ns           33
BM_write_streamed              19530899 ns     17132181 ns           36
BM_write_stream_then_read      19586335 ns     17525509 ns           43
BM_write_streamed_ps           19550735 ns     14485110 ns           39
BM_write_streamed_ps_skipped   37094026 ns     26238143 ns           26
BM_read_and_write              36829027 ns     33520956 ns           22

image

去除 #pragma omp parallel for 版本运行结果

从运行结果可以看到,BM_write_stream_then_read 运行耗时显著比 BM_write_streamed 长

-----------------------------------------------------------------------
Benchmark                             Time             CPU   Iterations
-----------------------------------------------------------------------
BM_read                        38213301 ns     38207623 ns           19
BM_write                       52209723 ns     52203705 ns           13
BM_write_streamed              34738316 ns     34735390 ns           20
BM_write_stream_then_read      40930259 ns     40927256 ns           17
BM_write_streamed_ps           17725541 ns     17724305 ns           36
BM_write_streamed_ps_skipped   36891533 ns     36889477 ns           19
BM_read_and_write              44972351 ns     44969916 ns           12

image

@archibate
Copy link
Collaborator

archibate commented Jun 9, 2024 via email

@rickif
Copy link
Author

rickif commented Jun 9, 2024

从运行结果可以看到,BM_write_stream_then_read 跟 BM_write_streamed 运行耗时相近,似乎读对 steam 指令并没有影响

  1. 从原始版本运行结果看,BM_write_stream_then_read 跟 BM_write_streamed 运行耗时相近,似乎读对 steam 指令并没有影响
  2. 从删除omp parrallel for 版本运行结果看,BM_write_stream_then_read 运行耗时显著比 BM_write_streamed 长

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants