-
Notifications
You must be signed in to change notification settings - Fork 541
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
关于 07/03_prefetch/06 运行结果的疑问 #31
Comments
类似在哪里?数值上的类似没有意义,比值才重要。我是9700,性能远低于你,导致你需要取消了并行后才能和我一样,很正常。
无法顺畅的大口呼吸,是活着的最好证明
…---原始邮件---
发件人: ***@***.***>
发送时间: 2024年6月9日(周日) 晚上8:09
收件人: ***@***.***>;
抄送: ***@***.***>;
主题: [parallel101/course] 关于 07/03_prefetch/06 运行结果的疑问 (Issue #31)
hi, 小彭老师好。关于 07/03_prefetch/06 例子运行结果我有一些疑问,望指正。
我的平台是 Intel i5-13500, Ubuntu 24.04, gcc version 13.2.0
在运行 07/03_prefetch/06 这个例子时,
去掉例子中的 #pragma omp parallel for 才能得到与课程中类似的结果。我不清楚 #pragma omp parallel for 是否除了并行之外还有其他的优化?
原始版本运行结果
----------------------------------------------------------------------- Benchmark Time CPU Iterations ----------------------------------------------------------------------- BM_read 25228152 ns 18180668 ns 38 BM_write 32696238 ns 25309548 ns 33 BM_write_streamed 19530899 ns 17132181 ns 36 BM_write_stream_then_read 19586335 ns 17525509 ns 43 BM_write_streamed_ps 19550735 ns 14485110 ns 39 BM_write_streamed_ps_skipped 37094026 ns 26238143 ns 26 BM_read_and_write 36829027 ns 33520956 ns 22
image.png (view on web)
去除 #pragma omp parallel for 版本运行结果
----------------------------------------------------------------------- Benchmark Time CPU Iterations ----------------------------------------------------------------------- BM_read 38213301 ns 38207623 ns 19 BM_write 52209723 ns 52203705 ns 13 BM_write_streamed 34738316 ns 34735390 ns 20 BM_write_stream_then_read 40930259 ns 40927256 ns 17 BM_write_streamed_ps 17725541 ns 17724305 ns 36 BM_write_streamed_ps_skipped 36891533 ns 36889477 ns 19 BM_read_and_write 44972351 ns 44969916 ns 12
image.png (view on web)
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
hi, 小彭老师好。关于 07/03_prefetch/06 例子运行结果我有一些疑问,望指正。
我的平台是 Intel i5-13500, Ubuntu 24.04, gcc version 13.2.0
在运行 07/03_prefetch/06 这个例子时,
去掉例子中的 #pragma omp parallel for 才能得到与课程中类似的结果。我不清楚 #pragma omp parallel for 是否除了并行之外还有其他的优化?
原始版本运行结果
从运行结果可以看到,BM_write_stream_then_read 跟 BM_write_streamed 运行耗时相近,似乎读对 stream 指令并没有影响
去除 #pragma omp parallel for 版本运行结果
从运行结果可以看到,BM_write_stream_then_read 运行耗时显著比 BM_write_streamed 长
The text was updated successfully, but these errors were encountered: