Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

timings in comsa_miniapp #106

Closed
airmler opened this issue Jul 1, 2022 · 2 comments
Closed

timings in comsa_miniapp #106

airmler opened this issue Jul 1, 2022 · 2 comments

Comments

@airmler
Copy link

airmler commented Jul 1, 2022

I am running the cosma miniapp on a 72 core xeon machine with the following parameters
$parallel_cosma -m 8688 -n 8688 -k 8688 -r 3
The last line of the stdout reads:
COSMA TIMES [ms] = 458 460 771

I am curious about the large spread between fastest and slowest multiplication. The fast number would mean 40 GFLOPS/core/s which is a good number for this machine. The slowest number would imply only 23 GFLOPS/core/s.

Am I right that there is a 300 ms overhead finding the optimal "parallelization strategy"? Which of both numbers would be fair to compare with other libraries like ScaLapack and others?

I am aware that this is a very extreme example. But a spread of 10-20% between fastest and slowest number is very typical.

@rasolca
Copy link
Collaborator

rasolca commented Jul 1, 2022

Am I right that there is a 300 ms overhead finding the optimal "parallelization strategy"?

No, the overhead is very likely due to library initializations during the first run in the miniapp.
Multithreaded MKL is usually the the library that introduces more overhead as it has to initialize the OpenMP environment and allocate some memory during the first library calls. MPI on certain systems introduces as well some overhead during the first communications.

@airmler
Copy link
Author

airmler commented Jul 1, 2022

Thanks for fast clarification.
I conclude that the correct approach is to neglect the slowest number.

@airmler airmler closed this as completed Jul 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants