Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OpenMP] Use half of available logical processors for collapse tests #88319

Merged
merged 6 commits into from
Apr 19, 2024

Conversation

xingxue-ibm
Copy link
Contributor

@xingxue-ibm xingxue-ibm commented Apr 10, 2024

The new collapse test cases define MAX_THREADS to be 256 and use all available threads/logical processors on the system. This triples the testing time on an AIX machine that has 128 logical processors. This patch changes to use half of available logical processors to avoid over subscribing because there are other libomp tests running at the same time, including 2 other such collapse tests.

Slowest Tests:
--------------------------------------------------------------------------
971.92s: libomp :: worksharing/for/omp_collapse_many_LTLEGE_int.c
971.62s: libomp :: worksharing/for/omp_collapse_many_GTGEGT_int.c
965.29s: libomp :: worksharing/for/omp_collapse_many_GELTGT_int.c
332.97s: libomp :: tasking/omp_taskloop_num_tasks.c
233.26s: libomp :: worksharing/single/omp_single.c
207.42s: libomp :: tasking/omp_taskloop_grainsize.c
178.41s: libomp :: tasking/hidden_helper_task/depend.cpp
172.02s: libomp :: worksharing/for/omp_collapse_many_int.c
....

@xingxue-ibm xingxue-ibm added the openmp:libomp OpenMP host runtime label Apr 10, 2024
@xingxue-ibm xingxue-ibm self-assigned this Apr 10, 2024
@jprotze
Copy link
Collaborator

jprotze commented Apr 11, 2024

The test should not use more than 128 threads on your system (default value returned by omp_get_max_threads is number of available cores). Since we usually run multiple test in parallel, I think it makes sense to modify the test to never use more than half of the available cores:

-  unsigned num_threads = omp_get_max_threads();
+  unsigned num_threads = omp_get_max_threads()/2;
  if (num_threads > MAX_THREADS)
    num_threads = MAX_THREADS;
  omp_set_num_threads(num_threads);

Furthermore, I suggest to move the omp_get_thread_num call out of the loop:

#pragma omp parallel num_threads(num_threads)
  {
+    unsigned gtid = omp_get_thread_num();
#pragma omp for collapse(3) private(i, j, k)
    LOOP {
      unsigned count;
-      unsigned gtid = omp_get_thread_num();

@xingxue-ibm
Copy link
Contributor Author

The test should not use more than 128 threads on your system (default value returned by omp_get_max_threads is number of available cores). Since we usually run multiple test in parallel, I think it makes sense to modify the test to never use more than half of the available cores:

-  unsigned num_threads = omp_get_max_threads();
+  unsigned num_threads = omp_get_max_threads()/2;
  if (num_threads > MAX_THREADS)
    num_threads = MAX_THREADS;
  omp_set_num_threads(num_threads);

Furthermore, I suggest to move the omp_get_thread_num call out of the loop:

#pragma omp parallel num_threads(num_threads)
  {
+    unsigned gtid = omp_get_thread_num();
#pragma omp for collapse(3) private(i, j, k)
    LOOP {
      unsigned count;
-      unsigned gtid = omp_get_thread_num();

Good point, thanks @jprotze! It makes sense not to use more than half of the available logical processors to avoid over-subscribing, noting there are other libomp test cases run running in parallel, including 2 other collapse test cases using collapse_test.inc. Changed as per suggestion. Also removed omp_set_num_threads(num_threads); because it will make the call to omp_get_max_threads() in the next iteration return half of the original value.

@jprotze
Copy link
Collaborator

jprotze commented Apr 11, 2024

Did you push the changes?

@xingxue-ibm
Copy link
Contributor Author

Did you push the changes?

Having a problem pushing the changes. Working on it...

@xingxue-ibm
Copy link
Contributor Author

Did you push the changes?

Having a problem pushing the changes. Working on it...

Changes are pushed.

@xingxue-ibm xingxue-ibm changed the title [OpenMP][AIX] lower max threads to reduce collapse testing time [OpenMP] Use half of available logical processors for collapse tests Apr 11, 2024
* Use half of available threads/logical processors to avoid over-subscribing
* Fix the PRINTF macro to get rid of warnings
* add free()
* make sure num_threads is not 0
- make chunkSizesOpenmp type 'unsinged long'.
Copy link
Collaborator

@jprotze jprotze left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm now

@xingxue-ibm xingxue-ibm merged commit 0a8cd1e into llvm:main Apr 19, 2024
4 checks passed
aniplcc pushed a commit to aniplcc/llvm-project that referenced this pull request Apr 21, 2024
…lvm#88319)

The new collapse test cases define `MAX_THREADS` to be 256 and use all
available threads/logical processors on the system. This triples the
testing time on an AIX machine that has 128 logical processors. This
patch changes to use half of available logical processors to avoid over
subscribing because there are other libomp tests running at the same
time, including 2 other such collapse tests.
@xingxue-ibm xingxue-ibm deleted the collapse-test branch May 24, 2024 21:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
openmp:libomp OpenMP host runtime
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants