Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Flang][OpenMP][Driver][Test] Fix the omp-driver-offload-test commands #66926

Merged
merged 1 commit into from
Sep 26, 2023

Conversation

agozillon
Copy link
Contributor

Since the changes to -### to correctly return error codes in D156363, some additions of the not lit keyword and perhaps some changes to how fopenmp-targets is handled; the test omp-driver-offload.f90 has become a little susceptible to what appears to be sporadic failures.

This was because if certain environment variables are set for AMD devices some of the commands succeed happily and the not check causes the test to fail. Whereas, if they were not set the tests would fail and the not test would succeed, allowing the test to pass provided you were not targeting AMD hardware while running the test.

When the environment variables for AMD hardware are set the compiler could find the correct offload architecture to compile for, allowing the command to succeed. However, when not set it'd fail as it can't detect the offload architecture. The solution to this is to specify an AMD architecture via --offload-arch so it always passes, the architecture doesn't really matter, I've chosen to select gfx90a.

Another command was just slightly incorrect, although it emitted the expected output, it'd return an error code as it was trying to emit multiple outputs to one file (had to remove the -o).

@llvmbot llvmbot added flang:driver flang Flang issues not falling into any other category labels Sep 20, 2023
@agozillon agozillon added flang:openmp flang:driver flang Flang issues not falling into any other category and removed flang:driver flang Flang issues not falling into any other category labels Sep 20, 2023
@llvmbot
Copy link
Collaborator

llvmbot commented Sep 20, 2023

@llvm/pr-subscribers-flang-openmp

@llvm/pr-subscribers-flang-driver

Changes

Since the changes to -### to correctly return error codes in D156363, some additions of the not lit keyword and perhaps some changes to how fopenmp-targets is handled; the test omp-driver-offload.f90 has become a little susceptible to what appears to be sporadic failures.

This was because if certain environment variables are set for AMD devices some of the commands succeed happily and the not check causes the test to fail. Whereas, if they were not set the tests would fail and the not test would succeed, allowing the test to pass provided you were not targeting AMD hardware while running the test.

When the environment variables for AMD hardware are set the compiler could find the correct offload architecture to compile for, allowing the command to succeed. However, when not set it'd fail as it can't detect the offload architecture. The solution to this is to specify an AMD architecture via --offload-arch so it always passes, the architecture doesn't really matter, I've chosen to select gfx90a.

Another command was just slightly incorrect, although it emitted the expected output, it'd return an error code as it was trying to emit multiple outputs to one file (had to remove the -o).


Full diff: https://github.com/llvm/llvm-project/pull/66926.diff

1 Files Affected:

  • (modified) flang/test/Driver/omp-driver-offload.f90 (+31-19)
diff --git a/flang/test/Driver/omp-driver-offload.f90 b/flang/test/Driver/omp-driver-offload.f90
index 74ed0f9006e81c3..bfdc3f6f4d4726b 100644
--- a/flang/test/Driver/omp-driver-offload.f90
+++ b/flang/test/Driver/omp-driver-offload.f90
@@ -1,6 +1,10 @@
 ! Test that flang-new OpenMP and OpenMP offload related 
 ! commands forward or expand to the appropriate commands 
-! for flang-new -fc1 as expected.
+! for flang-new -fc1 as expected. Assumes a gfx90a, aarch64,
+! and sm_70 architecture, but doesn't require one to be 
+! installed or compiled for, just testing the appropriate 
+! generation of jobs are created with the correct 
+! corresponding arguments.
 
 ! Test regular -fopenmp with no offload
 ! RUN: %flang -### -fopenmp %s 2>&1 | FileCheck --check-prefixes=CHECK-OPENMP %s
@@ -33,7 +37,7 @@
 ! OFFLOAD-HOST-NOT: "-triple" "nvptx64-nvidia-cuda"
 ! OFFLOAD-HOST-NOT: "{{[^"]*}}flang-new" "-fc1" "-triple" "aarch64-unknown-linux-gnu"
 
-! RUN: not %flang -S -### %s -o %t 2>&1 \
+! RUN: %flang -S -### %s 2>&1 \
 ! RUN: -fopenmp --offload-arch=gfx90a --offload-arch=sm_70 --offload-device-only \
 ! RUN: --target=aarch64-unknown-linux-gnu \
 ! RUN:   | FileCheck %s --check-prefix=OFFLOAD-DEVICE
@@ -44,7 +48,7 @@
 ! OFFLOAD-DEVICE-NOT: "{{[^"]*}}flang-new" "-fc1" "-triple" "aarch64-unknown-linux-gnu"
 
 ! Test regular -fopenmp with offload for basic fopenmp-is-target-device flag addition and correct fopenmp 
-! RUN: not %flang -### -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa %s 2>&1 | FileCheck --check-prefixes=CHECK-OPENMP-IS-TARGET-DEVICE %s
+! RUN: %flang -### -fopenmp --offload-arch=gfx90a -fopenmp-targets=amdgcn-amd-amdhsa %s 2>&1 | FileCheck --check-prefixes=CHECK-OPENMP-IS-TARGET-DEVICE %s
 ! CHECK-OPENMP-IS-TARGET-DEVICE: "{{[^"]*}}flang-new" "-fc1" {{.*}} "-fopenmp" {{.*}} "-fopenmp-is-target-device" {{.*}}.f90"
 
 ! Testing appropriate flags are gnerated and appropriately assigned by the driver when offloading
@@ -58,44 +62,51 @@
 ! OPENMP-OFFLOAD-ARGS-NEXT: "{{[^"]*}}flang-new" "-fc1" "-triple" "aarch64-unknown-linux-gnu" {{.*}} "-fopenmp" {{.*}} "-fembed-offload-object={{.*}}.out" {{.*}}.bc"
 
 ! Test -fopenmp with offload for RTL Flag Options
-! RUN: not %flang -### %s -o %t 2>&1 \
-! RUN: -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa \
+! RUN: %flang -### %s -o %t 2>&1 \
+! RUN: -fopenmp --offload-arch=gfx90a \
+! RUN: -fopenmp-targets=amdgcn-amd-amdhsa \
 ! RUN: -fopenmp-assume-threads-oversubscription \
 ! RUN: | FileCheck %s --check-prefixes=CHECK-THREADS-OVS
 ! CHECK-THREADS-OVS: "{{[^"]*}}flang-new" "-fc1" {{.*}} "-fopenmp" {{.*}} "-fopenmp-is-target-device" "-fopenmp-assume-threads-oversubscription" {{.*}}.f90"
 
-! RUN: not %flang -### %s -o %t 2>&1 \
-! RUN: -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa \
+! RUN: %flang -### %s -o %t 2>&1 \
+! RUN: -fopenmp --offload-arch=gfx90a \
+! RUN: -fopenmp-targets=amdgcn-amd-amdhsa \
 ! RUN: -fopenmp-assume-teams-oversubscription  \
 ! RUN: | FileCheck %s --check-prefixes=CHECK-TEAMS-OVS
 ! CHECK-TEAMS-OVS: "{{[^"]*}}flang-new" "-fc1" {{.*}} "-fopenmp" {{.*}} "-fopenmp-is-target-device" "-fopenmp-assume-teams-oversubscription" {{.*}}.f90"
 
-! RUN: not %flang -### %s -o %t 2>&1 \
-! RUN: -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa \
+! RUN: %flang -### %s -o %t 2>&1 \
+! RUN: -fopenmp --offload-arch=gfx90a \
+! RUN: -fopenmp-targets=amdgcn-amd-amdhsa \
 ! RUN: -fopenmp-assume-no-nested-parallelism  \
 ! RUN: | FileCheck %s --check-prefixes=CHECK-NEST-PAR
 ! CHECK-NEST-PAR: "{{[^"]*}}flang-new" "-fc1" {{.*}} "-fopenmp" {{.*}} "-fopenmp-is-target-device" "-fopenmp-assume-no-nested-parallelism" {{.*}}.f90"
 
-! RUN: not %flang -### %s -o %t 2>&1 \
-! RUN: -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa \
+! RUN: %flang -### %s -o %t 2>&1 \
+! RUN: -fopenmp --offload-arch=gfx90a \
+! RUN: -fopenmp-targets=amdgcn-amd-amdhsa \
 ! RUN: -fopenmp-assume-no-thread-state \
 ! RUN: | FileCheck %s --check-prefixes=CHECK-THREAD-STATE
 ! CHECK-THREAD-STATE: "{{[^"]*}}flang-new" "-fc1" {{.*}} "-fopenmp" {{.*}} "-fopenmp-is-target-device" "-fopenmp-assume-no-thread-state" {{.*}}.f90"
 
-! RUN: not %flang -### %s -o %t 2>&1 \
-! RUN: -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa \
+! RUN: %flang -### %s -o %t 2>&1 \
+! RUN: -fopenmp --offload-arch=gfx90a \
+! RUN: -fopenmp-targets=amdgcn-amd-amdhsa \
 ! RUN: -fopenmp-target-debug \
 ! RUN: | FileCheck %s --check-prefixes=CHECK-TARGET-DEBUG
 ! CHECK-TARGET-DEBUG: "{{[^"]*}}flang-new" "-fc1" {{.*}} "-fopenmp" {{.*}} "-fopenmp-is-target-device" "-fopenmp-target-debug" {{.*}}.f90"
 
-! RUN: not %flang -### %s -o %t 2>&1 \
-! RUN: -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa \
+! RUN: %flang -### %s -o %t 2>&1 \
+! RUN: -fopenmp --offload-arch=gfx90a \
+! RUN: -fopenmp-targets=amdgcn-amd-amdhsa \
 ! RUN: -fopenmp-target-debug \
 ! RUN: | FileCheck %s --check-prefixes=CHECK-TARGET-DEBUG
 ! CHECK-TARGET-DEBUG-EQ: "{{[^"]*}}flang-new" "-fc1" {{.*}} "-fopenmp" {{.*}} "-fopenmp-is-target-device" "-fopenmp-target-debug=111" {{.*}}.f90"
 
-! RUN: not %flang -S -### %s -o %t 2>&1 \
-! RUN: -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa \
+! RUN: %flang -S -### %s -o %t 2>&1 \
+! RUN: -fopenmp --offload-arch=gfx90a \
+! RUN: -fopenmp-targets=amdgcn-amd-amdhsa \
 ! RUN: -fopenmp-target-debug -fopenmp-assume-threads-oversubscription \
 ! RUN: -fopenmp-assume-teams-oversubscription -fopenmp-assume-no-nested-parallelism \
 ! RUN: -fopenmp-assume-no-thread-state \
@@ -104,8 +115,9 @@
 ! CHECK-RTL-ALL: "-fopenmp-assume-threads-oversubscription" "-fopenmp-assume-no-thread-state" "-fopenmp-assume-no-nested-parallelism"
 ! CHECK-RTL-ALL: {{.*}}.f90"
 
-! RUN: not %flang -### %s -o %t 2>&1 \
-! RUN: -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa \
+! RUN: %flang -### %s -o %t 2>&1 \
+! RUN: -fopenmp --offload-arch=gfx90a \
+! RUN: -fopenmp-targets=amdgcn-amd-amdhsa \
 ! RUN: -fopenmp-version=45 \
 ! RUN: | FileCheck %s --check-prefixes=CHECK-OPENMP-VERSION
 ! CHECK-OPENMP-VERSION: "{{[^"]*}}flang-new" "-fc1" {{.*}} "-fopenmp" "-fopenmp-version=45" {{.*}}.f90"

Copy link
Contributor

@banach-space banach-space left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for sending this! TBH, it's hard to follow without knowing/understanding the semantics of the relevant options. Looking at:

! RUN: %flang -### %s -o %t 2>&1 \
! RUN: -fopenmp --offload-arch=gfx90a \
! RUN: -fopenmp-targets=amdgcn-amd-amdhsa \

what is significant about --offload-arch=gfx90a that makes the driver not fail? IIUC, that's the main thing that's changing here.

@@ -33,7 +37,7 @@
! OFFLOAD-HOST-NOT: "-triple" "nvptx64-nvidia-cuda"
! OFFLOAD-HOST-NOT: "{{[^"]*}}flang-new" "-fc1" "-triple" "aarch64-unknown-linux-gnu"

! RUN: not %flang -S -### %s -o %t 2>&1 \
! RUN: %flang -S -### %s 2>&1 \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move to a dedicate patch

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do, but this is still sort of the same set of changes trying to fix the test back to a state of normalcy, just slightly differently to the rest. In this case it's failing because the command expects to output multiple inputs rather than a single so the -o segment of the command breaks it and the not is then required. Although, we can leave it as an incorrect command and it'd still be able to check the correct results and yield positive! This change mostly just allows us to remove the not from the front allowing all not's to be removed from the file, making understanding of the test a little easier.

So do please let me know if you still wish it in a seperate PR and I'll remove it and move it to a new one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So do please let me know if you still wish it in a seperate PR and I'll remove it and move it to a new one.

Yes, please. The rationale for this particular change is straightforward and unrelated to the other changes.

Copy link
Contributor Author

@agozillon agozillon Sep 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem, I'll do that soon!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New PR with the moved test is here: #67335 I'll update this patch shortly to exclude the change in the other PR.

@agozillon
Copy link
Contributor Author

agozillon commented Sep 21, 2023

Thanks for sending this! TBH, it's hard to follow without knowing/understanding the semantics of the relevant options.
Looking at:

No problem! Thank you for reviewing it. And yes it is, sorry about that, it's mostly just specifying the GPU triple and specific hardware architecture.

! RUN: %flang -### %s -o %t 2>&1 \
! RUN: -fopenmp --offload-arch=gfx90a \
! RUN: -fopenmp-targets=amdgcn-amd-amdhsa \

what is significant about --offload-arch=gfx90a that makes the driver not fail? IIUC, that's the main thing that's changing here.

I believe the amdgcn-amd-amdhsa command is more of a: "we're compiling for this generic amd gpu hardware triple,", so we don't know the specific GPU architecture just yet. Whereas defining --offload-arch=gfx90a is defining the architecture we're specifically targeting. Without the --offload-arch=gfx90a command it appears that with the right combination of AMD libraries in tow in your environment the GPU on your system can be identified (which on AMD systems like mine is a gfx of some variety) and the command will succeed. But the test has been trained in a past commit to expect failure for the return code, which is correct in most cases, unless you have the correct libraries in the environment and an AMD GPU (or other AMD offload arch) as it turns out. So it creates a scenario where we'll pass most of the time, but occasionally it'll fail when the stars align, even though the test is finding the correct information. So specifying the offload-arch just means the information is always available regardless of environment, so no chance failures.

Maybe I'm incorrect, this is just my understanding from diagnosing the test failures @jhuber6 can likely correct me if I'm wrong.

Copy link
Member

@TIFitis TIFitis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM :)

@banach-space
Copy link
Contributor

banach-space commented Sep 24, 2023

I don't want to come across as picky, but changing not %flang to %flang is significant - it basically means that in one case the end-user will see their compiler failing and in the other succeeding.

This was because if certain environment variables are set for AMD devices some of the commands succeed happily and the not check causes the test to fail.

IIUC, this test does not depend on any env variables and it should succeed regardless of the env variables? If it does depend on env variables then that should be documented. Is it? Similar note on various libraries being present or not.

As per PR title:

[Flang][OpenMP][Driver][Test] Fix the omp-driver-offload-test commands

This test works for me fine and from my perspective it is being "refined" rather than "fixed". Perhaps there's an issue that affects some folks and this change resolves that, but from your description it sounds like a "downstream" issue? Are there any public buildbots affected by this?

"fix" hints to me that there's a bug and in such cases I feel that I should understand what the issue is before I approve a patch. In this case I don't (*), but am happy to rely on your expertise and the expertise of other reviewers. The whole discussion seems to assume a good deal of background on OpenMP offloading for AMD GPUs, which I don't have.

(*) With the exception of the -o changes.

@agozillon
Copy link
Contributor Author

agozillon commented Sep 25, 2023

I don't want to come across as picky, but changing not %flang to %flang is significant - it basically means that in one case the end-user will see their compiler failing and in the other succeeding.

Not picky at all, always happy to have questions/attempt to clarify things!

The not %flang was added as a sweeping set of changes in this commit: f39c399 as what appears to be a band-aid fix, prior to it they weren't existent and they appear to have been added under the presumption that certain tests will need fixed based on the changes to -### returning error/success codes correctly (e.g. the test improvement list in the commit message, albeit there is none relevant to this test in particular). The tests passed fine all the time previously because they didn't ever fail, now they will fail on certain occasions (whenever they have no arch specified and no library present).

This was because if certain environment variables are set for AMD devices some of the commands succeed happily and the not check causes the test to fail.

IIUC, this test does not depend on any env variables and it should succeed regardless of the env variables? If it does depend on env variables then that should be documented. Is it? Similar note on various libraries being present or not.

The test itself doesn't depend on any environment variables (or at least I didn't write the initial version with that knowledge/intent), but the success or failure of the commands as they are currently written depends on having rocm's library folder in your path. Defining the architecture and defining the tests around this should avoid this behavior as we're no longer dependent on the rocm library for picking up the native architecture. Perhaps I'm incorrect though it's a part of the driver I'm unfamiliar with @jhuber6 can probably fact check me/correct me in this case.

As per PR title:

[Flang][OpenMP][Driver][Test] Fix the omp-driver-offload-test commands

This test works for me fine and from my perspective it is being "refined" rather than "fixed". Perhaps there's an issue that affects some folks and this change resolves that, but from your description it sounds like a "downstream" issue?

I'm happy with whatever classification we want to give it a refine or fix! I chose fix as the commands are failing currently, which wasn't the intention of them when they were originally added. I don't think it's a downstream issue, I believe it's an issue for anyone developing for AMD GPU in conjunction with a Flang build I think (this test fails in this manner on an unmodified llvm project), the community just happens to be quite small at the moment as it's in development (and it's been noted by @jsjodin and @TIFitis to fail arbitrarily as well, I just happened to stumble across what I believe is the reason why while fixing an unrelated patch).

Are there any public buildbots affected by this?

There are currently no public buildbots that run Flang+AMDGPU+OpenMP tests at the same time as far as I am aware, and this particular test is tied to Flang, so it's unfortunately fallen under the radar for a little while I think . the closest is this buildbot which tests AMDGPU + OpenMP with a focus on the runtime, but it doesn't have a flang build: https://lab.llvm.org/buildbot/#/builders/193/builds/39058/steps/6/logs/stdio (feels like a bit of a gap as we're beginning to add Flang+OpenMP runtime tests and I don't know who to discuss with about perhaps adding a Flang build for this to get test coverage for these, but might not be worth the extra build time at the moment).

"fix" hints to me that there's a bug and in such cases I feel that I should understand what the issue is before I approve a patch. In this case I don't (*), but am happy to rely on your expertise and the expertise of other reviewers. The whole discussion seems to assume a good deal of background on OpenMP offloading for AMD GPUs, which I don't have.

(*) With the exception of the -o changes.

That's understandable. Perhaps it's not necessarily a bug, just a not so future proof test originally written by me I believe so the test is in need of an adjustment. So perhaps fix is a bit strong, I don't mind what wording we use though, so feel free to suggest any changes to the commit message and title you wish!

…ands to be more consistent

Since the changes to -### to correctly return error codes
in D156363, some additions of the not lit keyword and
perhaps some changes to how fopenmp-targets is
handled the test omp-driver-offload.f90 has become a little
susceptible to what appears to be sporadic failures (they're not).

This was because if the rocm libraries from AMD devices are
findable in a users environment some of the commands
succeed happily and the not check causes the test to fail.
Whereas, if they were not set the tests would fail and the
not test would succeed, allowing the test to pass provided
you were not targetting AMD hardware while running the test.

When the environment variables for AMD hardware are set the
compiler could find the correct offload architecture to compile
for, allowing the command to succeed. However, when not set
it'd fail as it can't detect the offload architecture. The solution to
this is to specify an AMD architecture via --offload-arch so it always
passes, the architecture doesn't really matter, I've chosen to select
gfx90a.
@banach-space
Copy link
Contributor

I don't want to come across as picky, but changing not %flang to %flang is significant - it basically means that in one case the end-user will see their compiler failing and in the other succeeding.

Not picky at all, always happy to have questions/attempt to clarify things!

The not %flang was added as a sweeping set of changes in this commit: f39c399 as what appears to be a band-aid fix, prior to it they weren't existent and they appear to have been added under the presumption that certain tests will need fixed based on the changes to -### returning error/success codes correctly (e.g. the test improvement list in the commit message, albeit there is none relevant to this test in particular). The tests passed fine all the time previously because they didn't ever fail, now they will fail on certain occasions (whenever they have no arch specified and no library present).

This was because if certain environment variables are set for AMD devices some of the commands succeed happily and the not check causes the test to fail.

IIUC, this test does not depend on any env variables and it should succeed regardless of the env variables? If it does depend on env variables then that should be documented. Is it? Similar note on various libraries being present or not.

The test itself doesn't depend on any environment variables (or at least I didn't write the initial version with that knowledge/intent), but the success or failure of the commands as they are currently written depends on having rocm's library folder in your path. Defining the architecture and defining the tests around this should avoid this behavior as we're no longer dependent on the rocm library for picking up the native architecture. Perhaps I'm incorrect though it's a part of the driver I'm unfamiliar with @jhuber6 can probably fact check me/correct me in this case.

As per PR title:

[Flang][OpenMP][Driver][Test] Fix the omp-driver-offload-test commands

This test works for me fine and from my perspective it is being "refined" rather than "fixed". Perhaps there's an issue that affects some folks and this change resolves that, but from your description it sounds like a "downstream" issue?

I'm happy with whatever classification we want to give it a refine or fix! I chose fix as the commands are failing currently, which wasn't the intention of them when they were originally added. I don't think it's a downstream issue, I believe it's an issue for anyone developing for AMD GPU in conjunction with a Flang build I think (this test fails in this manner on an unmodified llvm project), the community just happens to be quite small at the moment as it's in development (and it's been noted by @jsjodin and @TIFitis to fail arbitrarily as well, I just happened to stumble across what I believe is the reason why while fixing an unrelated patch).

Are there any public buildbots affected by this?

There are currently no public buildbots that run Flang+AMDGPU+OpenMP tests at the same time as far as I am aware, and this particular test is tied to Flang, so it's unfortunately fallen under the radar for a little while I think . the closest is this buildbot which tests AMDGPU + OpenMP with a focus on the runtime, but it doesn't have a flang build: https://lab.llvm.org/buildbot/#/builders/193/builds/39058/steps/6/logs/stdio (feels like a bit of a gap as we're beginning to add Flang+OpenMP runtime tests and I don't know who to discuss with about perhaps adding a Flang build for this to get test coverage for these, but might not be worth the extra build time at the moment).

"fix" hints to me that there's a bug and in such cases I feel that I should understand what the issue is before I approve a patch. In this case I don't (), but am happy to rely on your expertise and the expertise of other reviewers. The whole discussion seems to assume a good deal of background on OpenMP offloading for AMD GPUs, which I don't have.
(
) With the exception of the -o changes.

That's understandable. Perhaps it's not necessarily a bug, just a not so future proof test originally written by me I believe so the test is in need of an adjustment. So perhaps fix is a bit strong, I don't mind what wording we use though, so feel free to suggest any changes to the commit message and title you wish!

Thank you for this comprehensive explanation - the additional context helps a lot. Also thank your taking the initiative to improve this - that's greatly appreciated!

LGTM

@agozillon
Copy link
Contributor Author

I don't want to come across as picky, but changing not %flang to %flang is significant - it basically means that in one case the end-user will see their compiler failing and in the other succeeding.

Not picky at all, always happy to have questions/attempt to clarify things!
The not %flang was added as a sweeping set of changes in this commit: f39c399 as what appears to be a band-aid fix, prior to it they weren't existent and they appear to have been added under the presumption that certain tests will need fixed based on the changes to -### returning error/success codes correctly (e.g. the test improvement list in the commit message, albeit there is none relevant to this test in particular). The tests passed fine all the time previously because they didn't ever fail, now they will fail on certain occasions (whenever they have no arch specified and no library present).

This was because if certain environment variables are set for AMD devices some of the commands succeed happily and the not check causes the test to fail.

IIUC, this test does not depend on any env variables and it should succeed regardless of the env variables? If it does depend on env variables then that should be documented. Is it? Similar note on various libraries being present or not.

The test itself doesn't depend on any environment variables (or at least I didn't write the initial version with that knowledge/intent), but the success or failure of the commands as they are currently written depends on having rocm's library folder in your path. Defining the architecture and defining the tests around this should avoid this behavior as we're no longer dependent on the rocm library for picking up the native architecture. Perhaps I'm incorrect though it's a part of the driver I'm unfamiliar with @jhuber6 can probably fact check me/correct me in this case.

As per PR title:

[Flang][OpenMP][Driver][Test] Fix the omp-driver-offload-test commands

This test works for me fine and from my perspective it is being "refined" rather than "fixed". Perhaps there's an issue that affects some folks and this change resolves that, but from your description it sounds like a "downstream" issue?

I'm happy with whatever classification we want to give it a refine or fix! I chose fix as the commands are failing currently, which wasn't the intention of them when they were originally added. I don't think it's a downstream issue, I believe it's an issue for anyone developing for AMD GPU in conjunction with a Flang build I think (this test fails in this manner on an unmodified llvm project), the community just happens to be quite small at the moment as it's in development (and it's been noted by @jsjodin and @TIFitis to fail arbitrarily as well, I just happened to stumble across what I believe is the reason why while fixing an unrelated patch).

Are there any public buildbots affected by this?

There are currently no public buildbots that run Flang+AMDGPU+OpenMP tests at the same time as far as I am aware, and this particular test is tied to Flang, so it's unfortunately fallen under the radar for a little while I think . the closest is this buildbot which tests AMDGPU + OpenMP with a focus on the runtime, but it doesn't have a flang build: https://lab.llvm.org/buildbot/#/builders/193/builds/39058/steps/6/logs/stdio (feels like a bit of a gap as we're beginning to add Flang+OpenMP runtime tests and I don't know who to discuss with about perhaps adding a Flang build for this to get test coverage for these, but might not be worth the extra build time at the moment).

"fix" hints to me that there's a bug and in such cases I feel that I should understand what the issue is before I approve a patch. In this case I don't (), but am happy to rely on your expertise and the expertise of other reviewers. The whole discussion seems to assume a good deal of background on OpenMP offloading for AMD GPUs, which I don't have.
(
) With the exception of the -o changes.

That's understandable. Perhaps it's not necessarily a bug, just a not so future proof test originally written by me I believe so the test is in need of an adjustment. So perhaps fix is a bit strong, I don't mind what wording we use though, so feel free to suggest any changes to the commit message and title you wish!

Thank you for this comprehensive explanation - the additional context helps a lot. Also thank your taking the initiative to improve this - that's greatly appreciated!

LGTM

No problem, always happy to try and explain, I know I'm not great at it sometimes, so I appreciate your patience. Thank you very much for the review and your time!

@agozillon agozillon merged commit 9971aef into llvm:main Sep 26, 2023
3 checks passed
legrosbuffle pushed a commit to legrosbuffle/llvm-project that referenced this pull request Sep 29, 2023
llvm#66926)

Since the changes to -### to correctly return error codes in D156363,
some additions of the not lit keyword and perhaps some changes to how
fopenmp-targets is handled; the test omp-driver-offload.f90 has become a
little susceptible to what appears to be sporadic failures.

This was because if certain environment variables are set for AMD
devices some of the commands succeed happily and the not check causes
the test to fail. Whereas, if they were not set the tests would fail and
the not test would succeed, allowing the test to pass provided you were
not targeting AMD hardware while running the test.

When the environment variables for AMD hardware are set the compiler
could find the correct offload architecture to compile for, allowing the
command to succeed. However, when not set it'd fail as it can't detect
the offload architecture. The solution to this is to specify an AMD
architecture via --offload-arch so it always passes, the architecture
doesn't really matter, I've chosen to select gfx90a.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:driver flang:openmp flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants