Skip to content

Fix compilation warnings regarding std::fn on device#70

Merged
bartgol merged 4 commits intomasterfrom
Fix-8
Oct 21, 2020
Merged

Fix compilation warnings regarding std::fn on device#70
bartgol merged 4 commits intomasterfrom
Fix-8

Conversation

@bartgol
Copy link
Copy Markdown
Contributor

@bartgol bartgol commented Oct 21, 2020

Motivation

On CUDA, the compilation of pack math functions caused a bunch of warnings, regarding std::blah functions not being a device fcn but being called inside a __host__ __device__ function. I am not 100% sure why this warning generated, and yet was not a real problem.

This PR fixes the warning, by providing 3 implementation of all the math functions, for double, float, and integer types.

Related Issues

Testing

These modification affect only pack operations implementations, which continue to be tested as they were before.

The error appeared to be benign, but it generated several warnings.
To get around it, we added the MathFcn class, which dispatches the
function call to the proper implementation (cuda vs std).
For some reason this never caused compilation errors,
but it started after the previous commit.
@bartgol bartgol added AT: AUTOMERGE pkg: pack Related to the pack package code cleanup labels Oct 21, 2020
@bartgol bartgol self-assigned this Oct 21, 2020
@E3SM-Bot
Copy link
Copy Markdown
Collaborator

Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request.

@E3SM-Bot
Copy link
Copy Markdown
Collaborator

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: EKAT_PullRequest_Autotester_Blake

  • Build Num: 94
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
EKAT_SOURCE_BRANCH Fix-8
EKAT_SOURCE_REPO https://github.com/E3SM-Project/EKAT
EKAT_SOURCE_SHA 13d6b89
EKAT_TARGET_BRANCH master
EKAT_TARGET_REPO https://github.com/E3SM-Project/EKAT
EKAT_TARGET_SHA b66923d
PR_LABELS AT: AUTOMERGE;Packs;code cleanup
PULLREQUESTNUM 70
TEST_REPO_ALIAS EKAT

Build Information

Test Name: EKAT_PullRequest_Autotester_Weaver

  • Build Num: 89
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
EKAT_SOURCE_BRANCH Fix-8
EKAT_SOURCE_REPO https://github.com/E3SM-Project/EKAT
EKAT_SOURCE_SHA 13d6b89
EKAT_TARGET_BRANCH master
EKAT_TARGET_REPO https://github.com/E3SM-Project/EKAT
EKAT_TARGET_SHA b66923d
PR_LABELS AT: AUTOMERGE;Packs;code cleanup
PULLREQUESTNUM 70
TEST_REPO_ALIAS EKAT

Build Information

Test Name: EKAT_PullRequest_Autotester_Mappy

  • Build Num: 12
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
EKAT_SOURCE_BRANCH Fix-8
EKAT_SOURCE_REPO https://github.com/E3SM-Project/EKAT
EKAT_SOURCE_SHA 13d6b89
EKAT_TARGET_BRANCH master
EKAT_TARGET_REPO https://github.com/E3SM-Project/EKAT
EKAT_TARGET_SHA b66923d
PR_LABELS AT: AUTOMERGE;Packs;code cleanup
PULLREQUESTNUM 70
TEST_REPO_ALIAS EKAT

Using Repos:

Repo: EKAT (E3SM-Project/EKAT)
  • Branch: Fix-8
  • SHA: 13d6b89
  • Mode: TEST_REPO

Pull Request Author: bartgol

@E3SM-Bot
Copy link
Copy Markdown
Collaborator

Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED

Pull Request Auto Testing has PASSED (click to expand)

Build Information

Test Name: EKAT_PullRequest_Autotester_Blake

  • Build Num: 94
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
EKAT_SOURCE_BRANCH Fix-8
EKAT_SOURCE_REPO https://github.com/E3SM-Project/EKAT
EKAT_SOURCE_SHA 13d6b89
EKAT_TARGET_BRANCH master
EKAT_TARGET_REPO https://github.com/E3SM-Project/EKAT
EKAT_TARGET_SHA b66923d
PR_LABELS AT: AUTOMERGE;Packs;code cleanup
PULLREQUESTNUM 70
TEST_REPO_ALIAS EKAT

Build Information

Test Name: EKAT_PullRequest_Autotester_Weaver

  • Build Num: 89
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
EKAT_SOURCE_BRANCH Fix-8
EKAT_SOURCE_REPO https://github.com/E3SM-Project/EKAT
EKAT_SOURCE_SHA 13d6b89
EKAT_TARGET_BRANCH master
EKAT_TARGET_REPO https://github.com/E3SM-Project/EKAT
EKAT_TARGET_SHA b66923d
PR_LABELS AT: AUTOMERGE;Packs;code cleanup
PULLREQUESTNUM 70
TEST_REPO_ALIAS EKAT

Build Information

Test Name: EKAT_PullRequest_Autotester_Mappy

  • Build Num: 12
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
EKAT_SOURCE_BRANCH Fix-8
EKAT_SOURCE_REPO https://github.com/E3SM-Project/EKAT
EKAT_SOURCE_SHA 13d6b89
EKAT_TARGET_BRANCH master
EKAT_TARGET_REPO https://github.com/E3SM-Project/EKAT
EKAT_TARGET_SHA b66923d
PR_LABELS AT: AUTOMERGE;Packs;code cleanup
PULLREQUESTNUM 70
TEST_REPO_ALIAS EKAT

@E3SM-Bot
Copy link
Copy Markdown
Collaborator

Status Flag 'Pre-Merge Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging
WARNING: NO REVIEWERS HAVE BEEN REQUESTED FOR THIS PULL REQUEST!

@E3SM-Bot
Copy link
Copy Markdown
Collaborator

All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur...

@bartgol bartgol requested review from ambrad and jgfouca October 21, 2020 18:51
@bartgol
Copy link
Copy Markdown
Contributor Author

bartgol commented Oct 21, 2020

@jgfouca @ambrad I pinged you two cause you implemented most of the pack stuff. Feel free to take your name off if you'd rather pass.

@ambrad
Copy link
Copy Markdown
Member

ambrad commented Oct 21, 2020

This moves some code and then for testing only replaces a macro with a function? Or is there more?

@bartgol
Copy link
Copy Markdown
Contributor Author

bartgol commented Oct 21, 2020

This moves some code and then for testing only replaces a macro with a function? Or is there more?

I moved the pack math overloads to their own file, cause ekat_pack.hpp was a bit long. For testing, nothing has changed, it's the same as before. What changed is how the different math fcns overloads are generated: the macro instantiating the pack overloads now calls MathFcn<ScalarT>::fcn, which takes care of doing a different impl (std::blah vs blah or blahf) depending on a) ScalarT, and b) whether we're inside a CUDA kernel.

@ambrad
Copy link
Copy Markdown
Member

ambrad commented Oct 21, 2020

Sorry for being dense, but is this introducing one extra level of function call for things like pow?

@E3SM-Bot
Copy link
Copy Markdown
Collaborator

Status Flag 'Pre-Merge Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging
NO REVIEWS HAVE BEEN PERFORMED ON THIS PULL REQUEST!

@E3SM-Bot
Copy link
Copy Markdown
Collaborator

All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur...

@bartgol
Copy link
Copy Markdown
Contributor Author

bartgol commented Oct 21, 2020

@ambrad yes, although it should be inlined (I may replace KOKKOS_INLINE_FUNCTION with KOKKOS_FORCEINLINE_FUNCTION if you prefer).

jgfouca
jgfouca previously approved these changes Oct 21, 2020
@ambrad
Copy link
Copy Markdown
Member

ambrad commented Oct 21, 2020

Ok. Yes, please at least use KOKKOS_FORCEINLINE_FUNCTION. We have clear evidence from studies ~2 years ago that if one isn't very careful, inline isn't enough and non-inlined function calls can destroy vectorization.

Also, I don't see why an extra fn call is necessary. Why not just use macros whose def depends on CUDA_ARCH? This is super low-level code, so worrying about things like fn calls destroying vectorization is important.

@bartgol
Copy link
Copy Markdown
Contributor Author

bartgol commented Oct 21, 2020

To give more context: things were also (almost) working with something like this


#ifdef __CUDA_ARCH__
#define ekat_pack_gen_unary_stdfn(fn)               \
  template <typename ScalarT, int N>                \
  KOKKOS_INLINE_FUNCTION                            \
  Pack<ScalarT,N> fn (const Pack<ScalarT,N>& p) {   \
    Pack<ScalarT,N> s;                              \
    vector_simd                                     \
    for (int i = 0; i < N; ++i) {                   \
      s[i] = fn(p[i]);                              \
    }                                               \
    return s;                                       \
  }
#else
#define ekat_pack_gen_unary_stdfn(fn)               \
  template <typename ScalarT, int N>                \
  KOKKOS_INLINE_FUNCTION                            \
  Pack<ScalarT,N> fn (const Pack<ScalarT,N>& p) {   \
    Pack<ScalarT,N> s;                              \
    vector_simd                                     \
    for (int i = 0; i < N; ++i) {                   \
      s[i] = std::fn(p[i]);                         \
    }                                               \
    return s;                                       \
  }
#endif

since cuda overloads blah for both double and float. However, the compilation would crap out when trying to test on int/long.

Edit: that's why I was asking on slack about ints. The only workaround I could find was to explicitly cast integers to double, but to do that without messing up the sp build, I had to do multiple impl. And partial specialization wasn't possible, so in the end I used a helper struct for the specialization of the core function on integer/floating.

@jgfouca
Copy link
Copy Markdown
Member

jgfouca commented Oct 21, 2020

Just a semi-related question, if KOKKOS_FORCEINLINE_FUNCTION is more reliable than KOKKOS_INLINE_FUNCTION, should we just always use KOKKOS_FORCEINLINE? If not, when to prefer one over the other?

@E3SM-Bot
Copy link
Copy Markdown
Collaborator

Status Flag 'Pre-Merge Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ jgfouca ]!

@E3SM-Bot
Copy link
Copy Markdown
Collaborator

Status Flag 'Pull Request AutoTester' - AutoMerge IS ENABLED, but the Label AT: AUTOMERGE is not set. Either set Label AT: AUTOMERGE or manually merge the PR...

@bartgol bartgol marked this pull request as draft October 21, 2020 19:47
@bartgol
Copy link
Copy Markdown
Contributor Author

bartgol commented Oct 21, 2020

Just a semi-related question, if KOKKOS_FORCEINLINE_FUNCTION is more reliable than KOKKOS_INLINE_FUNCTION, should we just always use KOKKOS_FORCEINLINE? If not, when to prefer one over the other?

I think the point is that you don't want to inline the whole library. For key functions, you want to be sure it is inlined, e.g., for vectorization reasons, as Andrew pointed out. But in other less crucial cases, it's probably best to let the compiler decide.

@ambrad
Copy link
Copy Markdown
Member

ambrad commented Oct 21, 2020

@jgfouca, I agree strongly with Luca. FORCEINLINE must be used very carefully. It's pretty much meant only for dealing with things like this.

- use FORCEINLINE instead of INLINE (crucial for vectorization)
- use only 2 specializations, since blah(x) on CUDA works for
  for both double and float.
@E3SM-Bot
Copy link
Copy Markdown
Collaborator

Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request.

@E3SM-Bot
Copy link
Copy Markdown
Collaborator

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: EKAT_PullRequest_Autotester_Blake

  • Build Num: 95
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
EKAT_SOURCE_BRANCH Fix-8
EKAT_SOURCE_REPO https://github.com/E3SM-Project/EKAT
EKAT_SOURCE_SHA 11c524b
EKAT_TARGET_BRANCH master
EKAT_TARGET_REPO https://github.com/E3SM-Project/EKAT
EKAT_TARGET_SHA b66923d
PR_LABELS Packs;code cleanup
PULLREQUESTNUM 70
TEST_REPO_ALIAS EKAT

Build Information

Test Name: EKAT_PullRequest_Autotester_Weaver

  • Build Num: 90
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
EKAT_SOURCE_BRANCH Fix-8
EKAT_SOURCE_REPO https://github.com/E3SM-Project/EKAT
EKAT_SOURCE_SHA 11c524b
EKAT_TARGET_BRANCH master
EKAT_TARGET_REPO https://github.com/E3SM-Project/EKAT
EKAT_TARGET_SHA b66923d
PR_LABELS Packs;code cleanup
PULLREQUESTNUM 70
TEST_REPO_ALIAS EKAT

Build Information

Test Name: EKAT_PullRequest_Autotester_Mappy

  • Build Num: 13
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
EKAT_SOURCE_BRANCH Fix-8
EKAT_SOURCE_REPO https://github.com/E3SM-Project/EKAT
EKAT_SOURCE_SHA 11c524b
EKAT_TARGET_BRANCH master
EKAT_TARGET_REPO https://github.com/E3SM-Project/EKAT
EKAT_TARGET_SHA b66923d
PR_LABELS Packs;code cleanup
PULLREQUESTNUM 70
TEST_REPO_ALIAS EKAT

Using Repos:

Repo: EKAT (E3SM-Project/EKAT)
  • Branch: Fix-8
  • SHA: 11c524b
  • Mode: TEST_REPO

Pull Request Author: bartgol

@E3SM-Bot
Copy link
Copy Markdown
Collaborator

Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED

Pull Request Auto Testing has PASSED (click to expand)

Build Information

Test Name: EKAT_PullRequest_Autotester_Blake

  • Build Num: 95
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
EKAT_SOURCE_BRANCH Fix-8
EKAT_SOURCE_REPO https://github.com/E3SM-Project/EKAT
EKAT_SOURCE_SHA 11c524b
EKAT_TARGET_BRANCH master
EKAT_TARGET_REPO https://github.com/E3SM-Project/EKAT
EKAT_TARGET_SHA b66923d
PR_LABELS Packs;code cleanup
PULLREQUESTNUM 70
TEST_REPO_ALIAS EKAT

Build Information

Test Name: EKAT_PullRequest_Autotester_Weaver

  • Build Num: 90
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
EKAT_SOURCE_BRANCH Fix-8
EKAT_SOURCE_REPO https://github.com/E3SM-Project/EKAT
EKAT_SOURCE_SHA 11c524b
EKAT_TARGET_BRANCH master
EKAT_TARGET_REPO https://github.com/E3SM-Project/EKAT
EKAT_TARGET_SHA b66923d
PR_LABELS Packs;code cleanup
PULLREQUESTNUM 70
TEST_REPO_ALIAS EKAT

Build Information

Test Name: EKAT_PullRequest_Autotester_Mappy

  • Build Num: 13
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
EKAT_SOURCE_BRANCH Fix-8
EKAT_SOURCE_REPO https://github.com/E3SM-Project/EKAT
EKAT_SOURCE_SHA 11c524b
EKAT_TARGET_BRANCH master
EKAT_TARGET_REPO https://github.com/E3SM-Project/EKAT
EKAT_TARGET_SHA b66923d
PR_LABELS Packs;code cleanup
PULLREQUESTNUM 70
TEST_REPO_ALIAS EKAT

@E3SM-Bot
Copy link
Copy Markdown
Collaborator

Status Flag 'Pre-Merge Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging
THE LAST COMMIT TO THIS PULL REQUEST HAS NOT BEEN REVIEWED YET!

@E3SM-Bot
Copy link
Copy Markdown
Collaborator

All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur...

@bartgol
Copy link
Copy Markdown
Contributor Author

bartgol commented Oct 21, 2020

Ok, I removed testing of math functions for int/long. Math fcns calls on packs of integers will result in a compiler error on cuda (except for abs, which is overloaded for int/long). This also removed the need for MathFcn, and the discussion on FORCEINLINE.

Note: I did however replace INLINE with FORCEINLINE on the max/min functions in ekat_math_utils.hpp, since it is possible they might get called inside vectorizable loops.

@bartgol bartgol marked this pull request as ready for review October 21, 2020 20:40
This comes at the (small) price of not providing overloads of
standard math functions for packs of integers.
@E3SM-Bot
Copy link
Copy Markdown
Collaborator

Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request.

@E3SM-Bot
Copy link
Copy Markdown
Collaborator

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: EKAT_PullRequest_Autotester_Blake

  • Build Num: 96
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
EKAT_SOURCE_BRANCH Fix-8
EKAT_SOURCE_REPO https://github.com/E3SM-Project/EKAT
EKAT_SOURCE_SHA d9a0710
EKAT_TARGET_BRANCH master
EKAT_TARGET_REPO https://github.com/E3SM-Project/EKAT
EKAT_TARGET_SHA b66923d
PR_LABELS Packs;code cleanup
PULLREQUESTNUM 70
TEST_REPO_ALIAS EKAT

Build Information

Test Name: EKAT_PullRequest_Autotester_Weaver

  • Build Num: 91
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
EKAT_SOURCE_BRANCH Fix-8
EKAT_SOURCE_REPO https://github.com/E3SM-Project/EKAT
EKAT_SOURCE_SHA d9a0710
EKAT_TARGET_BRANCH master
EKAT_TARGET_REPO https://github.com/E3SM-Project/EKAT
EKAT_TARGET_SHA b66923d
PR_LABELS Packs;code cleanup
PULLREQUESTNUM 70
TEST_REPO_ALIAS EKAT

Build Information

Test Name: EKAT_PullRequest_Autotester_Mappy

  • Build Num: 14
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
EKAT_SOURCE_BRANCH Fix-8
EKAT_SOURCE_REPO https://github.com/E3SM-Project/EKAT
EKAT_SOURCE_SHA d9a0710
EKAT_TARGET_BRANCH master
EKAT_TARGET_REPO https://github.com/E3SM-Project/EKAT
EKAT_TARGET_SHA b66923d
PR_LABELS Packs;code cleanup
PULLREQUESTNUM 70
TEST_REPO_ALIAS EKAT

Using Repos:

Repo: EKAT (E3SM-Project/EKAT)
  • Branch: Fix-8
  • SHA: d9a0710
  • Mode: TEST_REPO

Pull Request Author: bartgol

@E3SM-Bot
Copy link
Copy Markdown
Collaborator

Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED

Pull Request Auto Testing has PASSED (click to expand)

Build Information

Test Name: EKAT_PullRequest_Autotester_Blake

  • Build Num: 96
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
EKAT_SOURCE_BRANCH Fix-8
EKAT_SOURCE_REPO https://github.com/E3SM-Project/EKAT
EKAT_SOURCE_SHA d9a0710
EKAT_TARGET_BRANCH master
EKAT_TARGET_REPO https://github.com/E3SM-Project/EKAT
EKAT_TARGET_SHA b66923d
PR_LABELS Packs;code cleanup
PULLREQUESTNUM 70
TEST_REPO_ALIAS EKAT

Build Information

Test Name: EKAT_PullRequest_Autotester_Weaver

  • Build Num: 91
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
EKAT_SOURCE_BRANCH Fix-8
EKAT_SOURCE_REPO https://github.com/E3SM-Project/EKAT
EKAT_SOURCE_SHA d9a0710
EKAT_TARGET_BRANCH master
EKAT_TARGET_REPO https://github.com/E3SM-Project/EKAT
EKAT_TARGET_SHA b66923d
PR_LABELS Packs;code cleanup
PULLREQUESTNUM 70
TEST_REPO_ALIAS EKAT

Build Information

Test Name: EKAT_PullRequest_Autotester_Mappy

  • Build Num: 14
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
EKAT_SOURCE_BRANCH Fix-8
EKAT_SOURCE_REPO https://github.com/E3SM-Project/EKAT
EKAT_SOURCE_SHA d9a0710
EKAT_TARGET_BRANCH master
EKAT_TARGET_REPO https://github.com/E3SM-Project/EKAT
EKAT_TARGET_SHA b66923d
PR_LABELS Packs;code cleanup
PULLREQUESTNUM 70
TEST_REPO_ALIAS EKAT

@E3SM-Bot
Copy link
Copy Markdown
Collaborator

Status Flag 'Pre-Merge Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ jgfouca ambrad ]!

@E3SM-Bot
Copy link
Copy Markdown
Collaborator

Status Flag 'Pull Request AutoTester' - AutoMerge IS ENABLED, but the Label AT: AUTOMERGE is not set. Either set Label AT: AUTOMERGE or manually merge the PR...

@bartgol bartgol merged commit 20ab2e7 into master Oct 21, 2020
@bartgol bartgol deleted the Fix-8 branch October 21, 2020 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

code cleanup pkg: pack Related to the pack package

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix warning regarding calling host function from device in ekat_pack.hpp

4 participants