Fix STF compilation with nvc++ as host compiler#8230
Conversation
|
@davebayer how can we trust the CI if nvhpc is not enabled ? |
|
/ok to test 4a37550 |
I've tested it locally |
caugonnet
left a comment
There was a problem hiding this comment.
This looks good, but it's untested in the current CI afaik ?
I'm trying to finally get nvhpc back to the CI, so I need to patch these things 😅 |
|
Sounds great, anyway at some point we must use faith :) |
| void operator->*(Fun&& f) | ||
| { | ||
| # if __NVCOMPILER | ||
| # if _CCCL_CUDA_COMPILER(NVHPC) |
There was a problem hiding this comment.
I believe this is the wrong check as __NVCOMPILER checks only whether nvc++ is the host compiler
| # if _CCCL_CUDA_COMPILER(NVHPC) | |
| # if _CCCL_COMPILER(NVHPC) |
There was a problem hiding this comment.
And that's the problem that I'm fixing 😅
There was a problem hiding this comment.
Because the condition should only be true for NVC++ in CUDA mode
There was a problem hiding this comment.
this can be host code too, not just device, if we launch on the host
There was a problem hiding this comment.
But then shouldnt the title say cuda compiler ?
There was a problem hiding this comment.
this can be host code too, not just device, if we launch on the host
But that's fine. The thing is that nvc++ doesn't implement any of the __nv_is_extended_meow_lambda_closure_type traits, because it doesn't need them.
But then shouldnt the title say
cuda compiler?
No, because the problem occurs only when compiling CUDA source file with nvcc and nvc++ as the host compiler. Then __NVCOMPILER is defined and we suppose that all lambdas are host/device, which is not true.
🥳 CI Workflow Results🟩 Finished in 45m 39s: Pass: 100%/48 | Total: 8h 29m | Max: 17m 18s | Hits: 98%/26291See results here. |
Using
__NVCOMPILERcaused some serious issues with extended lambdas in this case..