Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

Fix <complex> on NVRTC #109

Merged
merged 4 commits into from
Feb 19, 2021
Merged

Fix <complex> on NVRTC #109

merged 4 commits into from
Feb 19, 2021

Conversation

wmaxey
Copy link
Member

@wmaxey wmaxey commented Jan 29, 2021

Changes made in this patchset to fix #101:

  • <cmath> exposes a subset of cuda::std:: math functions under NVRTC.
  • <cmath> creates defines for INFINITY and NAN when NVRTC is detected
  • Tests were reworked to no longer rely on a global test case struct.
  • A couple .fail tests required NVRTC exceptions.

@wmaxey wmaxey added this to the 2.0.0 milestone Jan 29, 2021
@wmaxey wmaxey self-assigned this Jan 29, 2021
@brycelelbach brycelelbach modified the milestones: 2.0.0, 1.4.1 Feb 11, 2021
@wmaxey wmaxey force-pushed the bugfix/nvrtc_complex branch 2 times, most recently from f2ea8f4 to 64b2727 Compare February 17, 2021 05:09
@wmaxey wmaxey changed the title WIP: Fix <complex> on NVRTC Fix <complex> on NVRTC Feb 17, 2021
@wmaxey wmaxey marked this pull request as ready for review February 17, 2021 05:16
@@ -16,166 +16,200 @@
#include <cuda/std/complex>
#include <cuda/std/cassert>

#ifdef __CUDA_ARCH__
Copy link
Member Author

@wmaxey wmaxey Feb 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be brought to attention, __CUDA_ARCH__ globals are not going to be supportable with NVC++. So this change serves a dual purpose in the future.

Immediately, however, NVRTC cannot initialize the testcases constant using the dynamic floating point values from the likes of CUDART_INF_F.

@@ -312,33 +312,19 @@ long double truncl(long double x);
#pragma GCC system_header
#endif

#ifdef _LIBCUDACXX_COMPILER_NVRTC
#include <math_constants.h>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just FYI: I am not sure what's the release strategy of libcu++, so maybe this can be ignored, but including this header has caused trouble in CuPy when a user runs CUDA + NVRTC + CuPy in a "runtime" environment (such as the CUDA Runtime Docker, or RAPIDS's favorite Conda env), because none of the standard CUDA headers are present there due to EULA.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Leo!

That's insightful, I'm recalling now similar issues with cooperative_groups being unavailable in NVRTC.

I have a more preferable solution now that I think about it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wesley, the change in 6696f20 is cool 😄

Copy link
Collaborator

@griwes griwes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approve pending resolution of the comments.

@wmaxey wmaxey merged commit d3334a6 into main Feb 19, 2021
@wmaxey wmaxey deleted the bugfix/nvrtc_complex branch February 19, 2021 01:27
@github-pages github-pages bot temporarily deployed to github-pages February 19, 2021 01:27 Inactive
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cuda::std::complex doesn't work with NVRTC because some <cmath> functions are missing
4 participants