New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Turn CUDA off by default #4406
Turn CUDA off by default #4406
Conversation
rwy7
commented
Oct 3, 2019
•
edited
edited
- Stop automatically enabling OMR_OPT_CUDA, so a user has to specifically opt-in for CUDA support.
- Stop using the built-in FindCUDA package, because it requires a complete installation of CUDA. We only need a small subset of the CUDA SDK.
- Enable OMR_OPT_CUDA explicitly in X86-64 builds
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good in principle. One question and a few hopefully quick changes requested before we merge.
CMakeLists.txt
Outdated
@@ -68,15 +68,22 @@ set(OMR_INSTALL_DATA_DIR ${CMAKE_INSTALL_DATADIR}/${PROJECT_NAME} CACHE PATH "In | |||
### | |||
|
|||
if (OMR_ENV_DATA64) | |||
find_package(Threads) | |||
find_package(Threads QUIET) | |||
if(Threads_FOUND) | |||
# Threads mush be found before we can look for CUDA. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe correct this typo "mush" while you're here?
CMakeLists.txt
Outdated
@@ -68,15 +68,22 @@ set(OMR_INSTALL_DATA_DIR ${CMAKE_INSTALL_DATADIR}/${PROJECT_NAME} CACHE PATH "In | |||
### | |||
|
|||
if (OMR_ENV_DATA64) | |||
find_package(Threads) | |||
find_package(Threads QUIET) | |||
if(Threads_FOUND) | |||
# Threads mush be found before we can look for CUDA. | |||
# FindCuda will error out if Threads is missing, even though CUDA itself is optional. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also should correct this comment as it no longer matches what will happen
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what needs to be corrected here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once you added "QUIET" it's no longer true that "FindCuda will error out" is it? Or am I reading too much into this comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see what you're saying. FindCUDA is already optional, now we're just making it quiet too. The comment is documenting what I consider a bug in FindCUDA module. FindCUDA treats FindThreads as required, even if FindCUDA itself is optional. So we have to ensure that FindThreads exists before we use FindCUDA.
CMakeLists.txt
Outdated
set(CUDA_FOUND OFF CACHE BOOL "CUDA is disabled (threads not found)") | ||
endif() | ||
else() | ||
message(STATUS "OMR: CUDA disabled (unsupported platform)") | ||
set(CUDA_FOUND OFF CACHE BOOL "CUDA is disabled") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this message also have the "(unsupported platform)" added to it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
CMakeLists.txt
Outdated
find_package(CUDA) | ||
find_package(CUDA QUIET) | ||
if (CUDA_FOUND) | ||
message(STATUS "OMR: CUDA enabled") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If someone requests to explicitly disable CUDA support, will this code still run (and print "OMR: CUDA enabled" if all the libraries are present)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. Want that changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If CUDA is explicitly disabled, there should be no messages about any attempts to find it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the flip side, if CUDA is explicitly enabled and directed to use a specific version, this should not fail if nvcc
is not found (OMR doesn't need it). See [1] which only makes the necessary header files available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, is there a specific CUDA header I can check for in the build system? And I suppose this means there are no CUDA libraries we need to link? On what platforms do we support CUDA?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OMR uses few of the header files from CUDA: I think checking for include/cuda.h
is probably sufficient.
Based on feedback on this PR, and in the OMR architecture meeting, I'm planning on making the following change: OMR_OPT_CUDA will be off by default, and must be explicitly enabled by the user. Since OMR doesn't require a complete CUDA SDK installed, I'm going to stop using the built-in FindCUDA package, and only search for what we need. |
8ecd87c
to
ef21c8c
Compare
OK I've pushed a new version of this commit, that just checks for |
@genie-omr build all |
31f4f6c
to
e19d2be
Compare
cmake/modules/FindOmrCuda.cmake
Outdated
find_path(OmrCuda_CUDA_H_DIR | ||
NAMES cuda.h | ||
PATHS | ||
${OmrCuda_SEARCH_PATH} | ||
PATH_SUFFIXES | ||
include | ||
DOC "The cuda.h include directory" | ||
) | ||
|
||
find_path(OmrCuda_CUDA_RUNTIME_H_DIR | ||
NAMES cuda_runime.h | ||
PATHS | ||
${OmrCuda_SEARCH_PATH} | ||
PATH_SUFFIXES | ||
include | ||
DOC "The cuda_runtime.h include directory" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OmrCuda_CUDA_H_DIR
and OmrCuda_CUDA_RUNTIME_H_DIR
should always be the same: I think it more appropriate to check for both header files in a single directory (and just add that one to the include path).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK I've made it so we look for cuda_runtime.h
in whichever directory cuda.h
is in.
@genie-omr build xlinux |
e19d2be
to
3e5958a
Compare
@genie-omr build xlinux |
3e5958a
to
ae9c3d4
Compare
@genie-omr build xlinux |
cmake/modules/FindOmrCuda.cmake
Outdated
############################################################################# | ||
|
||
# | ||
# This package locates a minimum set of cuda resources, required by the OMR_OPT_CUDA flag. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: 'CUDA' is an acronym which should be in upper-case.
@dnakamura do you want to review this? |
cmake/modules/FindOmrCuda.cmake
Outdated
"$ENV{CUDA_HOME}" | ||
"$ENV{CUDA_PATH}" | ||
"$ENV{CUDA_INC_PATH}" | ||
"$ENV{NVSDKCOMPUTE_ROOT}/C" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the /C
suffix here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stolen from FindCUDA.cmake. Apparently, after CUDA 3.0, the C headers were moved under the C directory. If you know this is wrong, please let me know--I am not familiar with CUDA, I just tried to replicate the FindCUDA search order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure it makes sense to include NVSDKCOMPUTE_ROOT
; if it's empty, the effect is to consider /C
which probably isn't what we want.
I'd suggest we simplify this so it just checks that the user has specified (a reasonable value for) OMR_CUDA_HOME
if OMR_OPT_CUDA
is set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or make it behave like the documentation for OMR_OPT_CUDA
says:
Path to the CUDA SDK. Takes precedence over CUDA_HOME in OMR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's reasonable to try to automatically locate CUDA. I would rather improve the search path than remove the functionality outright.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried a cmake build on zLinux yesterday: the search for CUDA failed. CMake is configured explicitly with -DJ9VM_OPT_CUDA=OFF
which OpenJ9 maps to OMR_OPT_CUDA=OFF
. We shouldn't even be trying to find CUDA in that case. I don't know if there's a way to tell the difference between OFF by default and OFF on the command-line: if not, then I suggest we have to scale this back so only validate a path supplied in CUDA_BIN_PATH
(or whatever CMake variables we settle on).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, with this PR, cmake will search for CUDA only if OMR_OPT_CUDA
is enabled. We can still provide a default search path for CUDA headers, and there is no need to differentiate why the feature is off. Does that sound good?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the caller specifies OMR_CUDA_HOME
, this should not consider any other location that might happen to be an acceptable CUDA installation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me 👍
f129bc9
to
6e3011b
Compare
cmake/modules/FindOmrCuda.cmake
Outdated
elseif(ENV{CUDA_HOME}) | ||
set(OmrCuda_SEARCH_DIR "$ENV{CUDA_HOME}") | ||
else() | ||
message(WARNING "CUDA support requested, but OMR_CUDA_HOME/CUDA_HOME are not set.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this not be an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was torn. If we make this an error, and the user doesn't specify REQUIRED
in find_package(OmrCuda REQUIRED)
, then we will have a hard error when there shouldn't be. However, this package doesn't function as a traditional "find package" module, and I find it hard to believe anyone would use this package without OMR_OPT_CUDA
. Making this an error is probably fine.
If we leave this as a warning, the build will still error out when it's misconfigured, because OMR makes OmrCuda
REQUIRED
when OMR_OPT_CUDA
is enabled.
I could go either way, so let me know what you prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot about the (optional)REQUIRED
argument to FindOmrCuda
. I think it should be an error in the required case and I'm fine with a warning otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done 👍
6e3011b
to
8f92d94
Compare
cmake/config.cmake
Outdated
@@ -202,6 +202,6 @@ set(OMR_NOTIFY_POLICY_CONTROL OFF CACHE BOOL "TODO: Document") | |||
|
|||
set(OMR_ENV_GCC OFF CACHE BOOL "TODO: Document") | |||
|
|||
set(OMR_OPT_CUDA ${CUDA_FOUND} CACHE BOOL "Enable CUDA support in OMR") | |||
set(OMR_OPT_CUDA OFF CACHE BOOL "Enable CUDA support in OMR") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps OMR_CUDA_HOME
should be set (and documented) here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes more sense to keep OMR_CUDA_HOME
in the FindOmrCuda
module, but I could add a note to OMR_OPT_CUDA
saying users should set OMR_CUDA_HOME
. Would that be OK?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking the set(OMR_CUDA_HOME)
call that would appear here would also specify the type (directory or path?), but mentioning it in the description of OMR_OPT_CUDA
works for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I've added the mention, and I've changed the type of OMR_CUDA_HOME
to PATH
(that means "path to a directory", it was just STRING
before).
8f92d94
to
42321d0
Compare
@rwy0717 @mstoodle What's the next step on this PR? There's an OpenJ9 cmake PR that depends on this so I'd like to see it merged unless there's something that still needs to be done |
When this is merged, the Java extension repos will need to change as well. Previously, |
1. Stop automatically enabling OMR_OPT_CUDA. This flag must now be enabled by the user. 2. Only search for CUDA resources if the feature has been enabled. 3. Only search for CUDA headers, not all resources. 4. Require that the user specify where CUDA is located via the OMR_CUDA_HOME directory. Do not search standard paths for CUDA resources. Signed-off-by: Robert Young <rwy0717@gmail.com>
42321d0
to
91890a1
Compare
Just rebased. |
+1 The extensions repos changes look to cover all repos with cmake so far. @keithc-ca Any further concerns? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're good to go.
LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm