-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor device selection at initialization #5211
Conversation
Summary:
|
db45bd3
to
ae03cc4
Compare
ae03cc4
to
963df48
Compare
@dalg24 You said When we say "device id", it is implied id with respect to enumeration given the by, e.g., nvidia-smi right? |
I was going for
I like
Yes, in whatever order the vendor runtimes detects them. |
Co-Authored-By: Christian Trott <crtrott@sandia.gov>
…nd KOKKOS_DEVICE_ID
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good
@@ -780,6 +840,7 @@ void Kokkos::Impl::parse_command_line_arguments( | |||
!kokkos_num_devices_found) { | |||
num_devices = std::stoi(num1_only); | |||
settings.set_num_devices(num_devices); | |||
settings.set_map_device_id_by("mpi_rank"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int use_gpu = settings.has_device_id() ? settings.get_device_id() : -1; | ||
const int ndevices = [](int num_devices) -> int { | ||
if (num_devices > 0) return num_devices; | ||
#if defined(KOKKOS_ENABLE_CUDA) | ||
return Cuda::detect_device_count(); | ||
#elif defined(KOKKOS_ENABLE_HIP) | ||
return Experimental::HIP::detect_device_count(); | ||
#elif defined(KOKKOS_ENABLE_SYCL) | ||
return sycl::device::get_devices(sycl::info::device_type::gpu).size(); | ||
#else | ||
return num_devices; | ||
#endif | ||
}(settings.has_num_devices() ? settings.get_num_devices() : -1); | ||
const int skip_device = | ||
settings.has_skip_device() ? settings.get_skip_device() : 9999; | ||
|
||
// if the exact device is not set, but ndevices was given, assign round-robin | ||
// using on-node MPI rank | ||
if (use_gpu < 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deprecate
num_devices
andskip_device
settings,--kokkos-num-devices
command line argument,KOKKOS_NUM_DEVICES
,KOKKOS_SKIP_DEVICES
, andKOKKOS_RAND_DEVICES
environment variable.Introduce as a replacement
map_device_id_by
setting along with--kokkos-map-device-id-by
andKOKKOS_MAP_DEVICE_ID_BY
which can be set to"random"
or"mpi_rank"
.KOKKOS_VISIBLE_DEVICES
environment variable to specify device ids as a comma-separated sequence of integers