Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kokkos::DualView: Incorrect deduction of "not device type" #1659

Closed
mhoemmen opened this issue Jun 5, 2018 · 14 comments
Closed

Kokkos::DualView: Incorrect deduction of "not device type" #1659

mhoemmen opened this issue Jun 5, 2018 · 14 comments
Assignees
Labels
Enhancement Improve existing capability; will potentially require voting
Milestone

Comments

@mhoemmen
Copy link
Contributor

mhoemmen commented Jun 5, 2018

If I create a DualView<..., Device<Cuda, CudaUVMSpace>>, and I call view, sync, modify, or need_sync with Device<Cuda, CudaSpace>, DualView incorrectly deduces that I meant the host space. Vice versa for a DualView on CudaSpace.

This is a problem when interacting with Trilinos especially, because Trilinos forces Kokkos to set Cuda's default memory space to CudaUVMSpace. Thus, whether or not the following two code examples are correct depends merely on whether building Kokkos through Trilinos or by itself.

Example 1:

const int N = 100;
DualView<int*, CudaSpace> dv ("dv", N);
auto should_be_device_view = dv.sync<Cuda>();

Example 2:

const int N = 100;
DualView<int*, Cuda> dv ("dv", N);
auto should_be_device_view = dv.sync<CudaSpace>();

DualView uses code like the following to select which memory space to access:

template< class Device >

I propose this instead:

constexpr bool is_host_space = 
  std::is_same<typename Device::memory_space, Kokkos::HostSpace>::value;
@mhoemmen
Copy link
Contributor Author

mhoemmen commented Jun 5, 2018

FYI I'm happy to fix this myself but I just want to check if this makes sense.

@ibaned ibaned added the Enhancement Improve existing capability; will potentially require voting label Jun 5, 2018
@ibaned ibaned added this to the 2018 July milestone Jun 5, 2018
@ibaned
Copy link
Contributor

ibaned commented Jun 5, 2018

Well, this does beg the question of why the member functions are being called with template parameters other than the one that was used to define the DualView itself. Why are CudaUVMSpace and CudaSpace being mixed? If the DualView's "device" view is in UVM space, it seems ill-formed to ask it to sync to CudaSpace, which is not the same thing.

@mhoemmen
Copy link
Contributor Author

mhoemmen commented Jun 5, 2018

@ibaned SPARC has an option to build with or without Trilinos. In the without-Trilinos build, SPARC uses Kokkos' default behavior of setting Cuda::memory_space = CudaSpace. In the with-Trilinos build, Trilinos compels Kokkos to set Cuda::memory_space = CudaUVMSpace.

@mhoemmen
Copy link
Contributor Author

mhoemmen commented Jun 5, 2018

Micah, Travis, and I debated whether it would be better to force the correct memory space via static_assert. I'm OK with either option. It just seems wrong to return the host memory space without an error in this case. It led to a lot of debugging.

@ibaned
Copy link
Contributor

ibaned commented Jun 5, 2018

Yea, I marked this as an enhancement because stronger error checks need to be put in place (static_assert sounds right to me), but at the same time I think SPARC should do something like

using DeviceSpace = Kokkos::Cuda::memory_space

I think the interface should require exactly one of the two spaces involved, and compile error if the space is something else.

@mhoemmen
Copy link
Contributor Author

mhoemmen commented Jun 5, 2018

@ibaned wrote:

I think the interface should require exactly one of the two spaces involved, and compile error if the space is something else.

I think that's a good idea.

SPARC has plans to do something like what you mentioned.

@ibaned
Copy link
Contributor

ibaned commented Jun 5, 2018

Awesome. On second thought, if SPARC doesn't need UVM for their DualViews, they should probably instead define a Device<Cuda, CudaSpace>, that may be better for performance. Either way, I think we're in agreement. This issue stays open until I add some static_asserts to save future developers tons of debugging time.

@ibaned ibaned self-assigned this Jun 5, 2018
@mhoemmen
Copy link
Contributor Author

mhoemmen commented Jun 5, 2018

@ibaned wrote:

... they should probably instead define a Device<Cuda, CudaSpace>, that may be better for performance.

They already do that :-) The issue is interactions with Trilinos' types.

@mhoemmen
Copy link
Contributor Author

mhoemmen commented Jun 5, 2018

@ibaned wrote:

This issue stays open until I add some static_asserts to save future developers tons of debugging time.

You're a dear; thanks :D

@crtrott
Copy link
Member

crtrott commented Jun 7, 2018

I concur with Dans solution.

@crtrott
Copy link
Member

crtrott commented Aug 14, 2018

Found some issues with it, which I am fixing.

@crtrott
Copy link
Member

crtrott commented Aug 14, 2018

This is the new deduction mechanism - BEHOLD the power of LOGIC:

  template<class Device>
  static int get_device_side() {
    constexpr bool device_is_memspace  = std::is_same<Device,typename Device::memory_space>::value;
    constexpr bool device_is_execspace = std::is_same<Device,typename Device::execution_space>::value;
    constexpr bool device_exec_is_t_dev_exec  = std::is_same<typename Device::execution_space,typename t_dev::execution_space>::value;
    constexpr bool device_mem_is_t_dev_mem    = std::is_same<typename Device::memory_space,typename t_dev::memory_space>::value;
    constexpr bool device_exec_is_t_host_exec  = std::is_same<typename Device::execution_space,typename t_host::execution_space>::value;
    constexpr bool device_mem_is_t_host_mem    = std::is_same<typename Device::memory_space,typename t_host::memory_space>::value;
    constexpr bool device_is_t_host_device  = std::is_same<typename Device::execution_space,typename t_host::device_type>::value;
    constexpr bool device_is_t_dev_device    = std::is_same<typename Device::memory_space,typename t_host::device_type>::value;

    #ifndef KOKKOS_ENABLE_DEPRECATED_CODE
    static_assert(
        device_is_t_dev_device || device_is_t_host_device ||
        (device_is_memspace  && (device_mem_is_t_dev_mem   || device_mem_is_t_host_mem) ) ||
        (device_is_execspace && (device_exec_is_t_dev_exec || device_exec_is_t_host_exec) ) ||
        (
          (!device_is_execspace && !device_is_memspace) && (
            (device_mem_is_t_dev_mem   || device_mem_is_t_host_mem)  ||
            (device_exec_is_t_dev_exec || device_exec_is_t_host_exec)
          )
        )
        ,
        "Template parameter to .sync() must exactly match one of the DualView's device types or one of the execution or memory spaces");
    #endif

    int dev = -1;
    if(device_is_t_dev_device) dev = 1;
    else if(device_is_t_host_device) dev = 0;
    else {
      if(device_is_memspace) {
        if(device_mem_is_t_dev_mem) dev = 1;
        if(device_mem_is_t_host_mem) dev = 0;
        if(device_mem_is_t_host_mem && device_mem_is_t_dev_mem) dev = -1;
      }
      if(device_is_execspace) {
        if(device_exec_is_t_dev_exec) dev = 1;
        if(device_exec_is_t_host_exec) dev = 0;
        if(device_exec_is_t_host_exec && device_exec_is_t_dev_exec) dev = -1;
      }
      if(!device_is_execspace && !device_is_memspace) {
        if(device_mem_is_t_dev_mem) dev = 1;
        if(device_mem_is_t_host_mem) dev = 0;
        if(device_mem_is_t_host_mem && device_mem_is_t_dev_mem) dev = -1;
        if(device_exec_is_t_dev_exec) dev = 1;
        if(device_exec_is_t_host_exec) dev = 0;
        if(device_exec_is_t_host_exec && device_exec_is_t_dev_exec) dev = -1;
      }
    }
    return dev;
  }

@mhoemmen
Copy link
Contributor Author

L O G I C

@mhoemmen
Copy link
Contributor Author

Thanks @crtrott :D

crtrott added a commit that referenced this issue Aug 15, 2018
Improved fix for DualView synd deduction issue #1659
@crtrott crtrott closed this as completed Nov 5, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Improve existing capability; will potentially require voting
Projects
None yet
Development

No branches or pull requests

3 participants