Create class templates for RAJA linear algebra objects parametrized by execution policies #241

pelesh · 2021-05-31T03:03:17Z

Currently, RAJA execution policies in HiOp are set at build configuration time and user cannot change them at runtime. Furthermore, all RAJA linear algebra objects use the same set of RAJA execution policies.

Consider creating class templates parametrized by execution policies. That would allow user to create CPU and GPU linear algebra objects at runtime like this:

hiopVectorRajaPar<hiop::CudaPolicy> x;
hiopVectorRajaPar<hiop::OmpPolicy>  y;

This could be also helpful on heterogenous machines with, e.g., NVIDIA and AMD GPUs.

CC @ashermancinelli @cameronrutherford @nychiang @cnpetra @jwang125

pelesh · 2021-05-31T03:04:35Z

Addressing #240 could simplify creating class templates for RAJA linear algebra objects.

ashermancinelli · 2021-05-31T03:31:37Z

This is a great idea. +1 from me.

pelesh · 2021-05-31T05:06:12Z

I believe something like this could work:

Keep pure virtual base class as is:

// hiopVector.hpp
class hiopVector
{
public:
  hiopVector(){}
  virtual ~hiopVector(){}
  virtual void add_const(double c) = 0;
  // more code ...
};

RAJA vector class template inherits from hiopVector:

// file hiopVectorRaja.hpp

#include "hiopVector.hpp"

template <class T>
class hiopVectorRaja : public hiopVector
{
public:
  hiopVectorRaja(/* argument list */);
  virtual ~hiopVectorRaja(){}

  virtual void add_const(double c);
  // more code ...
private:
  double* data_dev_;
  int n_local_;
  // more private members ...
};

Implementation of vector kernels is in hiopVectorRajaImpl.hpp header file:

// file hiopVectorRajaImpl.hpp

using hiop_raja_exec = T::hiop_raja_exec;

// some code ...

template<class T>
void hiopVectorRaja<T>::add_const(double c);
{
  double *yd = data_dev_;
  RAJA::forall< hiop_raja_exec >( RAJA::RangeSegment(0, n_local_),
    RAJA_LAMBDA(RAJA::Index_type i)
    {
      yd[i] += c;
    });
}

// more methods implementations

With this, most of the RAJA code is reused. Small pieces of execution policy specific code can be then implemented in separate source files. For example, to add CUDA backend one could implement something like this:

// file hiopVectorRajaCuda.cpp
#include <cuda.h>
#include <hiopVectorRaja.hpp>

#define RAJA_LAMBDA [=] __device__

template<class T>
hiopVectorRaja<T>::hiopVectorRaja(/* argument list */)
{
  assert(mem_space_ != "HOST");
  // Constructor implementation does not include
  // preprocessor directives anymore
}

#include "hiopVectorRajaImpl.hpp"
template hiopVectorRaja<hiopCudaPolicies>::hiopVectorRaja(/* argument list */);

This lets me keep backend specific code and RAJA_LAMBDA definition in a separate source file. An OpenMP implementation would look similar:

// file hiopVectorRajaOmp.cpp
#include <hiopVectorRaja.hpp>

#define RAJA_LAMBDA [=]

template<class T>
hiopVectorRaja<T>::hiopVectorRaja(/* argument list */)
{
  assert(mem_space_ == "HOST");
  // Constructor implementation does not include
  // preprocessor directives anymore
}

#include "hiopVectorRajaImpl.hpp"
template hiopVectorRaja<hiopOmpPolicies>::hiopVectorRaja(/* argument list */);

If both backends are built, in a HiOp application you could instantiate both CUDA and OpenMP vector:

// file main.cpp

// some code ..

hiopVectorRaja<hiop::CudaPolicy> x;
hiopVectorRaja<hiop::OmpPolicy>  y;

I tried this in a small prototype (without RAJA) and it seems this could work.

CC @davidbeckingsale

cnpetra · 2021-05-31T05:26:21Z

I find this design quite nice. One thing to keep in mind is that we may still need to specialize/subclass the hiopVectorRaja (same for the matrices classes) to provide platform-dependent implementation for (supposedly small number of) methods that require implementing different algorithms for different hardware... and that may make other designs appealing, I can think of at least one that would do the same without templates, not that I have anything against templates.

pelesh · 2021-05-31T05:45:10Z

One thing to keep in mind is that we may still need to specialize/subclass the hiopVectorRaja (same for the matrices classes) to provide platform-dependent implementation for (supposedly small number of) methods that require implementing different algorithms for different hardware...

I believe all you need to do in such case is to remove such method from hiopVectorRajaImpl.hpp and implement it in cpp files. I am really (ab)using templates here only for function name mangling.

cnpetra · 2022-12-19T19:47:58Z

closed by #543

pelesh added enhancement porting labels May 31, 2021

pelesh self-assigned this May 31, 2021

pelesh mentioned this issue May 31, 2021

Separate RAJA execution policies specification from linear algebra classes implementations #240

Closed

pelesh mentioned this issue May 31, 2021

Memory space can only be set statically #131

Closed

pelesh mentioned this issue Aug 2, 2022

Amd direct solver #521

Merged

cameronrutherford mentioned this issue Oct 21, 2022

Support configuration of OpenMP in backend for HiOp #563

Open

6 tasks

cnpetra closed this as completed Dec 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create class templates for RAJA linear algebra objects parametrized by execution policies #241

Create class templates for RAJA linear algebra objects parametrized by execution policies #241

pelesh commented May 31, 2021

pelesh commented May 31, 2021

ashermancinelli commented May 31, 2021 •

edited

Loading

pelesh commented May 31, 2021

cnpetra commented May 31, 2021

pelesh commented May 31, 2021

cnpetra commented Dec 19, 2022

Create class templates for RAJA linear algebra objects parametrized by execution policies #241

Create class templates for RAJA linear algebra objects parametrized by execution policies #241

Comments

pelesh commented May 31, 2021

pelesh commented May 31, 2021

ashermancinelli commented May 31, 2021 • edited Loading

pelesh commented May 31, 2021

cnpetra commented May 31, 2021

pelesh commented May 31, 2021

cnpetra commented Dec 19, 2022

ashermancinelli commented May 31, 2021 •

edited

Loading