Skip to content

Commit

Permalink
[SYCL] Add sycl_kernel attribute for accelerated code outlining
Browse files Browse the repository at this point in the history
SYCL is single source offload programming model relying on compiler to
separate device code (i.e. offloaded to an accelerator) from the code
executed on the host.

Here is code example of the SYCL program to demonstrate compiler
outlining work:

```
int foo(int x) { return ++x; }
int bar(int x) { throw std::exception("CPU code only!"); }
...
using namespace cl::sycl;
queue Q;
buffer<int, 1> a(range<1>{1024});
Q.submit([&](handler& cgh) {
  auto A = a.get_access<access::mode::write>(cgh);
  cgh.parallel_for<init_a>(range<1>{1024}, [=](id<1> index) {
    A[index] = index[0] + foo(42);
  });
}
...
```

SYCL device compiler must compile lambda expression passed to
cl::sycl::handler::parallel_for method and function foo called from this
lambda expression for an "accelerator". SYCL device compiler also must
ignore bar function as it's not required for offloaded code execution.

This patch adds the sycl_kernel attribute, which is used to mark code
passed to cl::sycl::handler::parallel_for as "accelerated code".

Attribute must be applied to function templates which parameters include
at least "kernel name" and "kernel function object". These parameters
will be used to establish an ABI between the host application and
offloaded part.

Reviewers: jlebar, keryell, Naghasan, ABataev, Anastasia, bader, aaron.ballman, rjmccall, rsmith

Reviewed By: keryell, bader

Subscribers: mgorny, OlegM, ArturGainullin, agozillon, aaron.ballman, ebevhan, Anastasia, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D60455

Signed-off-by: Alexey Bader <alexey.bader@intel.com>
  • Loading branch information
Fznamznon authored and bader committed Dec 3, 2019
1 parent 0e9b0b6 commit c094e7d
Show file tree
Hide file tree
Showing 6 changed files with 201 additions and 0 deletions.
13 changes: 13 additions & 0 deletions clang/include/clang/Basic/Attr.td
Expand Up @@ -121,6 +121,11 @@ def GlobalVar : SubsetSubject<Var,
def InlineFunction : SubsetSubject<Function,
[{S->isInlineSpecified()}], "inline functions">;

def FunctionTmpl
: SubsetSubject<Function, [{S->getTemplatedKind() ==
FunctionDecl::TK_FunctionTemplate}],
"function templates">;

// FIXME: this hack is needed because DeclNodes.td defines the base Decl node
// type to be a class, not a definition. This makes it impossible to create an
// attribute subject which accepts a Decl. Normally, this is not a problem,
Expand Down Expand Up @@ -296,6 +301,7 @@ def MicrosoftExt : LangOpt<"MicrosoftExt">;
def Borland : LangOpt<"Borland">;
def CUDA : LangOpt<"CUDA">;
def HIP : LangOpt<"HIP">;
def SYCL : LangOpt<"SYCLIsDevice">;
def COnly : LangOpt<"COnly", "!LangOpts.CPlusPlus">;
def CPlusPlus : LangOpt<"CPlusPlus">;
def OpenCL : LangOpt<"OpenCL">;
Expand Down Expand Up @@ -1056,6 +1062,13 @@ def CUDAShared : InheritableAttr {
let Documentation = [Undocumented];
}

def SYCLKernel : InheritableAttr {
let Spellings = [Clang<"sycl_kernel">];
let Subjects = SubjectList<[FunctionTmpl]>;
let LangOpts = [SYCL];
let Documentation = [SYCLKernelDocs];
}

def C11NoReturn : InheritableAttr {
let Spellings = [Keyword<"_Noreturn">];
let Subjects = SubjectList<[Function], ErrorDiag>;
Expand Down
73 changes: 73 additions & 0 deletions clang/include/clang/Basic/AttrDocs.td
Expand Up @@ -253,6 +253,79 @@ any option of a multiversioned function is undefined.
}];
}

def SYCLKernelDocs : Documentation {
let Category = DocCatFunction;
let Content = [{
The ``sycl_kernel`` attribute specifies that a function template will be used
to outline device code and to generate an OpenCL kernel.
Here is a code example of the SYCL program, which demonstrates the compiler's
outlining job:
.. code-block:: c++

int foo(int x) { return ++x; }

using namespace cl::sycl;
queue Q;
buffer<int, 1> a(range<1>{1024});
Q.submit([&](handler& cgh) {
auto A = a.get_access<access::mode::write>(cgh);
cgh.parallel_for<init_a>(range<1>{1024}, [=](id<1> index) {
A[index] = index[0] + foo(42);
});
}

A C++ function object passed to the ``parallel_for`` is called a "SYCL kernel".
A SYCL kernel defines the entry point to the "device part" of the code. The
compiler will emit all symbols accessible from a "kernel". In this code
example, the compiler will emit "foo" function. More details about the
compilation of functions for the device part can be found in the SYCL 1.2.1
specification Section 6.4.
To show to the compiler entry point to the "device part" of the code, the SYCL
runtime can use the ``sycl_kernel`` attribute in the following way:
.. code-block:: c++
namespace cl {
namespace sycl {
class handler {
template <typename KernelName, typename KernelType/*, ...*/>
__attribute__((sycl_kernel)) void sycl_kernel_function(KernelType KernelFuncObj) {
// ...
KernelFuncObj();
}

template <typename KernelName, typename KernelType, int Dims>
void parallel_for(range<Dims> NumWorkItems, KernelType KernelFunc) {
#ifdef __SYCL_DEVICE_ONLY__
sycl_kernel_function<KernelName, KernelType, Dims>(KernelFunc);
#else
// Host implementation
#endif
}
};
} // namespace sycl
} // namespace cl

The compiler will also generate an OpenCL kernel using the function marked with
the ``sycl_kernel`` attribute.
Here is the list of SYCL device compiler expectations with regard to the
function marked with the ``sycl_kernel`` attribute:

- The function must be a template with at least two type template parameters.
The compiler generates an OpenCL kernel and uses the first template parameter
as a unique name for the generated OpenCL kernel. The host application uses
this unique name to invoke the OpenCL kernel generated for the SYCL kernel
specialized by this name and second template parameter ``KernelType`` (which
might be an unnamed function object type).
- The function must have at least one parameter. The first parameter is
required to be a function object type (named or unnamed i.e. lambda). The
compiler uses function object type fields to generate OpenCL kernel
parameters.
- The function must return void. The compiler reuses the body of marked functions to
generate the OpenCL kernel body, and the OpenCL kernel must return `void`.

The SYCL kernel in the previous code sample meets these expectations.
}];
}

def C11NoReturnDocs : Documentation {
let Category = DocCatFunction;
let Content = [{
Expand Down
15 changes: 15 additions & 0 deletions clang/include/clang/Basic/DiagnosticSemaKinds.td
Expand Up @@ -10112,4 +10112,19 @@ def err_bit_cast_non_trivially_copyable : Error<
"__builtin_bit_cast %select{source|destination}0 type must be trivially copyable">;
def err_bit_cast_type_size_mismatch : Error<
"__builtin_bit_cast source size does not equal destination size (%0 vs %1)">;

// SYCL-specific diagnostics
def warn_sycl_kernel_num_of_template_params : Warning<
"'sycl_kernel' attribute only applies to a function template with at least"
" two template parameters">, InGroup<IgnoredAttributes>;
def warn_sycl_kernel_invalid_template_param_type : Warning<
"template parameter of a function template with the 'sycl_kernel' attribute"
" cannot be a non-type template parameter">, InGroup<IgnoredAttributes>;
def warn_sycl_kernel_num_of_function_params : Warning<
"function template with 'sycl_kernel' attribute must have a single parameter">,
InGroup<IgnoredAttributes>;
def warn_sycl_kernel_return_type : Warning<
"function template with 'sycl_kernel' attribute must have a 'void' return type">,
InGroup<IgnoredAttributes>;

} // end of sema component.
42 changes: 42 additions & 0 deletions clang/lib/Sema/SemaDeclAttr.cpp
Expand Up @@ -6412,6 +6412,45 @@ static void handleOpenCLAccessAttr(Sema &S, Decl *D, const ParsedAttr &AL) {
D->addAttr(::new (S.Context) OpenCLAccessAttr(S.Context, AL));
}

static void handleSYCLKernelAttr(Sema &S, Decl *D, const ParsedAttr &AL) {
// The 'sycl_kernel' attribute applies only to function templates.
const auto *FD = cast<FunctionDecl>(D);
const FunctionTemplateDecl *FT = FD->getDescribedFunctionTemplate();
assert(FT && "Function template is expected");

// Function template must have at least two template parameters.
const TemplateParameterList *TL = FT->getTemplateParameters();
if (TL->size() < 2) {
S.Diag(FT->getLocation(), diag::warn_sycl_kernel_num_of_template_params);
return;
}

// Template parameters must be typenames.
for (unsigned I = 0; I < 2; ++I) {
const NamedDecl *TParam = TL->getParam(I);
if (isa<NonTypeTemplateParmDecl>(TParam)) {
S.Diag(FT->getLocation(),
diag::warn_sycl_kernel_invalid_template_param_type);
return;
}
}

// Function must have at least one argument.
if (getFunctionOrMethodNumParams(D) != 1) {
S.Diag(FT->getLocation(), diag::warn_sycl_kernel_num_of_function_params);
return;
}

// Function must return void.
QualType RetTy = getFunctionOrMethodResultType(D);
if (!RetTy->isVoidType()) {
S.Diag(FT->getLocation(), diag::warn_sycl_kernel_return_type);
return;
}

handleSimpleAttribute<SYCLKernelAttr>(S, D, AL);
}

static void handleDestroyAttr(Sema &S, Decl *D, const ParsedAttr &A) {
if (!cast<VarDecl>(D)->hasGlobalStorage()) {
S.Diag(D->getLocation(), diag::err_destroy_attr_on_non_static_var)
Expand Down Expand Up @@ -6739,6 +6778,9 @@ static void ProcessDeclAttribute(Sema &S, Scope *scope, Decl *D,
case ParsedAttr::AT_Flatten:
handleSimpleAttribute<FlattenAttr>(S, D, AL);
break;
case ParsedAttr::AT_SYCLKernel:
handleSYCLKernelAttr(S, D, AL);
break;
case ParsedAttr::AT_Format:
handleFormatAttr(S, D, AL);
break;
Expand Down
14 changes: 14 additions & 0 deletions clang/test/SemaSYCL/kernel-attribute-on-non-sycl.cpp
@@ -0,0 +1,14 @@
// RUN: %clang_cc1 -std=c++11 -fsyntax-only -fsycl-is-device -verify %s
// RUN: %clang_cc1 -std=c++11 -fsyntax-only -verify -x c++ %s

#ifndef __SYCL_DEVICE_ONLY__
// expected-warning@+7 {{'sycl_kernel' attribute ignored}}
// expected-warning@+8 {{'sycl_kernel' attribute ignored}}
#else
// expected-no-diagnostics
#endif

template <typename T, typename A, int B>
__attribute__((sycl_kernel)) void foo(T P);
template <typename T, typename A, int B>
[[clang::sycl_kernel]] void foo1(T P);
44 changes: 44 additions & 0 deletions clang/test/SemaSYCL/kernel-attribute.cpp
@@ -0,0 +1,44 @@
// RUN: %clang_cc1 -std=c++11 -fsyntax-only -fsycl-is-device -verify %s

// Only function templates
[[clang::sycl_kernel]] int gv2 = 0; // expected-warning {{'sycl_kernel' attribute only applies to function templates}}
__attribute__((sycl_kernel)) int gv3 = 0; // expected-warning {{'sycl_kernel' attribute only applies to function templates}}

__attribute__((sycl_kernel)) void foo(); // expected-warning {{'sycl_kernel' attribute only applies to function templates}}
[[clang::sycl_kernel]] void foo1(); // expected-warning {{'sycl_kernel' attribute only applies to function templates}}

// Attribute takes no arguments
template <typename T, typename A>
__attribute__((sycl_kernel(1))) void foo(T P); // expected-error {{'sycl_kernel' attribute takes no arguments}}
template <typename T, typename A, int I>
[[clang::sycl_kernel(1)]] void foo1(T P);// expected-error {{'sycl_kernel' attribute takes no arguments}}

// At least two template parameters
template <typename T>
__attribute__((sycl_kernel)) void foo(T P); // expected-warning {{'sycl_kernel' attribute only applies to a function template with at least two template parameters}}
template <typename T>
[[clang::sycl_kernel]] void foo1(T P); // expected-warning {{'sycl_kernel' attribute only applies to a function template with at least two template parameters}}

// First two template parameters cannot be non-type template parameters
template <typename T, int A>
__attribute__((sycl_kernel)) void foo(T P); // expected-warning {{template parameter of a function template with the 'sycl_kernel' attribute cannot be a non-type template parameter}}
template <int A, typename T>
[[clang::sycl_kernel]] void foo1(T P); // expected-warning {{template parameter of a function template with the 'sycl_kernel' attribute cannot be a non-type template parameter}}

// Must return void
template <typename T, typename A>
__attribute__((sycl_kernel)) int foo(T P); // expected-warning {{function template with 'sycl_kernel' attribute must have a 'void' return type}}
template <typename T, typename A>
[[clang::sycl_kernel]] int foo1(T P); // expected-warning {{function template with 'sycl_kernel' attribute must have a 'void' return type}}

// Must take at least one argument
template <typename T, typename A>
__attribute__((sycl_kernel)) void foo(); // expected-warning {{function template with 'sycl_kernel' attribute must have a single parameter}}
template <typename T, typename A>
[[clang::sycl_kernel]] void foo1(T t, A a); // expected-warning {{function template with 'sycl_kernel' attribute must have a single parameter}}

// No diagnostics
template <typename T, typename A>
__attribute__((sycl_kernel)) void foo(T P);
template <typename T, typename A, int I>
[[clang::sycl_kernel]] void foo1(T P);

0 comments on commit c094e7d

Please sign in to comment.