Skip to content

Conversation

@aelovikov-intel
Copy link
Contributor

  • std::function is affecting compile-time too much
  • our queue::submit is a synchronous operation, so we don't need to make any copy. Maybe std::function_ref would be a good choice here, but that's C++26.
  • Try to convert typed CGFO into type-erased version as soon as possible to limit number of template instantiations FE needs to perform

Comment on lines +2862 to +2864
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@steffenlarsen , any idea if

/// NOTE: Function is dependent to prevent the fallback kernels from
/// materializing without the use of the function.
is somehow related?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "fallback kernel" in this case is not referring to fallback asserts. It's because the implementation of ext_oneapi_memcpy2d has a kernel it uses if the operation isn't natively supported, but we don't want that kernel to pop up unless the user calls ext_oneapi_memcpy2d in their code, so we make it dependent.

@aelovikov-intel aelovikov-intel added the run-perf-tests Run performance tests in pre-commit (normally part of post-commit only) label Jan 16, 2025
* `std::function` is affecting compile-time too much
* our `queue::submit` is a *synchronous* operation, so we don't need to
  make any copy. Maybe `std::function_ref` would be a good choice here,
  but that's C++26.
* Try to convert typed CGFO into type-erased version as soon as possible
  to limit number of template instantiations FE needs to perform
Comment on lines 172 to 173
// TODO: Is that true?
// As such, we know that it can't be a [member] function pointer.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gmlueck , is this true?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, a member function pointer cannot be called via operator() because the compiler needs to also have a pointer to the object. The syntax for calling through a member function pmem is obj.*pmem(), which is different from operator(). Is that your question?

I think the spec would technically allow the command group function object to be a plain function pointer or a plain (non-member) function because these can be called via operator(). However, this seems of limited use because you wouldn't be able to capture any local variables, so it would be difficult, for example, to pass parameters to the kernel. I suppose it might make sense if you were enqueuing a kernel that didn't take any parameters. Or, you could store the kernel parameter values in a global variable, and read the variable from the function (yuck!).


void operator()(sycl::handler &cgh) const { invoker_f(object, cgh); }
};

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I first saw this I was wondering if we could use std::invokable instead, but I think that'd change the ABI, plus it might have the same too-much-templating problem as std::function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how it's applicable here. std::invokable is a concept, not a data type; so more like std::is_invokable_r trait that we are already using.

@aelovikov-intel aelovikov-intel merged commit 8af1eb3 into intel:sycl Jan 22, 2025
20 checks passed
@aelovikov-intel aelovikov-intel deleted the cgfo branch January 22, 2025 16:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

run-perf-tests Run performance tests in pre-commit (normally part of post-commit only)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants