Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forward the explicitly given result_type in the hpx invoke #2404

Merged

Conversation

Projects
None yet
4 participants
@atrantan
Copy link
Contributor

commented Nov 22, 2016

This pull request tries to solve the issue when result_of cannot deduce the return type of the callable object, i.e. it happens when the callable object is an HPX_DEVICE lambda.

@K-ballo

This comment has been minimized.

Copy link
Member

commented Nov 22, 2016

What is the actual problem (why can't a return type be deduced? why would it have to be deduced in the first place? how is it related to HPX_DEVICE?), and how do these changes address it (seem to be introducing some kind of void_guard)?

@atrantan

This comment has been minimized.

Copy link
Contributor Author

commented Nov 22, 2016

In https://github.com/STEllAR-GROUP/hpx/blob/master/hpx/compute/cuda/detail/launch.hpp line 69
hpx::util::invoke_fused(f_, args_);
Cuda Clang fails in compiling this when f_ is a device lambda

@K-ballo

This comment has been minimized.

Copy link
Member

commented Nov 22, 2016

Cuda Clang fails in compiling this when f_ is a device lambda

How? Why?

@atrantan

This comment has been minimized.

Copy link
Contributor Author

commented Nov 22, 2016

Here is the error:

/home/atrantan/hpx-fork/hpx/util/invoke_fused.hpp:83:5: note: candidate template ignored: substitution failure [with F = (lambda at /home/atrantan/hpx-fork/work/cudaclang_fix/cudaclang_fix.cu:58:9) &, Tuple = hpx::util::tuple<int *, int *, int *> &]: no type named 'type' in 'hpx::util::detail::fused_result_of<(lambda at /home/atrantan/hpx-fork/work/cudaclang_fix/cudaclang_fix.cu:58:9) &(hpx::util::tuple<int *, int *, int *> &)>'
invoke_fused(F&& f, Tuple&& t)

@K-ballo

This comment has been minimized.

Copy link
Member

commented Nov 22, 2016

That looks like a note, a bit of explanation that might be related to the error, not the actual error. While the error message itself would certainly be useful and appreciated, I would like to know what and where the issue has determined to be, and how the changes proposed here address it.

@hkaiser

This comment has been minimized.

Copy link
Member

commented Nov 22, 2016

@K-ballo the issue is that the types used for invoking the invoke_fused on the host are different from invoking it on the device. Clang/Cuda tries to instantiate both (erroneously, I believe, as the operator()() is explicitly marked as device only, see: https://github.com/atrantan/hpx/blob/master/hpx/compute/cuda/detail/launch.hpp#L67). The patch proposed here adds a workaround for this case as it circumvents the result type computation which is causing trouble in the first place.

@K-ballo

This comment has been minimized.

Copy link
Member

commented Nov 22, 2016

The patch proposed here adds a workaround for this case as it circumvents the result type computation which is causing trouble in the first place.

We already have a way to circumvent the result type computation, which is what the code seems to be using in the first place, hence my confusion.

  • why can't a return type be deduced?
  • why would it have to be deduced in the first place?
@K-ballo

This comment has been minimized.

Copy link
Member

commented Nov 22, 2016

We already have a way to circumvent the result type computation

Actually, we don't anymore. We did have that initially and then we decided to go the other direction since it was hurting compilation time.

If we must go there again, then I would suggest to avoid the duplication and several steps of deduction proposed here. Instead, have invoke_impl take the return type as an explicit template argument, then have invoke feed it result_of and invoke_r feed it R.

@hkaiser

This comment has been minimized.

Copy link
Member

commented Nov 22, 2016

If we must go there again, then I would suggest to avoid the duplication and several steps of deduction as proposed here. Instead, have invoke_impl take the return type as an explicit template argument, then have invoke feed it result_of and invoke_r feed it R.

Fair point. That's a much better solution avoiding the code duplication.

@mcopik

This comment has been minimized.

Copy link
Contributor

commented Nov 23, 2016

@hkaiser Why types are different on host and device? Isn't the case in CUDA that we always operate on plain pointers and iterator/pointer structure is the same in both compilation modes?

I have have dozens of problems with device compilers (HC, SYCL) with them not being able to deduce type in invoke and invoke_fused. Most of them were related to problem of losing constness and lambdas called on device not having the mutable attribute.

@atrantan

This comment has been minimized.

Copy link
Contributor Author

commented Nov 23, 2016

If we must go there again, then I would suggest to avoid the duplication and several steps of deduction proposed here. Instead, have invoke_impl take the return type as an explicit template argument, then have invoke feed it result_of and invoke_r feed it R.

Done in commit bdc82a0

@hkaiser

This comment has been minimized.

Copy link
Member

commented Nov 23, 2016

@atrantan please fix the inspect problem reported on CircleCI (https://5435-4455628-gh.circle-artifacts.com/0/tmp/circle-artifacts.esgRjWz/hpx_inspect_report.html). The patch looks good otherwise, I'd like to merge it.

@atrantan atrantan force-pushed the atrantan:invoke_with_explicit_result_type branch from bdc82a0 to 780c954 Nov 24, 2016

@hkaiser hkaiser merged commit d22a936 into STEllAR-GROUP:master Nov 24, 2016

1 check passed

ci/circleci Your tests passed on CircleCI!
Details

@atrantan atrantan deleted the atrantan:invoke_with_explicit_result_type branch Nov 28, 2016

K-ballo added a commit to K-ballo/hpx that referenced this pull request Feb 5, 2019

K-ballo added a commit to K-ballo/hpx that referenced this pull request Feb 5, 2019

K-ballo added a commit to K-ballo/hpx that referenced this pull request Feb 6, 2019

K-ballo added a commit to K-ballo/hpx that referenced this pull request Feb 6, 2019

K-ballo added a commit to K-ballo/hpx that referenced this pull request Feb 7, 2019

K-ballo added a commit to K-ballo/hpx that referenced this pull request Feb 8, 2019

K-ballo added a commit to K-ballo/hpx that referenced this pull request Feb 9, 2019

K-ballo added a commit to K-ballo/hpx that referenced this pull request Feb 11, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.