Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Less restrictive TeamPolicy reduction on Cuda #286

Closed
crtrott opened this issue May 18, 2016 · 1 comment
Closed

Less restrictive TeamPolicy reduction on Cuda #286

crtrott opened this issue May 18, 2016 · 1 comment
Assignees
Labels
Feature Request Create new capability; will potentially require voting
Milestone

Comments

@crtrott
Copy link
Member

crtrott commented May 18, 2016

Allow arbitrary team sizes and vector_length > 1 for reductions using TeamPolicy.

@crtrott crtrott added the Feature Request Create new capability; will potentially require voting label May 18, 2016
@crtrott crtrott added this to the Spring 2016 milestone May 18, 2016
@crtrott crtrott self-assigned this May 18, 2016
@crtrott crtrott closed this as completed May 25, 2016
@mndevec
Copy link

mndevec commented May 31, 2016

My shared memory allocations seems to be failing in this case on GPUs for parallel_reduce.

I set the shared memory size to:
size_t team_shmem_size (int team_size) const {
return 16384;
}

Then I try to allocate:
char *all_shared_memory = (char *) (teamMember.team_shmem().get_shmem(16384));

This is working for parallel_for, but when I use parallel_reduce, my allocation appears to be null. I am not sure if it is really related to vector size, since it also fails when vector_size = 1 as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature Request Create new capability; will potentially require voting
Projects
None yet
Development

No branches or pull requests

2 participants