Skip to content

cuda & MPI_THREAD_MULTIPLE #627

@dribbroc

Description

@dribbroc

This maybe not a real issue, but at least unexpected behaviour.
Testet with cuda 7 and openmpi 1.8.5.

Background: If using hosts with only on gpu equipped, you ever hardly use the cudaSetDevice statement in your code, as all cuda calls default to the only existing device (0).

When allocating cuda device memory in the host thread and calling MPI_Send / Recv from other threads, mpi crashes with CUDA: Error in cuMemcpy.

A "workaround" is, to set the cudaDevice in every thread explitly, which definitly is a good advise at all.

Nevertheless, the cuda documentation states, that every thread's cudaDevice defaults to device 0, if not changed by the user.
This statement holds true, for cuda api calls in different threads, but openmpi seems to do some dark magic under the hood, changing this default device.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions