New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Share code registry across MPI processes #72
Comments
Initial work to centralize and unify the memory allocation incl. executable buffer under Microsoft Windows. This is also related to issue #72 (Share code registry across MPI processes).
There are ways to use MPI for this that will make it portable without OS dependencies, if you are willing to assume MPI as a dependency. |
I was planning to use shm_open (and the "equivalent" under Windows). This way I get full control, and for example the option to make the code registry persistent (map an actual file). Since you mention My main motivation for this issue was/is to lower the memory consumption on a per-node basis. However, due to internal restructuring, the size of the code registry dropped meanwhile to only ~12 MB on a per-process basis incl. a "typical" amount of kernels (12 MB incl. actual code size; btw. a typical kernel code size is perhaps around 4 KB). So this issue is not as urgent as it was a while ago (early days: the registry was ~5x larger). Looking at KNL, I think there it's fine too since the registry is sort of latency bound and there is enough DDR4 memory (64 ranks on a single system would only[?] use ~768 MB). |
@hfp Yeah, MPI_Comm_type_shared is how you get a communicator for every node, and thus how you know what to pass to MPI_Win_allocate_shared. It is the MPI-3 portable way to get a shared memory slab on every node. There are, of course, other ways to achieve this that do not require MPI. |
Moved this issue to https://github.com/hfp/libxsmm/wiki/Development#longer-term-issues. |
Share the code registry across (MPI-)processes to lower the total/per-process memory consumption, or to allow for a larger registry while still saving memory. This work does not introduce any dependency to the Message Passing Interface (MPI), but rather uses OS primitives to achieve this effect under MPI.
The text was updated successfully, but these errors were encountered: