Hybrid MPI (HMPI) is a library containing research on optimizing intra-node message passing communication for modern multi-core systems.
To get started with using HMPI, see the guide on the wiki.
A description of the research that has gone into HMPI is available in several publications. Most recently is Andrew Friedley's Ph.D. thesis:
Shared Memory Optimizations for Distributed Memory Programming Models
Publications include:
Hybrid MPI: Efficient Message Passing for Multi-core Systems
The above paper contains a description of HMPI, particularly of its current process-based design. The below works are based on an earlier thread-based design.
Ownership Passing: Efficient Distributed Memory Programming on Multi-core Systems
Compiling MPI for Many-Core Systems
And also related work on collectives was done using HMPI: