Skip to content

Optional use of tbbmalloc_proxy (to speedup macOS)? #126

@wds15

Description

@wds15

The RcppParallel provided Intel TBB library is being used in RStan for a while now. In benchmarks we found out that specifically on macOS the use of the tbbmalloc_proxy library speeds up Stan programs by ~20%. Loading the tbbmalloc_proxy library replaces all calls to the system malloc to the replacement from the TBB. The upside of doing so is that no source code needs to be changed at all in order to gain from the speed benefits of the TBB provided malloc, which is designed to work well will threaded programs (see here for details).

From running benchmarks with Stan programs it turned out that there are clear speedups of ~20% on macOS when using the TBB malloc while other platforms did not really gain in speed. This is why the TBB malloc is only enabled for Stan programs on macOS, but there the speed gains are really nice.

So I wonder if this would be of interest to enable (maybe optionally) loading the tbbmalloc_proxy with RcppParallel.

If that is an option, I am happy to provide a PR for such an optional feature. If there is anything to consider for such a feature, please let me know.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions