Skip to content


Subversion checkout URL

You can clone with
Download ZIP


Improve Function perfomance #489

galchinsky opened this Issue · 5 comments

3 participants


It is a feature-request or an RFC.
Current C++ implementations of std::function usually hold small closures in the functor itself to avoid heap allocation. As I can see, Clay's Function always uses a heap and this could be changed.


Modern allocators (like tcmalloc) are very efficient in allocating small objects, and Clay should probably have core library easy to maintain rather than highly optimized.


The representation of Function could be improved and retain a high-level, easy-to-maintain structure by being a variant Function (FunctionWithSmallClosure, FunctionWithHeapClosure).


There's simpler solution:

record SmallInPlace (
    data: Array[Byte, TypeSize(RawPointer)]],

record LargeInHeap (
   data: RawPointer,

// not generic
variant MemoryHolderSmallInPlaceLargeInHeap (SmallInPlace, LargeInHeap);

allocateMemorySmallInPlaceLargeInHeap(T): MemoryHolderSmallInPlaceLargeInHeap =
    if (TypeSize(T) <= TypeSize(SmallInPlace().data))

and then

record Function[In, Out] (
    obj: MemoryHolderSmallInPlaceLargeInHeap,

Function has only only implementation.

Note that this implementation (as well as variant Function implementation) won't be always faster than current implementation, because malloc is cheap, but branch misprediction isn't.


Indeed, factoring out the memory holder is a good idea to avoid needless instantiation. I would guess though that, even if you have a fast malloc, locality and heap size efficiency would end up being bigger factors than branch misprediction in a larger application. That's why libc++ favors size over speed and large C++ projects like LLVM and WebKit make heavy use of custom in-place SmallVector/SmallString/SmallDenseMap/etc. containers. In the case of Function, there's an indirect call to an underlying function pointer anyway, which will probably be opaque to the branch predictor no matter what.


Usually a construction is more rare than using. That's why cache friendliness of an object is often better than some creation overhead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.