It is a feature-request or an RFC.
Current C++ implementations of std::function usually hold small closures in the functor itself to avoid heap allocation. As I can see, Clay's Function always uses a heap and this could be changed.
Modern allocators (like tcmalloc) are very efficient in allocating small objects, and Clay should probably have core library easy to maintain rather than highly optimized.
The representation of Function could be improved and retain a high-level, easy-to-maintain structure by being a variant Function (FunctionWithSmallClosure, FunctionWithHeapClosure).
There's simpler solution:
record SmallInPlace (
data: Array[Byte, TypeSize(RawPointer)]],
record LargeInHeap (
// not generic
variant MemoryHolderSmallInPlaceLargeInHeap (SmallInPlace, LargeInHeap);
allocateMemorySmallInPlaceLargeInHeap(T): MemoryHolderSmallInPlaceLargeInHeap =
if (TypeSize(T) <= TypeSize(SmallInPlace().data))
record Function[In, Out] (
Function has only only implementation.
Note that this implementation (as well as variant Function implementation) won't be always faster than current implementation, because malloc is cheap, but branch misprediction isn't.
Indeed, factoring out the memory holder is a good idea to avoid needless instantiation. I would guess though that, even if you have a fast malloc, locality and heap size efficiency would end up being bigger factors than branch misprediction in a larger application. That's why libc++ favors size over speed and large C++ projects like LLVM and WebKit make heavy use of custom in-place SmallVector/SmallString/SmallDenseMap/etc. containers. In the case of Function, there's an indirect call to an underlying function pointer anyway, which will probably be opaque to the branch predictor no matter what.
Usually a construction is more rare than using. That's why cache friendliness of an object is often better than some creation overhead.