Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Improve Function perfomance #489

Open
galchinsky opened this Issue · 5 comments

3 participants

@galchinsky

It is a feature-request or an RFC.
Current C++ implementations of std::function usually hold small closures in the functor itself to avoid heap allocation. As I can see, Clay's Function always uses a heap and this could be changed.

@stepancheg
Collaborator

Modern allocators (like tcmalloc) are very efficient in allocating small objects, and Clay should probably have core library easy to maintain rather than highly optimized.

@jckarter
Owner

The representation of Function could be improved and retain a high-level, easy-to-maintain structure by being a variant Function (FunctionWithSmallClosure, FunctionWithHeapClosure).

@stepancheg
Collaborator

There's simpler solution:

record SmallInPlace (
    data: Array[Byte, TypeSize(RawPointer)]],
);

record LargeInHeap (
   data: RawPointer,
);

// not generic
variant MemoryHolderSmallInPlaceLargeInHeap (SmallInPlace, LargeInHeap);

allocateMemorySmallInPlaceLargeInHeap(T): MemoryHolderSmallInPlaceLargeInHeap =
    if (TypeSize(T) <= TypeSize(SmallInPlace().data))
        SmallInPlace()
    else
        LargeInHeap(allocateRawMemory(TypeSize(T)));

and then

record Function[In, Out] (
    obj: MemoryHolderSmallInPlaceLargeInHeap,
    ...
);

Function has only only implementation.

Note that this implementation (as well as variant Function implementation) won't be always faster than current implementation, because malloc is cheap, but branch misprediction isn't.

@jckarter
Owner

Indeed, factoring out the memory holder is a good idea to avoid needless instantiation. I would guess though that, even if you have a fast malloc, locality and heap size efficiency would end up being bigger factors than branch misprediction in a larger application. That's why libc++ favors size over speed and large C++ projects like LLVM and WebKit make heavy use of custom in-place SmallVector/SmallString/SmallDenseMap/etc. containers. In the case of Function, there's an indirect call to an underlying function pointer anyway, which will probably be opaque to the branch predictor no matter what.

@galchinsky

Usually a construction is more rare than using. That's why cache friendliness of an object is often better than some creation overhead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.