Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gnr::forwarder with std::bind -- static_assert(std::is_trivially_copyable<functor_type> ? #3

Closed
user706 opened this issue May 14, 2018 · 22 comments

Comments

@user706
Copy link
Contributor

user706 commented May 14, 2018

Hi,

Currently gnr::forwarder cannot take a std::bind object, since the following static_assert static_assert(std::is_trivially_copyable<functor_type>{},"" (ref) will issue an error.

Question: is that static_assert needed?
Thanks.

Details:
If you try the following code
https://github.com/user706/code_generic/blob/master/code/intern/forwarder_ex2.cpp
building as follows:

cd /tmp
git clone https://github.com/user706/code_generic
cd                                   code_generic
mkdir -p build
cd       build
cmake ..       # need at least cmake 3.8.2, which understands CMAKE_CXX_STANDARD of 17 https://cmake.org/cmake/help/v3.8/prop_tgt/CXX_STANDARD.html#prop_tgt:CXX_STANDARD
make -j8
./code/intern/forwarder_ex2

and then change the following snippet https://github.com/user706/code_generic/blob/master/code/intern/forwarder_ex2.cpp#L7
to use gnr::forwarder, as follows

// uncomment _one_ of the two lines below
//template <typename T, std::size_t N = 0> using FF = std::function<T>;
template <typename T, std::size_t N = default_size> using FF = gnr::forwarder<T, N>;  // this only works if commenting out lines with std::bind below, or changing forwarder.hpp

then you'll find that a rebuild with make will fail as follows

In file included from /tmp/code_generic/code/intern/forwarder_ex2.cpp:2:0:
/tmp/code_generic/code/extern/generic_cmake/generic/forwarder.hpp: In instantiation of ‘void gnr::forwarder<R(A ...), N, NE>::assign(F&&) [with F = std::_Bind<void (*(int))(int)>; R = void; A = {}; long unsigned int N = 32ul; bool NE = false]’:
/tmp/code_generic/code/extern/generic_cmake/generic/forwarder.hpp:77:11:   required from ‘gnr::forwarder<R(A ...), N, NE>::forwarder(F&&) [with F = std::_Bind<void (*(int))(int)>; <template-parameter-2-2> = void; R = void; A = {}; long unsigned int N = 32ul; bool NE = false]’
/tmp/code_generic/code/intern/forwarder_ex2.cpp:46:60:   required from here
/tmp/code_generic/code/extern/generic_cmake/generic/forwarder.hpp:123:5: error: static assertion failed: functor not trivially copyable
     static_assert(std::is_trivially_copyable<functor_type>{},
     ^~~~~~~~~~~~~
/tmp/code_generic/code/extern/generic_cmake/generic/forwarder.hpp:128:11: error: cannot convert ‘gnr::forwarder<R(A ...), N, NE>::assign(F&&) [with F = std::_Bind<void (*(int))(int)>; R = void; A = {}; long unsigned int N = 32ul; bool NE = false]::<lambda(void*)>’ to ‘void (*)(void*)’ in assignment
     stub_ = [](void* const ptr, A&&... args) noexcept(noexcept(NE)) -> R
            
/tmp/code_generic/code/extern/generic_cmake/generic/forwarder.hpp: In instantiation of ‘void gnr::forwarder<R(A ...), N, NE>::assign(F&&) [with F = std::_Bind<std::_Mem_fn<void (Foo::*)(int) const>(Foo, std::_Placeholder<1>)>; R = void; A = {int}; long unsigned int N = 32ul; bool NE = false]’:
/tmp/code_generic/code/extern/generic_cmake/generic/forwarder.hpp:77:11:   required from ‘gnr::forwarder<R(A ...), N, NE>::forwarder(F&&) [with F = std::_Bind<std::_Mem_fn<void (Foo::*)(int) const>(Foo, std::_Placeholder<1>)>; <template-parameter-2-2> = void; R = void; A = {int}; long unsigned int N = 32ul; bool NE = false]’
/tmp/code_generic/code/intern/forwarder_ex2.cpp:61:72:   required from here
/tmp/code_generic/code/extern/generic_cmake/generic/forwarder.hpp:123:5: error: static assertion failed: functor not trivially copyable
     static_assert(std::is_trivially_copyable<functor_type>{},
     ^~~~~~~~~~~~~
/tmp/code_generic/code/extern/generic_cmake/generic/forwarder.hpp:128:11: error: cannot convert ‘gnr::forwarder<R(A ...), N, NE>::assign(F&&) [with F = std::_Bind<std::_Mem_fn<void (Foo::*)(int) const>(Foo, std::_Placeholder<1>)>; R = void; A = {int}; long unsigned int N = 32ul; bool NE = false]::<lambda(void*, int&&)>’ to ‘void (*)(void*, int&&)’ in assignment
     stub_ = [](void* const ptr, A&&... args) noexcept(noexcept(NE)) -> R
            
/tmp/code_generic/code/extern/generic_cmake/generic/forwarder.hpp: In instantiation of ‘void gnr::forwarder<R(A ...), N, NE>::assign(F&&) [with F = std::_Bind<std::_Mem_fn<void (Foo::*)(int) const>(const Foo*, std::_Placeholder<1>)>; R = void; A = {int}; long unsigned int N = 32ul; bool NE = false]’:
/tmp/code_generic/code/extern/generic_cmake/generic/forwarder.hpp:77:11:   required from ‘gnr::forwarder<R(A ...), N, NE>::forwarder(F&&) [with F = std::_Bind<std::_Mem_fn<void (Foo::*)(int) const>(const Foo*, std::_Placeholder<1>)>; <template-parameter-2-2> = void; R = void; A = {int}; long unsigned int N = 32ul; bool NE = false]’
/tmp/code_generic/code/intern/forwarder_ex2.cpp:65:73:   required from here
/tmp/code_generic/code/extern/generic_cmake/generic/forwarder.hpp:123:5: error: static assertion failed: functor not trivially copyable
     static_assert(std::is_trivially_copyable<functor_type>{},
     ^~~~~~~~~~~~~
/tmp/code_generic/code/extern/generic_cmake/generic/forwarder.hpp:128:11: error: cannot convert ‘gnr::forwarder<R(A ...), N, NE>::assign(F&&) [with F = std::_Bind<std::_Mem_fn<void (Foo::*)(int) const>(const Foo*, std::_Placeholder<1>)>; R = void; A = {int}; long unsigned int N = 32ul; bool NE = false]::<lambda(void*, int&&)>’ to ‘void (*)(void*, int&&)’ in assignment
     stub_ = [](void* const ptr, A&&... args) noexcept(noexcept(NE)) -> R
            
code/intern/CMakeFiles/forwarder_ex2.dir/build.make:62: recipe for target 'code/intern/CMakeFiles/forwarder_ex2.dir/forwarder_ex2.cpp.o' failed
make[2]: *** [code/intern/CMakeFiles/forwarder_ex2.dir/forwarder_ex2.cpp.o] Error 1
CMakeFiles/Makefile2:186: recipe for target 'code/intern/CMakeFiles/forwarder_ex2.dir/all' failed
make[1]: *** [code/intern/CMakeFiles/forwarder_ex2.dir/all] Error 2
Makefile:83: recipe for target 'all' failed
make: *** [all] Error 2

The offending lines, are the ones using std::bind --- here, here and here.

However if one removes the following static_assert

    static_assert(std::is_trivially_copyable<functor_type>{},
      "functor not trivially copyable");

from code/extern/generic_cmake/generic/forwarder.hpp (ref), then it will compile.

Question: is that static_assert needed?
Thanks.

@user706
Copy link
Contributor Author

user706 commented May 14, 2018

Basically:

#include <iostream>
#include <functional>

void print_num(int i)
{
    std::cout << i << '\n';
}

int main()
{
    auto bound = std::bind(print_num, 31337);
    static_assert(std::is_trivially_copyable<std::decay_t<decltype(bound)>>{}, 
                  "functor not trivially copyable");

    auto b2 = bound;
    
    return 0;
}

Above the static_assert will issue an error.
Do we need it in code/extern/generic_cmake/generic/forwarder.hpp (ref)?

@user1095108
Copy link
Owner

Yes, the static_assert is necessary; it's what distinguishes gnr::forwarder from std::function. You may have noticed that gnr::forwarder has no destructor and it does not invoke any destructors. This makes is perform better when compared to std::function. It makes the invocation code compile to a handful of instructions in most cases, but it also makes the static_assert necessary. You see, gnr::forwarder is for speed fiends, who want some convenience that comes with std::function, but don't want the cost. There would be no reason for gnr::forwarder to exist if it were not for the static_assert.

@user706
Copy link
Contributor Author

user706 commented May 15, 2018

Hmm... ok I think I see where you're coming from...

But: Does a missing destructor produce faster code, even for trivially copyable objects, that don't really need a destructor?

Or is the static_assert there to avoid possible leaks (or late deallocation (in the case of smart-pointers being held)) since the destructor (corresponding to placement-new) is not called?

@user1095108
Copy link
Owner

user1095108 commented May 15, 2018 via email

@user1095108
Copy link
Owner

user1095108 commented May 15, 2018 via email

@user706
Copy link
Contributor Author

user706 commented May 20, 2018

I think I know why you demand a type that does not need an explicit destructor.

Because std::function has type erasure. (Here functor_type is erased, as template parameter F is only known in assign(), but not in the class.)

So how does one handle destruction of a previously placement-new constructed type, when one does not know the type.

buffer->~MyType();    // but one does not know MyType anymore

?

The "simplest solution" is to just bypass the problem, by just demanding a type who's destructor does nothing...

@user1095108
Copy link
Owner

user1095108 commented May 20, 2018

The type of the object you erase is "remembered", since you instantiate a function, that acts as an invoker, i.e. the stub. Similarly you can instantiate functions that act as "deleters" and they remember the type too. Or you can instantiate a static class which remembers the type you erased and store a pointer to it. These are all complications, that affect performance, however. I saw from disassembly, that forwarder produced the fastest code of all alternatives, with caveats.

@user706
Copy link
Contributor Author

user706 commented May 20, 2018

The type of the object you erase is "remembered", since you instantiate a function, that acts as an invoker, i.e. the stub. Similarly you can instantiate functions that act as "deleters" and they remember the type too. Or you can instantiate a static class which remembers the type you erased and store a pointer to it.

Ah yes, thanks!

These are all complications, that affect performance, however. I saw from disassembly, that forwarder produced the fastest code of all alternatives, with caveats.

Very nice!

However... this thing about the fastest code has caveats (as you write). It is very dependent on

  • platform/architecture (i.e. 64 bit vs 32 bit; arm vs intel, etc.)
  • compiler and c++standard used
  • optimization settings
  • what one is actually measuring (just function call, or including time of constructor and possible destructor)

Because to tell you the truth, in the following measurements gnr::forwarder is slower than std::function:
https://github.com/user706/CxxFunctionBenchmark#sample-result (was run on my machine [Intel Core i7-6700K], with -O3 -std=c++17 -DNDEBUG on gcc-8.1.0)

But yes, if I want no heap, then std::forwarder (and embxx::util::StaticFunction) is faster than using "a wrapper (std:ref, or gnr::memfun) in combination with std::function" to avoid heap-allocation.

struct A
{
    A(): a(2) {}

    int operator()(int val) { return val * a; }
    int arr[8] = {}; // pad it fat!
    int a;
};

int main()
{
    A a;
    std::function<int(int)> f1{std::ref(a)};

    decltype(gnr::memfun<MEMFUN(A::times_a)>(a)) binder = gnr::memfun<MEMFUN(A::times_a)>(a);
    std::function<int(int)> f2{binder};

    gnr::forwarder<int(int), sizeof(A)> f3{a};

    embxx::util::StaticFunction<int(int), sizeof(A)+4> f4{a};

    for (int i = 0; i < 10; ++i) {
	f1(i); // slow
	f2(i); // slow
	f3(i); // fast
	f4(i); // fastest
    }
}

On my machine, the benchmark here gives the following output:

[caller]
Perf< direct >:       0.5995013430 [s] {checksum: 0}
Perf< forw_memfun >:  1.5839021150 [s] {checksum: 0}
Perf< function_memfun >: 1.5768125770 [s] {checksum: 0}    // std::function with gnr::memfun
Perf< forw >:         1.5624859270 [s] {checksum: 0}       // gnr::forwarder
Perf< function_ref >: 1.5990735100 [s] {checksum: 0}       // std::function with std::ref
Perf< static_func >:  1.0868871510 [s] {checksum: 0}       // embxx::util::StaticFunction

\\\\\\\ memuse printout (#0) -----------
memory_tot                           = 0
memory_accumulation_since_last_print = 0
num_alloc_tot                        = 0
num_alloc_cur                        = 0

@user706
Copy link
Contributor Author

user706 commented May 20, 2018

Because to tell you the truth, in the following measurements gnr::forwarder is slower than std::function:
https://github.com/user706/CxxFunctionBenchmark#sample-result (was run on my machine [Intel Core i7-6700K], with -O3 -std=c++17 -DNDEBUG on gcc-8.1.0)

What numbers do you get, if you do:

git clone https://github.com/user706/CxxFunctionBenchmark.git
cd                                   CxxFunctionBenchmark/
mkdir build/
cd    build/
cmake ..   # or if you want to use a custom boost:    cmake -DBOOST_ROOT=/path/to/boost ..
make -j4 VERBOSE=1
./various

?

@user706
Copy link
Contributor Author

user706 commented May 20, 2018

I've run my benchmark again...

got new numbers...
https://pastebin.com/ETXb453D

So in addition to:

However... this thing about the fastest code has caveats (as you write). It is very dependent on

  • platform/architecture (i.e. 64 bit vs 32 bit; arm vs intel, etc.)
  • compiler and c++standard used
  • optimization settings
  • what one is actually measuring (just function call, or including time of constructor and possible destructor)

one needs to add

  • what else is running on your PC, while you're running the benchmark

This type of benchmarking stuff needs to be taken with a grain of salt, I think...

@user1095108
Copy link
Owner

user1095108 commented May 20, 2018 via email

@user1095108
Copy link
Owner

user1095108 commented May 20, 2018 via email

@user1095108
Copy link
Owner

user1095108 commented May 20, 2018 via email

@user1095108
Copy link
Owner

Anyway, I'm surprised, embxx_util_StaticFunction is more versatile than forwarder, produces more code, has virtual functions, yet is faster :) Maybe it's a warning to us, that the choice of a delegate is not as important as actually writing a useful app :)

@user706
Copy link
Contributor Author

user706 commented May 23, 2018

But essentially, your tests are flawed in the sense, that you don't define NDEBUG while compiling, if there are asserts in the code, they will skew the results.

No, I do compile with -DNDEBUG. That's because the CMakeLists.txt sets CMAKE_BUILD_TYPE to Release (ref) and if you then do make VERBOSE=1 (ref) you'll see the -DNDEBUG flag passed.

@user1095108
Copy link
Owner

Ha ha, never mind. If you have time, you can research why one delegate is better than another. It seems you are interested in this - but it really is not relevant a great deal - you can't always get a perfect compiler/architecture fit. As for me, I'm allergic about everything virtual, so even if a delegate using virtual member functions is faster, I am not going to use it :) BTW: He should have qualified the virtuals as final.

@user706
Copy link
Contributor Author

user706 commented May 23, 2018

Anyway, I'm surprised, embxx_util_StaticFunction is more versatile than forwarder, produces more code, has virtual functions, yet is faster :)

But it has a stricter license...

you can't always get a perfect compiler/architecture fit

yip!

@user706
Copy link
Contributor Author

user706 commented May 23, 2018

If you have time, you can research why one delegate is better than another. It seems you are interested in this

Ha well partially, but it's a rabbit hole that leads into a gigantic cave of vast proportions, and in the and one does not know up from down.

You put it best here:

Maybe it's a warning to us, that the choice of a delegate is not as important as actually writing a useful app :)

@user1095108
Copy link
Owner

user1095108 commented May 23, 2018 via email

@user706
Copy link
Contributor Author

user706 commented May 23, 2018

If you publish an extensive benchmark people will take a look at it, since C++ delegates fascinate many people for some reason.

I don't think I'll go beyond the few pull-requests I've done here.

But just a thought...

What about a different approach: run-time code generation approach!!? (Some JIT approaches could perhaps be used. ref)

One would generate custom invocation code that is guaranteed to be the fastest possible.
Perhaps in some circumstances the overhead of jumping to that generated invocation routine, and invoking the function, would be smaller than any other approach.

But that would need some time, and maybe I'm just dreaming and in the end it would not really be faster...

@user1095108
Copy link
Owner

Runtime code generation is so fancy, you better do it in your own repository. It's also architecture-dependent, unless you generate code for some virtual machine (like JVM). If you have an idea about your own delegate, just write one - I don't mind. I doubt though, that JIT is the way towards a faster delegate, C++ compilers are simply too good. Better spend your time on something else. I'm satisfied with forwarder.hpp, callback.hpp (which you didn't test at all) and memfun.hpp. If you figure out why others are faster, be sure to tell me :)

@user1095108
Copy link
Owner

Just some more thoughts: bothering with delegates isn't worth it. I suggest you find something worthwhile like graphics..., and implement stuff from that field. There's plenty of delegates to choose from, loool. Or maybe you could just use std::function<>.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants