Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

README: mention that library is not thread-safe #498

Closed
wants to merge 1 commit into from

Conversation

Hi-Angel
Copy link

@Hi-Angel Hi-Angel commented May 4, 2018

It's such a big deal, it should be among the first things a person gotta know!
I certainly didn't, it led to hours of frustration for why when I have a bunch
of clients communicating with a server (a thread per client), sometimes I'm
getting packets duplicated, even though clients sent them correct. It costed
lots of energy to pin down to deserialization, and no, the CEREAL_THREAD_SAFE
didn't help — probably because, per the linked docs, you still not allowed to
access the same archive from multiple threads.
(upd: the threading problem turned to be irrelevant to Cereal. The point stands though.)

It's such a surprise — why would a serialization library, aiming to be
header-only through C++11 features, use a global state? It's a rhetoric
question, maybe there is a reason, but let's at least make sure it's less
likely for new peoples to fall into threading trap.

It's such a big deal, it should be among the first things a person gotta
know! I certainly didn't, it led to hours of frustration for why when I
have a bunch of clients communicating with a server (a thread per
client), sometimes I'm getting packets duplicated, even though clients
sent them correct. It costed lots of energy to pin down to
deserialization, and no, the CEREAL_THREAD_SAFE didn't help — probably
because, per the linked docs, you still not allowed to access the same
archive from multiple threads.

Signed-off-by: Konstantine Kharlamov <Hi-Angel@yandex.ru>
@jeffreyscottgraham
Copy link

What? Frack!

@Hi-Angel
Copy link
Author

Hi-Angel commented May 8, 2018

Interestingly, it seems, Cereal-like API is unlikely to be possible without a global object (i.e. thread-safe). I have tried making a backend to match the API, so I could use it without modifications to my code, and have come up with this crazy construction.

This doesn't work, because the compiler fails to infer the first of the variadic arguments. I don't see any ways to proceed.

#include <vector>

using namespace std;

struct MyStruct {
    int int1, int2;

    template<class Archive>
    void save(const Archive& archive) {
        archive(int1, int2);
    }
};

template<class ToSer, class T, typename... Args>
vector<char> serialize2(ToSer obj){
    vector<char> ret;
    struct SerInternals {
        // constructor and "vec" are to workaround "use of local variable with
        // automatic storage from containing function" error
        vector<char> vec;
        SerInternals(vector<char>& vec) : vec(vec) {}

        void archive(T t) {
            //todo: use std::copy instead
            for (unsigned i = 0; i < sizeof(t); ++i)
                vec.push_back(*(((char*)&t)+i));
        }

        void archive(T fst, Args... args) {
            //todo: use std::copy instead
            for (unsigned i = 0; i < sizeof(fst); ++i)
                vec.push_back(*(((char*)&fst)+i));
            archive(args...);
        }
    };
    SerInternals tmp{ret};
    obj.save([&](auto&&... a) { tmp.archive(a...); });
    return ret;
}

int main() {

    MyStruct s;
    serialize2(s);
}

$ g++ test2.cpp -o a -g3 -O0 -Wall -Wextra -Wsign-conversion -std=c++17 -pedantic
test2.cpp: In function ‘int main()’:
test2.cpp:44:17: error: no matching function for call to ‘serialize2(MyStruct&)’
	 serialize2(s);
				 ^
test2.cpp:15:14: note: candidate: template<class ToSer, class T, class ... Args> std::vector<char> serialize2(ToSer)
 vector<char> serialize2(ToSer obj){
			  ^~~~~~~~~~
test2.cpp:15:14: note:   template argument deduction/substitution failed:
test2.cpp:44:17: note:   couldn't deduce template parameter ‘T’
	 serialize2(s);

@furkanusta
Copy link
Contributor

I am not sure why you are declaring SerInternals locally but this works:

 
namespace detail {
struct SerInternals {
    vector<char> vec;
    SerInternals(vector<char>& vec) : vec(vec) {}
    template<typename T>
    void archive(T t) {
        //todo: use std::copy instead
        for (unsigned i = 0; i < sizeof(t); ++i)
            vec.push_back(*(((char*)&t)+i));
    }
    template<typename T, typename... Args>
    void archive(T fst, Args... args) {
        //todo: use std::copy instead
        for (unsigned i = 0; i < sizeof(fst); ++i)
            vec.push_back(*(((char*)&fst)+i));
        archive(args...);
    }
};
}

template<class ToSer>
vector<char> serialize2(ToSer obj){
    vector<char> ret;
    detail::SerInternals tmp{ret};
    obj.save([&](auto&&... a) { tmp.archive(a...); });
    return ret;
}

If you insist on keeping it local you can use recursive variadic lambda (tough it requires C++17). Without if constexpr it requires an overload, there were implementations that provide overloads on lambdas (e.g. std::visit) though I am not sure whether they can be implemented solely using C++11 or earlier.

template<class ToSer>
vector<char> serialize2(ToSer obj){
    vector<char> ret;
    auto archive = [&](auto&&... t){
        // The below is necessary for recursion
        auto a_impl = [&](auto& self, auto&& fst, auto&&... args) {
            for (unsigned i = 0; i < sizeof(fst); ++i)
                ret.push_back(*(((char*)&fst)+i));
            if constexpr (sizeof...(args)){
                    self(self, args...);
            }
        };
        return a_impl(a_impl, t...);
    };
    obj.save(archive);
    return ret;
}

@Hi-Angel
Copy link
Author

Hi-Angel commented May 9, 2018

I am not sure why you are declaring SerInternals locally but this works:

Ha! Thanks, cool, this indeed does! The reason I was declaring it locally is because it took a looong way there, and I simply missed the opportunity to declare it as usual, and then create an object — thus getting the thread-safe serialization.

@OvermindDL1
Copy link

there were implementations that provide overloads on lambdas (e.g. std::visit) though I am not sure whether they can be implemented solely using C++11 or earlier.

Boost.Phoenix was polymorphic templateable lambdas in the C++98 era, so yes it is possible. :-)

@Hi-Angel
Copy link
Author

Hi-Angel commented May 10, 2018

Okay, FWIW, I managed to build the serialization part. The next tricky problem was the need to serialize embedded structs with their own save() functions. Here's the code, hopefully somebody would find it useful:

#include <type_traits>
#include <cstdint>
#include <vector>
#include <cstdio>
#include <iostream>

using namespace std;

struct MyStruct {
    ~MyStruct() { if (m) delete m; }
    uint32_t a;
    enum : uint16_t {
        ERROR,
        BLOCKS_OFFER,
        BLOCKS_REQUEST,
        BLOCKS_ARRAY,
        NEW_MODULE
    } type;
    MyStruct* m;

    template<class Archive>
    void save(const Archive&& archive) const {
        archive(a, type);
        if (m)
            archive(m);
    }
};

// like std::decay, but removes both constantess, references, pointers.
template<class T>
struct decay2 {
    typedef typename std::remove_cv<T>::type nocv_T;
    typedef typename std::remove_reference<nocv_T>::type noref_nocv_T;
    typedef typename std::remove_pointer<noref_nocv_T>::type noref_nocv_noptr;
    typedef typename std::remove_reference<noref_nocv_noptr>::type noref_nocv_T2;
    typedef typename std::remove_pointer<noref_nocv_T2>::type noref_nocv_noptr2;
    typedef noref_nocv_noptr2 type;
};

struct SerInternals {
    vector<char>& vec;
    constexpr SerInternals(vector<char>& vec) : vec(vec) {}
    template<typename T>
    constexpr void archive(const T& t) {
        if constexpr (is_fundamental<typename decay2<T>::type>::value
                      || is_enum<typename decay2<T>::type>::value) {
            for (unsigned i = 0; i < sizeof(t); ++i)
                vec.push_back(*(((const char*)&t)+i));
        } else {
            SerInternals tmp{vec};
            if constexpr (is_pointer<T>::value)
                t->save([&tmp](auto&&... a) { tmp.archive(a...); });
            else // it's a reference
                t.save([&tmp](auto&&... a) { tmp.archive(a...); });
        }
    }
    template<typename T, typename... Args>
    constexpr void archive(const T& fst, const Args... args) {
        if constexpr (is_fundamental<typename decay2<T>::type>::value
                      || is_enum<typename decay2<T>::type>::value) {
            for (unsigned i = 0; i < sizeof(fst); ++i)
                vec.push_back(*(((const char*)&fst)+i));
        } else {
            SerInternals tmp{vec};
            if constexpr (is_pointer<T>::value)
                fst->save([&tmp](auto&&... a) { tmp.archive(a...); });
            else // it's a reference
                fst.save([&tmp](auto&&... a) { tmp.archive(a...); });
        }
        archive(args...);
    }
};

template<class ToSer>
const vector<char> serialize(const ToSer& obj){
    vector<char> ret;
    SerInternals tmp{ret};
    obj.save([&tmp](auto&&... a) { tmp.archive(a...); });
    return ret;
}

int main() {
    MyStruct s = MyStruct{1,MyStruct::ERROR, new MyStruct{4, MyStruct::ERROR, nullptr}};
    printf("size is %lu\n", serialize(s).size());
}

$ g++ test2.cpp -o a -std=c++17 && ./a
size is 12

upd: after wiring up into a real project I found problems with enums. So I fixed the code to deal with this stuff.

@Hi-Angel
Copy link
Author

Hi-Angel commented May 15, 2018

FTR, here's the acc. deserialization part, hopefully someone find it useful too. I wrote it on prev. week, but wanted to make sure everything works.

BTW, the duplication problem turned out to be a race irrelevant to (de)serialization (funny story btw), but still I glad I uncovered that Cereal is non-thread safe before the project grew up much bigger.

#include <iostream>
#include <type_traits>
#include <variant>
#include <vector>

using namespace std;

struct DeSerException: public exception {
    const char* explanation;

    DeSerException(const char* text) : explanation(text) {}
    virtual const char* what() const throw() {
        return explanation;
    }
};

// like std::decay, but removes both constantess, references, pointers.
template<class T>
struct decay2 {
    typedef typename std::remove_cv<T>::type nocv_T;
    typedef typename std::remove_reference<nocv_T>::type noref_nocv_T;
    typedef typename std::remove_pointer<noref_nocv_T>::type noref_nocv_noptr;
    typedef typename std::remove_reference<noref_nocv_noptr>::type noref_nocv_T2;
    typedef typename std::remove_pointer<noref_nocv_T2>::type noref_nocv_noptr2;
    typedef noref_nocv_noptr2 type;
};

class DeSerInternals {
    const char*& start, *&past_end;

    template<typename T>
    void deserialize_elem(T& elem) {
        if constexpr (is_fundamental<typename decay2<T>::type>::value
                      || is_enum<typename decay2<T>::type>::value) {
            if (0 > past_end - start || sizeof(elem) > (size_t) (past_end - start))
                throw DeSerException{"Size of object being deserialized is bigger than the byte-vector!"};
            auto fst = start, past_lst = start + sizeof(elem);
            copy(fst, past_lst, ((char*)&elem));
            start = past_lst;
        } else { // it's a struct
            DeSerInternals tmp{start, past_end};
            if constexpr (is_pointer<T>::value)
                elem->load([&tmp](auto&&... a) { tmp.archive(a...); });
            else
                elem.load([&tmp](auto&&... a) { tmp.archive(a...); });
        }
    }

public:
    constexpr DeSerInternals(const char*& start, const char*& past_end) : start(start), past_end(past_end) {}

    template<typename T>
    constexpr void archive(T& x) {
        deserialize_elem(x);
    }

    template<typename T, typename... Args>
    constexpr void archive(T& x, Args&... xs) {
        deserialize_elem(x);
        archive(xs...);
    }
};

template<class DeSer>
variant<monostate, DeSer> deserialize(const char dat[], uint sz_dat){
    const char* fst = dat, *past_lst = dat + sz_dat;
    DeSer ret;
    DeSerInternals tmp{fst, past_lst};
    try{ ret.load([&tmp](auto&&... a) { tmp.archive(a...); }); } catch (DeSerException) {
        return monostate{};
    }
    return {ret};
}

struct Foo {
    char a;
    uint16_t b;
    enum : uint8_t {
        enum_member1
    } c;

    template<class Archive>
    void load(const Archive&& archive) {
        archive(a, b, c);
    }
};

int main() {
    vector<char> raw = {1, 2,0, 3};
    variant<monostate, Foo> mb_Foo = deserialize<Foo>(raw.data(), raw.size());
    if (holds_alternative<monostate>(mb_Foo))
        puts("Deserialization failed!");
    else {
        Foo& obj = get<Foo>(mb_Foo);
        printf("a = %u, b = %u, c = %u\n", obj.a, obj.b, obj.c);
    }
}

$ g++ test.cpp -o a -g3 -O0 -Wall -Wextra -Wsign-conversion -std=c++17 -pedantic -fsanitize=address && ./a
a = 1, b = 2, c = 3

upd: repeating code moved out to a separate function, and also enabled syntax highlight

@AzothAmmo
Copy link
Contributor

I'll update the documentation to more clearly reflect the limitations of cereal and threading.

@AzothAmmo AzothAmmo added this to the v1.2.3 milestone May 24, 2018
@AzothAmmo AzothAmmo closed this Oct 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants