Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove dependency on the use of spaces inside DEFINE_SERIALIZABLE #7

Merged
merged 9 commits into from
Nov 20, 2020

Conversation

bobsayshilol
Copy link
Contributor

I saw this library posted on reddit earlier and wondered how the member parsing worked, and upon inspection I spotted what looked like a bug due to the splitting of parameters on , in DEFINE_SERIALIZABLE(). It didn't look like you had any tests for that particular case so I set one up and it did indeed fail.

The issue in question was that DEFINE_SERIALIZABLE(a,b,c) and DEFINE_SERIALIZABLE(a, b, c) behaved differently because the spaces became part of the serialised name. This is fixed with 037e399.

I've also got it building again on Linux (ie with GCC and clang) since they weren't building, and removed some warnings that they were generating too. I've also given it a pass with ASAN and UBSan to check for any other issues and fixed the easy ones, but there are some follow up issues that need looking at, most notably serialisation failing for 64bit integer types (see ba66359). These issues are:

The build fails when using libc++ rather than libstdc++ (you do this by passing -stdlib=libc++ when compiling with clang):

tser/example/example2.cpp:114:19: error: invalid operands to binary expression ('std::__1::ostream' (aka 'basic_ostream<char>') and 'cpp_serializers_benchmark::Monster')
        std::cout << m << "\n";
        ~~~~~~~~~ ^  ~
tser/example/example2.cpp:43:9: note: candidate template ignored: requirement '!tser::is_detected_v<tser::has_outstream_op_t, cpp_serializers_benchmark::Monster>' was not satisfied [with OT = cpp_serializers_benchmark::Monster]
        DEFINE_SERIALIZABLE(Monster,pos,mana,hp,name,inventory,color,weapons,equipped,path)
        ^
tser/include/tser/tser.hpp:276:22: note: expanded from macro 'DEFINE_SERIALIZABLE'
friend std::ostream& operator<<(std::ostream& os, const OT& t) { tser::print(os, t); return os; }

GCC doesn't understand some of the compiler options and issues some warnings about them:

cc1plus: note: unrecognized command-line option ‘-Wno-shift-sign-overflow’
cc1plus: note: unrecognized command-line option ‘-Wno-exit-time-destructors’
cc1plus: note: unrecognized command-line option ‘-Wno-language-extension-token’
cc1plus: note: unrecognized command-line option ‘-Wno-global-constructors’
cc1plus: note: unrecognized command-line option ‘-Wno-c++98-compat-pedantic’
cc1plus: note: unrecognized command-line option ‘-Wno-c++98-compat’

UBSan has picked up an invalid shift:

tser/include/tser/varint_encoding.hpp:10:42: runtime error: left shift of negative value -1

I'm not actually using this library and it was curiosity that lead me here, so I guess those issues are in your hands now :P

Without this the 'else' branch that follows breaks because there isn't an overload for operator<<() taking a std::unique_ptr<T>.
GCC generates the diagnostic message:
  cc1plus: warning: command-line option ‘-Wno-missing-prototypes’ is valid for C/ObjC but not for C++
and since the code is entirely C++ we can safely remove this flag.
…ed as part of the names.

DEFINE_SERIALIZABLE(a,b,c) would serialise to the names a, b, and c as expected, but DEFINE_SERIALIZABLE(a, b, c) would not since the stringified version of the argument list was only being split at the commas.
In fixing this issue I've also refactored the name/data builders out into easily readable functions (vs being embedded in a macro) and added some tests for this bug (which were failing without the fix in tser.hpp).
UBSan spotted the issue:
  tser/include/tser/base64_encoding.hpp:15:24: runtime error: left shift of 16777216 by 8 places cannot be represented in type 'int'
so change the accumulator to be unsigned.
This tests more than just the signed ints and over a much larger range of values, however I've left the failing tests commented out for now.
@KonanM
Copy link
Owner

KonanM commented Nov 20, 2020

Thanks a lot for the review and the PR. I especially appreciate the changes to the tests and fixing UB on the varint encoding.

I was aware about the issue with the spaces, but wanted to keep the code simple. I think I had a version in between, which didn't have this issue and will check if I can find that (as that was shorter than your version).

I know this version is more ugly, but for me this is ok, because the code is executed at constexpr only anyways.
@KonanM KonanM merged commit dd6940d into KonanM:master Nov 20, 2020
@KonanM
Copy link
Owner

KonanM commented Nov 20, 2020

I will check what's up with the 64 bit variant encoding issues and try to get it to build with libc++. It seems to have to do with the way I implemented the SFINAE to allow for custom ostream operators.

@KonanM
Copy link
Owner

KonanM commented Nov 20, 2020

I fixed the issue with the 64 bit varint encoding. The encoding/decoding was actually correct, but I didn't resize the buffer in the BinaryArchieve correctly. Due to varint encoding a long long can take up to 10 bytes and I didn't account for the extra 2 bytes when resizing the buffer... this should be fixed now...

@bobsayshilol
Copy link
Contributor Author

I fixed the issue with the 64 bit varint encoding. The encoding/decoding was actually correct, but I didn't resize the buffer in the BinaryArchieve correctly. Due to varint encoding a long long can take up to 10 bytes and I didn't account for the extra 2 bytes when resizing the buffer... this should be fixed now...

Ah I didn't dig very deep into what was going on since I assumed it was tser/include/tser/varint_encoding.hpp:10:42: runtime error: left shift of negative value -1 but couldn't see an obvious cause.

Hmm, for c9e71e6 I wonder if that'll have any impact on compilation times. Now, rather than having a single function object in the compiler/IR that can be reused during compilation, every class that supports serialisation has a unique lambda that has to be built and evaluated. I guess it's not as much impact as #5 though.

@bobsayshilol bobsayshilol deleted the macro-fix branch November 20, 2020 18:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants