Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
Serialization improvements #10785
Conversation
|
Should probably be tested on big endian. :) |
sipa
reviewed
Jul 10, 2017
Here is a self-review in which I point out some of the things reviewers may want to be aware of.
| + void Unserialize(Stream& s) | ||
| + { | ||
| + std::vector<uint64_t> tmp; | ||
| + s >> blockhash >> VectorApply<CompactSizeWrapper>(tmp); |
sipa
Jul 10, 2017
•
Owner
For ease of implementation, deserialization first happens into a std::vector<uint64_t>, and is then converted. This means a temporary is created and allocated, which is an overhead that the old implementation didn't have.
| - READWRITEMANY(intval, boolval, stringval, FLATDATA(charstrval), txval); | ||
| + SERIALIZE_METHODS(obj) | ||
| + { | ||
| + READWRITE(obj.intval, obj.boolval, obj.stringval, FlatData(obj.charstrval), obj.txval); |
| - inline void SerializationOp(Stream& s, Operation ser_action) { | ||
| - if (ser_action.ForRead()) | ||
| - Init(NULL); | ||
| + template<typename Stream> |
sipa
Jul 10, 2017
Owner
This is one of the more involved changes, as it's both splitting the serializer into two versions, and the Serialize code no longer modifies mapValue in-place (wtf?).
| - ReadOrderPos(nOrderPos, mapValue); | ||
| + template<typename Stream> |
sipa
Jul 10, 2017
•
Owner
Here is another big change, that avoids modifying mapValue and strAccount and then later fixing it up before returning (wtf?).
| + * V is not required to be an std::vector type. It works for any class that | ||
| + * exposes a value_type, iteration, and resize method that behave like vectors. | ||
| + */ | ||
| +template<template <typename> class W, typename V> class VectorApplyWrapper |
sipa
Jul 10, 2017
Owner
Notice the unusual construction of a template that takes a template as parameter here. See "Template template parameter" here: http://en.cppreference.com/w/cpp/language/template_parameters
| { | ||
| - ::Serialize(s, std::forward<Arg>(arg)); | ||
| + ::Serialize(s, arg); |
sipa
Jul 10, 2017
Owner
The reason for removing the std::forward calls here is explained in the commit message (there is no benefit in passing down the rvalue-ness).
laanwj
requested a review
from jonasschnelli
Jul 11, 2017
laanwj
added the
Utils and libraries
label
Jul 11, 2017
|
Concept ACK. Will code-review soon. |
sipa
added some commits
Jul 7, 2017
|
Made some changes to reduce the size of the overall diff. |
sipa commentedJul 10, 2017
•
edited
This PR improves correctness (removing potentially unsafe
const_casts) and flexibility of the serialization code.The main issue is that use of the current
ADD_SERIALIZE_METHODSmacro (which is the only way to not duplicate serialization and deserialization code) only expands to a single class method, and thus can only be qualified as either const or non-const - not both. In many cases, serialization needs to work on const objects however, and preferably that is done without casts that could hide const-correctness bugs.To deal with that, this PR introduces a new approach that includes a
SERIALIZE_METHODS(obj)macro, whereobjis a variable name. It expands to some boilerplate and a static method to which the object itself is an argument. The advantage is that its type can be templated, and beconstwhen serializing.Another issue is the various serialization-wrapping macros (
VARINT,COMPACTSIZE,FLATDATAandLIMITED_STRING). They allconst_casttheir argument in order to construct a wrapper object, which supports both serialization and deserialization. This PR makes them templated in the underlying data type (for example,CompactSizeWrapper<uint64_t>). This has the advantage that we can make the template typeconstwhen invoked on aconstvariable (so it would beCompactSizeWrapper<const uint64_t>in that case).A last issue is the persistent use of the
REFmacro to deal with temporary expressions being passed in. Since C++11, this is not needed anymore as temporaries are explicitly represented as rvalue references. Thus we can removeREFinvocations and instead just make the various classes and helper functions deal correctly with references.The above changes permit a fully const-correct version of all serialization code. However, it is cumbersome. Any existing
ADD_SERIALIZE_METHODSinstances in the code that do more than just (conditionally) serializing/deserializing some fields (in particular, it contains branches that assign to some of the variables) need to be split up into an explicitSerializeandUnserializemethod instead. In some cases this is inevitable (wallet serializers do some crazy transformations while serializing!), but in many cases it is just annoying duplication.To improve upon this, a few more primitives that are currently inlined are turned into serialization wrappers:
BigEndianWrapper: Serializes/deserializes an integer as big endian rather than little endian (only for 16-bit). This permits the CService serialization to become a oneliner.Uint48Wrapper: Serializes/deserializes only the lower 48 bits of an integer (used in BIP152 code).VectorApplyWrapper: Serializes/deserializes a vector while using a custom serializer for its elements. This simplifies the undo and blockencoding serializers a lot.Best of all, it removes 147 lines of while code adding a bunch of comments (though the increased use of vararg
READWRITEis probably cheating a bit).The commits are ordered into 3 sections:
This may be too much to go in at once. I'm happy to split things up as needed.