New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upstream serialization improvements #3180

Merged
merged 20 commits into from Apr 19, 2018

Conversation

@str4d
Contributor

str4d commented Apr 17, 2018

Cherry-picked from the following upstream PRs:

Part of #2074.

jonasschnelli and others added some commits Jun 1, 2015

add bip32 pubkey serialization
CExtPubKey should be serializable like CPubKey
Resolve issue bitcoin/bitcoin#3166.
These changes decode valid SIGHASH types on signatures in assembly (asm) representations of scriptSig scripts.
This squashed commit incorporates substantial helpful feedback from jtimon, laanwj, and sipa.
[WIP] Remove unused statement in serialization
Zcash: Excludes changes to CBanEntry and CHDChain, which we don't have yet.
serialization: teach serializers variadics
Also add a variadic CDataStream ctor for ease-of-use.
Remove unused ReadVersion and WriteVersion
CDataStream and CAutoFile had a ReadVersion and WriteVersion method
that was never used. Remove them.
Make nType and nVersion private and sometimes const
Make the various stream implementations' nType and nVersion private
and const (except in CDataStream where we really need a setter).
Make streams' read and write return void
The stream implementations had two cascading layers (the upper one
with operator<< and operator>>, and a lower one with read and write).
The lower layer's functions are never cascaded (nor should they, as
they should only be used from the higher layer), so make them return
void instead.
Make GetSerializeSize a wrapper on top of CSizeComputer
Given that in default GetSerializeSize implementations created by
ADD_SERIALIZE_METHODS we're already using CSizeComputer(), get rid
of the specialized GetSerializeSize methods everywhere, and just use
CSizeComputer. This removes a lot of code which isn't actually used
anywhere.

For CCompactSize and CVarInt this actually removes a more efficient
size computing algorithm, which is brought back in a later commit.
Get rid of nType and nVersion
Remove the nType and nVersion as parameters to all serialization methods
and functions. There is only one place where it's read and has an impact
(in CAddress), and even there it does not impact any of the recursively
invoked serializers.

Instead, the few places that need nType or nVersion are changed to read
it directly from the stream object, through GetType() and GetVersion()
methods which are added to all stream classes.
Make CSerAction's ForRead() constexpr
The CSerAction's ForRead() method does not depend on any runtime
data, so guarantee that requests to it can be optimized out by
making it constexpr.

Suggested by Cory Fields.
Add optimized CSizeComputer serializers
To get the advantages of faster GetSerializeSize() implementations
back that were removed in "Make GetSerializeSize a wrapper on top of
CSizeComputer", reintroduce them in the few places in the form of a
specialized Serialize() implementation. This actually gets us in a
better state than before, as these even get used when they're invoked
indirectly in the serialization of another object.
Use fixed preallocation instead of costly GetSerializeSize
Dbwrapper used GetSerializeSize() to compute the size of the buffer
to preallocate. For some cases (specifically: CCoins) this requires
a costly compression call. Avoid this by just using fixed size
preallocations instead.
Avoid -Wshadow errors
Suggested by Pavel Janik.

@str4d str4d added this to the v1.1.1 milestone Apr 17, 2018

@str4d str4d requested review from daira, ebfull and Eirik0 Apr 17, 2018

@str4d str4d changed the title from Transaction serialization to Upstream serialization improvements Apr 17, 2018

@str4d

This comment has been minimized.

Contributor

str4d commented Apr 17, 2018

@zkbot try

@zkbot

This comment has been minimized.

Contributor

zkbot commented Apr 17, 2018

⌛️ Trying commit c7d7198 with merge 3b68ab2...

zkbot added a commit that referenced this pull request Apr 17, 2018

Auto merge of #3180 - str4d:transaction-serialization, r=<try>
Upstream serialization improvements

Cherry-picked from the following upstream PRs:

- bitcoin/bitcoin#5264
- bitcoin/bitcoin#6914
- bitcoin/bitcoin#6215
- bitcoin/bitcoin#8068
  - Only the `COMPACTSIZE` wrapper commit
- bitcoin/bitcoin#8658
- bitcoin/bitcoin#8708
  - Only the serializer variadics commit
- bitcoin/bitcoin#9039
- bitcoin/bitcoin#9125
  - Only the first two commits (the last two block on other upstream PRs)

Part of #2074.
@str4d

This comment has been minimized.

Contributor

str4d commented Apr 17, 2018

Transient RPC test failure in one builder; all the other supported builders passed.

@str4d str4d requested a review from bitcartel Apr 17, 2018

@ebfull

This comment has been minimized.

Contributor

ebfull commented Apr 18, 2018

ACK

@ebfull

ebfull approved these changes Apr 18, 2018

@@ -123,7 +123,7 @@ std::string CTxIn::ToString() const
if (prevout.IsNull())
str += strprintf(", coinbase %s", HexStr(scriptSig));
else
str += strprintf(", scriptSig=%s", scriptSig.ToString().substr(0,24));
str += strprintf(", scriptSig=%s", HexStr(scriptSig).substr(0, 24));

This comment has been minimized.

@arielgabizon

arielgabizon Apr 18, 2018

Contributor

wondering if there was a reason HexStr wasn't used here before

This comment has been minimized.

@arielgabizon

arielgabizon Apr 18, 2018

Contributor

Asked about it on bitcoin, seems fine.bitcoin/bitcoin#5264 (comment)

@@ -106,11 +106,11 @@ TEST(founders_reward_test, general) {
// address = t2ENg7hHVqqs9JwU5cgjvSbxnT2a9USNfhy
// script.ToString() = OP_HASH160 55d64928e69829d9376c776550b6cc710d427153 OP_EQUAL
// HexStr(script) = a91455d64928e69829d9376c776550b6cc710d42715387
EXPECT_EQ(params.GetFoundersRewardScriptAtHeight(1), ParseHex("a914ef775f1f997f122a062fff1a2d7443abd1f9c64287"));
EXPECT_EQ(HexStr(params.GetFoundersRewardScriptAtHeight(1)), "a914ef775f1f997f122a062fff1a2d7443abd1f9c64287");

This comment has been minimized.

@arielgabizon

arielgabizon Apr 18, 2018

Contributor

what does this have to do with the prevector?

This comment has been minimized.

@bitcartel

bitcartel Apr 18, 2018

Contributor

Good catch, the change to this file should be dropped.

This comment has been minimized.

@ebfull

ebfull Apr 19, 2018

Contributor

There is no implementation of equality now after the other changes, so this change is necessary or it won't compile.

This comment has been minimized.

@arielgabizon

arielgabizon Apr 19, 2018

Contributor

I don't see an equality operator for CScript being erased in this commit or the previous one.
So does that mean some commits in this PR don't compile?

This comment has been minimized.

@arielgabizon

arielgabizon Apr 19, 2018

Contributor

I guess what's going on is when there's a ToString method it can compare as strings and that was erased in the prevector commit. So the third commit doesn't compile, and the fourth does d2fb34f
I guess that's fine.

This comment has been minimized.

@str4d

str4d Apr 20, 2018

Contributor

It's not that it was erased, just that upstream didn't implement it. The change here is definitely a no-op, as comparing two hex strings should always be the same as comparing their decoded forms.

This comment has been minimized.

@daira

daira Apr 20, 2018

Contributor

Provided the case matches, and HexStr always produces lowercase so that should be fine.

@bitcartel

This comment has been minimized.

Contributor

bitcartel commented Apr 18, 2018

@zkbot retry

@zkbot

This comment has been minimized.

Contributor

zkbot commented Apr 18, 2018

⌛️ Trying commit c7d7198 with merge d408e23...

zkbot added a commit that referenced this pull request Apr 18, 2018

Auto merge of #3180 - str4d:transaction-serialization, r=<try>
Upstream serialization improvements

Cherry-picked from the following upstream PRs:

- bitcoin/bitcoin#5264
- bitcoin/bitcoin#6914
- bitcoin/bitcoin#6215
- bitcoin/bitcoin#8068
  - Only the `COMPACTSIZE` wrapper commit
- bitcoin/bitcoin#8658
- bitcoin/bitcoin#8708
  - Only the serializer variadics commit
- bitcoin/bitcoin#9039
- bitcoin/bitcoin#9125
  - Only the first two commits (the last two block on other upstream PRs)

Part of #2074.
@zkbot

This comment has been minimized.

Contributor

zkbot commented Apr 19, 2018

☀️ Test successful - pr-try
State: approved= try=True

@bitcartel

ACK once @arielgabizon comment addressed: #3180 (comment)

@ebfull

This comment has been minimized.

Contributor

ebfull commented Apr 19, 2018

@zkbot r+

@zkbot

This comment has been minimized.

Contributor

zkbot commented Apr 19, 2018

📌 Commit c7d7198 has been approved by ebfull

@zkbot

This comment has been minimized.

Contributor

zkbot commented Apr 19, 2018

⌛️ Testing commit c7d7198 with merge 0753a0e...

zkbot added a commit that referenced this pull request Apr 19, 2018

Auto merge of #3180 - str4d:transaction-serialization, r=ebfull
Upstream serialization improvements

Cherry-picked from the following upstream PRs:

- bitcoin/bitcoin#5264
- bitcoin/bitcoin#6914
- bitcoin/bitcoin#6215
- bitcoin/bitcoin#8068
  - Only the `COMPACTSIZE` wrapper commit
- bitcoin/bitcoin#8658
- bitcoin/bitcoin#8708
  - Only the serializer variadics commit
- bitcoin/bitcoin#9039
- bitcoin/bitcoin#9125
  - Only the first two commits (the last two block on other upstream PRs)

Part of #2074.
@zkbot

This comment has been minimized.

Contributor

zkbot commented Apr 19, 2018

☀️ Test successful - pr-merge
Approved by: ebfull
Pushing 0753a0e to master...

@zkbot zkbot merged commit c7d7198 into zcash:master Apr 19, 2018

1 check passed

homu Test successful
Details

Consensus Protocol Team automation moved this from In Review to Complete PRs (Ignore) Apr 19, 2018

@zkbot zkbot referenced this pull request Apr 19, 2018

Open

Sprout test vectors #3107

@@ -1146,4 +1020,16 @@ inline void SerReadWriteMany(Stream& s, int nType, int nVersion, CSerActionUnser
::UnserializeMany(s, nType, nVersion, args...);
}
template <typename T>
size_t GetSerializeSize(const T& t, int nType, int nVersion = 0)

This comment has been minimized.

@arielgabizon

arielgabizon Apr 19, 2018

Contributor

Is there a reason to have the default value of 0 for nVersion?
Seems worth thinking of especially with all our transaction versions (though I don't remember if we are using nVersion for those)

This comment has been minimized.

@arielgabizon

arielgabizon Apr 19, 2018

Contributor

A side benefit of reviewing this is that I finally understood what that weird ADD_SERIALIZE_METHODS macro is!

This comment has been minimized.

@str4d

str4d Apr 20, 2018

Contributor

GetSerializeSize is not just for transactions; it is used by all serialization operations in zcashd. The justification for leaving the default as 0 is that changing it could likely break a bunch of other parsers.

This comment has been minimized.

@arielgabizon

arielgabizon Apr 20, 2018

Contributor

Ahh, yeah now I see a default of 0 in all the old versions of the method.

@@ -380,7 +380,7 @@ class CDiskBlockIndex : public CBlockIndex
// Only read/write nSproutValue if the client version used to create
// this index was storing them.
if ((nType & SER_DISK) && (nVersion >= SPROUT_VALUE_VERSION)) {

This comment has been minimized.

@arielgabizon

arielgabizon Apr 19, 2018

Contributor

the previous commit has errors nType not defined here and doesn't compile.
This change might be moved to the previous commit

This comment has been minimized.

@arielgabizon

arielgabizon Apr 19, 2018

Contributor

Ahh OK, this line wasn't in the bitcoin code, that's why it's still wrong here.

@str4d str4d referenced this pull request Apr 20, 2018

Open

Bitcoin Core 0.12.0 #2074

193 of 452 tasks complete

@str4d str4d deleted the str4d:transaction-serialization branch Apr 20, 2018

str4d added a commit to str4d/zcash that referenced this pull request Apr 20, 2018

Remove now-unshadowed serialization lines that do nothing
Previously we had both nVersion as a class parameter *and* a serialization
argument, and in several inherited serializers the latter was set to the former,
in order to pass the serialized object's version into underlying parsers. zcash#3180
pulled in the upstream changes to clean this up, and in doing so these lines
became no-ops - setting the class parameter to itself. Clang throws warnings on
this, which turn into errors on the MacOS builder.

We can just remove these, because upstream already had done so in earlier PRs,
indicating that they were not being relied on by underlying parsers.

zkbot added a commit that referenced this pull request Apr 20, 2018

Auto merge of #3195 - str4d:3180-clang-warnings, r=<try>
Remove now-unshadowed serialization lines that do nothing

Previously we had both nVersion as a class parameter *and* a serialization
argument, and in several inherited serializers the latter was set to the former,
in order to pass the serialized object's version into underlying parsers. #3180
pulled in the upstream changes to clean this up, and in doing so these lines
became no-ops - setting the class parameter to itself. Clang throws warnings on
this, which turn into errors on the MacOS builder.

We can just remove these, because upstream already had done so in earlier PRs,
indicating that they were not being relied on by underlying parsers.
@arielgabizon

This comment has been minimized.

Contributor

arielgabizon commented Apr 20, 2018

postmerge ack

src/key.h Outdated
{
unsigned int len = ::ReadCompactSize(s);
unsigned char code[BIP32_EXTKEY_SIZE];
s.read((char *)&code[0], len);

This comment has been minimized.

@daira

daira Apr 20, 2018

Contributor

This differs from the near-duplicate code in pubkey.h by not checking len. Why is there near-duplicate code and why does it differ?

ssPriv >> privCheck;
BOOST_CHECK(pubCheck == pubkeyNew);
BOOST_CHECK(privCheck == keyNew);

This comment has been minimized.

@daira

daira Apr 20, 2018

Contributor

Add a test that reads an invalid serialization, e.g. with a wrong length.

* - T* indirect: a pointer to an array of capacity elements of type T
* (only the first _size are initialized).
*
* The data type T must be movable by memmove/realloc(). Once we switch to C++,

This comment has been minimized.

@daira

daira Apr 20, 2018

Contributor

This probably meant to say C++11.

};
private:
size_type _size;

This comment has been minimized.

@daira

daira Apr 20, 2018

Contributor

Document how this encodes the size: 29a8ade#diff-99a4340a7ab38fb07a2365d0e8d030daR263

T* indirect = indirect_ptr(0);
T* src = indirect;
T* dst = direct_ptr(0);
memcpy(dst, src, size() * sizeof(T));

This comment has been minimized.

@daira

daira Apr 20, 2018

Contributor

This seems to have UB if new_capacity < size(). Is that intentional? I would add an assertion new_capacity >= size() at the top if that is supposed to be a precondition of this method. (I know it's private, but still.)

}
template<typename Stream, typename Arg, typename... Args>
void SerializeMany(Stream& s, int nType, int nVersion, Arg&& arg, Args&&... args)

This comment has been minimized.

@daira

daira Apr 20, 2018

Contributor

What's that klunking sound I hear? Is it the sound of language features being duplicated?

if (!(nType & SER_GETHASH))
inline void SerializationOp(Stream& s, Operation ser_action) {
int nVersion = s.GetVersion();
if (!(s.GetType() & SER_GETHASH))

This comment has been minimized.

@daira

daira Apr 20, 2018

Contributor

Why is the version excluded when computing the hash? Don't we want the hash to change if the version does?

(I know this is not affected by this commit.)

if (!(nType & SER_GETHASH))
inline void SerializationOp(Stream& s, Operation ser_action) {
int nVersion = s.GetVersion();
if (!(s.GetType() & SER_GETHASH))

This comment has been minimized.

@daira

daira Apr 20, 2018

Contributor

Why is the version excluded when computing the hash? Don't we want the hash to change if the version does?

(I know this is not affected by this commit.)

@@ -380,7 +380,7 @@ class CDiskBlockIndex : public CBlockIndex
// Only read/write nSproutValue if the client version used to create
// this index was storing them.
if ((nType & SER_DISK) && (nVersion >= SPROUT_VALUE_VERSION)) {
if ((s.GetType() & SER_DISK) && (nVersion >= SPROUT_VALUE_VERSION)) {

This comment has been minimized.

@daira

daira Apr 20, 2018

Contributor

nVersion here refers to s.GetVersion(), which may be different from the block header version this->nVersion. I suggest replacing nVersion with s.GetVersion() here and on line 346 to make it more explicit.

This comment has been minimized.

@daira

daira Apr 20, 2018

Contributor

That is, although the local doesn't technically shadow the member variable, there is a way to refer to the member variable (as just nVersion) that would usually be correct and here means something different. That's confusing.

template <typename Stream>
void Unserialize(Stream& s)
{
unsigned int len = ::ReadCompactSize(s);

This comment has been minimized.

@daira

daira Apr 20, 2018

Contributor

This differs from the near-duplicate code in pubkey.h by not checking len. Why is there near-duplicate code and why does it differ?

zkbot added a commit that referenced this pull request Apr 24, 2018

Auto merge of #3195 - str4d:3180-clang-warnings, r=str4d
Remove now-unshadowed serialization lines that do nothing

Previously we had both nVersion as a class parameter *and* a serialization
argument, and in several inherited serializers the latter was set to the former,
in order to pass the serialized object's version into underlying parsers. #3180
pulled in the upstream changes to clean this up, and in doing so these lines
became no-ops - setting the class parameter to itself. Clang throws warnings on
this, which turn into errors on the MacOS builder.

We can just remove these, because upstream already had done so in earlier PRs,
indicating that they were not being relied on by underlying parsers.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment