Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upstream serialization improvements #3180

Merged
merged 20 commits into from Apr 19, 2018
Merged

Conversation

mruddy and others added 20 commits April 16, 2018 07:38
These changes decode valid SIGHASH types on signatures in assembly (asm) representations of scriptSig scripts.
This squashed commit incorporates substantial helpful feedback from jtimon, laanwj, and sipa.
CExtPubKey should be serializable like CPubKey
Zcash: Excludes changes to CBanEntry and CHDChain, which we don't have yet.
Also add a variadic CDataStream ctor for ease-of-use.
CDataStream and CAutoFile had a ReadVersion and WriteVersion method
that was never used. Remove them.
The stream implementations had two cascading layers (the upper one
with operator<< and operator>>, and a lower one with read and write).
The lower layer's functions are never cascaded (nor should they, as
they should only be used from the higher layer), so make them return
void instead.
Make the various stream implementations' nType and nVersion private
and const (except in CDataStream where we really need a setter).
Given that in default GetSerializeSize implementations created by
ADD_SERIALIZE_METHODS we're already using CSizeComputer(), get rid
of the specialized GetSerializeSize methods everywhere, and just use
CSizeComputer. This removes a lot of code which isn't actually used
anywhere.

For CCompactSize and CVarInt this actually removes a more efficient
size computing algorithm, which is brought back in a later commit.
Remove the nType and nVersion as parameters to all serialization methods
and functions. There is only one place where it's read and has an impact
(in CAddress), and even there it does not impact any of the recursively
invoked serializers.

Instead, the few places that need nType or nVersion are changed to read
it directly from the stream object, through GetType() and GetVersion()
methods which are added to all stream classes.
Suggested by Pavel Janik.
The CSerAction's ForRead() method does not depend on any runtime
data, so guarantee that requests to it can be optimized out by
making it constexpr.

Suggested by Cory Fields.
To get the advantages of faster GetSerializeSize() implementations
back that were removed in "Make GetSerializeSize a wrapper on top of
CSizeComputer", reintroduce them in the few places in the form of a
specialized Serialize() implementation. This actually gets us in a
better state than before, as these even get used when they're invoked
indirectly in the serialization of another object.
Dbwrapper used GetSerializeSize() to compute the size of the buffer
to preallocate. For some cases (specifically: CCoins) this requires
a costly compression call. Avoid this by just using fixed size
preallocations instead.
@str4d str4d added C-upstream-port Category: Changes that are ported from the Bitcoin Core codebase. C-cleanup Category: PRs that clean code up or issues documenting cleanup. labels Apr 17, 2018
@str4d str4d added this to the v1.1.1 milestone Apr 17, 2018
@str4d str4d requested review from daira, ebfull and Eirik0 April 17, 2018 12:38
@str4d str4d changed the title Transaction serialization Upstream serialization improvements Apr 17, 2018
@str4d
Copy link
Contributor Author

str4d commented Apr 17, 2018

@zkbot try

@zkbot
Copy link
Contributor

zkbot commented Apr 17, 2018

⌛ Trying commit c7d7198 with merge 3b68ab255fcbb233af50a76a684b7673738e3c0d...

zkbot added a commit that referenced this pull request Apr 18, 2018
Upstream serialization improvements

Cherry-picked from the following upstream PRs:

- bitcoin/bitcoin#5264
- bitcoin/bitcoin#6914
- bitcoin/bitcoin#6215
- bitcoin/bitcoin#8068
  - Only the `COMPACTSIZE` wrapper commit
- bitcoin/bitcoin#8658
- bitcoin/bitcoin#8708
  - Only the serializer variadics commit
- bitcoin/bitcoin#9039
- bitcoin/bitcoin#9125
  - Only the first two commits (the last two block on other upstream PRs)

Part of #2074.
@zkbot
Copy link
Contributor

zkbot commented Apr 19, 2018

☀️ Test successful - pr-try
State: approved= try=True

Copy link
Contributor

@bitcartel bitcartel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK once @arielgabizon comment addressed: #3180 (comment)

@ebfull
Copy link
Contributor

ebfull commented Apr 19, 2018

@zkbot r+

@zkbot
Copy link
Contributor

zkbot commented Apr 19, 2018

📌 Commit c7d7198 has been approved by ebfull

@zkbot
Copy link
Contributor

zkbot commented Apr 19, 2018

⌛ Testing commit c7d7198 with merge 0753a0e...

zkbot added a commit that referenced this pull request Apr 19, 2018
Upstream serialization improvements

Cherry-picked from the following upstream PRs:

- bitcoin/bitcoin#5264
- bitcoin/bitcoin#6914
- bitcoin/bitcoin#6215
- bitcoin/bitcoin#8068
  - Only the `COMPACTSIZE` wrapper commit
- bitcoin/bitcoin#8658
- bitcoin/bitcoin#8708
  - Only the serializer variadics commit
- bitcoin/bitcoin#9039
- bitcoin/bitcoin#9125
  - Only the first two commits (the last two block on other upstream PRs)

Part of #2074.
@zkbot
Copy link
Contributor

zkbot commented Apr 19, 2018

☀️ Test successful - pr-merge
Approved by: ebfull
Pushing 0753a0e to master...

@zkbot zkbot merged commit c7d7198 into zcash:master Apr 19, 2018
Protocol Team automation moved this from In Review to Complete PRs (Ignore) Apr 19, 2018
@zkbot zkbot mentioned this pull request Apr 19, 2018
src/serialize.h Outdated
@@ -1146,4 +1020,16 @@ inline void SerReadWriteMany(Stream& s, int nType, int nVersion, CSerActionUnser
::UnserializeMany(s, nType, nVersion, args...);
}

template <typename T>
size_t GetSerializeSize(const T& t, int nType, int nVersion = 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to have the default value of 0 for nVersion?
Seems worth thinking of especially with all our transaction versions (though I don't remember if we are using nVersion for those)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A side benefit of reviewing this is that I finally understood what that weird ADD_SERIALIZE_METHODS macro is!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetSerializeSize is not just for transactions; it is used by all serialization operations in zcashd. The justification for leaving the default as 0 is that changing it could likely break a bunch of other parsers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, yeah now I see a default of 0 in all the old versions of the method.

@@ -380,7 +380,7 @@ class CDiskBlockIndex : public CBlockIndex

// Only read/write nSproutValue if the client version used to create
// this index was storing them.
if ((nType & SER_DISK) && (nVersion >= SPROUT_VALUE_VERSION)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the previous commit has errors nType not defined here and doesn't compile.
This change might be moved to the previous commit

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh OK, this line wasn't in the bitcoin code, that's why it's still wrong here.

@str4d str4d mentioned this pull request Apr 20, 2018
@str4d str4d deleted the transaction-serialization branch April 20, 2018 08:45
str4d added a commit to str4d/zcash that referenced this pull request Apr 20, 2018
Previously we had both nVersion as a class parameter *and* a serialization
argument, and in several inherited serializers the latter was set to the former,
in order to pass the serialized object's version into underlying parsers. zcash#3180
pulled in the upstream changes to clean this up, and in doing so these lines
became no-ops - setting the class parameter to itself. Clang throws warnings on
this, which turn into errors on the MacOS builder.

We can just remove these, because upstream already had done so in earlier PRs,
indicating that they were not being relied on by underlying parsers.
@arielgabizon
Copy link
Contributor

postmerge ack

src/key.h Outdated
{
unsigned int len = ::ReadCompactSize(s);
unsigned char code[BIP32_EXTKEY_SIZE];
s.read((char *)&code[0], len);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This differs from the near-duplicate code in pubkey.h by not checking len. Why is there near-duplicate code and why does it differ?

ssPriv >> privCheck;

BOOST_CHECK(pubCheck == pubkeyNew);
BOOST_CHECK(privCheck == keyNew);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a test that reads an invalid serialization, e.g. with a wrong length.

* - T* indirect: a pointer to an array of capacity elements of type T
* (only the first _size are initialized).
*
* The data type T must be movable by memmove/realloc(). Once we switch to C++,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably meant to say C++11.

};

private:
size_type _size;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document how this encodes the size: 29a8ade#diff-99a4340a7ab38fb07a2365d0e8d030daR263

T* indirect = indirect_ptr(0);
T* src = indirect;
T* dst = direct_ptr(0);
memcpy(dst, src, size() * sizeof(T));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to have UB if new_capacity < size(). Is that intentional? I would add an assertion new_capacity >= size() at the top if that is supposed to be a precondition of this method. (I know it's private, but still.)

src/serialize.h Outdated
}

template<typename Stream, typename Arg, typename... Args>
void SerializeMany(Stream& s, int nType, int nVersion, Arg&& arg, Args&&... args)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's that klunking sound I hear? Is it the sound of language features being duplicated?

if (!(nType & SER_GETHASH))
inline void SerializationOp(Stream& s, Operation ser_action) {
int nVersion = s.GetVersion();
if (!(s.GetType() & SER_GETHASH))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the version excluded when computing the hash? Don't we want the hash to change if the version does?

(I know this is not affected by this commit.)

if (!(nType & SER_GETHASH))
inline void SerializationOp(Stream& s, Operation ser_action) {
int nVersion = s.GetVersion();
if (!(s.GetType() & SER_GETHASH))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the version excluded when computing the hash? Don't we want the hash to change if the version does?

(I know this is not affected by this commit.)

@@ -380,7 +380,7 @@ class CDiskBlockIndex : public CBlockIndex

// Only read/write nSproutValue if the client version used to create
// this index was storing them.
if ((nType & SER_DISK) && (nVersion >= SPROUT_VALUE_VERSION)) {
if ((s.GetType() & SER_DISK) && (nVersion >= SPROUT_VALUE_VERSION)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nVersion here refers to s.GetVersion(), which may be different from the block header version this->nVersion. I suggest replacing nVersion with s.GetVersion() here and on line 346 to make it more explicit.

Copy link
Contributor

@daira daira Apr 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is, although the local doesn't technically shadow the member variable, there is a way to refer to the member variable (as just nVersion) that would usually be correct and here means something different. That's confusing.

template <typename Stream>
void Unserialize(Stream& s)
{
unsigned int len = ::ReadCompactSize(s);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This differs from the near-duplicate code in pubkey.h by not checking len. Why is there near-duplicate code and why does it differ?

zkbot added a commit that referenced this pull request Apr 24, 2018
Remove now-unshadowed serialization lines that do nothing

Previously we had both nVersion as a class parameter *and* a serialization
argument, and in several inherited serializers the latter was set to the former,
in order to pass the serialized object's version into underlying parsers. #3180
pulled in the upstream changes to clean this up, and in doing so these lines
became no-ops - setting the class parameter to itself. Clang throws warnings on
this, which turn into errors on the MacOS builder.

We can just remove these, because upstream already had done so in earlier PRs,
indicating that they were not being relied on by underlying parsers.
str4d added a commit to str4d/zcash that referenced this pull request Jun 27, 2019
Originally excluded from zcash#3180 due to these PRs not having been
backported yet.
str4d added a commit to str4d/zcash that referenced this pull request Dec 4, 2019
Originally excluded from zcash#3180 due to these PRs not having been
backported yet.
zkbot added a commit that referenced this pull request Feb 15, 2021
Backport of bitcoin/bitcoin#8020

Cherry-picked from bitcoin/bitcoin#8020

- cherry-pick bitcoin/bitcoin@a68ec21f7e
  - conflict with bitcoin/bitcoin@957e5d2
- cherry-pick bitcoin/bitcoin@8cc9cfe160
- cherry-pick bitcoin/bitcoin@382c871d28
  - replace CCoinsKeyHasher with SaltedTxidHasher
- cherry-pick bitcoin/bitcoin@0b1295b066
  - self-conflict with previously-applied bitcoin/bitcoin@b8a657936 (#3180)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-cleanup Category: PRs that clean code up or issues documenting cleanup. C-upstream-port Category: Changes that are ported from the Bitcoin Core codebase.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet