New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove unused statements in serialization #8658
Remove unused statements in serialization #8658
Conversation
I'd suspect the binaries to be the same if any optimization flags are enabled. |
Tests/travis is OK. Can anyone test binaries, please? |
89d7305
to
64d9507
Compare
Rebased to match |
The only place where the implicitly-passed-around |
@sipa Exactly. And there is no such line like in the proposed change. |
utACK 64d9507 After this I think we can actually go further and completely remove the nType and nVersion arguments from all SerializationOp methods, and replace them with calls to s.GetType() and s.GetVersion(). |
@paveljanik No need to test them. Apparently I get the same binaries, which means this is indeed dead code. (Instead of deleting the lines, you can replace them with empty lines to get rid of the offsets in the objdump). Can you check this? |
bitcoind-matches-ACK 64d9507 (qt binaries do not match, though) |
Eh, isn't this intended to allow the serialised version to override the parameter? |
@luke-jr Yes, that was the intention. But I don't think it's very usable.
First of all, it requires that everything can be encoded inside a single
version number, something that is increasingly hard in a decentralized
environment. It also has never ever actually been used. It also is sort of
a layer violation, as you need a single namespace for versioning across all
of wallet code, network code, consensus-critical hash computation, database
storage, ...
I think there are more flexible mechanisms possible using wrapper objects
that introduce new serialization formats.
|
These are all implementation-specific objects, so decentralisation is less of an issue. But I guess it's fine so long as it's never been used before (but I'm not certain of that). |
Concept ACK.
It can always be added back for a certain serialization op in the oft case that that needs to be used. No need to have dead code around 'just in case'. |
I think this is ready for more ACKs. I volunteer for the next steps pointed out by @sipa above. |
Cannot reproduce this, I detect differences in the following functions between 64d9507 and 6898213:
|
Make sure to replace the deleted lines by empty lines to get rid of the offset? |
BTW - this is not I think that we should compare binaries in some higher |
I think it's unlikely that this can result in identical binaries. It requires some cross-function reasoning across multiple modules to see this has no effect. We are changing the actual values of arguments passed down - they're just not used. |
So these are line numbers? Ok, that'd make sense, will try again and w/ disabled LINE.
Yes the changes in the constructors are more involved.
Increasing -O generally makes it worse, not better. A lot of optimization settings make the output extremely sensitive to small change in the input, as well as cause a change in one function to move another (little related) function, making per-function comparison useless. This is why I went out of my way to find specific flags that work for this in my build/compare script: https://github.com/laanwj/bitcoin-maintainer-tools/blob/master/build-for-compare.py#L33 . It'd be possible to add specific flags if they're known to be safe. |
It is very unlikely to produce the same binary, but to be safe, we should fully understand the difference. Can As a side note: in the "gcc set" |
They had to do with In any case I've updated the above list, they no longer show up after removing line number sensitivity there. |
Right:
Thus silently passes nVersion to ::SerReadWrite. This function may or may not use the argument, but it is used. This does change my opinion on this change from "harmless" to "hard to verify for correctness".
Indeed, if you change the argument name, the name binding will change and functions called will suddenly receive At least that's what I would infer. I don't see why it would lead to a crashing segwit test though... This makes it kind of scary. |
OHH I get it, maybe.
expands to
So the first line will changes Do you really need that change for shadowing? I'd prefer not to do that, it looks to me that we're taking a huge risk just to avoid some compiler warnings. CBlock is as deep in consensus code as you can get. |
@laanwj IMHO, this reveals that |
IMO it'd have been better to just write those macros out., they make the code shorter apparently but much more obfuscated. |
My suggestion was that after this PR we would proceed to get rid of the
nVersion and nType implicit parameters, and just replace them with getters
on the stream implementation.
That would make things much easier to reason about.
|
Guess I've been doing it wrong then. git checkout bitcoin/master && \
git reset --hard HEAD && \
curl https://raw.githubusercontent.com/laanwj/bitcoin-maintainer-tools/6e4425587736144b067f67ad792d9ee904e74fd7/patches/stripbuildinfo.patch | patch -p 1 && \
make -j 2 && \
objdump -d -r -C --no-show-raw-insn src/bitcoind > /tmp/d_old && \
curl https://github.com/bitcoin/bitcoin/commit/64d9507ea5724634783cdaa290943292132086a9.diff | patch -p 1 && \
make -j 2 && \
objdump -d -r -C --no-show-raw-insn src/bitcoind > /tmp/d_new && \
diff /tmp/d_old /tmp/d_new | wc
0 0 0 |
@MarcoFalke the differences there would be:
I don't think the first would make executables suddenly match. I'll retry with "-O2" and see if I can get a match. |
Good news: using
as well as on bitcoin-qt
ACK 64d9507 |
@laanwj Thanks! |
This still has a [WIP] tag on the commit. However I'm going to merge nevertheless, as rebasing to change the commit message would have us all re-check executables again... |
64d9507 [WIP] Remove unused statement in serialization (Pavel Janík)
64d9507 [WIP] Remove unused statement in serialization (Pavel Janík)
Upstream serialization improvements Cherry-picked from the following upstream PRs: - bitcoin/bitcoin#5264 - bitcoin/bitcoin#6914 - bitcoin/bitcoin#6215 - bitcoin/bitcoin#8068 - Only the `COMPACTSIZE` wrapper commit - bitcoin/bitcoin#8658 - bitcoin/bitcoin#8708 - Only the serializer variadics commit - bitcoin/bitcoin#9039 - bitcoin/bitcoin#9125 - Only the first two commits (the last two block on other upstream PRs) Part of #2074.
Upstream serialization improvements Cherry-picked from the following upstream PRs: - bitcoin/bitcoin#5264 - bitcoin/bitcoin#6914 - bitcoin/bitcoin#6215 - bitcoin/bitcoin#8068 - Only the `COMPACTSIZE` wrapper commit - bitcoin/bitcoin#8658 - bitcoin/bitcoin#8708 - Only the serializer variadics commit - bitcoin/bitcoin#9039 - bitcoin/bitcoin#9125 - Only the first two commits (the last two block on other upstream PRs) Part of #2074.
64d9507 [WIP] Remove unused statement in serialization (Pavel Janík)
249cc9d Avoid -Wshadow errors (random-zebra) 8e1ec9e Use fixed preallocation instead of costly GetSerializeSize (random-zebra) 9b801d0 Add optimized CSizeComputer serializers (random-zebra) 0035a54 Make CSerAction's ForRead() constexpr (random-zebra) 9730a3f Get rid of nType and nVersion (random-zebra) 25ce2bb Make GetSerializeSize a wrapper on top of CSizeComputer (random-zebra) 1b479db Make nType and nVersion private and sometimes const (random-zebra) 35f1755 Make streams' read and write return void (random-zebra) a395914 Remove unused ReadVersion and WriteVersion (random-zebra) 52e614c [WIP] Remove unused statement in serialization (random-zebra) 82a2021 Add COMPACTSIZE wrapper similar to VARINT for serialization (random-zebra) 13ad779 add bip32 pubkey serialization (random-zebra) 9e9b7b5 [QA] Update json files with sig hash type in ASM for bitcoin-util-test (random-zebra) 3383983 Resolve issue bitcoin#3166 (random-zebra) Pull request description: -Based on top of - [x] #1629 Backports the following serialization improvements from upstream and adds the required changes for the 2nd layer network and the legacy zerocoin code. - bitcoin#5264 > show scriptSig signature hash types. fixes bitcoin#3166 > > The fix basically appends the scriptSig signature hash types, within parentheses, onto the end of the signature(s) in the various "asm" json outputs. That's just the first formatting idea that came to my mind. > > Added some tests for this too. - bitcoin#6215 > CExtPubKey should be serializable like CPubKey. > This would allow storing extended private and public key to support BIP32/HD wallets. - bitcoin#8068 (only commit 5249dac) This adds COMPACTSIZE wrapper similar to VARINT for serialization - bitcoin#8658 > As the line > ``` > nVersion = this->nVersion; > ``` > seems to have no meaning in READ and also in WRITE serialization op, let's remove it and see what our tests/travis will tell us. See bitcoin#8468 for previous discussion. - bitcoin#9039 > The commits in this pull request implement a sequence of changes: > > - Simplifications: > - **Remove unused ReadVersion and WriteVersion** CDataStream and CAutoFile had a ReadVersion and WriteVersion method that was never used. Remove them. > - **Make nType and nVersion private and sometimes const** Make the various stream implementations' nType and nVersion private and const (except in CDataStream where we really need a setter). > - **Make streams' read and write return void** The stream implementations have two layers (the upper one with operator<< and operator>>, and a lower one with read and write). The lower layer's return values are never used (nor should they, as they should only be used from the higher layer), so make them void. > - **Make GetSerializeSize a wrapper on top of CSizeComputer** Given that in default GetSerializeSize implementations we're already using CSizeComputer(), get rid of the specialized GetSerializeSize methods everywhere, and just use CSizeComputer. This removes a lot of code which isn't actually used anywhere. In a few places, this removes an actually more efficient size computing algorithm, which we'll bring back in the "Add optimized CSizeComputer serializers" commit later. > - **Get rid of nType and nVersion** The big change: remove the nType and nVersion as parameters to all serialization methods and functions. There is only one place where it's read and has an impact (in CAddress), and even there it does not impact any of the member objects' serializations. Instead, the few places that need nType or nVersion read it directly from the stream, through GetType and GetVersion calls which are added to all streams. > - **Avoid -Wshadow errors** As suggested by @paveljanik, remove the few remaining cases of variable shadowing in the serialization code. > - Optimizations: > - **Make CSerAction's ForRead() constexpr** The CSerAction's ForRead() method does not depend on any runtime data, so guarantee that requests to it can be optimized out by making it constexpr (suggested by @theuni in bitcoin#8580). > - **Add optimized CSizeComputer serializers** To get the advantages of faster GetSerializeSize implementations back, reintroduce them in the few places where they actually make a difference, in the form of a specialized Serialize implementation. This actually gets us in a better state than before, as these even get used when they're nested inside the serialization of another object. > - **Use fixed preallocation instead of costly GetSerializeSize** dbwrapper uses GetSerializeSize to compute the size of the buffer to preallocate. For some cases (specifically: CCoins) this requires a costly compression call. Avoid this by just using fixed size preallocations instead. > > This will make it easier to address @TheBlueMatt's comments in bitcoin#8580, resulting is a simpler and more efficient way to simultaneously deserialize+construct objects with const members from streams. ACKs for top commit: furszy: Long and nice PR 👌 , code review ACK 249cc9d . Fuzzbawls: ACK 249cc9d furszy: tested ACK 249cc9d and merging. Tree-SHA512: 56b07634b1e18871e7c9a99d412282c83b85f77f1672ec56330a1131fc7c234cd1ba3a053bdd210cc29f1e636ee374477ff614fa9a930329a7f8f912c5006232
As the line
seems to have no meaning in READ and also in WRITE serialization op, let's remove it and see what our tests/travis will tell us. See #8468 for previous discussion.