-
Notifications
You must be signed in to change notification settings - Fork 36.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add pre-allocated vector type and use it for CScript #6914
Conversation
tentative concept ACK, once-over utACK (not in depth) As discussed on IRC, the trade offs [currently] are API compatibility (how invasive the change is) and the complexity of the change to achieve the desired memory/performance characteristics. |
Concept ACK. |
@sipa Not sure why you'd consider this controversial. Seems like a big win to me. |
@jonasschnelli well, disabled should (eventually) be representative of initial sync impact... runtime impact is a bit harder to measure, I expect, because it will depend greatly on the signature cache. When the hitrate is high I expect it to be closer to these figures, and when it's low... the signature validation would dwarf this improvement. @btcdrak because it's hard to review, if it weren't such a big improvement in speed/memory it wouldn't be worth considering. |
This is a standard technique for vectors. I don't find it controversial. concept ACK |
@gmaxwell Understood, but I was meaning, if there is clearly such a benefit, it makes it uncontroversial to me. Anyway, Concept ACK, will try to review this weekend. |
@jgarzik Controversial to go change such a widely used consensus data
structure.
|
I like the speedup. I do think it introduces a lot of somewhat hard to review code, and by sake of being part of CScript it becomes part of the consensus code. At some point we should separate how scripts are allocated/stored, which is not consensus critical, from the code used to evaluate them, which is consensus critical, but could act on any slice of memory. |
@laanwj Yes, absolutely! That would also help building types that use arena
storage for scripts (like one single allocation arena for a CCoins, a
CTransaction, or even a whole CBlock).
I would of course like to see this change in before that time, but feel
free to demand better comments and/or tests if not considered sufficient.
|
@sipa Not sure if you were aware and/or based your work on this, but the "dynarray" was proposed for c++14, but didn't make it. http://en.cppreference.com/w/cpp/container/dynarray I only mention because using a reference implementation of that may ease the minds of reviewers. |
@cfields: dynarray is very different
|
@sipa ah sorry, I see. That's what i get for skimming. |
This is a huge speedup, and tests clean in valgrind. For reindex w/ libsecp256k1 this takes the time down from 3 hours 16 minutes to 2 hours 7 minutes. (With signature tests disabled completely this gets it down to 1h 17m.). Size of the UTXO set in memory is reduced from 5486MB to 4012MB. I intend to test this further, but I support this, and it looks generally okay to me. I'm surprised we got this much improvement just for cscript, I was expecting we'd have to make the entire cached transaction a single allocation to see this kind of benefit. |
Why isn't that the case OOI? I feel like CScript could just be 2 pointers, edit: Is |
Inside the coins cache the whole entry gets mutated, e.g. to delete outputs when they're spent... but never in a way which needs to increase their size. The cscripts themselves in these entries are never mutated at all. The mutation that decreases the entry size could be addressed by flagging their deletion, and a separate operation which compacts the coins cache (e.g. takes txouts which many deleted entries and packs then reallocs them to a smaller size)-- e.g. it could be run on flush (which scans for modified entries) as well as periodically when flusing is infrequent (e.g. db cache set to infinity), perhaps triggered by an overhead counter (which would be cheap to maintain). This would eliminate quite a few more malloc operations and further reduce fragmentation; but I suspect it would be a much bigger change than just changing the type of cscripts. |
Indeed, as discussed with @sipa originally, the biggest benefit about this change is how isolated it is, it just hot swaps an existing interface, while netting a huge performance increase. |
15ece3f
to
710442e
Compare
Added some comments. |
@dcousens CScript is hardly ever mutated. But CCoins which contains a vector of CTxouts, which contain a CScript is mutated often. The proposal is to make CCoins allocate its entire memory (including several CScripts) at once. |
Added a few more iterator tests (which caught (compile-time) errors), and removed the [WIP] marker. |
Do we want this for 0.12? |
I think so, it's a large performance improvement and memory usage reduction. |
ACK for 0.12 |
I'd prefer to merge a large, reasonably risky change like this after 0.12 branch, but if you are confident enough about this, I'm ok with it. |
i've done a light code review and have used this pull extensively. |
I have tested this extensively-- including operation in valgrind, perhaps more than I've tested the current tip without it. I have read the patch and think it looks fine but I feel anything doing this exceeds my knowledge of the subtle behaviors of C++ so I do not consider my code review to have much value on this pull. I think it's okay to merge, and considering our interactions if it causes trouble during the 0.12 pre-release cycle its also not hard to back out: absent it; we're likely to ignore some performance problems in 0.12 chalking it up to "well 6914 is a big fix". It will also see a lot more testing if its merged for 0.12 (as I will have multiple people working full time on testing the 0.12 and the next elements update that will be 0.12 based). |
I'll pull this on to my own nodes for extended testing, and will look over it once more. |
114b581 Prevector type (Pieter Wuille)
Upstream serialization improvements Cherry-picked from the following upstream PRs: - bitcoin/bitcoin#5264 - bitcoin/bitcoin#6914 - bitcoin/bitcoin#6215 - bitcoin/bitcoin#8068 - Only the `COMPACTSIZE` wrapper commit - bitcoin/bitcoin#8658 - bitcoin/bitcoin#8708 - Only the serializer variadics commit - bitcoin/bitcoin#9039 - bitcoin/bitcoin#9125 - Only the first two commits (the last two block on other upstream PRs) Part of #2074.
Upstream serialization improvements Cherry-picked from the following upstream PRs: - bitcoin/bitcoin#5264 - bitcoin/bitcoin#6914 - bitcoin/bitcoin#6215 - bitcoin/bitcoin#8068 - Only the `COMPACTSIZE` wrapper commit - bitcoin/bitcoin#8658 - bitcoin/bitcoin#8708 - Only the serializer variadics commit - bitcoin/bitcoin#9039 - bitcoin/bitcoin#9125 - Only the first two commits (the last two block on other upstream PRs) Part of #2074.
a8b965e Explicitly initialize prevector _union (random-zebra) bdd98e8 [Trivial] Add a comment on the use of prevector in script. (random-zebra) bbaa6e3 Clarify prevector::erase and avoid swap-to-clear (random-zebra) 28a8435 prevector: assert successful allocation (random-zebra) de41cb5 Only call clear on prevector if it isn't trivially destructible (random-zebra) 8784df3 Make CScript (and prevector) c++11 movable. (random-zebra) 83f9ac6 serialize: Deprecate `begin_ptr` / `end_ptr` (random-zebra) c3ecc12 prevector: add C++11-like data() method (random-zebra) 5490aa0 test prevector::swap (random-zebra) 035760e prevector::swap: fix (unreached) data corruption (random-zebra) 0e71400 prevector: destroy elements only via erase() (random-zebra) 9811a68 [Core] Prevector type (random-zebra) Pull request description: Based on: - [x] #1554 This introduces `prevector<N, T>`, a new basic data type which is a fully API compatible drop-in replacement for `std::vector<T>` and uses it for the script, to reduce the considerable memory overhead of vectors in cases where they normally contain a small number of small elements. Original tests in Bitcoin showed a reduction in dbcache memory usage by **23%**, and made an initial sync **13%** faster. Backported from the following upstream's PRs, with only additional edits in the 2nd layer scripts (`masternode-payments`, `masternode-budget`) and in the unit tests (to address the differences in `insecure_rand`): - bitcoin#6914 - bitcoin#7888 - bitcoin#8850 - bitcoin#9349 - bitcoin#9505 - bitcoin#9856 - bitcoin#10534 - bitcoin#11011 - bitcoin#14028 ~~NOTE: Updates to `memusage.h` needed as well, when #1531 is merged.~~ ACKs for top commit: furszy: Tested ACK a8b965e . Fuzzbawls: ACK a8b965e Tree-SHA512: 203b660dd8d9f98780be660dae0ac951766ba438836bd6f0ec921e7749c1a84143f3b920fe49f2cc399f4aa7e763098f78746d3b6ff845bcf913bb13de984bc1
This is likely a controversial change, and will need very careful review and testing.
It adds a new basic data type (
prevector<N, T>
) which is a fully API compatible drop-in replacement forstd::vector<T>
(except it does not support custom allocators, does not haveat()
, and misses a few comparison operators). It will allocate up to N elements inside the parent container directly, only switching over to heap-allocated storage if more is needed. The data structures for the N elements and the pointer/size metadata for the heap-allocated storage are shared, making it very efficient for small N.CScript is switched to use this new type, reducing the memory consumption of mempool and chainstate.
A benchmark (reindex with script validation disabled):
So for this reindex up to 223133, the chainstate needs 23% less memory, and is 13% faster.
There are likely several other places where this datatype can be used.