-
Notifications
You must be signed in to change notification settings - Fork 212
graphene: first draft #973
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hi @bissias, |
Hey Andrea, my pleasure to contribute! Thanks for taking a look.
…On Fri, Feb 16, 2018 at 10:01 AM, Andrea Suisani ***@***.***> wrote:
Hi @bissias <https://github.com/bissias>,
I'm eager and excited to test and review the code you submitted. Just want
to thank you in advance for the work you have done here!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#973 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AA9WOOXUixjBnefSHReunZRpqNZDHcA2ks5tVZg2gaJpZM4SIYHy>
.
|
Hi George,
On Fri, Feb 16, 2018 at 4:06 PM, bissias ***@***.***> wrote:
Hey Andrea, my pleasure to contribute! Thanks for taking a look.
Just a word of warning, I'm new to the Bitcoin code and c++, so expect it
to be a little rough ;-)
don't worry I consider myself c++ newbe.
BTW, I'm on the BU slack channel (username "george"). Are you on as well?
yes I'm on BU slack using the same nick I'm using here
…On Fri, Feb 16, 2018 at 10:01 AM, Andrea Suisani ***@***.***> > wrote:
> Hi @bissias <https://github.com/bissias>,
> I'm eager and excited to test and review the code you submitted. Just
want
> to thank you in advance for the work you have done here!
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#973#
issuecomment-366258901>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/
AA9WOOXUixjBnefSHReunZRpqNZDHcA2ks5tVZg2gaJpZM4SIYHy>
> .
>
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#973 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABZrE847f5NUzxXm8iku_syosa6wXkMuks5tVZlygaJpZM4SIYHy>
.
|
the travis integration test failure are due to source code formatting problem, they are not actual failures. see https://travis-ci.org/BitcoinUnlimited/BitcoinUnlimited/jobs/342353084#L698 for more details to get your files automatically formatted add this snippet to your
and install clang-format-3.8 on you deev machine (I'm supposing you are using a relatively recent ubuntu linux distro for your dev env) |
Ok will do. Do you want me to just start a new pull request after making
these formatting changes?
…On Fri, Feb 16, 2018 at 10:32 AM, Andrea Suisani ***@***.***> wrote:
the travis integration test failure are due to source code formatting
problem, they are not actual failures. see https://travis-ci.org/
BitcoinUnlimited/BitcoinUnlimited/jobs/342353084#L698 for more details
to get your files automatically formatted add this snippet to your
.git/config file in your cloned repo:
[filter "bitcoin-clang-format"]
clean = "contrib/devtools/clang-format.py format-stdout-if-wanted clang-format-3.8 %f"
smudge = cat
and install clang-format-3.8 on you deev machine
(I'm supposing you are using a relatively recent ubuntu linux distro for
your dev env)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#973 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AA9WONCs43_h0q3okRjWCa_v8BYa4pvdks5tVZ91gaJpZM4SIYHy>
.
|
@bissias just fix the formatting problems, committing the changes and push it. Github will automatically update the PR for you w/o loosing all the comments and the reviewed already made. if you want before pushing you can squash the commit containing the formatting fix to the previous one
|
@bissias : Alright, awesome! Funny coincidence, but just a couple hours ago, I asked Gavin Andresen on the BU slack when and where (which code fork) implementations of the Graphene protocol are to be expected. He told me to ask Brian Levine - and before I could do that, here we are :) I assume you work in his group? I am working on preconsensus in the form of weak blocks, as described by Peter Rizun in his paper. One of the things that I want to do, but which would be kind-of overlapping with your work is transmitting weak blocks as delta blocks. What I do right now is to hijack @ptschip's But yeah, so in terms of actual transmission of my deltablocks I really still need to do something proper :) But I imagine that it should be very doable to also use the graphene protocol or a variant that would point to a weak block as the basis and then just transmit the delta that is not in the weak block? If so, I like to take the opportunity to remark here that it would be great if graphene could provide such a hook and provisions to use it in a "deltablocks" scheme as well. That way, both could be combined in the longer run and even further reduce the burstiness of block transmission. I guess I should also remark that I still consider the whole notion of weakblocks as experimental, and even though I am optimistic that this will deliver benefits, it is in experimental stages still. The proof for that is in the pudding. What do you think about a delta scheme for graphene? |
4c7bbf5
to
7c2b039
Compare
@awemany, that is a funny coincidence! Glad to finally have something to submit for discussion. And yes I'm working in Brian's group. I'm only familiar with the basic idea of weak blocks, but my understanding is that you need data structures for both the weak and final blocks. The final block will point to a weak block and may additionally contain some new transactions. From what you wrote, it sounds like you're considering using something like graphene for the final block? It seems reasonable in that final and graphene blocks share the same objective of communicating some set of transactions that are probably already on the recipients machine. But what hooks did you have in mind exactly? The core logic for graphene that performs set reconciliation is pretty small. It's the small details of the overall protocol that constitute the majority of the code. So one way I could enable reuse of the graphene code would be to factor out that core logic into a class that could be used outside of the graphene protocol. |
Kind of! Weak blocks is what Peter Rizun's called subchains in his paper: https://www.bitcoinunlimited.info/resources/subchains.pdf The idea of weak blocks is to basically publish partial solutions of finding a block with reduced difficulty (but while searching for a strong block, of course, so the miners essentially do the same thing as before!). As it is found, it will be sent around and kept as an increment to the state of the chain by all participating nodes ("deltablocks"). It is meant as a non-sybil-able preconsensus scheme of what goes into the next strong block. It is assumed that participating nodes will refer to the last weak block when building their next block and enjoy easier propagation of their solutions as most of the block's entropy can be referred to by pointing to the weak block as a 'delta' (the incentive to use it). A side benefit is faster "fractional confirmations" on top of the 0-conf of transactions. Now, the requirement is added that a node participating in weakblock propagation will build its next block (weak or strong) on top of the last weak block (or rather: longest chain), in the following way (currently; this should of course be adapted for things like a preferred ordering etc.):
Assuming full propagation of such a weakblock, this means that it can then send out any found solution (strong or weak) by referring to the weakblock it is based upon, plus just the set of transactions on top of that. In terms of graphene, if I understand the approach correctly, this would mean it looks like the mempool is reduced to just those txn that are not in the previous weak block and a single extra hash would be necessary in a "deltagraphene" message that refers to this last propagated weak block. And this is what I am wondering about: How to make it so that one can say "these transactions are for sure in the transmitted block, see hash so and so here" and thus further reduce the amount of transactions that the graphene set reconciliation has to deal with: By a factor of weakblockrate / strongblockrate assuming no orphaning problems. Basically, what I like to have is a way to say "transmit this CBlock using graphene, but tell graphene that these transactions from that other block (except CB) are for sure in this block, and in the same order". The advantage of weakblocks over just reducing the block time interval is that this purely an opt-in scheme that can also be gradually implemented by participating miners and does not need a chain fork nor a change of this fundamental variable and all the assumptions that have been built on top of it. All other code stays the same and code that is ignorant of weak blocks will see the chain progress just like it did before. |
qa/rpc-tests/grapheneblocks.py
Outdated
self.nodes[0].generate(100) | ||
self.sync_all() | ||
self.nodes[0].generate(5) | ||
self.sync_all() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not just
self.nodes[0].generate(105)
self.sync_all()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
qa/rpc-tests/grapheneblocks.py
Outdated
@@ -0,0 +1,97 @@ | |||
#!/usr/bin/env python3 | |||
# Copyright (c) 2014-2015 The Bitcoin Core developers | |||
# Copyright (c) 2015-2016 The Bitcoin Unlimited developers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit. copyright header should be
Copyright (c) 2018 The Bitcoin Unlimited developers
since this is a new file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/graphene.cpp
Outdated
@@ -0,0 +1,1514 @@ | |||
// Copyright (c) 2016-2017 The Bitcoin Unlimited developers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit copyright header (same as qa/rpc-tests/grapheneblocks.py
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/graphene.cpp
Outdated
{ | ||
LOCK(cs_orphancache); | ||
LOCK(cs_xval); | ||
if (!ReconstructBlock(pfrom, fXVal, missingCount, unnecessaryCount)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, why not LOCK2(cs_orphancache, cs_xval)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I was copying xthin here :-D. But it compiles as you recommended so I changed it. In general, I would appreciate going over all the locks before merging with dev to make sure I got them right.
src/graphene.cpp
Outdated
} | ||
else | ||
{ | ||
BOOST_FOREACH(const CTransaction &tx, block.vtx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know you inherited from thinblock.cpp
, but we are trying to use c++11 for loop when introducing new code. e.g. something like:
- BOOST_FOREACH(const CTransaction &tx, block.vtx)
+ for (const CTransaction &tx : block.vtx)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/graphene.h
Outdated
@@ -0,0 +1,254 @@ | |||
// Copyright (c) 2016-2017 The Bitcoin Unlimited developers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit 2018
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -0,0 +1,266 @@ | |||
#include "iblt.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing copyright header
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a question here. I lifted this code from Gavin's repo. It is distributed with a MIT license. I did make some edits though. How should I copyright this while also crediting the original author?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should add use the original copyright notice (the one that present on Gavin repo and any other copyright that Gavin included if any), then add yours and BU ones if you wanted to (IANAL).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no copyright in any single file, but the top level of the repository contains the following file. So I guess I should paste the contents into each iblt file and then include the BU copyright just above it?
https://github.com/gavinandresen/IBLT_Cplusplus/blob/master/LICENSE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again IANAL, but this what I would do 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok fair enough! I pushed license information for the IBLT code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you make of wallet.py test failing due to KeyboardInterupt in CI? This just happened after I committed the license changes to the IBLT code; doesn't seem like it could be related. For me all unit tests pass in my devenv, at least when I run make check. Is there another test suite I should be running as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a spurious failure due to travis problem. Going to rekick the failing job
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see, intermittent failure. Looks green now. Thanks.
@@ -0,0 +1,100 @@ | |||
#ifndef CIblt_H |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing copyright header
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question as with iblt.cpp.
src/net.h
Outdated
} | ||
|
||
// BUIP055: | ||
bool BitcoinCashCapable() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this method is not present anymore in the dev
branch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
src/protocol.h
Outdated
// NODE_GRAPHENE means the node supports Graphene blocks | ||
// If this is turned off then the node will not service graphene requests nor | ||
// make graphene requests | ||
NODE_GRAPHENE = (1 << 6), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in case we are going to port graphene to the BU legacy code base (BitcoinCore
branch) guess we need to check if some other implementation is already using this bit to signal some other services.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made a note about this. I will be sure to double-check before merging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great
@awemany thanks for the link, that paper didn't immediately surface when I was looking up weak blocks. I'll be sure to take a close look. Anyhow, thanks also for the description. I think I'm getting the overall picture now (but tell me if not). Here are my thoughts on weak block integration. In graphene, the sender creates both a Bloom filter (BF) and IBLT containing the txs in the block. And the receiver reconstructs the blocks by trying to pass every tx it knows about through the BF, and placing all positives in its own IBLT. The difference between sender and receiver IBLTs allows the receiver to determine the final tx set. With a reference to a weak block, the receiver simply has a new source of transactions to pass through the BF on the way to constructing its IBLT. So I believe in this way weak blocks and graphene can compliment each other nicely. Some practical considerations:
|
I just updated the single commit for this pull request with changes requested by @sickpig. Some details about the updates are below. Completed tasks:
Some tasks noted but not yet completed:
|
src/net.cpp
Outdated
if (strCommand == NetMsgType::GRAPHENETX) | ||
LOG(GRAPHENE, "ReceiveMsgBytes GRAPHENETX\n"); | ||
if (strCommand == NetMsgType::GET_GRAPHENETX) | ||
LOG(GRAPHENE, "ReceiveMsgBytes GET_GRAPHENETX\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a little bit inefficient... better to use "else if" statements so we don't have to potentially evaluate every if statement... or do just on if statement with || connecting ...and then just one generic LOG statement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. I went with the "else if" option.
src/requestManager.cpp
Outdated
mi != mapOrphanTransactions.end(); ++mi) | ||
vOrphanHashes.push_back((*mi).first); | ||
} | ||
BuildSeededBloomFilter(filterMemPool, vOrphanHashes, inv2.hash, pfrom); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why create a seeded bloom filter here? this is what we need for an Xthin. It looks like you're possibly not creating a proper graphene block here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops. That's a copy paste error. I've removed that call to BuildSeededBloomFilter and added a line like in 461.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like nice work and not that much code really. But I haven't take a deep dive yet. One question...I haven't taken it out for a test drive but if i setup two nodes will this actually work as is?
@ptschip thanks for taking a look and for the comments. I've committed the changes you mentioned. The code should work as-is. However the only way I have actually run it is with the qa test: qa/rpc-tests/grapheneblocks.py. It's a close copy of the test thinblocks.py. There's enough log output to see the blocks being requested, created, sent, and received. The next step for me is to actually deploy some nodes on the test net. |
@bissias btw, when you make requested changes , don't squash your commits, it makes it hard to know what you changed when we come back to review the changes. Squashing can be done at the end of the cycle or intermittently if there are just too many commits. Thanks anyway, and look forward to seeing how it in action. |
@ptschip sorry about that. Makes sense to squash only after the commits get unruly. Will do in the future. |
Pushed a small update to fix reporting of in-memory graphene bytes. |
@bissias: Right now, I send weakblocks through my own logic but hijack some of the thinblocks code to implement deltablocks. So I think your second approach (a library to transmit using graphene) would be preferred by me :-) More specifically, my main point is that it would be great if it is not closely tied to the mempool code: My weakblocks currently live outside the mempool. It is all in a state of flux and I might try integrating weakblocks with the mempool with some kind of subset or flagging system. It would be great if the sets to be reconciled and the pools available on each side can be set explicitly instead of graphene figuring that out by itself (and thus tying itself to the rest of system more closely). I am thinking something along the lines of this on the sending side (just as a general picture)
And vice-versa on the receiving side. And then it would be great if it would be easy for me to add a single 32-byte hash to a sent graphene block (and maybe a corresponding boolean flag as well), so that when a graphene block is received, I can basically say 'Oh ok, this is a graphene block that refers to this weak block, so lets reconstruct it from that one'. I don't care about the exact mechanism, whether I have to subclass the graphene block representation classes or need to set a couple configuration variables. Does that make sense? I am by now means in the position to make this an official request, but maybe you also agree that this would be great also in terms of decoupled architecture of the graphene logic :) |
@awemany I think I see what you mean about making graphene agnostic to tx sources. I guess I was imagining a slightly different abstraction that would fall entirely outside the protocol layer. The core logic would encapsulate set reconciliation code in a data structure like a
On the sender side:
And then on the receiver side:
The With all this being said, I also see a (separate?) opportunity to refactor and abstract the block propagation protocol logic. With graphene and weak blocks we will have five different block types: regular, thin, xthin, graphene, and weak, which share a lot of boilerplate logic. I don't have a strong opinion about whether or not such a refactoring should be performed (it's fairly high risk since it would touch existing block protocols). But my sense is that any abstraction of the protocol logic should remain separate from abstraction of set reconciliation logic. Do you see what I mean here? Are you more advocating for abstraction of the set reconciliation or protocol logic (or both)? If you're thinking about abstracting protocol logic, there might be an opportunity to create a general library with interface compatible with existing block types as well as our own. A safe way forward would be to implement our new block types with this common interface and leave refactoring the old block types for future work. |
@bissias : Yes that kind of interface makes sense for me and it seems easy enough to integrate that into the TBD weakblocks protocol messages. Great, thank you! I guess I'll eventually rebase onto your patch then, as soon as I am at the stage where I want to clean up my currently messy & hacky abuse of the And there we get to your second point: My own, personal two satoshis on this is that such a refactoring would be awesome to do and much riskier things have been done to the code. This might be getting a bit OT now, but there's a couple places in the current block transmission code that are fairly closely tied to block validation logic. Because weak blocks take different paths at some points in the validation logic, if anyone wants to do such a refactoring, s/he should IMO also make sure to decouple that properly. |
As they are changed during peeling. This is a bare patch from Andrew made into a commit with an extra comment added each by awemany.
And force it to be deserialized to zero at the moment.
And create an easy method to force the number of checksum bits to a lower value for easier testing of IBLT failures giving wrong results.
// Check for Misbehaving and DOS | ||
// If they make more than 20 requests in 10 minutes then disconnect them | ||
{ | ||
LOCK(cs_vNodes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this lock is unnecessary. pfrom should have a reference added by the caller so no chance that it will be deleted
problem is this data is not locked by any other lock...i remember using
this in xthins as well before the time we had atomics...these can all be
changed to atomics and I was planning on doing that after the graphene
merge...there are many other things that we can convert to atomics, such
as all the tweaks along with these values.
…On 23/07/2018 6:52 AM, Andrew Stone wrote:
***@***.**** commented on this pull request.
------------------------------------------------------------------------
In src/graphene.cpp
<#973 (comment)>:
> +
+ return true;
+}
+
+bool HandleGrapheneBlockRequest(CDataStream &vRecv, CNode *pfrom, const CChainParams &chainparams)
+{
+ if (!pfrom->GrapheneCapable())
+ {
+ dosMan.Misbehaving(pfrom, 100);
+ return error("Graphene block message received from a non graphene block node, peer %s\n", pfrom->GetLogName());
+ }
+
+ // Check for Misbehaving and DOS
+ // If they make more than 20 requests in 10 minutes then disconnect them
+ {
+ LOCK(cs_vNodes);
I think this lock is unnecessary. pfrom should have a reference added
by the caller so no chance that it will be deleted
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#973 (review)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AMRF0D1egd1jAkaqq4wVN4gJG7tZWnv6ks5uJdUQgaJpZM4SIYHy>.
|
uint64_t nLocalGrapheneBlockBytes; // the bytes used in creating this graphene block, updated dynamically | ||
int nSizeGrapheneBlock; // Original on-wire size of the block. Just used for reporting | ||
int grapheneBlockWaitingForTxns; // if -1 then not currently waiting | ||
CCriticalSection cs_grapheneadditionaltxs; // lock grapheneAdditionalTxs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do these 2 structures need critical sections but not grapheneBlock, grapheneBlockHashes, grapheneMapHashOrderIndex, mapGrapheneMissingTx, ...
i dont' think cs_grapheneadditionaltxs is necessary here...we can only
process one graphene block from a peer at a time , same for xthins.
these can however be changed to atomics when we refactor and allow
multi-graphen and xthins from one peer.
…On 23/07/2018 7:01 AM, Andrew Stone wrote:
***@***.**** commented on this pull request.
------------------------------------------------------------------------
In src/net.h
<#973 (comment)>:
> double nGetXBlockTxCount; // Count how many get_xblocktx requests are made
uint64_t nGetXBlockTxLastTime; // The last time a get_xblocktx request was made
double nGetXthinCount; // Count how many get_xthin requests are made
uint64_t nGetXthinLastTime; // The last time a get_xthin request was made
uint32_t nXthinBloomfilterSize; // The maximum xthin bloom filter size (in bytes) that our peer will accept.
// BUIP010 Xtreme Thinblocks: end section
+ // BUIPXXX Graphene blocks: begin section
+ CBlock grapheneBlock;
+ std::vector<uint256> grapheneBlockHashes;
+ std::map<uint64_t, uint32_t> grapheneMapHashOrderIndex;
+ std::map<uint64_t, CTransaction> mapGrapheneMissingTx;
+ uint64_t nLocalGrapheneBlockBytes; // the bytes used in creating this graphene block, updated dynamically
+ int nSizeGrapheneBlock; // Original on-wire size of the block. Just used for reporting
+ int grapheneBlockWaitingForTxns; // if -1 then not currently waiting
+ CCriticalSection cs_grapheneadditionaltxs; // lock grapheneAdditionalTxs
why do these 2 structures need critical sections but not
grapheneBlock, grapheneBlockHashes, grapheneMapHashOrderIndex,
mapGrapheneMissingTx, ...
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#973 (review)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AMRF0BnnueVp07v-S1lBHZDie-jJwtSvks5uJdc1gaJpZM4SIYHy>.
|
src/iblt.h
Outdated
CIblt(); | ||
CIblt(size_t _expectedNumEntries); | ||
CIblt(const CIblt &other); | ||
virtual ~CIblt(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason to have a virtual destructor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH this code was originally copied from Gavin's repo, so I didn't actively choose virtual. Do you prefer that it be dropped?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, drop it
…lace double lookups in std containers with a single insert, don't calculate cheaphashes twice
…tcoinUnlimited into gandrewstone-graphene_review2
self.stats['mempool_info_size'] = MEMPOOL_INFO_SIZE | ||
self.stats['rank'] = self.extract_bytes(self.extract_stats(self.nodes[0])['rank']) | ||
self.stats['filter'] = self.extract_bytes(self.extract_stats(self.nodes[0])['filter']) | ||
self.stats['iblt'] = self.extract_bytes(self.extract_stats(self.nodes[0])['iblt']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems a bit repetitive. Wouldn't it be nicer on the eyes to name the dict self.extract_bytes(self.extract_stats(self.nodes[0])
with some shorthand first and then extracting the stats?
Also, this would avoid doing repetitive RPC calls which might create confusion down the road in unforeseen corner cases where a stats update is in progress.
nFlags = nFlagsIn; | ||
} | ||
|
||
void CBloomFilter::setupGuaranteeFPR(unsigned int nElements, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this, except for the ceil
, an almost-copy of the other constructor? Can this code duplication be avoided? Also, I noticed that in your new constructor, the nDesiredSize
is not constrained when fSizeConstrained
is specified as true
. Is that intentional or a copy-and-paste error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I intentionally duplicated the setup method (modulo edits) in order to avoid having to make subtle changes to the existing CBloomFilter
implementation. Originally @ptschip suggested subclassing CBloomFilter
, but I needed access to private members, so we settled on this approach.
The fact that nDesiredSize
can be arbitrarily large is not an accident. I need the Bloom filter to honor the requested false positive rate. I don't think that there is potential for data-blowup here because the entire vData
vector is deserialized, so any filter that would be huge in memory would already be huge on the wire. With that being said, do we already avoid deserializing really huge messages into memory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Except for the almost duplication, the issue I have regarding fSizeConstrained
is that you take the flag in the constructor, but only seem to honor it partially. nDesiredSize
is unlimited, but nHashFuncs
is still being limited. Is that really intentional? Assuming you want an exact FPR, isn't the whole notion of having fSizeConstrained
kind of counterproductive?
@@ -89,6 +96,15 @@ class CBloomFilter | |||
unsigned int nTweak, | |||
unsigned char nFlagsIn, | |||
uint32_t nMaxFilterSize = SMALLEST_MAX_BLOOM_FILTER_SIZE); | |||
/** | |||
* Add the option to force the bloom filter setup to guarantee the false positive rate does not exceed nFPRate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think documentation of the sort of "Add the option" is more like for a git commit
message. I don't think it makes much sense to note in the code that this functionality has been added - just add it and document its proper use :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok but the new setup function already made you wonder why the code duplication was necessary ;-), so maybe it's worth leaving it in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe your issue is just with the comment being too casual? I'm happy to fix that also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this was just a nit about the comment.
@@ -461,7 +461,8 @@ static void addDebuggingOptions(AllowedArgs &allowedArgs, HelpMessageMode mode) | |||
{ | |||
std::string debugCategories = "addrman, bench, blk, bloom, coindb, db, estimatefee, evict, http, lck, " | |||
"libevent, mempool, mempoolrej, miner, net, parallel, partitioncheck, " | |||
"proxy, prune, rand, reindex, req, rpc, selectcoins, thin, tor, wallet, zmq"; | |||
"proxy, prune, rand, reindex, req, rpc, selectcoins, thin, tor, wallet, zmq" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it is missing a comma and a space.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup, should be:
"proxy, prune, rand, reindex, req, rpc, selectcoins, thin, tor, wallet, zmq, "
"graphene"
It would seem the debug category at the end is now "zmqgraphene"
@@ -531,7 +569,15 @@ class CNode | |||
return false; | |||
} | |||
|
|||
void AddAddressKnown(const CAddress &_addr) { addrKnown.insert(_addr.GetKey()); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a regression here: This undoes the recent shadowing fix by @sickpig.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weird. I'm pretty sure that this was removed during a rebase with dev. I'll add it back in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, it looks like this has already been added back in the merged version.
I'm looking at dev head right now and it looks like that change was
already merged in..can you pull the latest and check? thx.
…On 25/07/2018 11:34 PM, Angel Leon wrote:
***@***.**** commented on this pull request.
------------------------------------------------------------------------
In src/allowed_args.cpp
<#973 (comment)>:
> @@ -461,7 +461,8 @@ static void addDebuggingOptions(AllowedArgs &allowedArgs, HelpMessageMode mode)
{
std::string debugCategories = "addrman, bench, blk, bloom, coindb, db, estimatefee, evict, http, lck, "
"libevent, mempool, mempoolrej, miner, net, parallel, partitioncheck, "
- "proxy, prune, rand, reindex, req, rpc, selectcoins, thin, tor, wallet, zmq";
+ "proxy, prune, rand, reindex, req, rpc, selectcoins, thin, tor, wallet, zmq"
yup, should be:
|"proxy, prune, rand, reindex, req, rpc, selectcoins, thin, tor,
wallet, zmq, " "graphene" |
It would seem the debug category at the end is now |"zmqgraphene"|
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#973 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AMRF0Fki_I7dC82ylrXsq43Tm5e5Bb04ks5uKWLqgaJpZM4SIYHy>.
|
@bissias I'm playing around with a scheme for encoding transaction order efficiently. The rough idea is to start with a transaction list that's sorted by feerate, and compare that to the block's actual transaction order, and move transactions around to make the two orders equal. It uses some quick heuristics to decide which transactions to move in order to get the majority of transactions to have offsets of 0, allowing for efficient encoding. I've still got some bugs to fix, but the python draft I have right now is getting the encoding of the ordering of a 910 tx block template from the BTC network down to about 186 bytes, and 500 bytes for a 2000 tx template. There does appear to be some non-linearity, in which larger blocks take more bits per transaction to encode, but the efficiency might still be good enough for now. Having a canonical in-block transaction ordering would obviously be more efficient, but that's a fairly substantial hard fork change. |
@jtoomim just wanted to mention that the bitcoin cash network will introduce a canonical txns ordering in the consensus rules set in the next Nov 2018 upgrade. This mean that we won't need to communicate the txns order in a block to get it propagated. |
@jtoomim that's a very cool idea, thanks for posting. It looks like your numbers are well below the nlog(n) theoretical limit for arbitrary items. Do you have a writeup anywhere that I can take a look at? As @sickpig pointed out, a canonical ordering for BCH will likely obviate the need for passing ordering information in this implementation. However, I know that Dash is also working on a graphene implementation that could potentially benefit from your approach. In any case, I'd be interested to learn more. |
@bissias I was having some issues with getting the decoding to work properly with my older (and more ambitious) encoding scheme, so I made a version that uses a much simpler encoding scheme and uploaded that. https://github.com/jtoomim/orderencode The new version seems to make encodings that are about 25% bigger, or roundabout 600 bytes for a 2000 tx block, but it's still not too bad. If there is interest in greater efficiency, I can work on fixing the bugs and writing an explanation for the older version. The new encoding works like this: First, generate a list of offsets. If the transaction at index 5 in the block is at index 20 in the list sorted by feerate, then store a value of 15 into position 5 in the offsets list. Second, identify repeating offsets and encode them as (count, offset) pairs. If your offset list is [0, 0, 0, 4, 1, 1, 1, -4], you can summarize that as [(3,0), (1,4), (3,1), (1,-4)] since you have three 0s, followed by one 4, etc. Third, since most of the counts will have a value of 1 (i.e. a single transaction out of place, followed by one or more transactions from a different place), we can reencode the counts by generating a bitmap of which counts are equal to 1. Any count that is not equal to 1 is encoded separately in a list of residual counts. Last, encode all of the residuals and the offset values as signed varints and serialize them. I didn't bother with this step in the python proof of concept. |
Thanks for putting that repository together @jtoomim. I've been talking with Darren Tapp from Dash research, and I think this might be useful for their implementation. I will definitely let you know if they head down that path. |
This commit provides a functional implementation of graphene blocks. All unit tests pass and the QA test grapheneblocks.py also passes. From log output it is evident that graphene blocks are requested, created, serialized, deserialized, and reconstructed. However there is still work to be done. My general approach has been to largely replicate the workflow of xthin blocks. Thus I have replicated large portions of the xthin code. It's not clear that this is preferable to sharing code between the two block types. There also remains the question of transaction ordering for graphene blocks. If it is possible to commit a canonical ordering before graphene, then we will want to change this patch accordingly.
The code will also require further optimization (primarily block size optimization) before being deployed to production. My aim is to initiate a review for the basic workflow now and continue to work on optimization while graphene is running on the testnet. The following optimization tasks remain.