-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use cp_height for checkpoints #5147
base: master
Are you sure you want to change the base?
Conversation
Whoops, looks like I broke 4 of the unit tests... I'll get those fixed ASAP. |
c3b40c4
to
b682a59
Compare
Fixed the unit tests. |
As best I can tell, the Travis failure for Android is unrelated to this PR. |
Note that this breaks |
Ah, good point, I hadn't realized that.
Yes, this makes sense as a good solution. Might as well also include the timestamp of the beginning of the current difficulty period, so that I can drop the stupid workaround of requesting an older header in order to calculate the next retarget after the checkpointed height. I'll look into implementing this. |
Updated PR; this also adds a unit test for get_chainwork on mainnet so that it won't get accidentally broken again in that way. |
Oops, seems I messed up the new unit test; I'll check why that's broken and fix it ASAP. |
I was going to say that you should avoid taking my networking changes wrt to actual synchronisation, as I am pretty sure they introduced bugs and unnecessarily increased network traffic (exacerbated by the broken forking code which gets triggered for BCH and BSV). But it looks like you've just excluded that part of it, so all good. |
Fixed the unit test, feel free to continue reviewing this PR. |
I don't like it that so many different things need to be provided as part of the checkpoint that all need to be kept "in sync".
I think it would be better to do away with the timestamp, and the last bits; and instead download the chunk immediately prior to the checkpoint. Out of the scope of this PR, it bothers me that we do not enforce the median past time rule for the header timestamps. I will probably at some point implement that, and that would make this list of cp params even longer... So I really think the way to go is to download the previous chunk. Specifically, given the checkpoint height is no longer restricted to be at a chunk boundary, we could download the chunk in which the checkpoint block is, and also the chunk before that. I realise that this is basically what your first implementation did, and that you changed that to accomodate my previous remarks. Sorry about that. |
@SomberNight I guess I don't fully follow what the problem is with that. The intent is that the checkpoint will be set based on the result of the console command listed in the comment, so it's not really problematic AFAICT that the checkpoint consists of more than 1 value. I guess it would be better if that console command's JSON object were directly copied into a single checkpoint variable rather than manually reformatting it into 5 variables (and I'd be happy to fix that if desired), but besides that I don't see any issue. Downloading the full chunk containing the checkpoint height + the full chunk previous to it means that anywhere from 2016 to 4032 extra headers will be downloaded that could be avoided; this is somewhat problematic for Bitcoin over clearnet and becomes an even bigger issue when downloading over Tor or using a blockchain that has larger headers (e.g. merge-mined chains). So I'd really prefer that we keep the optimization that I introduced, unless there's some concern with it that I'm not seeing. Could you clarify what exactly the concern is, and whether your concern would be addressed if I replaced those 5 variables with 1 variable that contains a verbatim JSON object from the console command that generates the checkpoint? |
(Triggering a rebuild, don't mind me....) |
Hi @SomberNight , sorry to pester you (I'm sure you guys are busy dealing with the phisher, which is higher priority than whatever I'm asking for), but when you have a few minutes it would be great if you could provide clarification regarding #5147 (comment) . I'm happy to tweak the PR as needed, but having a better understanding of the requirements at play would be very helpful for me. Cheers! |
eb20d43
to
3766034
Compare
7f997ee
to
e0ef357
Compare
Rebased against recent master branch, also did a bunch of cleanup/refactoring. I'm not 100% sure whether regtest is broken, I'll try to test with an empty checkpoint shortly (which should simulate regtest mode as far as the checkpoint code is concerned). Beyond that, it's ready for review. EDIT: Empty checkpoint works fine for me on testnet. So it should work fine on regtest. Feel free to review. |
|
||
chain_u.get_chainwork(constants.net.max_checkpoint()) | ||
with self.assertRaises(MissingHeader): | ||
chain_u.get_chainwork(constants.net.max_checkpoint() + 4032) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this supposed to test? (the ~10 lines above)
Just whether get_chainwork raises or not?
Is it intentional that you are changing constants.net.CHECKPOINTS['height']
? and that height_offset
is not used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, basically this test is supposed to ensure that get_chainwork
's exception-raising behavior is correct regardless of what the checkpoint height MOD the retargeting period is equal to. Early drafts of this PR were broken for a subset of possible moduli, so I figured an exhaustive test of that space would avoid any accidental future regressions in that area. I think height_offset's value isn't used for anything, I just needed a variable name for the for loop, so I picked one that was hopefully self-documenting of what the range was.
electrum/blockchain.py
Outdated
prev_header_hash = this_header_hash | ||
|
||
# Copied from electrumx | ||
def root_from_proof(hash, branch, index): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: this looks really similar to
Line 140 in cb204dd
def hash_merkle_root(cls, merkle_branch: Sequence[str], tx_hash: str, leaf_pos_in_tree: int): |
but I guess because of
_raise_if_valid_tx
, it might need to be duplicated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... do you think it would be useful to refactor hash_merkle_root
so that the raise is disableable?
On a related note, do you think it would be useful to validate the length of each member of branch
? The length isn't currently validated in either root_from_proof
or hash_merkle_root
, and I feel a bit nervous about the potential for attacks possibly being enabled by sending a Merkle branch whose members aren't 32 bytes long. (For example, what happens if a branch member is the empty string? Would that result in a Merkle root that's equal to its child? Can we be confident that this doesn't enable any attacks?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I must have missed this message before.
Good point on extra validation; does not seem trivial to come up with an attack but no point in risking it.
See 9b82321
I think it would be nice not to duplicate this function; so yes, the raise could be made optional based on a parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed the duplicated function in latest commit.
e0ef357
to
6a3f457
Compare
@SomberNight Feedback addressed. The Travis fail appears to be unrelated to this PR (maybe an intermittent network issue on Travis's end?). Anything else I need to do to get this merge-worthy? |
Triggering a rebuild to see if Travis is happier now.... |
fbe6cd4
to
723e9a9
Compare
Fixed merge conflict. |
This is a rough port of abf12b6 by rt121212121 from Electron-Cash.
This makes it easier for Electrum forks to support alternate header formats such as AuxPoW.
Move hash_merkle_root to blockchain.py to avoid circular imports.
723e9a9
to
85f872a
Compare
Fixed merge conflict. |
sorry to answer so late; I did not have much time to look at this until now (we were focusing on lightning and on the phishing attack). I am not convinced that we should merge this PR. I understand that it might be something interesting for Namecoin, but it is not our policy to adapt Electrum to other coins. I see the following issues: 1 - the bandwidth gain is not obvious. If you have lots of transactions in the same chunk, downloading Merkle proofs for all the block headers might actually be worse than downloading the chunk. |
Closing this. Feel free to propose changes that make your fork easier to rebase, though. |
@ecdsa In the specific case of EPS, I'm not convinced that EPS would need to implement this feature. AFAIK EPS is generally used in situations where the client already trusts the server to not lie about the state of the blockchain (e.g. Electrum is usually used in a mode that ignores transaction Merkle proof verification errors from EPS). Would your specific concern about EPS compatibility be sufficiently addressed by allowing Electrum to optionally ignore header Merkle proof verification errors? (I'll ponder your other concerns and reply about those later; just wanted to ask about this one first since it seems to be one of the more significant concerns.) |
@ecdsa Regarding these concerns, would it be helpful if I did some benchmarking with real-world Bitcoin wallets to evaluate what the performance improvements would be?
For the case of electrs, would this concern be addressed if electrs merged a patch that added
Independently of the above, would a PR that adds the |
I am not sure what do you mean by 'helpful'. of course benchmarking is always helpful, but I feel like answering 'yes' here would imply that I promise to merge it if you show an improvement. We are working in many other different directions in order to improve Electrum, and this pull request steals a lot of time and effort to review. In addition, it makes the code unnecessarily complex, so I do not want it. I am sorry that I did not comment earlier on this, and I realize you might have put a lot of effort in it. However, that does not mean I have to accept it. It is not going in the direction I want for this project. |
yes, if it helps making the code more modular |
Note regarding "dos vector" convo (though I can't find it now; was that on IRC?): Given this data, I estimate that doing the required hashing to serve a single request takes around 1 msec of CPU time. |
Further re dos vector: @JeremyRand asked on IRC:
There is an LRU cache there, which stores data for 1000 blocks. However, given DoS we should the cache is not hit. It seems it already starts at tx hashes there as leaves, so with n leaves I suppose it's n double-sha256 hashes to build the tree and get the proof. A block might have e.g. 3k transactions so 3k hashes. One somewhat interesting point is that maybe that LRU cache could be extended to cover the whole chain.
Napkin math says that would take 1 GB of memory. So it seems to me the current worst-case cost of the tx merkle proof protocol request is higher than this header request. |
Yes, that was brought up on IRC. A couple of other random things from the IRC discussion: The question came up of whether downloading individual headers would use more bandwidth than downloading full chunks for certain high-volume wallets. I did the following Electrum console commands:
So that means that downloading a full chunk becomes cheaper bandwidth-wise if at least 216 blocks out of the 2016 blocks in the chunk are relevant to the wallet. 216/2016 is 10.7%; i.e. only wallets that contains transactions covering at least 10.7% of the blocks will be better off downloading full chunks (which is the status quo). I'm relatively confident that the median Electrum user will therefore save bandwidth by downloading individual headers, though I'm not particularly confident about high-volume merchants. Also, I checked the electrs source code, and they do appear to already support |
ok, I am reconsidering this in light of the recent comments |
This is a rough port of abf12b6 by @rt121212121 from Electron-Cash. Fixes #4900
I've tried to keep the diff against Electrum master branch as minimal as possible to facilitate easier review. So for now, this PR approximates the existing behavior of downloading a full chunk when verifying a transaction that's covered by a checkpoint. A future PR could easily modify this behavior to only download a single header in that circumstance (and I intend to submit such a PR in the near future).
Warning: The checkpoint Merkle roots (for both mainnet and testnet) in this PR were extracted from a randomly selected ElectrumX server, and shouldn't be trusted in any way. Prior to merging, someone with access to a known-good ElectrumX server should double-check that these checkpoints are correct.