Use cp_height for checkpoints #5147

JeremyRand · 2019-02-22T10:21:22Z

This is a rough port of abf12b6 by @rt121212121 from Electron-Cash. Fixes #4900

I've tried to keep the diff against Electrum master branch as minimal as possible to facilitate easier review. So for now, this PR approximates the existing behavior of downloading a full chunk when verifying a transaction that's covered by a checkpoint. A future PR could easily modify this behavior to only download a single header in that circumstance (and I intend to submit such a PR in the near future).

Warning: The checkpoint Merkle roots (for both mainnet and testnet) in this PR were extracted from a randomly selected ElectrumX server, and shouldn't be trusted in any way. Prior to merging, someone with access to a known-good ElectrumX server should double-check that these checkpoints are correct.

JeremyRand · 2019-02-22T10:52:41Z

Whoops, looks like I broke 4 of the unit tests... I'll get those fixed ASAP.

JeremyRand · 2019-02-22T12:20:07Z

Fixed the unit tests.

JeremyRand · 2019-02-24T14:24:03Z

As best I can tell, the Travis failure for Android is unrelated to this PR.

SomberNight · 2019-02-25T14:24:14Z

Note that this breaks blockchain.get_chainwork
To make that work, maybe we could include the total chainwork up to the checkpoint block,
and also the target in that chunk.

JeremyRand · 2019-02-26T00:47:30Z

Note that this breaks blockchain.get_chainwork

Ah, good point, I hadn't realized that.

To make that work, maybe we could include the total chainwork up to the checkpoint block,
and also the target in that chunk.

Yes, this makes sense as a good solution. Might as well also include the timestamp of the beginning of the current difficulty period, so that I can drop the stupid workaround of requesting an older header in order to calculate the next retarget after the checkpointed height. I'll look into implementing this.

JeremyRand · 2019-02-26T08:47:27Z

Updated PR; this also adds a unit test for get_chainwork on mainnet so that it won't get accidentally broken again in that way.

JeremyRand · 2019-02-26T09:36:37Z

Oops, seems I messed up the new unit test; I'll check why that's broken and fix it ASAP.

rt121212121 · 2019-02-28T20:32:55Z

I was going to say that you should avoid taking my networking changes wrt to actual synchronisation, as I am pretty sure they introduced bugs and unnecessarily increased network traffic (exacerbated by the broken forking code which gets triggered for BCH and BSV). But it looks like you've just excluded that part of it, so all good.

JeremyRand · 2019-03-05T18:14:54Z

Fixed the unit test, feel free to continue reviewing this PR.

SomberNight · 2019-03-09T17:09:34Z

I don't like it that so many different things need to be provided as part of the checkpoint that all need to be kept "in sync".

VERIFICATION_BLOCK_MERKLE_ROOT = "4c6b8a34ed0ab02ef9486d0f57de24ee367e4c1735c20f7b3a79927b62310fc3"
VERIFICATION_BLOCK_HEIGHT = 558431
VERIFICATION_BLOCK_FIRST_TIMESTAMP = 1546276809
VERIFICATION_BLOCK_LAST_CHAINWORK = 1458314160032177082968012064
VERIFICATION_BLOCK_LAST_BITS = 389010995

I think it would be better to do away with the timestamp, and the last bits; and instead download the chunk immediately prior to the checkpoint.

Out of the scope of this PR, it bothers me that we do not enforce the median past time rule for the header timestamps. I will probably at some point implement that, and that would make this list of cp params even longer... So I really think the way to go is to download the previous chunk.

Specifically, given the checkpoint height is no longer restricted to be at a chunk boundary, we could download the chunk in which the checkpoint block is, and also the chunk before that.

I realise that this is basically what your first implementation did, and that you changed that to accomodate my previous remarks. Sorry about that.

JeremyRand · 2019-03-13T13:55:13Z

I don't like it that so many different things need to be provided as part of the checkpoint that all need to be kept "in sync".

@SomberNight I guess I don't fully follow what the problem is with that. The intent is that the checkpoint will be set based on the result of the console command listed in the comment, so it's not really problematic AFAICT that the checkpoint consists of more than 1 value. I guess it would be better if that console command's JSON object were directly copied into a single checkpoint variable rather than manually reformatting it into 5 variables (and I'd be happy to fix that if desired), but besides that I don't see any issue.

Downloading the full chunk containing the checkpoint height + the full chunk previous to it means that anywhere from 2016 to 4032 extra headers will be downloaded that could be avoided; this is somewhat problematic for Bitcoin over clearnet and becomes an even bigger issue when downloading over Tor or using a blockchain that has larger headers (e.g. merge-mined chains). So I'd really prefer that we keep the optimization that I introduced, unless there's some concern with it that I'm not seeing.

Could you clarify what exactly the concern is, and whether your concern would be addressed if I replaced those 5 variables with 1 variable that contains a verbatim JSON object from the console command that generates the checkpoint?

JeremyRand · 2019-03-21T14:46:04Z

(Triggering a rebuild, don't mind me....)

JeremyRand · 2019-03-29T08:29:37Z

Hi @SomberNight , sorry to pester you (I'm sure you guys are busy dealing with the phisher, which is higher priority than whatever I'm asking for), but when you have a few minutes it would be great if you could provide clarification regarding #5147 (comment) . I'm happy to tweak the PR as needed, but having a better understanding of the requirements at play would be very helpful for me. Cheers!

JeremyRand · 2019-06-07T02:28:46Z

Rebased against recent master branch, also did a bunch of cleanup/refactoring. I'm not 100% sure whether regtest is broken, I'll try to test with an empty checkpoint shortly (which should simulate regtest mode as far as the checkpoint code is concerned). Beyond that, it's ready for review.

EDIT: Empty checkpoint works fine for me on testnet. So it should work fine on regtest. Feel free to review.

SomberNight · 2019-06-17T23:21:05Z

electrum/tests/test_blockchain.py

+
+            chain_u.get_chainwork(constants.net.max_checkpoint())
+            with self.assertRaises(MissingHeader):
+                chain_u.get_chainwork(constants.net.max_checkpoint() + 4032)


What is this supposed to test? (the ~10 lines above)
Just whether get_chainwork raises or not?
Is it intentional that you are changing constants.net.CHECKPOINTS['height'] ? and that height_offset is not used?

Yeah, basically this test is supposed to ensure that get_chainwork's exception-raising behavior is correct regardless of what the checkpoint height MOD the retargeting period is equal to. Early drafts of this PR were broken for a subset of possible moduli, so I figured an exhaustive test of that space would avoid any accidental future regressions in that area. I think height_offset's value isn't used for anything, I just needed a variable name for the for loop, so I picked one that was hopefully self-documenting of what the range was.

SomberNight · 2019-06-17T23:26:15Z

electrum/blockchain.py

+        prev_header_hash = this_header_hash
+
+# Copied from electrumx
+def root_from_proof(hash, branch, index):


note: this looks really similar to

electrum/electrum/verifier.py

Line 140 in cb204dd

def hash_merkle_root(cls, merkle_branch: Sequence[str], tx_hash: str, leaf_pos_in_tree: int):

but I guess because of _raise_if_valid_tx, it might need to be duplicated

Hmm... do you think it would be useful to refactor hash_merkle_root so that the raise is disableable?

On a related note, do you think it would be useful to validate the length of each member of branch? The length isn't currently validated in either root_from_proof or hash_merkle_root, and I feel a bit nervous about the potential for attacks possibly being enabled by sending a Merkle branch whose members aren't 32 bytes long. (For example, what happens if a branch member is the empty string? Would that result in a Merkle root that's equal to its child? Can we be confident that this doesn't enable any attacks?)

Sorry, I must have missed this message before.
Good point on extra validation; does not seem trivial to come up with an attack but no point in risking it.
See 9b82321

I think it would be nice not to duplicate this function; so yes, the raise could be made optional based on a parameter.

Fixed the duplicated function in latest commit.

JeremyRand · 2019-07-05T20:26:00Z

@SomberNight Feedback addressed. The Travis fail appears to be unrelated to this PR (maybe an intermittent network issue on Travis's end?). Anything else I need to do to get this merge-worthy?

JeremyRand · 2019-08-02T08:17:24Z

Triggering a rebuild to see if Travis is happier now....

JeremyRand · 2019-09-20T20:29:53Z

Fixed merge conflict.

This is a rough port of abf12b6 by rt121212121 from Electron-Cash.

This makes it easier for Electrum forks to support alternate header formats such as AuxPoW.

Move hash_merkle_root to blockchain.py to avoid circular imports.

JeremyRand · 2019-09-23T01:56:45Z

Fixed merge conflict.

ecdsa · 2019-09-23T14:53:43Z

sorry to answer so late; I did not have much time to look at this until now (we were focusing on lightning and on the phishing attack).

I am not convinced that we should merge this PR. I understand that it might be something interesting for Namecoin, but it is not our policy to adapt Electrum to other coins. I see the following issues:

1 - the bandwidth gain is not obvious. If you have lots of transactions in the same chunk, downloading Merkle proofs for all the block headers might actually be worse than downloading the chunk.
2 - it makes our code more difficult to understand, because it adds a layer of complexity. This is especially true if we include strategies such as #5653 to deal with my first point. I am not in favor of adding complexity to the code unless the benefits are obvious.
3 - it will force other Electrum servers implementation (electrs, EPS) to follow ElectrumX. This is actually the most important issue I have with this PR, because we do not want to rely too much on a single implementation.

ecdsa · 2019-09-23T15:16:33Z

Closing this. Feel free to propose changes that make your fork easier to rebase, though.

JeremyRand · 2019-09-23T16:35:10Z

3 - it will force other Electrum servers implementation (electrs, EPS) to follow ElectrumX. This is actually the most important issue I have with this PR, because we do not want to rely too much on a single implementation.

@ecdsa In the specific case of EPS, I'm not convinced that EPS would need to implement this feature. AFAIK EPS is generally used in situations where the client already trusts the server to not lie about the state of the blockchain (e.g. Electrum is usually used in a mode that ignores transaction Merkle proof verification errors from EPS). Would your specific concern about EPS compatibility be sufficiently addressed by allowing Electrum to optionally ignore header Merkle proof verification errors?

(I'll ponder your other concerns and reply about those later; just wanted to ask about this one first since it seems to be one of the more significant concerns.)

JeremyRand · 2019-09-23T23:21:54Z

1 - the bandwidth gain is not obvious. If you have lots of transactions in the same chunk, downloading Merkle proofs for all the block headers might actually be worse than downloading the chunk.
2 - it makes our code more difficult to understand, because it adds a layer of complexity. This is especially true if we include strategies such as #5653 to deal with my first point. I am not in favor of adding complexity to the code unless the benefits are obvious.

@ecdsa Regarding these concerns, would it be helpful if I did some benchmarking with real-world Bitcoin wallets to evaluate what the performance improvements would be?

3 - it will force other Electrum servers implementation (electrs, EPS) to follow ElectrumX. This is actually the most important issue I have with this PR, because we do not want to rely too much on a single implementation.

For the case of electrs, would this concern be addressed if electrs merged a patch that added cp_height support? I wouldn't be opposed to submitting a patch to them. Are there any Electrum server implementations besides ElectrumX, EPS, and electrs that are significant for the purpose of evaluating this PR?

Feel free to propose changes that make your fork easier to rebase, though.

Independently of the above, would a PR that adds the HeaderChunk class without adding the cp_height checkpoint code be a good starting point?

ecdsa · 2019-09-24T06:35:20Z

@ecdsa Regarding these concerns, would it be helpful if I did some benchmarking with real-world Bitcoin wallets to evaluate what the peformance improvements would be?

I am not sure what do you mean by 'helpful'. of course benchmarking is always helpful, but I feel like answering 'yes' here would imply that I promise to merge it if you show an improvement.

We are working in many other different directions in order to improve Electrum, and this pull request steals a lot of time and effort to review. In addition, it makes the code unnecessarily complex, so I do not want it.

I am sorry that I did not comment earlier on this, and I realize you might have put a lot of effort in it. However, that does not mean I have to accept it. It is not going in the direction I want for this project.

ecdsa · 2019-09-24T07:10:48Z

Independently of the above, would a PR that adds the HeaderChunk class without adding the cp_height checkpoint code be a good starting point?

yes, if it helps making the code more modular

SomberNight · 2019-09-24T23:42:22Z

Note regarding "dos vector" convo (though I can't find it now; was that on IRC?):
You need to do around 2*sqrt(2*n) double-sha256 hashes to calculate a proof for a header, where n is the number of headers in the blockchain.
This means around 2.2 Khashes for the current chain.
According to this link, an Intel i7-920 can do around 19.2 Mhashes/sec when working fully parallelised on 8 threads. With how things are currently implemented in ElectrumX, I believe this hashing happens on the event loop thread, so it would not be parallelised.

Given this data, I estimate that doing the required hashing to serve a single request takes around 1 msec of CPU time.

SomberNight · 2019-09-25T00:31:36Z

Further re dos vector:

@JeremyRand asked on IRC:

How does this compare to the API method that gets a Merkle inclusion proof that ties a transaction to a block header? Is it actually any worse?

There is an LRU cache there, which stores data for 1000 blocks. However, given DoS we should the cache is not hit. It seems it already starts at tx hashes there as leaves, so with n leaves I suppose it's n double-sha256 hashes to build the tree and get the proof. A block might have e.g. 3k transactions so 3k hashes.

One somewhat interesting point is that maybe that LRU cache could be extended to cover the whole chain.

;;calc 600000*sqrt(3000)*32
1051627310.41
;;calc 600000*sqrt(3000)*32/1024**3
0.979404254267

Napkin math says that would take 1 GB of memory.
With the 2*sqrt(2*n) formula from above, if it's all cached, it is around 160 hashes to calc the proof.

So it seems to me the current worst-case cost of the tx merkle proof protocol request is higher than this header request.

JeremyRand · 2019-09-25T02:43:17Z

Note regarding "dos vector" convo (though I can't find it now; was that on IRC?):

Yes, that was brought up on IRC.

A couple of other random things from the IRC discussion:

The question came up of whether downloading individual headers would use more bandwidth than downloading full chunks for certain high-volume wallets. I did the following Electrum console commands:

>> electrum.verifier.is_chunk_cheaper(215)
False
>> electrum.verifier.is_chunk_cheaper(216)
True

So that means that downloading a full chunk becomes cheaper bandwidth-wise if at least 216 blocks out of the 2016 blocks in the chunk are relevant to the wallet. 216/2016 is 10.7%; i.e. only wallets that contains transactions covering at least 10.7% of the blocks will be better off downloading full chunks (which is the status quo). I'm relatively confident that the median Electrum user will therefore save bandwidth by downloading individual headers, though I'm not particularly confident about high-volume merchants.

Also, I checked the electrs source code, and they do appear to already support cp_height checkpoint proofs. Grepping their Git repo for cp_height yields the relevant pieces of code.

ecdsa · 2019-09-26T10:45:06Z

ok, I am reconsidering this in light of the recent comments

JeremyRand force-pushed the cp-height-checkpoints branch from c3b40c4 to b682a59 Compare February 22, 2019 12:15

SomberNight added the topic-blockchain ⛓ related to logic in blockchain.py label Feb 25, 2019

JeremyRand closed this Mar 21, 2019

JeremyRand reopened this Mar 21, 2019

JeremyRand force-pushed the cp-height-checkpoints branch from eb20d43 to 3766034 Compare June 7, 2019 01:30

JeremyRand changed the title ~~Use cp_height for checkpoints~~ (WIP) Use cp_height for checkpoints Jun 7, 2019

JeremyRand force-pushed the cp-height-checkpoints branch 4 times, most recently from 7f997ee to e0ef357 Compare June 7, 2019 02:24

JeremyRand changed the title ~~(WIP) Use cp_height for checkpoints~~ Use cp_height for checkpoints Jun 7, 2019

SomberNight reviewed Jun 17, 2019

View reviewed changes

JeremyRand force-pushed the cp-height-checkpoints branch from e0ef357 to 6a3f457 Compare July 5, 2019 20:16

JeremyRand mentioned this pull request Sep 20, 2019

(WIP) Use cp_height for checkpoints namecoin/electrum-nmc#173

Open

JeremyRand force-pushed the cp-height-checkpoints branch from fbe6cd4 to 723e9a9 Compare September 20, 2019 20:15

This was referenced Sep 22, 2019

_request_proofs would benefit from network batching #5651

Open

(WIP) verifier: Get single header when cheaper than chunk #5653

Closed

JeremyRand added 10 commits September 23, 2019 01:51

Use cp_height for checkpoints.

045c083

This is a rough port of abf12b6 by rt121212121 from Electron-Cash.

Add test for blockchain.get_chainwork

2d173d0

Add timestamp and chainwork to checkpoint

6d437cc

Eliminate obsolete special case from request_chunk

31c3ff0

Test get_chainwork with a variety of checkpoint heights

97feec6

Add bits to checkpoint

30f6ee7

Refactor cp_height checkpoints' usage of header parsing

5e93243

This makes it easier for Electrum forks to support alternate header formats such as AuxPoW.

Add prev_hash to cp_height checkpoint logic

b83998d

Load cp_height checkpoints from JSON file instead of constants.py

e2986ac

Use verifier.py's hash_merkle_root for checkpoints

85f872a

Move hash_merkle_root to blockchain.py to avoid circular imports.

JeremyRand force-pushed the cp-height-checkpoints branch from 723e9a9 to 85f872a Compare September 23, 2019 01:56

ecdsa closed this Sep 23, 2019

ecdsa reopened this Sep 26, 2019

spesmilo deleted a comment from chriswebstin Dec 1, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use cp_height for checkpoints #5147

Use cp_height for checkpoints #5147

JeremyRand commented Feb 22, 2019

JeremyRand commented Feb 22, 2019

JeremyRand commented Feb 22, 2019

JeremyRand commented Feb 24, 2019

SomberNight commented Feb 25, 2019

JeremyRand commented Feb 26, 2019

JeremyRand commented Feb 26, 2019

JeremyRand commented Feb 26, 2019

rt121212121 commented Feb 28, 2019

JeremyRand commented Mar 5, 2019

SomberNight commented Mar 9, 2019

JeremyRand commented Mar 13, 2019

JeremyRand commented Mar 21, 2019

JeremyRand commented Mar 29, 2019

JeremyRand commented Jun 7, 2019 •

edited

Loading

SomberNight Jun 17, 2019

JeremyRand Jun 22, 2019 •

edited

Loading

SomberNight Jun 17, 2019

JeremyRand Jun 22, 2019

SomberNight Jul 5, 2019 •

edited

Loading

JeremyRand Jul 5, 2019

JeremyRand commented Jul 5, 2019

JeremyRand commented Aug 2, 2019

JeremyRand commented Sep 20, 2019

JeremyRand commented Sep 23, 2019

ecdsa commented Sep 23, 2019

ecdsa commented Sep 23, 2019

JeremyRand commented Sep 23, 2019

JeremyRand commented Sep 23, 2019

ecdsa commented Sep 24, 2019

ecdsa commented Sep 24, 2019

SomberNight commented Sep 24, 2019

SomberNight commented Sep 25, 2019

JeremyRand commented Sep 25, 2019

ecdsa commented Sep 26, 2019

Use cp_height for checkpoints #5147

Are you sure you want to change the base?

Use cp_height for checkpoints #5147

Conversation

JeremyRand commented Feb 22, 2019

JeremyRand commented Feb 22, 2019

JeremyRand commented Feb 22, 2019

JeremyRand commented Feb 24, 2019

SomberNight commented Feb 25, 2019

JeremyRand commented Feb 26, 2019

JeremyRand commented Feb 26, 2019

JeremyRand commented Feb 26, 2019

rt121212121 commented Feb 28, 2019

JeremyRand commented Mar 5, 2019

SomberNight commented Mar 9, 2019

JeremyRand commented Mar 13, 2019

JeremyRand commented Mar 21, 2019

JeremyRand commented Mar 29, 2019

JeremyRand commented Jun 7, 2019 • edited Loading

SomberNight Jun 17, 2019

Choose a reason for hiding this comment

JeremyRand Jun 22, 2019 • edited Loading

Choose a reason for hiding this comment

SomberNight Jun 17, 2019

Choose a reason for hiding this comment

JeremyRand Jun 22, 2019

Choose a reason for hiding this comment

SomberNight Jul 5, 2019 • edited Loading

Choose a reason for hiding this comment

JeremyRand Jul 5, 2019

Choose a reason for hiding this comment

JeremyRand commented Jul 5, 2019

JeremyRand commented Aug 2, 2019

JeremyRand commented Sep 20, 2019

JeremyRand commented Sep 23, 2019

ecdsa commented Sep 23, 2019

ecdsa commented Sep 23, 2019

JeremyRand commented Sep 23, 2019

JeremyRand commented Sep 23, 2019

ecdsa commented Sep 24, 2019

ecdsa commented Sep 24, 2019

SomberNight commented Sep 24, 2019

SomberNight commented Sep 25, 2019

JeremyRand commented Sep 25, 2019

ecdsa commented Sep 26, 2019

JeremyRand commented Jun 7, 2019 •

edited

Loading

JeremyRand Jun 22, 2019 •

edited

Loading

SomberNight Jul 5, 2019 •

edited

Loading