HeaderSync optimization (#1372) #1400

garyyu · 2018-08-21T12:10:54Z

With this optimization for #1372, the HeaderSync procedure can be 200% faster than before on my mac air(early 2015), i.e, 3 minutes compared to 6 minutes before:

Aug 21 19:02:20.615 DEBG sync_state: sync_status: Initial -> HeaderSync { current_height: 0, highest_height: 61131 }
Aug 21 19:05:19.367 DEBG sync_state: sync_status: HeaderSync { current_height: 60698, highest_height: 61134 } -> TxHashsetDownload

ignopeverell · 2018-08-21T17:04:40Z

This is a great performance improvement! However I'd like to propose a couple changes:

It seems most of the new code is just about buffering reads instead of consuming straight from the TCP connection. Couldn't we get the same gain by using BufStream as described in wrap TcpStream in BufStream (buffered read and write) #1402 ?
If this is indeed the gain, we should do it for every message, not only headers.

garyyu · 2018-08-22T02:12:56Z

Please look my comment in #1402 , BufStream won't help anything.

Regarding:

It seems most of the new code is just about buffering reads instead of consuming straight from the TCP connection

Why you have this feeling? This new code's not related to any buffering reads.
Actually, this new code gave 2 changes:

Instead of receiving all 511 Headers and then go to process all Headers in one call, I split it into pieces, each piece containing 8 Headers and go to process immediately, then go to download next 8 Headers. I call this new procedure as streaming. This can avoid a long (about 1 second on my pc) processing time on 511 Headers at one call.
2nd change is in get_locator(), which you gave the comment and I need rework it.

garyyu · 2018-08-22T02:58:49Z

I just realized it's better to split get_locator() optimization from this PR, for the good rule of one PR only solve one problem.

Another reason is that get_locator() is security related, so better to give dedicated code review on it.

Will remove it from this PR and submit a new PR for it.

…for security review

ignopeverell · 2018-08-22T03:54:26Z

Sounds good, thank you for removing the get_locate changes! To summarize a gitter discussion:

Add a simpler method on Message that just returns the &'mut TcpStream so it can be read from protocol.rs
Move your code to a simple utility function in protocol.rs
Potentially simplify given that all headers have equal size
Remove the rustfmt changes from sync.rs as you're not touching it.

And thank you for the great work!

garyyu · 2018-08-22T12:44:11Z

Thanks for a good review!
Refactoring is finished according to above comments.

ignopeverell · 2018-08-23T01:19:44Z

Just meant to amend my comments first as I realized I made a mistake. Point 3 is actually incorrect as we allow Cuckoo sizes greater than 30 now. So the proof of work part of the header could be 30*42 bits, 31*42 bits, 32*42 bits, etc. Sorry for that.

p2p/src/protocol.rs

@@ -39,6 +41,60 @@ impl Protocol {
 	pub fn new(adapter: Arc<NetAdapter>, addr: SocketAddr) -> Protocol {
 		Protocol { adapter, addr }
 	}
+
+	/// Read the Headers Vec size from the underlying connection, and calculate the header_size of one Header
+	pub fn headers_header_size(conn: &mut TcpStream, msg_len: u64) -> Result<u64, Error> {


garyyu · 2018-08-23T09:25:06Z

Done for comments of moving out of impl Protocol and removing pub.

ignopeverell · 2018-08-23T17:19:35Z

Have you seen my previous comment? Looks like header size calculation will fail in presence of headers with different Cuckoo sizes.

garyyu · 2018-08-23T21:49:15Z

Just meant to amend my comments first as I realized I made a mistake. Point 3 is actually incorrect as we allow Cuckoo sizes greater than 30 now. So the proof of work part of the header could be 3042 bits, 3142 bits, 32*42 bits, etc. Sorry for that.

Sorry, I don't know why I thought that doesn't impact :) (it could because you firstly said simplify , and then you said made a mistake, so I took it as -1 * -1 = +1) . But indeed it does!

The Headers Vector with different size Header is totally out of control, both in this new code and in the old code, I can't imagine how possible to deserialize a Vector with each different size of element. I will read more code to confirm. If it's an existing issue, that will be a major bug and better to be handled in an independent new PR.

garyyu · 2018-08-23T22:49:12Z

The good thing: After reading Headers deserialization code, I confirm there's no problem for the old code, since Proof has its own deserialization code which can handle different Cuckoo size.

Now the bad thing: a very upset finding, we don't have a size field in BlockHeader structure! then, without a deserialization, we don't know the exact size of this BlockHeader. But, to split the Headers, I need to know the size of this BlockHeader before deserialization!

This fact could make this optimization not feasible anymore!

Any suggestion to this? perhaps it's too late to add size field to BlockHeader structure, which will be consensus breaking?

ignopeverell · 2018-08-24T00:19:28Z

It's too late and in theory not needed, as each header has enough information to get deserialized (the cuckoo size is in them).

One idea would be to read enough bytes for say Cuckoo34 headers. You'll overshoot, but then you can reuse the unused bytes in the next iteration, placing them first in the buffer.

garyyu · 2018-08-24T04:48:32Z

One idea would be to read enough bytes for say Cuckoo34 headers. You'll overshoot, but then you can reuse the unused bytes in the next iteration, placing them first in the buffer.

sounds a feasible solution. but :) a little bit ugly, and need rewrite a deserialize function for that.

Even I agree it could be too late to add size for BlockHeader, I still strongly propose a hard fork on Testnet3. Because it's not only for this minor optimization, also for another maybe more important use case: Block receiving.

When a Grin node receive a Block, currently we complete this whole block receiving from network, then start block validation, if find this received block already received before (or Header validation fail), we throw this block. That could be a hole for attacking. A better solution is to receive its Header firstly, than start a validation for this Header, if not needed, we don't need receive the bigger body part of this block.

Does this make sense to worth a hard fork?

And remember we have a security bug in #1358, which also need a hard fork.

Anyway, up to your decision:)

ignopeverell · 2018-08-24T18:51:00Z

You're arguing for header-first propagation. We already have compact blocks, which are pretty much the same, maybe better. By the way none of these are strictly hard forks (nothing forks), they're just breaking protocol changes that can lead to network partitioning if not properly handled. Also we only receive a single block/header in that cases, the total size is the same as the message size.

I may be mistaking but it doesn't seem the lack of header size requries requires rewriting a deserialization. You can just use the same deserialization, it'll do the right thing from a Reader as long as you feed it enough bytes.

garyyu · 2018-08-25T11:49:02Z

Complete the enhancement to support variable BlockHeader size in one Headers message, from Cuckoo30 to Cuckoo36.

core/src/core/block.rs

@@ -265,6 +299,20 @@ impl BlockHeader {
 	pub fn total_kernel_offset(&self) -> BlindingFactor {
 		self.total_kernel_offset
 	}
+
+	/// Ser size of this header
+	pub fn size_of_ser(&self) -> usize {


ignopeverell

Minor comment but looking good otherwise! Restarted the failing test to see if it's just a transient issue.

garyyu · 2018-08-26T01:05:54Z

Thanks for approving!

And function name has been changed.

The travis-ci panic at grin_chain::txhashset::extending could be a real report. But I can't reproduce on Mac, could only happen in Linux platform. I have to wait to tomorrow for Linux test since now only laptop with me, in a trip.

ignopeverell · 2018-08-26T17:47:22Z

I think you're running into an issue with the fast sync test that I fixed a few days ago. Can you merge master back to double-check?

garyyu · 2018-08-27T02:17:17Z

Linux test in local also OK.
And merged master into this PR, still fail in travis-ci, check the log and find:

Aug 27 01:18:06.852 DEBG (Server ID: Port 40000) Mining Cuckoo10 for max 60s on 86 @ 60 [ea49bda7].
Aug 27 01:18:06.853 INFO (Server ID: Port 40000) Found valid proof of work, adding block b6f5472f.
Aug 27 01:18:06.853 DEBG pipe: process_block b6f5472f at 60 with 0 inputs, 1 outputs, 1 kernels
Aug 27 01:18:06.885 INFO Rejected block b6f5472f at 60: Error { inner: stack backtrace:

Mining Cuckoo10

then check inner_mining_loop() and find:

/// The minimum acceptable sizeshift
pub fn min_sizeshift() -> u8 {
	let param_ref = CHAIN_TYPE.read().unwrap();
	match *param_ref {
		ChainTypes::AutomatedTesting => AUTOMATED_TESTING_MIN_SIZESHIFT,       <<<< i.e. 10
		ChainTypes::UserTesting => USER_TESTING_MIN_SIZESHIFT,      <<<< i.e. 16
		ChainTypes::Testnet1 => USER_TESTING_MIN_SIZESHIFT,         <<<< i.e. 16
		_ => DEFAULT_MIN_SIZESHIFT,        <<<< i.e. 30
	}
}

Please let me think out a solution to be compatible with this.

…koo10 will be used for AutomatedTesting chain

ignopeverell · 2018-08-28T19:54:12Z

Starting to lose track with other PRs looking at Travis failures. Are there still test regressions caused by this PR? I've restarted tests many times but they always fail (while master mostly passes).

hashmap · 2018-08-28T20:36:15Z

@ignopeverell on my machine I see a pr issue:

Aug 27 14:16:28.767 DEBG Client 127.0.0.1:12001 connection lost: Connection(Os { code: 104, kind: ConnectionReset, message: "Connection reset by peer" })
Aug 27 14:16:28.771 DEBG Client 127.0.0.1:12000 connection lost: Connection(Custom { kind: InvalidData, error: StringError("headers_streaming_body") })
Aug 27 14:16:28.771 WARN sync: no peers available, disabling sync
Aug 27 14:16:28.771 DEBG sync_state: sync_status: HeaderSync { current_height: 0, highest_height: 30 } -> NoSync

garyyu · 2018-08-28T22:51:42Z

@ignopeverell please let me finish #1434 firstly, to fix the boring false alarm of travis-ci. then will come back to check this PR again.

…ulate serialized_size_of_header

garyyu · 2018-08-30T23:20:51Z

Finally! the last fix works and pass the travis-ci test.
That mistake is caused by calculating serialized_size of block, which I was using hardcoded 42 instead of global::proofsize().

Now this PR is ready to merge.

BTW:
A remaining issue on test: that mistake should alway make test fail, it's strange that local test on my mac always passed! something must be wrong in the test, will find it with another PR.

ignopeverell · 2018-08-30T23:50:31Z

Indeed finally! Sorry it was so difficult, but thanks for another great PR!

garyyu added 2 commits August 21, 2018 19:51

improve: HeaderSync optimization (#1372)

b7654cd

rustfmt

dcacc55

remove get_locator() optimization, which should be an independent pr …

22f1492

…for security review

garyyu added 2 commits August 22, 2018 12:52

revert rustfmt for sync.rs

03d82e6

refactoring: move 'headers_streaming_body()' from Message to Protocol

b49dead

Merge branch 'master' into HeaderSync

be18bcb

ignopeverell reviewed Aug 23, 2018

View reviewed changes

garyyu added 3 commits August 23, 2018 17:11

move 2 headers utils functions out of Protocol, and remove 'pub'

7bda003

Merge branch 'HeaderSync' of github.com:garyyu/grin into HeaderSync

144f10d

Merge branch 'master' into HeaderSync

ced7065

garyyu added 2 commits August 25, 2018 19:33

support reading variable size of BlockHeader, from Cuckoo30 to Cuckoo36

ef8e1f5

Merge branch 'HeaderSync' of github.com:garyyu/grin into HeaderSync

468d764

ignopeverell reviewed Aug 25, 2018

View reviewed changes

core/src/core/block.rs Outdated

@@ -265,6 +299,20 @@ impl BlockHeader {

pub fn total_kernel_offset(&self) -> BlindingFactor {

self.total_kernel_offset

}

/// Ser size of this header

pub fn size_of_ser(&self) -> usize {

This comment was marked as spam.

Sign in to view

ignopeverell approved these changes Aug 25, 2018

View reviewed changes

modify the function name

2083659

Merge branch 'master' into HeaderSync

377867b

fix: use global::min_sizeshift() instead of hardcoded 30, because Cuc…

880dd4c

…koo10 will be used for AutomatedTesting chain

garyyu added 2 commits August 29, 2018 17:47

Merge branch 'master' into HeaderSync

0827026

Merge branch 'master' into HeaderSync

6106ae7

garyyu mentioned this pull request Aug 29, 2018

fix for unstable travis-ci test on servers module #1434

Merged

garyyu added 5 commits August 30, 2018 15:30

Merge branch 'master' into HeaderSync

853b7bb

add one line of log to see the exact problem

ecdc8f2

fix: should use global::proofsize() instead of hardcoded 42 when calc…

308d6f4

…ulate serialized_size_of_header

replace another 42 with global::proofsize()

bb1e9e8

Merge branch 'master' into HeaderSync

94f12b9

ignopeverell merged commit d719493 into mimblewimble:master Aug 30, 2018

garyyu mentioned this pull request Aug 31, 2018

HeaderSync Speedup #1372

Closed

garyyu deleted the HeaderSync branch September 16, 2018 13:21

garyyu mentioned this pull request Sep 26, 2018

header sync (small numbers of headers from a single peer) #1590

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HeaderSync optimization (#1372) #1400

HeaderSync optimization (#1372) #1400

garyyu commented Aug 21, 2018 •

edited

ignopeverell commented Aug 21, 2018

garyyu commented Aug 22, 2018

garyyu commented Aug 22, 2018

ignopeverell commented Aug 22, 2018

garyyu commented Aug 22, 2018

ignopeverell commented Aug 23, 2018 •

edited

This comment was marked as spam.

garyyu commented Aug 23, 2018

ignopeverell commented Aug 23, 2018

garyyu commented Aug 23, 2018

garyyu commented Aug 23, 2018

ignopeverell commented Aug 24, 2018

garyyu commented Aug 24, 2018

ignopeverell commented Aug 24, 2018

garyyu commented Aug 25, 2018

This comment was marked as spam.

ignopeverell left a comment

garyyu commented Aug 26, 2018

ignopeverell commented Aug 26, 2018

garyyu commented Aug 27, 2018 •

edited

ignopeverell commented Aug 28, 2018

hashmap commented Aug 28, 2018

garyyu commented Aug 28, 2018

garyyu commented Aug 30, 2018

ignopeverell commented Aug 30, 2018

HeaderSync optimization (#1372) #1400

HeaderSync optimization (#1372) #1400

Conversation

garyyu commented Aug 21, 2018 • edited

ignopeverell commented Aug 21, 2018

garyyu commented Aug 22, 2018

garyyu commented Aug 22, 2018

ignopeverell commented Aug 22, 2018

garyyu commented Aug 22, 2018

ignopeverell commented Aug 23, 2018 • edited

This comment was marked as spam.

garyyu commented Aug 23, 2018

ignopeverell commented Aug 23, 2018

garyyu commented Aug 23, 2018

garyyu commented Aug 23, 2018

ignopeverell commented Aug 24, 2018

garyyu commented Aug 24, 2018

ignopeverell commented Aug 24, 2018

garyyu commented Aug 25, 2018

This comment was marked as spam.

ignopeverell left a comment

Choose a reason for hiding this comment

garyyu commented Aug 26, 2018

ignopeverell commented Aug 26, 2018

garyyu commented Aug 27, 2018 • edited

ignopeverell commented Aug 28, 2018

hashmap commented Aug 28, 2018

garyyu commented Aug 28, 2018

garyyu commented Aug 30, 2018

ignopeverell commented Aug 30, 2018

garyyu commented Aug 21, 2018 •

edited

ignopeverell commented Aug 23, 2018 •

edited

garyyu commented Aug 27, 2018 •

edited