[GOAL2-777] Catchup performance : avoid retrying unless previous block was retrieved #31

tsachiherman · 2019-06-14T19:09:19Z

Existing code would retry retrieving a block regardless of whether the previous block was retrieved successfully. While (functionally) it won't hurt to make the subsequent attempt, it generate excessive network traffic when the first block retrieval is being delayed.

We want to keep the happy path retrieve blocks as fast as possible, while slowing down once we ran into network failures.

tsachiherman · 2019-06-14T19:14:19Z

Just a small note - these changes are only optimization. I have no reason to believe that the existing network would not be able to sustain the expected load ( by order of magnitude )

Vervious · 2019-06-14T19:20:21Z

I am worried that this might adversely affect parallelism: consider the following scenario:

We call fetchAndWrite on the next 50 blocks; all the fetchers (in parallel) hit the same peer X for blocks r + 1, r + 2, ... r + 50; peer X happens to be misconfigured, so all of these requests fail.

But now, all fifty requests are serialized, waiting for seed loopbacks. Round r + 3 is waiting for r + 1, r+5 is waiting for r + 3, r + 50 is waiting for r + 48; because of one faulty peer, and catchup is no longer parallelized until all 50 blocks are fetched.

Is this a problem? Let me think about it some more.

zeldovich · 2019-06-14T19:21:17Z

I am worried that this might adversely affect parallelism: consider the following scenario:

We call fetchAndWrite on the next 50 blocks; all the fetchers (in parallel) hit the same peer X for blocks r + 1, r + 2, ... r + 50; X happens to be misconfigured, so all of these requests fail.

But now, all fifty requests are serialized, waiting for seed loopbacks, because of one faulty peer, and catchup is no longer parallelized until all 50 blocks are fetched.

Is this a problem? Let me think about it some more.

In this scenario, peer X will be taken out of the candidate peer list, and subsequent block fetches will be pipelined properly.

Vervious · 2019-06-14T19:24:02Z

In this scenario, peer X will be taken out of the candidate peer list, and subsequent block fetches will be pipelined properly.

The peer list is checked before we call client.GetBlockBytes, and it seems possible that we could hit the same peer in parallel until a request errors.

That being said, I looked at the code again, and we select peers randomly and look at how many active fetches there are. So probably it's ok.

New opcode, enum fields

avoid retrying unless previous block was retrieved.

11aca9b

tsachiherman requested review from zeldovich, Vervious, algoradam and Karmastic June 14, 2019 19:09

zeldovich approved these changes Jun 14, 2019

View reviewed changes

Vervious approved these changes Jun 14, 2019

View reviewed changes

derbear merged commit 7b2547b into algorand:master Jun 17, 2019

pzbitskiy pushed a commit to pzbitskiy/go-algorand that referenced this pull request Apr 6, 2020

Merge pull request algorand#31 from pzbitskiy/pavel/stf-teal

5ffb992

New opcode, enum fields

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GOAL2-777] Catchup performance : avoid retrying unless previous block was retrieved #31

[GOAL2-777] Catchup performance : avoid retrying unless previous block was retrieved #31

tsachiherman commented Jun 14, 2019

tsachiherman commented Jun 14, 2019

Vervious commented Jun 14, 2019 •

edited

Loading

zeldovich commented Jun 14, 2019

Vervious commented Jun 14, 2019 •

edited

Loading

[GOAL2-777] Catchup performance : avoid retrying unless previous block was retrieved #31

[GOAL2-777] Catchup performance : avoid retrying unless previous block was retrieved #31

Conversation

tsachiherman commented Jun 14, 2019

tsachiherman commented Jun 14, 2019

Vervious commented Jun 14, 2019 • edited Loading

zeldovich commented Jun 14, 2019

Vervious commented Jun 14, 2019 • edited Loading

Vervious commented Jun 14, 2019 •

edited

Loading

Vervious commented Jun 14, 2019 •

edited

Loading