Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Doomslug endorsement delay starts from the last endorsement, no… #3164

Merged
merged 1 commit into from
Aug 14, 2020

Conversation

SkidanovAlex
Copy link
Collaborator

@SkidanovAlex SkidanovAlex commented Aug 14, 2020

…t the head update, and test fixes

Making doomslug wait for 600ms from the last time the node sent an endorsement, not from the head update. Under load this will remove unnecessary idleness.
This change breaks catchup tests that do not use doomslug, because they rely on blocks being produced without skips and forks, and without doomslug and the wait not starting from the head update the timing is less predictable.
I address it by making the tests actually use doomslug (and upping the block prod times where necessary). Since they rely on having no skips and forks, there's no reason not to use doomslug.
For the only test that had two separate versions with and without doomslug, I removed the version without.

Separately and unrelatedly to the above, removed backtrace from the formatted client errors. On my (very beefy) machine fetching the backtrace takes 800ms, and on nayduck runners it was more than 2s. Since we format errors in many places (including responces to RPC, though that will change), such delays are not acceptable.
This fixes state_sync tests.

Fixes: #3139, #3129

Test plan:

For #3139: http://nayduck.eastus.cloudapp.azure.com:3000/#/run/88
For #3129: http://nayduck.eastus.cloudapp.azure.com:3000/#/run/90
Catchup tests are split accross many runs, but I ensured they are not flaky after this change.

…t the head update, and test fixes

Making doomslug wait for 600ms from the last time the node sent an endorsement, not from the head update. Under load this will remove unnecessary idleness.
This change breaks catchup tests that do not use doomslug, because they rely on blocks being produced without skips and forks, and without doomslug and the wait not starting from the head update the timing is less predictable.
I address it by making the tests actually use doomslug (and upping the block prod times where necessary). Since they rely on having no skips and forks, there's no reason not to use doomslug.
For the only test that had two separate versions with and without doomslug, I removed the version without.

Separately and unrelatedly to the above, removed backtrace from the formatted client errors. On my (very beefy) machine fetching the backtrace takes 800ms, and on nayduck runners it was more than 2s. Since we format errors in many places (including responces to RPC, though that will change), such delays are not acceptable.
This fixes state_sync tests.

Fixes: #3139, #3129

Test plan:
----------
For #3139: http://nayduck.eastus.cloudapp.azure.com:3000/#/run/88
For #3129: (still running, will update the description)
Catchup tests are split accross many runs, but I ensured they are not flaky after this change.
@gitpod-io
Copy link

gitpod-io bot commented Aug 14, 2020

@codecov
Copy link

codecov bot commented Aug 14, 2020

Codecov Report

Merging #3164 into master will increase coverage by 0.04%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3164      +/-   ##
==========================================
+ Coverage   87.42%   87.46%   +0.04%     
==========================================
  Files         212      212              
  Lines       41556    41552       -4     
==========================================
+ Hits        36330    36345      +15     
+ Misses       5226     5207      -19     
Impacted Files Coverage Δ
chain/chain/src/doomslug.rs 99.22% <100.00%> (+<0.01%) ⬆️
chain/chain/src/error.rs 50.40% <100.00%> (-1.51%) ⬇️
neard/tests/stake_nodes.rs 85.66% <0.00%> (-11.79%) ⬇️
neard/tests/rpc_nodes.rs 96.69% <0.00%> (-0.24%) ⬇️
chain/client/tests/process_blocks.rs 90.72% <0.00%> (-0.22%) ⬇️
chain/chain/src/store.rs 86.50% <0.00%> (-0.06%) ⬇️
chain/chain/src/chain.rs 89.05% <0.00%> (+0.10%) ⬆️
neard/src/runtime.rs 86.54% <0.00%> (+0.12%) ⬆️
chain/client/src/view_client.rs 68.48% <0.00%> (+0.18%) ⬆️
... and 9 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c9d37cf...20aea6b. Read the comment docs.

@bowenwang1996 bowenwang1996 merged commit 7028068 into master Aug 14, 2020
@bowenwang1996 bowenwang1996 deleted the doomslug_proper_wait branch August 14, 2020 14:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

test_catchup_sanity_blocks_produced is flaky
2 participants