New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
performance: cache the ID for Header entities #1279
performance: cache the ID for Header entities #1279
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1279 +/- ##
==========================================
+ Coverage 54.96% 55.01% +0.04%
==========================================
Files 501 501
Lines 31713 31739 +26
==========================================
+ Hits 17432 17461 +29
+ Misses 11925 11921 -4
- Partials 2356 2357 +1
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
This has fixes for unit tests that were failing due to race conditions that became apparent when the cache sped up the MakeID() call.
2b049fb
to
41b3fc4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion. Overall, I have strong concerns regarding the thread safety of this approach (see comments below). For merging this into master, I would need a rigorous argument why this is thread safe on all hardware platforms (based on the go memory model and potentially additional statements in the go developer forums).
Maybe I am missing something.
Furthermore, I feel we might be able to do much better, if we cache the IDs for all headers that are stored in memory, no just the one we most recently computed the ID for. There are many cases, where we traverse the chain of recent blocks, each time we receive a new one. Caching only the last header ID will not help because much, because we don't hit the same header repeatedly.
engine/execution/computation/computer/uploader/uploader_test.go
Outdated
Show resolved
Hide resolved
According to the testing, we were accessing the same header repeatedly, so I'm only checking the previous one. I suspect that looking further back in history would cause a performance issue, because it's a linear search, and each compare is about a 1000-byte binary comparison. How about this idea: what if, with each MakeID call, we also create a Murmur3 hash with the result from Rlp.encodeBytes(), and store that in a deque-based cache of limited length for ID searching ? |
99186f0
to
82358e7
Compare
I am go noob but I think this doesn't solve race conditions + getting I think what you need is more like this:
|
yes and no.. The only place I'm seeing reentrancy is during the unit tests. There's no reason to make a deep copy of h, because it is on the stack. For Header entity types, it appears from looking at the logs that the MakeID calls are not done across threads, and that they are sequential. What you're suggesting is to make a deep copy of the ID and a Header object. I can avoid the deep copy by using the mutex around the entire function. For other entity types, like Transaction, I would agree with you. If the functions involved with MakeID for Header were more intensely concurrent, I would also agree with you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These requested changes are addressed with the most recent commit.
Thanks for the answer. One more question, why not make It states that fingerprint is calculated on the immutable part of the header. So technically this will allow caching of all header IDs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the revisions. Some minor suggestions regarding the coding style.
Regarding the question whether to cache the ID for one or multiple headers:
- In my mind, your approach taken in this PR is very light-weight to address the most severe performance bottleneck. I think we were anyway viewing it as a temporary fix, right?
How about this idea: what if, with each MakeID call, we also create a Murmur3 hash with the result from Rlp.encodeBytes(), and store that in a deque-based cache of limited length for ID searching ?
Overall, I think your existing fix is totally fine as a temporary solution. Lets stick with that. Utilizing a cache seems to me a long-term solution
Yes; from my perspective, this is the long-term goal. But it is a non-trivial task, because nodes are operating in a malicious environment. We can't just send the ID of a block (or header) over the wire and the recipient trusts that this value is correct. Instead, the recipient would compute the ID value when receiving the block. Essentially we would need a transformation step from "untrusted data structure" to "validated data structure". Only the validated data structure should contain a pre-computed ID. A node can locally cache the validated data structure (including the pre-computed ID). To make the situation even more complex, it is established best practise (for blockchain systems) to only store the "untrusted data structure" in the node's data bases. When retrieving the block from the DB, the node would then re-compute the ID, which guarantees that the data has not been altered. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK.
Idle musings:
How often do we call this in contexts requiring deep equality, I wonder? Is it possible that we ferry around a header by reference far enough that we forget how to get its ID, rather than actually compare blocks from different source? In other terms, could we try pointer equality optimistically, and degrade to the full comparison when the pointers don't match?
@huitseeker @ramtinms I view this as a temporary solution. Hence, I am not concerned if we are not achieving max performance with this patch. I created an issue describing what I think a full solution should look like: https://github.com/dapperlabs/flow-go/issues/5853. It is going to be a bit work intensive, but hopefully not too bad 😅 |
…low/flow-go into jwinkler2083233/5841-cache-header-id
@@ -95,6 +95,8 @@ func Test_AsyncUploader(t *testing.T) { | |||
require.Equal(t, 3, callCount) | |||
}) | |||
|
|||
time.Sleep(1 * time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this added by accident?
mutexHeader.Lock() | ||
|
||
// unlock at the return | ||
defer mutexHeader.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite understand this change. Wouldn't this lock block two different threads computing IDs of two headers?
If engine A is calling header1.ID(), and engine B is calling header2.ID(), before this change, they can run concurrently, but with this change, one engine has to be blocked and wait until the other finishes its call. no?
* wip * update mutator tests * update epoch lockup tests * update committee leader selection tests * update comments and consolidated code a bit * update sampling of entities * minor polishing * adding the peer manager back to the staked AN which supports the unstaked AN; adding option to disable connection pruning * consolidated EECC logic in Protocol State mutator * changing DisableConnectionPruning to WithConnectionPruning * Update engine/collection/synchronization/engine.go Co-authored-by: Jordan Schalm <jordan@dapperlabs.com> * Replaced return with continue * added access-tests to local integration tests * Update state/protocol/badger/mutator.go * copy epoch status when in EECC * Replaced log for fatal * adjust numbers * Remove temp local cadence version * define new staked topic validator * Rename ExtractPeerID * Added test * address comments * Add new methods to identity provider * modify api * update fixed_provider * address some comments * move messagesigningid * Fix Middleware test * fix topic validator test * Move key translation to its own package, break circular dependency * Update libp2pUtils.go * Update contLoadGenerator.go * Update middleware_test.go * add error logging for topic validator unregister * rename GetIdentity -> Identity * Update middleware_test.go * fix lint * adding comments to the topic validator test (#1253) * adding a comment to middelware.Subscribe describing the new change to add topic validators for stake nodes * goimports * making middleware.topologyPeers private again * Update state/protocol/badger/mutator.go Co-authored-by: Leo Zhang <zhangchiqing@gmail.com> * fix formatting * Update engine.go * update Cadence to commit 283fa4b180bf44abe9b46455ef0028ab5cb04226 * makes time to live exportable * activates dns resolver on libp2p node * adds dns ttl to scaffold * adds dns cache ttl to base config * adds flag for dns node cache ttl * refactors scaffold with base config for dns cache * update Cadence to commit eeb3fa23d8754367f93e5f725acd7e7f50a4579b * adds dns cache for staked access node * Sync with latest cadence changes * adds dns cache for unstaked access node * refactors tests with resolver * fixes an error * Dump removed contract codes to a file * Fix nested type infering * Invert the storage cleanup flag * Refactor code * add separate fallback-explicit EpochForView method * refactors builder * refactors resolver invocations * Fix storage migration V5 tests * fix mocked function * Update to Cadence v0.19.0 * stripping out addresses from the root protocol snapshot for the unstaked AN * Fix fvm payloads getting skipped * Use cadence v0.19.0 * Use cadence v0.19.0 for intergration * Update storage migration V4 to latest cadence changes * Fix tests * Remove previous storage migration V4 tests * Fix linter errors * skipping all validations which involve flow.Identity for the unstaked AN * Removing unecessary fmt.Println * Fix --no-migration flag for state extract * Extracted core from proposal engine * Renamed proposal to compliance engine * Updated metrics. Updated compliaence engine for collection nodes to follow new design * fix bug with tracer sensitivity * add isSampled flag * minor fix * update mock files * mock update * clean up * clean up * hot fix * clean up * fix tests * fix tests * fix test * test fix * handle empty collection case * test fix * Fixed tests for core * Updated core implementation * Updated majority of engine tests * Fixed compilation, linted * Misc updated to godoc. Linted. Updated labels * update Cadence to commit 2759300ebf616214cf8a04287f4379bb666b7c98 * Introduced LifecycleManager into hotstuff event loop, compliance engines * Add block final state commitment * adding an extra skipNwAddressRelatedValidations parameter to skip validations which use network addresses for the unstaked access node * Update request_handler.go * add random delay to periodic peer update thread We are observing synchronized goroutine spikes causing consensus finalization issues on large committees on benchet, with the same period as this peer update process. This adds a random delay before spawning the thread which requests updates, so that updates are not synchronised. * update core-contracts version 0.7.7->0.7.9 This new version includes a critical bug fix for the FlowClusterQC contract * exit goroutine launched periodically while in initial delay * add test case for stopping during delay * add new contract contructor argument * update state commitments * tests * incorporating review comments * adding comments * fixing the integration test framework * minor update to goDoc * refactoring and removing the sentinel NetworkAddressError * Improve error handling in access node * Remove unnecessary error wrapping * uncommenting code to remove IDs from custers * reverting previous change to result ID verification * handle MarkFlagRequired errors * add testing.T to machine account fixture * add log hook test utility * extend machine account tests to check log warn cases * Updated error handling for consensus engine * adding unit test * Update engine_test.go * update Cadence to commit 05fdc6cb695dca1ccc17b26eccb1d123a2049556 * update Cadence to commit 3d1926c0c05a7a661e9fb192d0bba661afc0d4c1 * Jaeger version update * make tidy * don't process empty spans * go mod clean up * update Cadence to commit 16d477f6d7d63ae9048f955e5ec22f52337f915c * small optimization * [Verification] Adds telemetry for outstanding chunk data pack requests (#1290) * adds max chunk data pack attempts metric * adds set max chunk data pack attempts at requester to interface * generates mocks * wires in metrics * fixes TestCompleteRequestingUnsealedChunkLifeCycle * refactors a log * fixes TestRequestPendingChunkSealedBlock * fixes TestRequestPendingChunkSealedBlock_Hybrid * fixes TestDispatchingRequests_Hybrid( * adds new metric to demo examples * fixes tests * refactors metric to track next unsealed height chunk attempts * Update module/metrics/verification.go Co-authored-by: Leo Zhang <zhangchiqing@gmail.com> * refactors metrics name * refactors metrics name Co-authored-by: Leo Zhang <zhangchiqing@gmail.com> * fix timetracking * clean up exec node traces * cleanup * reduce nesting * performance: cache the ID for Header entities (#1279) * performance: cache the ID for Header entities * performance: identify unit tests that have race conditions. This has fixes for unit tests that were failing due to race conditions that became apparent when the cache sped up the MakeID() call. * performance: adjustments for race condition tests and thread-safety * performance: fix lint problem * Update compactor_test.go * Update compactor_test.go * Update cluster_switchover_test.go * Update compactor_test.go * Update cluster_switchover_test.go * Update engine_test.go * Update header.go * Update header.go * Update compactor_test.go * performance: adjusted this file for linter * trigger GitHub actions * trigger GitHub actions * trigger GitHub actions * trigger GitHub actions Co-authored-by: Jeff Winkler <jeff.winkler@dapperlabs.com> * lower emergency sealing threshold (#1301) * make tidy * Update model/flow/identifier.go Co-authored-by: Leo Zhang <zhangchiqing@gmail.com> * applying PR comments * refactor sampling metod * hot fix * hot fix * Revert "refactor sampling metod" This reverts commit edcd95b. * rename save State * cleanup unnecesary data * add chainID as a tag on root span * Update model/flow/identifier.go Co-authored-by: Leo Zhang <zhangchiqing@gmail.com> * fix matching engine (#1294) * hot fix * make tidy Co-authored-by: vishal <1117327+vishalchangrani@users.noreply.github.com> Co-authored-by: Ramtin M. Seraj <ramtinms@users.noreply.github.com> Co-authored-by: Jordan Schalm <jordan@dapperlabs.com> Co-authored-by: Alexander Hentschel <alex.hentschel@axiomzen.co> Co-authored-by: ramtinms <ramtin@axiomzen.co> Co-authored-by: Yurii Oleksyshyn <yuraolex@gmail.com> Co-authored-by: Misha Rybalov <gomisha> Co-authored-by: Supun Setunga <supun.setunga@gmail.com> Co-authored-by: Simon Zhu <simon.zsiyan@gmail.com> Co-authored-by: François Garillot <francois.garillot@dapperlabs.com> Co-authored-by: bors[bot] <26634292+bors[bot]@users.noreply.github.com> Co-authored-by: Kan Zhang <kan@axiomzen.co> Co-authored-by: Kay-Zee <Kay-Zee@users.noreply.github.com> Co-authored-by: Yahya <yahya@dapperlabs.com> Co-authored-by: SupunS <SupunS@users.noreply.github.com> Co-authored-by: Yahya Hassanzadeh <yhassanzadeh13@ku.edu.tr> Co-authored-by: Bastian Müller <bastian@axiomzen.co> Co-authored-by: Maks Pawlak <120831+m4ksio@users.noreply.github.com> Co-authored-by: Peter Argue <89119817+peterargue@users.noreply.github.com> Co-authored-by: Janez Podhostnik <janez.podhostnik@gmail.com> Co-authored-by: janezpodhostnik <janezpodhostnik@users.noreply.github.com> Co-authored-by: turbolent <turbolent@users.noreply.github.com> Co-authored-by: Jeff Winkler <jwinkler2083233@users.noreply.github.com> Co-authored-by: Jeff Winkler <jeff.winkler@dapperlabs.com> Co-authored-by: Janez Podhostnik <67895329+janezpodhostnik@users.noreply.github.com>
2704: Remove header caching r=SaveTheRbtz a=SaveTheRbtz This 1 item caching introduces global lock. Also code is quite brittle and can cause bugs if `Header` struct changes. This diff is no-op on Skylake: ``` name old us/tx new us/tx delta ComputeBlock/1/cols/16/txes-16 1.44k ± 0% 1.46k ± 3% ~ (p=0.099 n=9+10) ComputeBlock/1/cols/32/txes-16 1.37k ± 1% 1.38k ± 1% +0.60% (p=0.007 n=9+10) ComputeBlock/1/cols/64/txes-16 1.36k ± 1% 1.35k ± 0% -0.51% (p=0.014 n=10+8) ComputeBlock/1/cols/128/txes-16 1.34k ± 1% 1.34k ± 0% ~ (p=0.673 n=10+9) ComputeBlock/4/cols/16/txes-16 1.36k ± 2% 1.36k ± 1% ~ (p=0.237 n=10+10) ComputeBlock/4/cols/32/txes-16 1.34k ± 1% 1.34k ± 1% ~ (p=0.329 n=10+10) ComputeBlock/4/cols/64/txes-16 1.35k ± 1% 1.34k ± 2% ~ (p=0.117 n=9+10) ComputeBlock/4/cols/128/txes-16 1.36k ± 2% 1.35k ± 2% ~ (p=0.287 n=9+10) ComputeBlock/16/cols/16/txes-16 1.33k ± 1% 1.32k ± 1% ~ (p=0.643 n=10+10) ComputeBlock/16/cols/32/txes-16 1.32k ± 1% 1.32k ± 1% ~ (p=0.202 n=9+10) ComputeBlock/16/cols/64/txes-16 1.35k ± 1% 1.35k ± 1% ~ (p=0.957 n=10+10) ComputeBlock/16/cols/128/txes-16 1.32k ± 1% 1.32k ± 2% ~ (p=0.643 n=10+10) ``` And a 3% improvement on M1 Pro: ``` $ go test -tags relic -bench='BenchmarkComputeBlock' -count 10 -run='^$' ./engine/execution/computation/. ... $ benchstat old new name old us/tx new us/tx delta ComputeBlock/1/cols/16/txes-8 590 ± 3% 803 ±62% ~ (p=0.404 n=10+10) ComputeBlock/1/cols/32/txes-8 554 ± 2% 559 ± 2% +0.93% (p=0.043 n=9+8) ComputeBlock/1/cols/64/txes-8 557 ± 4% 538 ± 2% -3.46% (p=0.000 n=10+10) ComputeBlock/1/cols/128/txes-8 560 ± 1% 538 ± 2% -3.92% (p=0.000 n=7+9) ComputeBlock/4/cols/16/txes-8 558 ± 3% 539 ± 1% -3.51% (p=0.000 n=10+10) ComputeBlock/4/cols/32/txes-8 555 ± 1% 540 ± 1% -2.81% (p=0.000 n=10+10) ComputeBlock/4/cols/64/txes-8 556 ± 2% 538 ± 2% -3.13% (p=0.000 n=10+10) ComputeBlock/4/cols/128/txes-8 546 ± 1% 527 ± 2% -3.50% (p=0.000 n=10+10) ComputeBlock/16/cols/16/txes-8 552 ± 3% 533 ± 2% -3.44% (p=0.000 n=10+10) ComputeBlock/16/cols/32/txes-8 548 ± 1% 527 ± 2% -3.68% (p=0.000 n=9+10) ComputeBlock/16/cols/64/txes-8 548 ± 1% 530 ± 1% -3.34% (p=0.000 n=10+10) ComputeBlock/16/cols/128/txes-8 522 ± 1% 519 ± 2% ~ (p=0.327 n=9+9) ``` cc: `@zhangchiqing` since he was also looking at it here #1279 (comment) Co-authored-by: Alexey Ivanov <rbtz@dapperlabs.com>
No description provided.