Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

account.Address: use RWLocker #8774

Merged
merged 6 commits into from
Mar 16, 2021
Merged

account.Address: use RWLocker #8774

merged 6 commits into from
Mar 16, 2021

Conversation

robert-zaremba
Copy link
Collaborator

Description

Use sync.RWMutex instead of sync.Mutex in account cache as suggested in #8767 (comment)


Before we can merge this PR, please make sure that all the following items have been
checked off. If any of the checklist items are not applicable, please leave them but
write a little note why.

  • Targeted PR against correct branch (see CONTRIBUTING.md)
  • Linked to Github issue with discussion and accepted design OR link to spec that describes this work.
  • Code follows the module structure standards.
  • Wrote unit and integration tests
  • Updated relevant documentation (docs/) or specification (x/<module>/spec/)
  • Added relevant godoc comments.
  • Added a relevant changelog entry to the Unreleased section in CHANGELOG.md
  • Re-reviewed Files changed in the Github PR explorer
  • Review Codecov Report in the comment section below once CI passes

@odeke-em
Copy link
Collaborator

odeke-em commented Mar 3, 2021 via email

@codecov
Copy link

codecov bot commented Mar 3, 2021

Codecov Report

Merging #8774 (dd84aa1) into master (42919c8) will increase coverage by 0.17%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #8774      +/-   ##
==========================================
+ Coverage   59.17%   59.34%   +0.17%     
==========================================
  Files         571      563       -8     
  Lines       31800    31256     -544     
==========================================
- Hits        18817    18550     -267     
+ Misses      10780    10551     -229     
+ Partials     2203     2155      -48     
Impacted Files Coverage Δ
types/address.go 66.66% <100.00%> (+0.65%) ⬆️
x/feegrant/simulation/decoder.go 0.00% <0.00%> (-100.00%) ⬇️
x/bank/types/msgs.go 53.91% <0.00%> (-42.97%) ⬇️
x/staking/types/msg.go 23.92% <0.00%> (-31.01%) ⬇️
x/gov/client/utils/utils.go 47.05% <0.00%> (-26.48%) ⬇️
x/distribution/types/msg.go 31.86% <0.00%> (-15.68%) ⬇️
x/gov/client/rest/query.go 19.31% <0.00%> (-6.82%) ⬇️
x/upgrade/types/plan.go 86.95% <0.00%> (-5.91%) ⬇️
x/auth/tx/service.go 69.35% <0.00%> (-3.51%) ⬇️
x/gov/client/cli/query.go 69.56% <0.00%> (-1.15%) ⬇️
... and 38 more

@robert-zaremba
Copy link
Collaborator Author

Thanks for insights @odeke-em . So on direct benchmark, sync.Mutex is 50% faster than syncRWMutex?
I think here, after some time we will have mostly reads. (especially with regards to consensus addresses). So probably RWMutex will still make more sense because we will rarely do a normal lock.

@odeke-em
Copy link
Collaborator

odeke-em commented Mar 4, 2021

This is why benchmarking code will answer those questions and quell assumptions that might not hold in practice. As I showed in that surprising benchmark, multiple readers and a single writer, yet memory and allocations were still high, an objective measure would be to check for performance before and after having written a carefully close to reality scenario for benchmarking.

@robert-zaremba
Copy link
Collaborator Author

My assumption for a real-world use case, that after the warm up we will have mostly (95+% ?) reads. @marbar3778 , @alessio - what do you think?

@tac0turtle
Copy link
Member

yea agree. Might as well do some quick benchmarks, doesnt hurt

@alessio
Copy link
Contributor

alessio commented Mar 4, 2021

Merging master in as tests are failing.

@alessio
Copy link
Contributor

alessio commented Mar 4, 2021

The race detector is failing

@orijbot
Copy link

orijbot commented Mar 14, 2021

@orijbot
Copy link

orijbot commented Mar 14, 2021

@@ -78,11 +78,11 @@ const (
var (
// AccAddress.String() is expensive and if unoptimized dominantly showed up in profiles,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unforunate we have to rely on singletons for the cache and mutexes :-/

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. Changing it would involve a big refactoring -> reducing package public functions and making everything managed by objects.

@robert-zaremba
Copy link
Collaborator Author

I made a benchmark based on the code Emanuel shared (https://gist.github.com/odeke-em/234bc88fbaf9e439b70a618ce6674659#file-aresults-md). I adjusted it to my CPU (4 cores) and here are the result:

cpu: Intel(R) Core(TM) i5-4300M CPU @ 2.60GHz
BenchmarkRWMutex-4       3968	    264872 ns/op    1393 B/op	      30 allocs/op
BenchmarkMutex-4         2596	    490315 ns/op    1163 B/op	      27 allocs/op

Also, we can expect that there will be way more reads than writes after a warmup. Decreasing writes will only bring more points to the RWMutex in the benchmark. So it make even more sense to use RWMutex.

Copy link
Member

@tac0turtle tac0turtle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK.

Thank you for doing the benchmarking. Do you want to add the code to the repo?

@odeke-em
Copy link
Collaborator

@robert-zaremba thanks for letting me know.
For proper benchmarking:

  • please run benchmarks between 5 to even 20 times and save each to a different file e.g. go test -run=^$ -bench=<SPECIFIC NAME or .> -count=10 > target
  • always use benchstat to compare deltas, otherwise folks have to mentally do Maths

The above steps help remove noise, and give a direct comparison and summary.

Also the benchmarks I provided were specific for a comparison. Did you adapt to code that uses the cosmos-sdk and this new code in specific usage?

@robert-zaremba robert-zaremba added A:automerge Automatically merge PR once all prerequisites pass. and removed A:automerge Automatically merge PR once all prerequisites pass. labels Mar 16, 2021
@robert-zaremba
Copy link
Collaborator Author

Here are aggregated stats, with multiple run.

  • old = normal mutex
  • new = rw mutex
name     old time/op    new time/op    delta
Mutex-4     414µs ±20%     252µs ± 2%  -39.25%  (p=0.000 n=10+9)

name     old alloc/op   new alloc/op   delta
Mutex-4    1.31kB ± 9%    1.28kB ± 3%     ~     (p=0.460 n=10+9)

name     old allocs/op  new allocs/op  delta
Mutex-4      28.8 ± 8%      28.6 ± 2%     ~     (p=0.665 n=10+9)

@robert-zaremba
Copy link
Collaborator Author

merging, the rw mutex is appropriate for this use case.

@robert-zaremba robert-zaremba added the A:automerge Automatically merge PR once all prerequisites pass. label Mar 16, 2021
@orijbot
Copy link

orijbot commented Mar 16, 2021

@mergify mergify bot merged commit 05d4e43 into master Mar 16, 2021
@mergify mergify bot deleted the robert/addr-rwmutex branch March 16, 2021 17:48
@orijbot
Copy link

orijbot commented Mar 16, 2021

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A:automerge Automatically merge PR once all prerequisites pass. T: Performance Performance improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants