Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/slashing performance #4977

Closed
4 tasks
alexanderbez opened this issue Sep 1, 2019 · 8 comments
Closed
4 tasks

x/slashing performance #4977

alexanderbez opened this issue Sep 1, 2019 · 8 comments
Labels
C:x/slashing T: Performance Performance improvements

Comments

@alexanderbez
Copy link
Contributor

Summary

After performing some benchmarks with and without #4748 on the Cosmos Hub mainnet for blocks 50k-100k using a modified version of v0.34.7, certain execution paths offer glaringly obvious room for performance improvement.

Taking a look at the benchmark, we can see inter-block caching provides significant improvement during BeginBlock. However, we can see the following areas take a significant amount of computation time:

  • handleValidatorSignature
  • AllocateTokensToValidator

handleValidatorSignature

Computation time could improve if we modify how GetValidatorMissedBlockBitArray works. The overhead of storing the key at varying heights is probably high. We instead can store the arrays at fixed lengths (e.g. 100), where they would be kept in cache except every 100 blocks (credit @cwgoes).

AllocateTokensToValidator

I don't think there's too much we can tweak here as a bulk of the time is spent in amino and mutexes.

/cc @jackzampolin @tnachen


For Admin Use

  • Not duplicate issue
  • Appropriate labels applied
  • Appropriate contributors tagged
  • Contributor assigned/self-assigned
@alexanderbez alexanderbez added C:x/slashing T: Performance Performance improvements labels Sep 1, 2019
@jackzampolin
Copy link
Member

Sounds like the handleValidatorSignature changes are worth while and low hanging fruit.

@ValarDragon
Copy link
Contributor

Half of this time is coming from database stuff. I think something like: #2562, where we use our database more efficiently / significantly lower the tree depth should have some additional significant impact. (The cost for re-hashing is accounted for here, but is definitely a thing)

Additionally to structure this optimally, there is a 100x reduction in I/O by making one key for each bit-chunk for all validators, as that has to be iterated across for all validiators.

@alexanderbez alexanderbez self-assigned this Sep 19, 2019
@alexanderbez alexanderbez added this to the v0.38.0 milestone Oct 12, 2019
@alexanderbez alexanderbez mentioned this issue Oct 12, 2019
4 tasks
@alexanderbez alexanderbez removed this from the v0.38.0 milestone Nov 30, 2019
@dshulyak
Copy link
Contributor

I was curious to trace x/slashing and made a synthetic benchmark. Benchmark stubs slashing keeper with all required information for performing BeginBlocker and runs it for configurable number of validators. My results are a bit different from what was reported earlier, for example GetValidatorMissedBlockBitArray is barely noticeable, and i am sure that it is executed in every iteration of the BeginBlocker.

I will report my findings a bit later (reduced time of the BeginBlocker by >50% and allocations >40%) with fixing some low hanging issues. But I want to sanity check my benchmark with some other data, @alexanderbez do you have any suggestions?

@alexanderbez
Copy link
Contributor Author

Benchmarks were done against mainnet sync -- so it's real-world data. So there is probably something missing in your benchmarks.

If you'd like to help out with this issue, I'd recommend tweaking your benchmark to have it actually emulate real-world bottlenecks (i.e. be slow) and attempt to implement a solution such as the one outlined above. If you see improvements, we can go from there and review a PR 👍

@dshulyak
Copy link
Contributor

Yeah, you are right. I didn't notice previously that in my case GetValidatorMissedBlockBitArray aren't doing leveldb disk reads, whole iavl is in memory. Not so easy to go through those storage layers :). Will work on that at some point.

There are also some other improvements that worth doing, sliding window is unmarshalled from json several times for single validator. And it can be reused for all validators. Also visible in your profile.

Some improvements can be made in encoding, like no reason to encode boolean with a prefix. Same for public keys, but thats a bit more complicated.

@alexanderbez
Copy link
Contributor Author

What you can easily do the ensure proper benchmark is the following:

  • Take a mainnet export (from cosmoshub-2) and remove all but the slashing state (I can give you a file).
  • Have the benchmark read this file and call x/slashing InitGenesis on it. This will be the starting state of the benchmark.

@github-actions
Copy link
Contributor

github-actions bot commented Jul 5, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale label Jul 5, 2020
@tac0turtle tac0turtle added pinned and removed stale labels Jul 6, 2020
@alexanderbez alexanderbez removed their assignment Mar 21, 2022
@tac0turtle tac0turtle removed the pinned label May 9, 2022
@tac0turtle
Copy link
Member

closed via #15580

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C:x/slashing T: Performance Performance improvements
Projects
None yet
Development

No branches or pull requests

5 participants