Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ListValidatorBalances RPC is very slow for historic epochs (new state mgmt) #5851

Closed
peterbitfly opened this issue May 14, 2020 · 12 comments · Fixed by #5858
Closed

ListValidatorBalances RPC is very slow for historic epochs (new state mgmt) #5851

peterbitfly opened this issue May 14, 2020 · 12 comments · Fixed by #5858
Assignees
Labels
API Api related tasks Enhancement New feature or request

Comments

@peterbitfly
Copy link
Contributor

peterbitfly commented May 14, 2020

Version: latest master

Using the --enable-new-state-mgmt flag the node does not respond to the ListValidatorBalances RPC call for historic epochs.

For example querying the validator balances of all validators of epoch 5877 takes nearly 2 minutes on our production system:

INFO[0000] retrieving balances for epoch 5877            module=rpc
INFO[0120] retrieved data for 29220 validator balances for epoch 5877  module=rpc
@nisdas nisdas added the API Api related tasks label May 14, 2020
@nisdas nisdas added the Enhancement New feature or request label May 14, 2020
@terencechain
Copy link
Member

@ppratscher does not respond or responded but slow, took 2 mins?

2 mins is what we see as well. Unfortunately that's how long it takes to replay back all the blocks and generate the missing state. We'll need to optimize it

@peterbitfly
Copy link
Contributor Author

hm, it varies between 2-5 minutes which makes it nor really usable for a block explorer.

we now wiped the database and re-synced from scratch. after that the call responded withing a few seconds.

@terencechain
Copy link
Member

I see, I observed a bug which cause archived state to be incorrectly saved at every period. I will fix that today. Will keep you updated with this

@peterbitfly
Copy link
Contributor Author

Just tried a full resync again on latest master with --enable-new-state-mgmt and exporting historic data continues to be very slow. Currently using the new state management is not really an option for our explorer because of that.

@nisdas nisdas reopened this May 25, 2020
@peterbitfly
Copy link
Contributor Author

peterbitfly commented May 26, 2020

Did a new resync today with the latest master with --enable-new-state-mgmt. Most rpc calls seem to be performing well for historic data, except the ListValidatorBalances call:

time curl -s "localhost:9091/eth/v1alpha1/validators/balances?epoch=1550&pageSize=500&pageToken=1"
real	0m20.652s
user	0m0.011s
sys	0m0.004s

@peterbitfly
Copy link
Contributor Author

We did now a full test run of all rpc apis we use. The results are as follows:

user@prysm-1:~$ time curl -s -o /dev/null "http://localhost:9091/eth/v1alpha1/beacon/chainhead"
real	0m0.021s
user	0m0.015s
sys	0m0.001s

user@prysm-1:~$ time curl -s -o /dev/null "http://localhost:9091/eth/v1alpha1/validators/queue"
real	0m0.028s
user	0m0.007s
sys	0m0.004s

user@prysm-1:~$ time curl -s -o /dev/null "http://localhost:9091/eth/v1alpha1/beacon/attestations/pool"
real	0m0.016s
user	0m0.011s
sys	0m0.001s

user@prysm-1:~$ time curl -s -o /dev/null "http://localhost:9091/eth/v1alpha1/validators/balances?epoch=1550&pageSize=500"
real	0m8.205s
user	0m0.005s
sys	0m0.006s

user@prysm-1:~$ time curl -s -o /dev/null "http://localhost:9091/eth/v1alpha1/validators?epoch=1550&pageSize=500"
real	0m0.041s
user	0m0.005s
sys	0m0.007s

user@prysm-1:~$ time curl -s -o /dev/null "http://localhost:9091/eth/v1alpha1/beacon/committees?epoch=1550&pageSize=500"
real	0m8.022s
user	0m0.009s
sys	0m0.005s

user@prysm-1:~$ time curl -s -o /dev/null "http://localhost:9091/eth/v1alpha1/beacon/blocks?epoch=1550&pageSize=500"
real	0m0.070s
user	0m0.013s
sys	0m0.001s

user@prysm-1:~$ time curl -s -o /dev/null "http://localhost:9091/eth/v1alpha1/validators/participation?epoch=1550&pageSize=500"
real	0m8.072s
user	0m0.005s
sys	0m0.010s

user@prysm-1:~$ time curl -s -o /dev/null "http://localhost:9091/eth/v1alpha1/validators/assignments?epoch=1550&pageSize=500"
real	0m8.081s
user	0m0.010s
sys	0m0.004s

From what we can see validators/assignments, validators/participation, beacon/committees and validators/balances are slow.

@terencechain
Copy link
Member

@ppratscher expected, the default slots per archived point is 2048 slots which is not suited for beacon block explorer use cases. I recommend setting --slots-per-archived-point to 32

@prestonvanloon
Copy link
Member

@terencechain can you comment on why the archived point is 2048?
Which use case would benefit from that checkpoint threshold?
If 2048 slot checkpoints don't satisfy the needs of typical archival data access, what is the benefit of having these checkpoints on disk?

@terencechain
Copy link
Member

terencechain commented Jun 1, 2020

@prestonvanloon
2048 was never meant to satisfy archival usages. 2048 was only marketed to satisfy validator usages. 2048 has been preliminary choice for default to follow lighthouse. The value was proposed to the team and many others and no one objected to it:
https://lighthouse-book.sigmaprime.io/advanced_database.html

Although there's still optimizations and improvements. We definitely do not recommend 2048 for serving explorers and for archival usages. Even for lighthouse, it takes 6s to retrieve a state:
Screen Shot 2020-06-01 at 12 40 03 PM

Block explorers is recommended to use shorter intervals like 32/64/128.. etc

@prestonvanloon
Copy link
Member

OK thanks for the insight, maybe we can capture this in the docs page for recommended state checkpoint intervals? What do you think?

@terencechain
Copy link
Member

OK thanks for the insight, maybe we can capture this in the docs page for recommended state checkpoint intervals? What do you think?

I agree with this. Will open an issue to track this

@terencechain
Copy link
Member

Closing this and will track documentation:
prysmaticlabs/documentation#147

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Api related tasks Enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants