-
Notifications
You must be signed in to change notification settings - Fork 743
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow performance of validator client PATCH API with hundreds of keys #3968
Comments
Summarising the other comments from that issue:
A few observations from me: The main work happens in lighthouse/validator_client/src/http_api/mod.rs Lines 638 to 643 in 38514c0
The main culprits which could be slow are
And writing the updated definitions to disk here: lighthouse/validator_client/src/initialized_validators.rs Lines 728 to 730 in 38514c0
Due to the increased CPU (rather than I/O or memory) I suspect Debug logs from your VC would also help @ricardolyn. They'd allow us to quickly check which branches are being taken, e.g. do we hit the efficient case here?
It's possible that @paulhauner has already done some optimisations of these routines for the validator manager (#3502) and we could merge that to improve performance. |
yes, we are.
I believe yes. However, Here are the logs when the validators were updated by the cronjob (the one that runs each 2 minutes) and the service being stopped (
it seems that while is doing that code branch on each PATCH request (aka |
As you may know, Teku have implemented a standard way of loading this information in bulk from a file or a remote URL, using the same JSON schema. The schema used is defined here: https://docs.teku.consensys.net/HowTo/Configure/Proposer-Configuration. In addition, the MEV-Boost project is also planning to use this approach to load relays per validator from a file or remote URL, as outlined in this Pull Request: flashbots/mev-boost#456. Currently, Lighthouse requires performing an HTTP request for each validator that needs updating, which may not scale well. Additionally, there is no GET endpoint to validate if the builder_proposals is enabled or not per validator. This setup complexity could be reduced by adopting the Teku and MEV Boost approach, which would allow Lighthouse to optimize the logic of handling bulk updates more efficiently and optimise the solution for the slow performance raised on this Github issue. Therefore, I recommend that Lighthouse consider adopting this standard approach for syncing fee recipient and builder proposal configuration. @michaelsproul could LH team consider this improvement? |
@paulhauner would make sense to implement something like in the comment above for the Validator Manager? thanks |
@paulhauner @michaelsproul - I think we at Consensys Staking have the capacity to fix this problem ourselves. Are you and your team open to us doing that? |
@beetrootkid That would be great! My preference would be to not add any more types of config file. So either improving the perf of the HTTP API, or adding re-loading for the existing validator definitions YAML |
…ary (#4126) ## Title Optimise `update_validators` by decrypting key cache only when necessary ## Issue Addressed Resolves [#3968: Slow performance of validator client PATCH API with hundreds of keys](#3968) ## Proposed Changes 1. Add a check to determine if there is at least one local definition before decrypting the key cache. 2. Assign an empty `KeyCache` when all definitions are of the `Web3Signer` type. 3. Perform cache-related operations (e.g., saving the modified key cache) only if there are local definitions. ## Additional Info This PR addresses the excessive CPU usage and slow performance experienced when using the `PATCH lighthouse/validators/{pubkey}` request with a large number of keys. The issue was caused by the key cache using cryptography to decipher and cipher the cache entities every time the request was made. This operation called `scrypt`, which was very slow and required a lot of memory when there were many concurrent requests. These changes have no impact on the overall functionality but can lead to significant performance improvements when working with remote signers. Importantly, the key cache is never used when there are only `Web3Signer` definitions, avoiding the expensive operation of decrypting the key cache in such cases. Co-authored-by: Maksim Shcherbo <max.shcherbo@consensys.net>
Closed by #4126! Please open a new issue if there are performance issues after the most recent change |
…ary (sigp#4126) ## Title Optimise `update_validators` by decrypting key cache only when necessary ## Issue Addressed Resolves [sigp#3968: Slow performance of validator client PATCH API with hundreds of keys](sigp#3968) ## Proposed Changes 1. Add a check to determine if there is at least one local definition before decrypting the key cache. 2. Assign an empty `KeyCache` when all definitions are of the `Web3Signer` type. 3. Perform cache-related operations (e.g., saving the modified key cache) only if there are local definitions. ## Additional Info This PR addresses the excessive CPU usage and slow performance experienced when using the `PATCH lighthouse/validators/{pubkey}` request with a large number of keys. The issue was caused by the key cache using cryptography to decipher and cipher the cache entities every time the request was made. This operation called `scrypt`, which was very slow and required a lot of memory when there were many concurrent requests. These changes have no impact on the overall functionality but can lead to significant performance improvements when working with remote signers. Importantly, the key cache is never used when there are only `Web3Signer` definitions, avoiding the expensive operation of decrypting the key cache in such cases. Co-authored-by: Maksim Shcherbo <max.shcherbo@consensys.net>
…ary (sigp#4126) ## Title Optimise `update_validators` by decrypting key cache only when necessary ## Issue Addressed Resolves [sigp#3968: Slow performance of validator client PATCH API with hundreds of keys](sigp#3968) ## Proposed Changes 1. Add a check to determine if there is at least one local definition before decrypting the key cache. 2. Assign an empty `KeyCache` when all definitions are of the `Web3Signer` type. 3. Perform cache-related operations (e.g., saving the modified key cache) only if there are local definitions. ## Additional Info This PR addresses the excessive CPU usage and slow performance experienced when using the `PATCH lighthouse/validators/{pubkey}` request with a large number of keys. The issue was caused by the key cache using cryptography to decipher and cipher the cache entities every time the request was made. This operation called `scrypt`, which was very slow and required a lot of memory when there were many concurrent requests. These changes have no impact on the overall functionality but can lead to significant performance improvements when working with remote signers. Importantly, the key cache is never used when there are only `Web3Signer` definitions, avoiding the expensive operation of decrypting the key cache in such cases. Co-authored-by: Maksim Shcherbo <max.shcherbo@consensys.net>
Originally posted by @ricardolyn in #3795 (comment)
The text was updated successfully, but these errors were encountered: