-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallelize loading from S3 for better performance #4
Conversation
It contains improvements on loading speed near/near-lake-framework-js#4
It allows to build code when installing npm package from source
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vgrichina Thanks for looking into it! Let's address the comments and merge it
src/s3fetchers.ts
Outdated
@@ -21,12 +21,13 @@ import { normalizeBlockHeight, parseBody } from "./utils"; | |||
export async function listBlocks( | |||
client: S3Client, | |||
bucketName: string, | |||
startAfter: BlockHeight | |||
startAfter: BlockHeight, | |||
limit = 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's default to 200 since it will make less requests, and boosts the throughput quite a bit.
@khorolets Let's make this parameter configurable on Rust side as well and update the default to 200 (somehow we used 100 there, but choose to use 10 in JS version)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like 200 might be a bit suboptimal when dealing with more meaty blocks, might need experiments on user's side to tune – e.g. at block #46661963 it seems that 100 is working better
maybe just needs another change to avoid blocking for all of them to be loaded
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many thanks!
P.S.: I've addressed the review suggestion from @frol
Note that this is relatively naive method which unnecessarily waits for one batch to complete before starting another batch load.
However it still seems to improve performance significantly.
Results in improvement of from about 2 blocks/second to 12+ blocks/second on my machine.
When measured at Hetzner box I use: