Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallelize loading from S3 for better performance #4

Merged
merged 6 commits into from
May 17, 2022

Conversation

vgrichina
Copy link
Contributor

@vgrichina vgrichina commented May 13, 2022

Note that this is relatively naive method which unnecessarily waits for one batch to complete before starting another batch load.

However it still seems to improve performance significantly.

Results in improvement of from about 2 blocks/second to 12+ blocks/second on my machine.

When measured at Hetzner box I use:

  • Batch size 1 ≈ 5 blocks/second
  • Batch size 10 ≈ 36 blocks/second
  • Batch size 20 ≈ 66 blocks/second
  • Batch size 50 ≈ 145 blocks/second
  • Batch size 100 ≈ 195 blocks/second
  • Batch size 200 ≈ 234 blocks/second

@vgrichina vgrichina requested a review from khorolets May 13, 2022 01:08
vgrichina added a commit to fastnear/fast-near that referenced this pull request May 13, 2022
It allows to build code when installing npm package from source
Copy link
Contributor

@frol frol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vgrichina Thanks for looking into it! Let's address the comments and merge it

src/types.ts Outdated Show resolved Hide resolved
@@ -21,12 +21,13 @@ import { normalizeBlockHeight, parseBody } from "./utils";
export async function listBlocks(
client: S3Client,
bucketName: string,
startAfter: BlockHeight
startAfter: BlockHeight,
limit = 10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's default to 200 since it will make less requests, and boosts the throughput quite a bit.

@khorolets Let's make this parameter configurable on Rust side as well and update the default to 200 (somehow we used 100 there, but choose to use 10 in JS version)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like 200 might be a bit suboptimal when dealing with more meaty blocks, might need experiments on user's side to tune – e.g. at block #46661963 it seems that 100 is working better

maybe just needs another change to avoid blocking for all of them to be loaded

src/streamer.ts Show resolved Hide resolved
Copy link
Member

@khorolets khorolets left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks!

P.S.: I've addressed the review suggestion from @frol

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants