New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make ETL file size configurable #6927
Make ETL file size configurable #6927
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, pending @joshieDo and conflicts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great! HeaderStage
also uses ETL, so it would require it as well.
Personally, I'd rather have a config shared by all stages, rather than per stage as it seems. wdyt @shekhirin
that makes sense, I agree. ETL doesn't depend on the stage workload type, and the config only depends on the available resources. |
Changes from @shekhirin Co-authored-by: Alexey Shekhirin <a.shekhirin@gmail.com>
@joshieDo Will do, but note that the current default size for these temporary folders is 100Mb so my change will increase it from 200 Mb total to 1Gi total. reth/crates/stages/src/stages/headers.rs Line 84 in b66f54d
|
…hared by all stages.
Made a refactoring, and created a dedicated config |
crates/stages/src/stages/headers.rs
Outdated
hash_collector: Collector::new(tempdir.clone(), etl_file_size), | ||
header_collector: Collector::new(tempdir, etl_file_size), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joshieDo what do you think if we do etl_file_size / 2
here, so the ETL file size from config means how much max disk space to allocate for ETL files per stage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, agree
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only documentation nits, LGTM otherwise!
book/run/config.md
Outdated
|
||
```toml | ||
[stages.etl] | ||
# The size of temporary file in bytes for ETL data collector. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because we do / 2
now, it's about the total size of ETL files that the node can allocate, and not just one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's about the total size of ETL files
it's not though, it's the threshold size before creating a new ETL file, important so memory isnt blown
Okay, I tried to rebase and hell broke lose :) |
high prio, so i'll be taking over the rebase and push it. thanks! |
thank you @joshieDo ! |
Thanks for taking this on @SozinM !! |
closes #6696