-
Notifications
You must be signed in to change notification settings - Fork 390
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Perf/compaction readahead #5875
Conversation
"at expense of some RAM" We want to target 16 GB machines by default configs. Is it worth doing by default? I think Nethermind should work well with 16 GB Ram and the user should be able to do full pruning with 16 GB RAM. Is expense negligible? Is it still worth merging PR in this form? |
Ok, what do you have in mind? The reduction in iops is significant. |
With default configs we want Nethermind to work well with 16 GB RAM (including CL on the machine). What's more, I want 16 GB users to be able to run full pruning with some memory budget. So I'm just asking if this PR is a step in the right direction. It depends on how big is the additional expense of RAM and if it is worth merging in this form |
Ok, I'll just default it back to 0. |
An open question is (not in the scope of this PR): |
We should have detailed metrics for such one but not from graphs perspective but how it impacts users:
We need to fine tune and I believe we always had a target to work well on 16GB setup and ARMs so keep it in mind for sure. |
@MarekM25 @asdacap Make it even as a separate tool, let it select on which chain you plan to run a node, scan your VM/local PC and "generate" config like "mainnet_optimized.cfg" where we will just pick mainnet.cfg and apply few more params (like adjusted MemoryHint, RocksDB optimization, if NetworkBandwith is good enough - allow more threads for networking, select more peers instead of max 100 etc). Then just add it to docs, if someone asks on support on how to optimize - send it and ask to generate config and copy it - and that is all. |
Could we do something like threads and CPU count where default picks based on available RAM; then you can override by hand to whatever setting? |
@kamilchodola @benaadams I think we're all on the same page that it would be nice to do something :) Let's discuss it next week! |
The thing that it took > 16GB's is because it could. Could you test these changes on 16GB machine? (with CL on it). Maybe internal memory handling will still make it better. |
Ah.. my cgroup setup is broken right now. Can't seems to limit memory.. |
Tested with 16GB cgroup. Does not sync reliably unfortunately. |
Can you elaborate? It doesn't sync at all? Performance is varying? |
A cgroup configured with 16GB for memory.max and 15GB for memory.high. (it'll slowdown a lot at 15GB and crash at 16GB). Lighthouse included in the cgroup. Around 70% of snap sync the GC rate increases significantly and it tend to hang. It might continue but a lot of peers dropped due to the hangs likely due to GC. However, its likely going to end up with DB corruption, which is a known issue with XFS under low memory limit. So I can now reproduce that again... which is nice? Seems to be the case even when disabling |
can we try optimizing snap sync gc with either: |
Seems like it can sync fine without lighthouse in the group. Lighthouse seems to take quite a significant chunk of RAM. |
Most of the heavy operation in snap sync is in PatriciaTrie/TrieStore, trying to optimize snap sync for memory means trying to optimize PatriciaTrie/TrieStore. |
Changes
Types of changes
What types of changes does your code introduce?
Testing
Requires testing
If yes, did you write tests?
Notes on testing
Requires documentation update
Requires explanation in Release Notes
User impact: Reduces iops requirement during snap sync.