Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perf/compaction readahead #5875

Merged
merged 4 commits into from
Jul 31, 2023
Merged

Perf/compaction readahead #5875

merged 4 commits into from
Jul 31, 2023

Conversation

asdacap
Copy link
Contributor

@asdacap asdacap commented Jun 30, 2023

  • It seems that the 4k block size increases read iops requirement during snap sync.
  • This specify a readhead size for compaction of 32kB which reduce read iops during sync by about 4x at expense of some RAM.
  • Graph is before, 32K readahead, before, 16K readahead, before, 64k readahead, before.

Screenshot from 2023-06-30 21-40-40

Changes

  • Add option to specify compaction readahead.
  • Specify 32kb readahead for statedb.
  • Cleanup dbconfig a little.

Types of changes

What types of changes does your code introduce?

  • Optimization
  • Refactoring

Testing

Requires testing

  • Yes
  • No

If yes, did you write tests?

  • Yes
  • No

Notes on testing

  • Impact measure via collectd.

Requires documentation update

  • Yes
  • No

Requires explanation in Release Notes

  • Maybe?

User impact: Reduces iops requirement during snap sync.

@MarekM25
Copy link
Contributor

MarekM25 commented Jun 30, 2023

"at expense of some RAM"

We want to target 16 GB machines by default configs. Is it worth doing by default? I think Nethermind should work well with 16 GB Ram and the user should be able to do full pruning with 16 GB RAM. Is expense negligible? Is it still worth merging PR in this form?

@asdacap
Copy link
Contributor Author

asdacap commented Jun 30, 2023

Ok, what do you have in mind? The reduction in iops is significant.

@MarekM25
Copy link
Contributor

Ok, what do you have in mind? The reduction in iops is significant.

With default configs we want Nethermind to work well with 16 GB RAM (including CL on the machine). What's more, I want 16 GB users to be able to run full pruning with some memory budget. So I'm just asking if this PR is a step in the right direction. It depends on how big is the additional expense of RAM and if it is worth merging in this form

@asdacap
Copy link
Contributor Author

asdacap commented Jun 30, 2023

Ok, I'll just default it back to 0.

@MarekM25
Copy link
Contributor

Ok, I'll just default it back to 0.

An open question is (not in the scope of this PR):
Should we have a Nethermind config that works better on 32 GB RAM machines and by default support 16 GB?

@kamilchodola
Copy link
Contributor

We should have detailed metrics for such one but not from graphs perspective but how it impacts users:

  1. Sync time (where it can be improved, how much did it improved, what spec was used)
  2. BlockProcessing - if RAM is increased how will it work on 16GB RAM, 32GB RAM etc
  3. Is any change in DB size?
  4. Any improvement/degradation of JsonRPC because of that (bigger block may mean longer access time? Just guessing).

We need to fine tune and I believe we always had a target to work well on 16GB setup and ARMs so keep it in mind for sure.

@kamilchodola
Copy link
Contributor

@MarekM25 @asdacap
I think we should start thinking if this shouldn't be tackled from other side.
We can't make "default" work best on each setup because we have plenty of parameters which if not changed for some specific setups, it will break performance or worse - will not sync at all.
We should think of some tool which will auto choose values for just few (at the beginning) basic parameters which we can easily manage.

Make it even as a separate tool, let it select on which chain you plan to run a node, scan your VM/local PC and "generate" config like "mainnet_optimized.cfg" where we will just pick mainnet.cfg and apply few more params (like adjusted MemoryHint, RocksDB optimization, if NetworkBandwith is good enough - allow more threads for networking, select more peers instead of max 100 etc).

Then just add it to docs, if someone asks on support on how to optimize - send it and ask to generate config and copy it - and that is all.

@benaadams
Copy link
Member

We want to target 16 GB machines by default configs.

Could we do something like threads and CPU count where default picks based on available RAM; then you can override by hand to whatever setting?

@MarekM25
Copy link
Contributor

@kamilchodola @benaadams I think we're all on the same page that it would be nice to do something :) Let's discuss it next week!

@LukaszRozmej
Copy link
Member

LukaszRozmej commented Jul 3, 2023

The thing that it took > 16GB's is because it could. Could you test these changes on 16GB machine? (with CL on it). Maybe internal memory handling will still make it better.

@asdacap
Copy link
Contributor Author

asdacap commented Jul 3, 2023

Ah.. my cgroup setup is broken right now. Can't seems to limit memory..

@asdacap
Copy link
Contributor Author

asdacap commented Jul 4, 2023

Tested with 16GB cgroup. Does not sync reliably unfortunately.

@LukaszRozmej
Copy link
Member

Tested with 16GB cgroup. Does not sync reliably unfortunately.

Can you elaborate? It doesn't sync at all? Performance is varying?

@asdacap
Copy link
Contributor Author

asdacap commented Jul 4, 2023

A cgroup configured with 16GB for memory.max and 15GB for memory.high. (it'll slowdown a lot at 15GB and crash at 16GB). Lighthouse included in the cgroup. Around 70% of snap sync the GC rate increases significantly and it tend to hang. It might continue but a lot of peers dropped due to the hangs likely due to GC. However, its likely going to end up with DB corruption, which is a known issue with XFS under low memory limit. So I can now reproduce that again... which is nice? Seems to be the case even when disabling --Sync.TuneDbMode HeavyWrite, which is what I usually run with.

@LukaszRozmej
Copy link
Member

A cgroup configured with 16GB for memory.max and 15GB for memory.high. (it'll slowdown a lot at 15GB and crash at 16GB). Lighthouse included in the cgroup. Around 70% of snap sync the GC rate increases significantly and it tend to hang. It might continue but a lot of peers dropped due to the hangs likely due to GC. However, its likely going to end up with DB corruption, which is a known issue with XFS under low memory limit. So I can now reproduce that again... which is nice? Seems to be the case even when disabling --Sync.TuneDbMode HeavyWrite, which is what I usually run with.

can we try optimizing snap sync gc with either:
a) explicit GC?
b) objects reuse? (or arena's like dotnetty)

@asdacap
Copy link
Contributor Author

asdacap commented Jul 4, 2023

Seems like it can sync fine without lighthouse in the group. Lighthouse seems to take quite a significant chunk of RAM.

@asdacap
Copy link
Contributor Author

asdacap commented Jul 4, 2023

Most of the heavy operation in snap sync is in PatriciaTrie/TrieStore, trying to optimize snap sync for memory means trying to optimize PatriciaTrie/TrieStore.

@asdacap asdacap merged commit ec95054 into master Jul 31, 2023
61 checks passed
@asdacap asdacap deleted the perf/compaction-readahead branch July 31, 2023 10:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants