Perf/compaction readahead #5875

asdacap · 2023-06-30T13:45:02Z

It seems that the 4k block size increases read iops requirement during snap sync.
This specify a readhead size for compaction of 32kB which reduce read iops during sync by about 4x at expense of some RAM.
Graph is before, 32K readahead, before, 16K readahead, before, 64k readahead, before.

Changes

Add option to specify compaction readahead.
Specify 32kb readahead for statedb.
Cleanup dbconfig a little.

Types of changes

What types of changes does your code introduce?

Optimization
Refactoring

Testing

Requires testing

Yes
No

If yes, did you write tests?

Yes
No

Notes on testing

Impact measure via collectd.

Requires documentation update

Yes
No

Requires explanation in Release Notes

Maybe?

User impact: Reduces iops requirement during snap sync.

MarekM25 · 2023-06-30T14:48:48Z

"at expense of some RAM"

We want to target 16 GB machines by default configs. Is it worth doing by default? I think Nethermind should work well with 16 GB Ram and the user should be able to do full pruning with 16 GB RAM. Is expense negligible? Is it still worth merging PR in this form?

asdacap · 2023-06-30T14:50:42Z

Ok, what do you have in mind? The reduction in iops is significant.

MarekM25 · 2023-06-30T14:56:12Z

Ok, what do you have in mind? The reduction in iops is significant.

With default configs we want Nethermind to work well with 16 GB RAM (including CL on the machine). What's more, I want 16 GB users to be able to run full pruning with some memory budget. So I'm just asking if this PR is a step in the right direction. It depends on how big is the additional expense of RAM and if it is worth merging in this form

asdacap · 2023-06-30T14:57:22Z

Ok, I'll just default it back to 0.

MarekM25 · 2023-06-30T15:01:26Z

Ok, I'll just default it back to 0.

An open question is (not in the scope of this PR):
Should we have a Nethermind config that works better on 32 GB RAM machines and by default support 16 GB?

kamilchodola · 2023-06-30T15:23:12Z

We should have detailed metrics for such one but not from graphs perspective but how it impacts users:

Sync time (where it can be improved, how much did it improved, what spec was used)
BlockProcessing - if RAM is increased how will it work on 16GB RAM, 32GB RAM etc
Is any change in DB size?
Any improvement/degradation of JsonRPC because of that (bigger block may mean longer access time? Just guessing).

We need to fine tune and I believe we always had a target to work well on 16GB setup and ARMs so keep it in mind for sure.

kamilchodola · 2023-06-30T15:28:26Z

@MarekM25 @asdacap
I think we should start thinking if this shouldn't be tackled from other side.
We can't make "default" work best on each setup because we have plenty of parameters which if not changed for some specific setups, it will break performance or worse - will not sync at all.
We should think of some tool which will auto choose values for just few (at the beginning) basic parameters which we can easily manage.

Make it even as a separate tool, let it select on which chain you plan to run a node, scan your VM/local PC and "generate" config like "mainnet_optimized.cfg" where we will just pick mainnet.cfg and apply few more params (like adjusted MemoryHint, RocksDB optimization, if NetworkBandwith is good enough - allow more threads for networking, select more peers instead of max 100 etc).

Then just add it to docs, if someone asks on support on how to optimize - send it and ask to generate config and copy it - and that is all.

benaadams · 2023-06-30T15:48:10Z

We want to target 16 GB machines by default configs.

Could we do something like threads and CPU count where default picks based on available RAM; then you can override by hand to whatever setting?

MarekM25 · 2023-06-30T18:58:24Z

@kamilchodola @benaadams I think we're all on the same page that it would be nice to do something :) Let's discuss it next week!

LukaszRozmej · 2023-07-03T10:42:40Z

The thing that it took > 16GB's is because it could. Could you test these changes on 16GB machine? (with CL on it). Maybe internal memory handling will still make it better.

asdacap · 2023-07-03T12:25:23Z

Ah.. my cgroup setup is broken right now. Can't seems to limit memory..

asdacap · 2023-07-04T07:53:47Z

Tested with 16GB cgroup. Does not sync reliably unfortunately.

LukaszRozmej · 2023-07-04T10:20:56Z

Tested with 16GB cgroup. Does not sync reliably unfortunately.

Can you elaborate? It doesn't sync at all? Performance is varying?

asdacap · 2023-07-04T11:53:03Z

A cgroup configured with 16GB for memory.max and 15GB for memory.high. (it'll slowdown a lot at 15GB and crash at 16GB). Lighthouse included in the cgroup. Around 70% of snap sync the GC rate increases significantly and it tend to hang. It might continue but a lot of peers dropped due to the hangs likely due to GC. However, its likely going to end up with DB corruption, which is a known issue with XFS under low memory limit. So I can now reproduce that again... which is nice? Seems to be the case even when disabling --Sync.TuneDbMode HeavyWrite, which is what I usually run with.

LukaszRozmej · 2023-07-04T14:58:10Z

A cgroup configured with 16GB for memory.max and 15GB for memory.high. (it'll slowdown a lot at 15GB and crash at 16GB). Lighthouse included in the cgroup. Around 70% of snap sync the GC rate increases significantly and it tend to hang. It might continue but a lot of peers dropped due to the hangs likely due to GC. However, its likely going to end up with DB corruption, which is a known issue with XFS under low memory limit. So I can now reproduce that again... which is nice? Seems to be the case even when disabling --Sync.TuneDbMode HeavyWrite, which is what I usually run with.

can we try optimizing snap sync gc with either:
a) explicit GC?
b) objects reuse? (or arena's like dotnetty)

asdacap · 2023-07-04T17:14:03Z

Seems like it can sync fine without lighthouse in the group. Lighthouse seems to take quite a significant chunk of RAM.

asdacap · 2023-07-04T17:15:08Z

Most of the heavy operation in snap sync is in PatriciaTrie/TrieStore, trying to optimize snap sync for memory means trying to optimize PatriciaTrie/TrieStore.

asdacap added 3 commits June 30, 2023 02:57

Enable compaction readahead for statedb

2c06097

Allow disabling by setting to 0

b90eff5

Whitespace

0d6e581

Disable by default

c30456a

LukaszRozmej approved these changes Jul 31, 2023

View reviewed changes

asdacap merged commit ec95054 into master Jul 31, 2023
61 checks passed

asdacap deleted the perf/compaction-readahead branch July 31, 2023 10:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Perf/compaction readahead #5875

Perf/compaction readahead #5875

asdacap commented Jun 30, 2023

MarekM25 commented Jun 30, 2023 •

edited

asdacap commented Jun 30, 2023

MarekM25 commented Jun 30, 2023

asdacap commented Jun 30, 2023

MarekM25 commented Jun 30, 2023

kamilchodola commented Jun 30, 2023

kamilchodola commented Jun 30, 2023

benaadams commented Jun 30, 2023

MarekM25 commented Jun 30, 2023

LukaszRozmej commented Jul 3, 2023 •

edited

asdacap commented Jul 3, 2023

asdacap commented Jul 4, 2023

LukaszRozmej commented Jul 4, 2023

asdacap commented Jul 4, 2023 •

edited

LukaszRozmej commented Jul 4, 2023

asdacap commented Jul 4, 2023

asdacap commented Jul 4, 2023

Perf/compaction readahead #5875

Perf/compaction readahead #5875

Conversation

asdacap commented Jun 30, 2023

Changes

Types of changes

What types of changes does your code introduce?

Testing

Requires testing

If yes, did you write tests?

Notes on testing

Requires documentation update

Requires explanation in Release Notes

MarekM25 commented Jun 30, 2023 • edited

asdacap commented Jun 30, 2023

MarekM25 commented Jun 30, 2023

asdacap commented Jun 30, 2023

MarekM25 commented Jun 30, 2023

kamilchodola commented Jun 30, 2023

kamilchodola commented Jun 30, 2023

benaadams commented Jun 30, 2023

MarekM25 commented Jun 30, 2023

LukaszRozmej commented Jul 3, 2023 • edited

asdacap commented Jul 3, 2023

asdacap commented Jul 4, 2023

LukaszRozmej commented Jul 4, 2023

asdacap commented Jul 4, 2023 • edited

LukaszRozmej commented Jul 4, 2023

asdacap commented Jul 4, 2023

asdacap commented Jul 4, 2023

MarekM25 commented Jun 30, 2023 •

edited

LukaszRozmej commented Jul 3, 2023 •

edited

asdacap commented Jul 4, 2023 •

edited