Profile the performance of in memory trie for shard 2 #10877

bowenwang1996 · 2024-03-25T20:35:29Z

Previously, we concluded that with optimizations in runtime, the bottleneck of apply is mostly storage operations, which in memory trie should help a lot with. It would be good to understand how much we gain by enabling in memory trie for a shard and what the remaining performance bottlenecks are.

The text was updated successfully, but these errors were encountered:

tayfunelmas · 2024-03-29T17:27:21Z

Performed an initial set of measurements in apply_chunk latency. See the related dashboard here.
Enabled memtrie for shard 2 and 3, these spans indicate the time memtrie was enabled for these shards:

Observed 3x-10x reduction (will repeat with longer time and with more shared later).

I also observed long load times for memtrie:

Done loading trie from flat state, took 55.47568243s shard_uid=s3.v3
Done loading trie from flat state, took 207.43764391s shard_uid=s2.v3

Will look at what is going on there.

bowenwang1996 · 2024-04-01T15:38:01Z

I also observed long load times for memtrie:

cc @robin-near

tayfunelmas · 2024-04-02T01:12:58Z

Measured different phases of constructing the memory trie. Looks like the most of the time spent is coming from reading the flat state values from the database (not from the constructing of the trie in memory).

walnut-the-cat · 2024-04-04T21:00:03Z

@tayfunelmas , does long trie loading(from flat storage) and construction time impact performance? I was assuming this should be done before the next epoch behind the scene?

tayfunelmas · 2024-04-04T22:19:55Z

It is called in two cases:

On startup (eg. after a restart of the node). code This load blocks the overall startup (the state sync/catchup happens after that).
After catchup of a new shard. code This is done as part of catching up for the next epoch, so latency in this case should be less of a problem compared to the former case.

Besides, I found out that the latency is not only coming from the iterating over the flat-state over rocksDB. (example profiling view) It is also coming from the construction and hash computation of the memtrie; we encode/decode nodes (serialize/deserialize) between constructing the trie and computing hashes.
The part we need to investigate is whether the mechanism that is causing the slow down is also contributing to the latency post memtrie construction. Currently I am trying to find a way to speed up the loading of memtrie "without" changing this mechanism (which requires understanding why it was designed this way at the first place).

walnut-the-cat · 2024-04-05T16:57:27Z

On startup (eg. after a restart of the node). code This load blocks the overall startup (the state sync/catchup happens after that).

To make sure I understand the 'latency' here correctly, the latency here is about when a validator can participate in consensus mechanism, right? This latency shouldn't have anything to do with 'how long it will take for a validate to perform chunk generation/validation'.

tayfunelmas · 2024-04-05T21:23:29Z

Yes, once the memtrie is loaded, this latency will not contribute to the later operations such as block/chunk production or validation. In fact this load code is specific to one-off loading of the state and separate from the rest of the memtrie operations performed during block/chunk generation or validation.

staffik · 2024-04-05T21:46:27Z

We can move the memtrie loading part to a separate thread. I have a draft implementation. Will do it next week.

tayfunelmas · 2024-04-05T22:50:34Z

Do you want to move the entire load operation to a separate thread? How does it help? Assuming we are talking about the node startup, the rest of the functionality needs to wait for memtrie load anyways. I think we can parallelize certain parts of the load instead, for example, the hash computation can start earlier in parallel while the trie is being built (it is currently done after the tree is fully constructed). But not sure about having the entire loading in a different thread, I might be missing something.

staffik · 2024-04-06T08:12:04Z

I was thinking about catchup. Yes, for startup it might be hard to do it. For startup I thought that maybe we could start with regular trie, and load memtrie in background, but that's probably not possible as we might want the state to not change during the load.

bowenwang1996 assigned tayfunelmas Mar 25, 2024

bowenwang1996 added the A-benchmark Area: performance benchmarks label Mar 25, 2024

tayfunelmas mentioned this issue Mar 26, 2024

[ProjectTracking]: Stateless validation Mainnet Release near/near-one-project-tracking#46

Open

52 tasks

tayfunelmas added the A-stateless-validation Area: stateless validation label Apr 1, 2024

nagisa mentioned this issue Jun 24, 2024

Idea: ponder making wasmtime the primary backend #11660

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Profile the performance of in memory trie for shard 2 #10877

Profile the performance of in memory trie for shard 2 #10877

bowenwang1996 commented Mar 25, 2024

tayfunelmas commented Mar 29, 2024

bowenwang1996 commented Apr 1, 2024

tayfunelmas commented Apr 2, 2024 •

edited

Loading

walnut-the-cat commented Apr 4, 2024

tayfunelmas commented Apr 4, 2024 •

edited

Loading

walnut-the-cat commented Apr 5, 2024

tayfunelmas commented Apr 5, 2024

staffik commented Apr 5, 2024

tayfunelmas commented Apr 5, 2024

staffik commented Apr 6, 2024

Profile the performance of in memory trie for shard 2 #10877

Profile the performance of in memory trie for shard 2 #10877

Comments

bowenwang1996 commented Mar 25, 2024

tayfunelmas commented Mar 29, 2024

bowenwang1996 commented Apr 1, 2024

tayfunelmas commented Apr 2, 2024 • edited Loading

walnut-the-cat commented Apr 4, 2024

tayfunelmas commented Apr 4, 2024 • edited Loading

walnut-the-cat commented Apr 5, 2024

tayfunelmas commented Apr 5, 2024

staffik commented Apr 5, 2024

tayfunelmas commented Apr 5, 2024

staffik commented Apr 6, 2024

tayfunelmas commented Apr 2, 2024 •

edited

Loading

tayfunelmas commented Apr 4, 2024 •

edited

Loading