Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve (*MemCachedStore).Persist #2102

Merged
merged 3 commits into from
Aug 2, 2021
Merged

Improve (*MemCachedStore).Persist #2102

merged 3 commits into from
Aug 2, 2021

Conversation

roman-khimov
Copy link
Member

This is important in making our node performing the way it should. Now profiler data really means something and other optimizations will have more direct effect.

@roman-khimov roman-khimov added this to the v0.96.2 milestone Aug 1, 2021
@codecov
Copy link

codecov bot commented Aug 1, 2021

Codecov Report

Merging #2102 (b10aea3) into master (bbe4e9c) will increase coverage by 0.06%.
The diff coverage is 78.37%.

❗ Current head b10aea3 differs from pull request most recent head 443f3cf. Consider uploading reports for the commit 443f3cf to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2102      +/-   ##
==========================================
+ Coverage   84.08%   84.14%   +0.06%     
==========================================
  Files         288      288              
  Lines       27102    27113      +11     
==========================================
+ Hits        22788    22815      +27     
+ Misses       3064     3045      -19     
- Partials     1250     1253       +3     
Impacted Files Coverage Δ
pkg/core/blockchain.go 79.52% <50.00%> (-0.11%) ⬇️
pkg/core/storage/memcached_store.go 100.00% <100.00%> (ø)
pkg/services/notary/notary.go 92.85% <0.00%> (+0.79%) ⬆️
pkg/services/oracle/request.go 62.50% <0.00%> (+3.26%) ⬆️
pkg/services/oracle/oracle.go 86.48% <0.00%> (+9.00%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bbe4e9c...443f3cf. Read the comment docs.

Comment on lines 151 to 153
// tempstore.mem and tempstore.del are completely flushed now
// to tempstore.ps, so all KV pairs are the same and this
// substitution has no visible effects.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Onjy true if PutBatch didn't return nil.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If PutBatch fails we're toast.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth a comment then, we lose data without abitlity to retry, this is unexpected.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We actually can recover by merging two maps, so I'll improve it (although it'll end up with OOM anyway, but that was the behaviour before the change also).

Persist by its definition doesn't change MemCachedStore visible state, all KV
pairs that were acessible via it before Persist remain accessible after
Persist. The only thing it does is flushing of the current set of KV pairs
from memory to peristent store. To do that it needs read-only access to the
current KV pair set, but technically it then replaces maps, so we have to use
full write lock which makes MemCachedStore inaccessible for the duration of
Persist. And Persist can take a lot of time, it's about disk access for
regular DBs.

What we do here is we create new in-memory maps for MemCachedStore before
flushing old ones to the persistent store. Then a fake persistent store is
created which actually is a MemCachedStore with old maps, so it has exactly
the same visible state. This Store is never accessed for writes, so we can
read it without taking any internal locks and at the same time we no longer
need write locks for original MemCachedStore, we're not using it. All of this
makes it possible to use MemCachedStore as normally reads are handled going
down to whatever level is needed and writes are handled by new maps. So while
Persist for (*Blockchain).dao does its most time-consuming work we can process
other blocks (reading data for transactions and persisting storeBlock caches
to (*Blockchain).dao).

The change was tested for performance with neo-bench (single node, 10 workers,
LevelDB) on two machines and block dump processing (RC4 testnet up to 62800
with VerifyBlocks set to false) on i7-8565U.

Reference results (bbe4e9c):

Ryzen 9 5950X:
RPS     23616.969 22817.086 23222.378  ≈ 23218   ± 1.72%
TPS     23047.316 22608.578 22735.540  ≈ 22797   ± 0.99%
CPU %      23.434    25.553    23.848  ≈    24.3 ± 4.63%
Mem MB    600.636   503.060   582.043  ≈   562   ± 9.22%

Core i7-8565U:
RPS     6594.007 6499.501 6572.902  ≈ 6555   ± 0.76%
TPS     6561.680 6444.545 6510.120  ≈ 6505   ± 0.90%
CPU %     58.452   60.568   62.474    ≈ 60.5 ± 3.33%
Mem MB   234.893  285.067  269.081   ≈ 263   ± 9.75%

DB restore:
real    0m22.237s 0m23.471s 0m23.409s  ≈ 23.04 ± 3.02%
user    0m35.435s 0m38.943s 0m39.247s  ≈ 37.88 ± 5.59%
sys      0m3.085s  0m3.360s  0m3.144s  ≈  3.20 ± 4.53%

After the change:

Ryzen 9 5950X:
RPS     27747.349 27407.726 27520.210  ≈ 27558   ± 0.63%  ↑ 18.69%
TPS     26992.010 26993.468 27010.966  ≈ 26999   ± 0.04%  ↑ 18.43%
CPU %      28.928    28.096    29.105  ≈    28.7 ± 1.88%  ↑ 18.1%
Mem MB    760.385   726.320   756.118  ≈   748   ± 2.48%  ↑ 33.10%

Core i7-8565U:
RPS     7783.229 7628.409 7542.340  ≈ 7651   ± 1.60%  ↑ 16.72%
TPS     7708.436 7607.397 7489.459  ≈ 7602   ± 1.44%  ↑ 16.85%
CPU %     74.899   71.020   72.697  ≈   72.9 ± 2.67%  ↑ 20.50%
Mem MB   438.047  436.967  416.350  ≈  430   ± 2.84%  ↑ 63.50%

DB restore:
real    0m20.838s 0m21.895s 0m21.794s  ≈ 21.51 ± 2.71%  ↓ 6.64%
user    0m39.091s 0m40.565s 0m41.493s  ≈ 40.38 ± 3.00%  ↑ 6.60%
sys      0m3.184s  0m2.923s  0m3.062s  ≈  3.06 ± 4.27%  ↓ 4.38%

It obviously uses more memory now and utilizes CPU more aggressively, but at
the same time it allows to improve all relevant metrics and finally reach a
situation where we process 50K transactions in less than second on Ryzen 9
5950X (going higher than 25K TPS). The other observation is much more stable
block time, on Ryzen 9 it's as close to 1 second as it could be.
It doesn't make any sense, in some situations it leads to a number of
goroutines created that will Persist one after another (as we can't Persist
concurrently). We can manage it better in a single thread.

This doesn't change performance in any way, but somewhat reduces resource
consumption. It was tested neo-bench (single node, 10 workers, LevelDB) on two
machines and block dump processing (RC4 testnet up to 62800 with VerifyBlocks
set to false) on i7-8565U.

Reference (b9be892):

Ryzen 9 5950X:
RPS     27747.349 27407.726 27520.210  ≈ 27558   ± 0.63%
TPS     26992.010 26993.468 27010.966  ≈ 26999   ± 0.04%
CPU %      28.928    28.096    29.105  ≈    28.7 ± 1.88%
Mem MB    760.385   726.320   756.118  ≈   748   ± 2.48%

Core i7-8565U:
RPS     7783.229 7628.409 7542.340  ≈ 7651   ± 1.60%
TPS     7708.436 7607.397 7489.459  ≈ 7602   ± 1.44%
CPU %     74.899   71.020   72.697  ≈   72.9 ± 2.67%
Mem MB   438.047  436.967  416.350  ≈  430   ± 2.84%

DB restore:
real    0m20.838s 0m21.895s 0m21.794s  ≈ 21.51 ± 2.71%
user    0m39.091s 0m40.565s 0m41.493s  ≈ 40.38 ± 3.00%
sys      0m3.184s  0m2.923s  0m3.062s  ≈  3.06 ± 4.27%

Patched:

Ryzen 9 5950X:
RPS     27636.957 27246.911 27462.036  ≈ 27449   ±  0.71%  ↓ 0.40%
TPS     27003.672 26993.468 27011.696  ≈ 27003   ±  0.03%  ↑ 0.01%
CPU %      28.562    28.475    28.012  ≈    28.3 ±  1.04%  ↓ 1.39%
Mem MB    627.007   648.110   794.895  ≈   690   ± 13.25%  ↓ 7.75%

Core i7-8565U:
RPS     7497.210 7527.797 7897.532  ≈ 7641   ±  2.92%  ↓ 0.13%
TPS     7461.128 7482.678 7841.723  ≈ 7595   ±  2.81%  ↓ 0.09%
CPU %     71.559   73.423   69.005  ≈   71.3 ±  3.11%  ↓ 2.19%
Mem MB   393.090  395.899  482.264  ≈  424   ± 11.96%  ↓ 1.40%

DB restore:
real    0m20.773s 0m21.583s 0m20.522s  ≈ 20.96 ±  2.65%  ↓ 2.56%
user    0m39.322s 0m42.268s 0m38.626s  ≈ 40.07 ±  4.82%  ↓ 0.77%
sys      0m3.006s  0m3.597s  0m3.042s  ≈  3.22 ± 10.31%  ↑ 5.23%
Using bc.dao here is wrong, it can contain unpersisted data.
@roman-khimov roman-khimov modified the milestones: v0.97.0, v0.97.1 Aug 2, 2021
@roman-khimov roman-khimov merged commit dfc514e into master Aug 2, 2021
@roman-khimov roman-khimov deleted the store4 branch August 2, 2021 17:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants