Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improvement: don't cache all data of header map in memory during IBD #2147

Merged
merged 1 commit into from
Jul 22, 2020

Conversation

yangby-cryptape
Copy link
Collaborator

@yangby-cryptape yangby-cryptape commented Jul 1, 2020

  • Open a temporary rocksdb (default is ${ckb_data_dir}/tmp/ckb-tmp-*) to store a part data of header map which exceeded the configured quota.

  • Why not use the same rocksdb as block data?

    • It is just temporary data, all data was destroyed when ckb shutdown, so the format of data can be changed in any commits, we don't have to do migrations on it.

    • I don't want to create columns, it requires migrations. Also too many columns will affect performance.

    • If I use current existed columns, then traverse all data will be difficult.

      At present, no data in chain database have key-prefix.
      For example, if we use key header-${header_hash} to save headers, key uncle-${uncle_hash} to save uncles, key block-${block_hash} to save blocks, then we can save them in a same column and we can traverse each of them use the key prefix (header-*, uncle-*, block-*) easily.
      But we save almost all data just via ${hash}, we can't know the type of data if we put them in a same column.

      So I can't use current existed columns without breaking the data traversing.

    • After IBD, there is about 450 MiB for current mainnet which height is about 2_200_000 with default rocksdb configuration (we can optimize the configuration later), but the size of header map is very small (in most of the time).
      So we can just delete the entire rocksdb, we don't care about how to compress data and free space for rocksdb.

  • I use two thresholds to control the balance of memory and disk:

    • The count of headers in memory is less than or equal to primary_limit.
    • If the rocksdb will be closed when the count header of headers in memory is less than primary_limit - backend_close_threshold and there is no headers in rocksdb.
  • I use a lot of panic, same as what we did in ckb-store. I trust the rocksdb and assume it always works well.

  • I wrote two simple serialization and deserialization functions, because I think it is a private data structure, there is no need to make it too public (move to ckb-types) or bring heavy serialization and deserialization dependencies into this crate.

  • Enable the stats feature to collect the statistics, the result of syncing mainnet from 0 to 2247420 is :

    2020-07-06 15:53:32.875 +08:00 NetworkRuntime TRACE ckb_sync::types::header_map::kernel_lru  Header Map Statistics
    >       | storage | length  |  limit  | contain |   select   | insert  | delete  |
    >       |---------+---------+---------+---------+------------+---------+---------|
    >       | primary |     4974|   300000|  2252392|   196946199|  4243176|  4238382|
    >       | backend |        0|    false|        0|           0|  1990779|  2457795|
    

@yangby-cryptape
Copy link
Collaborator Author

benchmark

@nervos-bot-user
Copy link
Collaborator

Benchmark Result

  • TPS: 350.48
  • Samples Count: 51
  • CKB Version: 63e1cc8
  • Instance Type: c5.xlarge
  • Instances Count: 3
  • Bench Type: 2in2out
  • CKB Logger Filter: info,ckb=debug

driftluo
driftluo previously approved these changes Jul 2, 2020
@doitian doitian added this to 👀 Awaiting review in CKB - Pull Requests Jul 6, 2020
@yangby-cryptape yangby-cryptape added the s:hold Status: Put this issue on hold. label Jul 6, 2020
@doitian
Copy link
Member

doitian commented Jul 6, 2020

It's better to default to the CKB data directory. Also expose the default in the generated config file.

@yangby-cryptape
Copy link
Collaborator Author

benchmark

@yangby-cryptape yangby-cryptape removed the s:hold Status: Put this issue on hold. label Jul 7, 2020
@yangby-cryptape
Copy link
Collaborator Author

@quake I re-push this PR and update the description, I try several strategies and compare the statistics, this is the easiest strategy, my original strategy is too complex but only a little benefit.

@nervos-bot-user
Copy link
Collaborator

Benchmark Result

  • TPS: 347.23
  • Samples Count: 51
  • CKB Version: d4c0ba5
  • Instance Type: c5.xlarge
  • Instances Count: 3
  • Bench Type: 2in2out
  • CKB Logger Filter: info,ckb=debug

sync/src/types/mod.rs Show resolved Hide resolved
sync/src/types/header_map/kernel_lru.rs Outdated Show resolved Hide resolved
@yangby-cryptape yangby-cryptape force-pushed the pr/reduce-header-map branch 2 times, most recently from 92bad00 to 0ad104b Compare July 9, 2020 00:14
quake
quake previously approved these changes Jul 9, 2020
driftluo
driftluo previously approved these changes Jul 9, 2020
CKB - Pull Requests automation moved this from 👀 Awaiting review to ✅ Reviewer approved Jul 9, 2020
@doitian
Copy link
Member

doitian commented Jul 22, 2020

@yangby-cryptape please resolve the conflicts

CKB - Pull Requests automation moved this from ✅ Reviewer approved to 👀 Awaiting review Jul 22, 2020
@yangby-cryptape
Copy link
Collaborator Author

@quake @driftluo @doitian Rebased develop onto this PR. No more updates.

CKB - Pull Requests automation moved this from 👀 Awaiting review to ✅ Reviewer approved Jul 22, 2020
@doitian
Copy link
Member

doitian commented Jul 22, 2020

bors r=quake,driftluo

@bors
Copy link
Contributor

bors bot commented Jul 22, 2020

Build succeeded:

  • continuous-integration/travis-ci/push

@bors bors bot merged commit 19dae26 into nervosnetwork:develop Jul 22, 2020
CKB - Pull Requests automation moved this from ✅ Reviewer approved to Done Jul 22, 2020
@yangby-cryptape yangby-cryptape deleted the pr/reduce-header-map branch September 11, 2020 06:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

None yet

5 participants