Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add prune beacon chain feature #3908

Merged
merged 2 commits into from
Jan 20, 2022

Conversation

LuttyYang
Copy link
Contributor

Issue

harmony-one/bounties#84

Operational Checklist

  1. Does this PR introduce backward-incompatible changes to the on-disk data structure and/or the over-the-wire protocol?. (If no, skip to question 8.)

    NO

  2. Does this PR introduce backward-incompatible changes NOT related to on-disk data structure and/or over-the-wire protocol? (If no, continue to question 11.)

    NO

  3. Does this PR introduce significant changes to the operational requirements of the node software, such as >20% increase in CPU, memory, and/or disk usage?

    NO

@LuttyYang LuttyYang force-pushed the pruned_validator_node_disk_space branch from c5e158d to fce50c2 Compare October 26, 2021 05:40
@LuttyYang LuttyYang requested review from rlan35 and LeoHChen and removed request for LeoHChen October 26, 2021 05:43
Copy link
Contributor

@LeoHChen LeoHChen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please fix CI errors, and present test results

core/blockchain.go Outdated Show resolved Hide resolved
@JackyWYX
Copy link
Contributor

JackyWYX commented Oct 28, 2021

This PR change looks does not have a large impact in storage size. From my understanding, the major contributor to the storage is actually the state. Is there any benchmark of this PR to metric how many storage we are saving on mainnet?

@LuttyYang LuttyYang force-pushed the pruned_validator_node_disk_space branch from 64f3442 to 857f72c Compare October 31, 2021 06:01
@LuttyYang
Copy link
Contributor Author

LuttyYang commented Oct 31, 2021

@JackyWYX You are right, the major contributor to the storage is validator snapshot (in testnet shard 1 beacon chain, it almost 76%, mainnet snapshot is damaged, calculation later)

I added prune validator snapshot, tx and cx feature, But this requires @rlan35 to evaluate whether it is safe to delete.

testnet shard 1 beacon chain data constitute(yellow line is prune data, almost 90% size)
image

@LuttyYang
Copy link
Contributor Author

LuttyYang commented Oct 31, 2021

In testnet test: db size from 43GB to 8.7GB

image

test video(at the beginning of prune), You can see the disk usage is slowly decreasing

Screencast.2021-11-02.00.12.47.mp4

@rlan35
Copy link
Contributor

rlan35 commented Nov 1, 2021

@JackyWYX You are right, the major contributor to the storage is validator snapshot (in testnet shard 1 beacon chain, it almost 76%, mainnet snapshot is damaged, calculation later)

I added prune validator snapshot, tx and cx feature, But this requires @rlan35 to evaluate whether it is safe to delete.

testnet shard 1 beacon chain data constitute(yellow line is prune data, almost 90% size) image

Testnet doesn't have many transactions and testnet epoch is every 4 hours. So you will see the validatorSnapshot taking more spaces. But in mainnet. epoch is every 18 hours and there is lots of txns. So I doubt the situation will be a lot different there. Do you mind analyzing the storage cost for mainnet instead?

core/blockchain.go Outdated Show resolved Hide resolved
cmd/harmony/main.go Outdated Show resolved Hide resolved
@LuttyYang
Copy link
Contributor Author

LuttyYang commented Nov 2, 2021

@rlan35

You are rignt! We can see situation will be a lot different there.

mainnet shard 1 beacon chain data constitute(yellow line is prune data, almost 60% size)

image

And we can see I see a lot of unknown (other) data in the database, The length of their keys is 32 bytes, I haven't found their source in the code yet. I guess they are a transactions data, But when I hex them, I can’t find these transactions.

Here are some samples:
Keys Hex: 6ab3de58182b6dcf0c8b31253391448c7935ccc59243302f925164c2155c3094
Value Hex: f39f3540171b6c0c960b71a7020d9f60077f6af931a8bbf590da0223dacf75c7af929114d4596f5b8cb26d1bad2cdd431e51d651

Keys Hex: 4c8eef247a6311b6e7856e427dda2a996f61bce864544f99711e57a8a8bd930e
Value Hex: f9013180a004d64835b820db52817e07de5fb7c3b3ae270530f565786fdbaedc40cea4fab4a0fc92b3b82ca72aec1ddac408475526a96e2d91b780978a132d02c14dbe620b5da0e10d74b13bf4e71631f0e0e727af875787310406ff5030fdbd7bab80e8bf7ada808080a035be599177102c05d6871a732494d274258a44b9f6c60cb1bb2e672099e0f9c8a06389d0f4f2dfc45dd7c0d09fa221d90f760b2f98e41aa77715fec5080618269680a062857465df814e3f47663c3a4adebc05813217266a77615e875a0375af5ae2afa0707a1a18b56c87f75649155acea750f357e4c9a18bbfe5db5202e46734a4b7bb8080a016e154a3f57f49ee4fd430465c2b9db9a9d905e2a5b93afa47074e0ff7a72316a0f80154a8d5b2a94bb815f1dbb2a7d9ca4fa79ea02166cbee3b5f56aba5f39fd980

Keys Hex: 57b4246adf4b54e875dc93eab4110248d0567794a700b92781015af4efb29ee5
Value Hex: f90211a08e778cd54bc7ba0d8ffc3c38969b704da48c62e27273f70fae3176feb28df708a05d9760eb6ef776e266ecd1da9740bc76bbc0fe1384b74728919bbe075299ec6ea0f46d9c99a2f3eae5d7dcd89817be6162e6011dd0a70df1fa757f0538fa7c0822a05ba0429828e80db268d016202ea6db9d4decb2d2297bc68f620aee8cfbd47d28a0b336865f8bb1228cc63778962d3b65f3f923273206e089d5c7bc9fcc51b69f80a053abf54a189e1f03067778456628f284324ac80d33db997e9573ee81ecdb6ccaa0358f56aa7deebac13ce473e48e654c475e9a605337e608c09b9b9ceae972795da084ddbf05b94e8e37f8fe6520889c844546634853245b533e3400400939cbba6ca068648ba8e09cd3e08c395774ac7d22b22bf8c86730397ef20e948e03fb4a70c3a03cd32572f2eb6a3246006e7f406e72dcfd1604772d1ceb66f016857e575b8c53a0227703cbed7f8faec7362c884a179301ad999d80095f4eb78b340136d6437fafa05d39cd0083c712ad87072610ab94f34eaf4b7f9117a0aae802650cafd63d75a1a0baeac6ad013c36ea449b17ea099645f7559522ea4d564922c76768360e2ef85aa0700876751332561790f298106e2448f67e68b6e269889c7670d12bc43c604555a050e55e16459e614cb749feea79301d5add78fb43a5d7c46e22dffa630fbb50c3a07a0b63459c2e5092e85eb4c41ad34aa48ea2179d5edebacbb3ad29527746508d80

These data make our compact require more IO and longer time, Do you know what these are

@rlan35
Copy link
Contributor

rlan35 commented Nov 2, 2021

@rlan35

You are rignt! We can see situation will be a lot different there.

mainnet shard 1 beacon chain data constitute(yellow line is prune data, almost 60% size)

image

And we can see I see a lot of unknown (other) data in the database, The length of their keys is 32 bytes, I haven't found their source in the code yet. I guess they are a transactions data, But when I hex them, I can’t find these transactions.

Here are some samples: Keys Hex: 6ab3de58182b6dcf0c8b31253391448c7935ccc59243302f925164c2155c3094 Value Hex: f39f3540171b6c0c960b71a7020d9f60077f6af931a8bbf590da0223dacf75c7af929114d4596f5b8cb26d1bad2cdd431e51d651

Keys Hex: 4c8eef247a6311b6e7856e427dda2a996f61bce864544f99711e57a8a8bd930e Value Hex: f9013180a004d64835b820db52817e07de5fb7c3b3ae270530f565786fdbaedc40cea4fab4a0fc92b3b82ca72aec1ddac408475526a96e2d91b780978a132d02c14dbe620b5da0e10d74b13bf4e71631f0e0e727af875787310406ff5030fdbd7bab80e8bf7ada808080a035be599177102c05d6871a732494d274258a44b9f6c60cb1bb2e672099e0f9c8a06389d0f4f2dfc45dd7c0d09fa221d90f760b2f98e41aa77715fec5080618269680a062857465df814e3f47663c3a4adebc05813217266a77615e875a0375af5ae2afa0707a1a18b56c87f75649155acea750f357e4c9a18bbfe5db5202e46734a4b7bb8080a016e154a3f57f49ee4fd430465c2b9db9a9d905e2a5b93afa47074e0ff7a72316a0f80154a8d5b2a94bb815f1dbb2a7d9ca4fa79ea02166cbee3b5f56aba5f39fd980

Keys Hex: 57b4246adf4b54e875dc93eab4110248d0567794a700b92781015af4efb29ee5 Value Hex: f90211a08e778cd54bc7ba0d8ffc3c38969b704da48c62e27273f70fae3176feb28df708a05d9760eb6ef776e266ecd1da9740bc76bbc0fe1384b74728919bbe075299ec6ea0f46d9c99a2f3eae5d7dcd89817be6162e6011dd0a70df1fa757f0538fa7c0822a05ba0429828e80db268d016202ea6db9d4decb2d2297bc68f620aee8cfbd47d28a0b336865f8bb1228cc63778962d3b65f3f923273206e089d5c7bc9fcc51b69f80a053abf54a189e1f03067778456628f284324ac80d33db997e9573ee81ecdb6ccaa0358f56aa7deebac13ce473e48e654c475e9a605337e608c09b9b9ceae972795da084ddbf05b94e8e37f8fe6520889c844546634853245b533e3400400939cbba6ca068648ba8e09cd3e08c395774ac7d22b22bf8c86730397ef20e948e03fb4a70c3a03cd32572f2eb6a3246006e7f406e72dcfd1604772d1ceb66f016857e575b8c53a0227703cbed7f8faec7362c884a179301ad999d80095f4eb78b340136d6437fafa05d39cd0083c712ad87072610ab94f34eaf4b7f9117a0aae802650cafd63d75a1a0baeac6ad013c36ea449b17ea099645f7559522ea4d564922c76768360e2ef85aa0700876751332561790f298106e2448f67e68b6e269889c7670d12bc43c604555a050e55e16459e614cb749feea79301d5add78fb43a5d7c46e22dffa630fbb50c3a07a0b63459c2e5092e85eb4c41ad34aa48ea2179d5edebacbb3ad29527746508d80

These data make our compact require more IO and longer time, Do you know what these are

These are all the defined keys for the data entry: https://github.com/harmony-one/harmony/blob/main/core/rawdb/schema.go#L31. Don't see any keys as plane 32 bytes hash. That's weird.

core/blockchain.go Outdated Show resolved Hide resolved
@LuttyYang
Copy link
Contributor Author

LuttyYang commented Nov 3, 2021

@rlan35

Mainnet shard 1, This reduces almost 110GB of space.
image

There are still possibilities that can be prune, but first we need to know what unknown are

core/blockchain.go Outdated Show resolved Hide resolved
core/blockchain.go Outdated Show resolved Hide resolved
@rlan35
Copy link
Contributor

rlan35 commented Nov 10, 2021

Please make sure that the feature is turned off by default and only enabled if specifically configured. And also fix the Travis tests.

@LuttyYang
Copy link
Contributor Author

LuttyYang commented Nov 11, 2021

Please make sure that the feature is turned off by default and only enabled if specifically configured. And also fix the Travis tests.

@rlan35 It's turned off by default, Travis Has been successed

@LuttyYang
Copy link
Contributor Author

LuttyYang commented Nov 11, 2021

@rlan35

I think the unknown data is the Ethereum Merkle Patricia Trie Data

image

@JackyWYX JackyWYX self-requested a review November 13, 2021 05:24
core/blockchain_pruner.go Outdated Show resolved Hide resolved
core/blockchain_pruner.go Outdated Show resolved Hide resolved
core/blockchain_pruner.go Show resolved Hide resolved
core/blockchain_pruner.go Outdated Show resolved Hide resolved
core/blockchain_pruner.go Show resolved Hide resolved
internal/configs/harmony/harmony.go Outdated Show resolved Hide resolved
internal/params/config.go Outdated Show resolved Hide resolved
core/blockchain_pruner.go Outdated Show resolved Hide resolved
core/blockchain_pruner.go Outdated Show resolved Hide resolved
core/blockchain.go Outdated Show resolved Hide resolved
@JackyWYX
Copy link
Contributor

Welcome for discussion for the comments BTW.

@LuttyYang
Copy link
Contributor Author

LuttyYang commented Nov 16, 2021

@JackyWYX You can use the following of the code to verify that iteration is faster and simpler than mark in db.

package main

import (
	"encoding/binary"
	"github.com/syndtr/goleveldb/leveldb"
	"github.com/syndtr/goleveldb/leveldb/util"
	"os"
	"testing"
)

const (
	testDir = "test_db"
	testOnce = 1000
)

func keyName(i int) []byte {
	key := make([]byte, 4)
	binary.LittleEndian.PutUint32(key, uint32(i))

	return append([]byte{'H'}, key...)
}

func prepareDB(testTimes int) *leveldb.DB {
	db, err := leveldb.OpenFile(testDir, nil)
	if err != nil {
		panic(err)
	}

	// first, fill data
	for i := 0; i < testTimes; i++ {
		key := keyName(i)
		err := db.Put(key, key, nil)
		if err != nil {
			panic(err)
		}
	}

	return db
}

func clearDB(db *leveldb.DB) {
	err := db.Close()
	if err != nil {
		panic(err)
	}

	err = os.RemoveAll(testDir)
	if err != nil {
		panic(err)
	}
}

func BenchmarkKeyName(b *testing.B) {
	for i := 0; i < b.N; i++ {
		keyName(i)
	}
}

func BenchmarkMarkInDB(b *testing.B) {
	db := prepareDB(b.N * testOnce)
	defer clearDB(db)

	b.ResetTimer()

	for i := 0; i < b.N; i++ {
		var lastBlockNum, j int
		lastBlockByte, err := db.Get([]byte("Last"), nil)
		if err != nil {
			if err != leveldb.ErrNotFound {
				panic(err)
			}
			lastBlockNum = 0
		} else {
			lastBlockNum = int(binary.LittleEndian.Uint32(lastBlockByte))
		}

		for j = lastBlockNum; j < lastBlockNum+testOnce; j++ {
			key := keyName(j)

			err := db.Delete(key, nil)
			if err != nil {
				panic(err)
			}
		}

		lastBlockByte = make([]byte, 4)
		binary.LittleEndian.PutUint32(lastBlockByte, uint32(j))
		err = db.Put([]byte("Last"), lastBlockByte, nil)
		if err != nil {
			panic(err)
		}
	}
}

func BenchmarkIterDB(b *testing.B) {
	db := prepareDB(b.N * testOnce)
	defer clearDB(db)

	b.ResetTimer()

	for i := 0; i < b.N; i++ {
		j := 0

		iterator := db.NewIterator(util.BytesPrefix([]byte{'H'}), nil)
		for j < testOnce {
			j++

			err := db.Delete(iterator.Key(), nil)
			if err != nil {
				panic(err)
			}
		}
		iterator.Release()
	}
}

image

@OneUnitedPowerValidator
Copy link

Is this still in consideration to be implemented?

@LuttyYang
Copy link
Contributor Author

@OneUnitedPowerValidator Currently under review

@LeoHChen
Copy link
Contributor

LeoHChen commented Dec 18, 2021

Is it a hardfork?

Can we enable this feature using the configuration flag? So that it can be dynamically enabled/disabled using configuration flag.

@LuttyYang
Copy link
Contributor Author

@LeoHChen There is an optional configuration to open, But once opened, if you want to close, you need to rclone again

@LeoHChen
Copy link
Contributor

a few questions,

  1. does a validator need to sync from the very beginning to use this new feature?
  2. if this feature is turned on existing db, will it prune existing db, or just reduce the size on new db syncing?
  3. can we create a new rclone repo with this feature on, from the genesis block so that new validators may rclone a much smaller db?

@LuttyYang
Copy link
Contributor Author

@LeoHChen

  1. No, Can be opened and closed at any time
  2. When Opened, it prune existing db
  3. Yes, After this test on mainnet has no problems for a period of time, This is of course the best

@soltrac
Copy link

soltrac commented Jan 7, 2022

Do we have any ETA on this PR?

@jhd2best
Copy link
Contributor

hey @LuttyYang @rlan35 @LeoHChen

tested this PR as the validator on shard1 of mainnet, and the result as follows:

first of all, our normal node's db size of shard1 is 929G

Node 1

using binary build by this PR, synced from latest block(at that time), and the storage used now
image

Node 2

using binary build by this PR, synced from 0 block, and the storage used now
image

both nodes can sign blocks normally (48 hours of attention). It seems that if we start syncing from block 0, we will save a lot of space. Next, the database synchronized from block 0 can be made into a snapshot to provide validator download

Copy link
Contributor

@LeoHChen LeoHChen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

final round of review comments,

upload_to_test.sh Outdated Show resolved Hide resolved
internal/params/config.go Outdated Show resolved Hide resolved
core/rawdb/accessors_offchain.go Show resolved Hide resolved
core/rawdb/accessors_offchain.go Show resolved Hide resolved
core/blockchain_pruner.go Outdated Show resolved Hide resolved
core/blockchain_pruner.go Show resolved Hide resolved
core/blockchain_pruner.go Show resolved Hide resolved
core/blockchain_pruner.go Show resolved Hide resolved
core/blockchain_pruner.go Show resolved Hide resolved
core/gen_genesis.go Outdated Show resolved Hide resolved
@LuttyYang LuttyYang force-pushed the pruned_validator_node_disk_space branch from 90ce3fa to 5db4dc9 Compare January 18, 2022 08:45
@LuttyYang LuttyYang force-pushed the pruned_validator_node_disk_space branch from d55b97e to f5a7fe4 Compare January 18, 2022 09:20
node/node.go Show resolved Hide resolved
Copy link
Contributor

@JackyWYX JackyWYX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how long it takes for the first iteration of pruning?

If the first iteration is interrupted with Ctrl+C, how will it effect the second iteration?

For example, current block number = 10000. For the first pruning, I shut down the node when it finished pruning for 0~1000. The next time I start the node, is it true that the pruning will panic when pruning block 0?

core/blockchain_pruner.go Show resolved Hide resolved
return true
}

blockInfo := rawdb.ReadBlock(bp.db, hash, blockNum)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like if a certain block is pruned in last iteration, block Info can be nil. Could this possibly happen? If we reach a block that is pruned previously, does the prune iteration continue or break?

@LuttyYang
Copy link
Contributor Author

LuttyYang commented Jan 20, 2022

I wonder how long it takes for the first iteration of pruning?

If the first iteration is interrupted with Ctrl+C, how will it effect the second iteration?

For example, current block number = 10000. For the first pruning, I shut down the node when it finished pruning for 0~1000. The next time I start the node, is it true that the pruning will panic when pruning block 0?

It takes about a week for the first iteration (After our test, we do not recommend to delete it slowly from the existing node. There is a lot of unknown data in the current db. We recommend starting a new snapshot.).

The iteration does not start from the first block every time, but starts from the last deleted block.

@LeoHChen LeoHChen merged commit 55f8c76 into harmony-one:main Jan 20, 2022
@LuttyYang LuttyYang deleted the pruned_validator_node_disk_space branch January 21, 2022 05:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants