Split validator databases #6048

rkapka · 2020-05-29T16:20:39Z

What type of PR is this?

Feature

What does this PR do? Why is it needed?
This PR introduces a new command: validator accounts split. Is takes a source directory and a target directory as inputs and creates one validator database for each public key in the source directory's validator database. Each database is saved in a subdirectory of the target directory. The subdirectory's name is the hex-encoded public key whose data has been split into the database. The encoding is the same that is used for naming validator private key files. This way it will be easy to identify which database belongs to which key.

Which issues(s) does this PR fix?
This PR addresses #5638, however it does not close it yet. There are two other operations to implement: importing and exporting data for selected keys.

…abases # Conflicts: # validator/db/BUILD.bazel # validator/db/manage.go # validator/db/manage_test.go

# Conflicts: # validator/accounts/account_test.go # validator/db/manage.go # validator/db/manage_test.go

validator/accounts/account.go

validator/db/manage.go

rauljordan · 2020-05-29T16:42:11Z

validator/db/manage.go

+	return nil
+}
+
+func addAttestations(bucket *bolt.Bucket, attestations pubKeyAttestations) error {


This function seems to be assuming that it will called immediately after addProposals right? It doesn't feel right to make that assumption, as someone can use this function incorrectly for some other use case. Perhaps refactor to resolve that risk.

Even though I do always add attestations after adding proposals, there is no direct dependency between them. One can add attestations before proposals, or even add one but not the other. Attestations and proposals live in separate buckets.

rauljordan · 2020-05-29T16:42:49Z

validator/db/manage.go

+	var allAttestations []pubKeyAttestations
+
+	for _, store := range stores {
+		if err := store.db.View(func(tx *bolt.Tx) error {


It's risky to clog up the boltdb thread and do all these expensive operations within a single bolt transaction. Any chance we could do these as a separate bolt tx's?

I found a way to do this, but it's not too elegant.

Firstly I open a transaction in which I extract all keys and assign them to a variable for later use. Then I open one transaction for every stored key and get proposals/attestations for that key.

There might be a better way to do this, but this is the best I could come up with.

validator/db/manage_test.go

rauljordan · 2020-05-29T16:46:09Z

validator/db/manage_test.go

@@ -54,6 +58,220 @@ func TestMerge(t *testing.T) {
 	assertMergedStore(t, mergedStore, firstStorePubKeys, secondStorePubKeys, history)
 }

+func TestSplit(t *testing.T) {


This test is extremely long. Any way it can be simplified? Perhaps the setup can be a somewhat general helper function, or perhaps can be split into several tests and not just one big TestSplit. For example, TestSplit_OK, TestSplit_CouldNotDoSomething, etc.

I redesigned the tests, I made helper functions more granular so that they can be reused by tests for merging and splitting. I hope most of these changes will make it easier to write tests for future functionalities, e.g. importing.

validator/db/manage.go

rkapka · 2020-05-31T19:25:33Z

I had to rename the flags related with merging and splitting to avoid duplicate names. I find it a bit awkward that now the commands have redundancy in flag names, e.g. validator accounts merge --merge-source-dirs="x" --merge-target-dir="y". It's obvious from the command name what action is being invoked - there's no need to repeat it in flag names. I believe that having at most 4 flags: source-dir, source-dirs, target-dir and target-dirs should be sufficient for all validator operations. However this would require generic descriptions of the flags. Any thoughts on this?

codecov · 2020-05-31T19:39:06Z

Codecov Report

Merging #6048 into master will decrease coverage by 0.01%.
The diff coverage is 51.29%.

@@            Coverage Diff             @@
##           master    #6048      +/-   ##
==========================================
- Coverage   59.51%   59.50%   -0.02%     
==========================================
  Files         320      320              
  Lines       27023    27129     +106     
==========================================
+ Hits        16083    16143      +60     
- Misses       8749     8777      +28     
- Partials     2191     2209      +18

rauljordan · 2020-05-31T19:58:03Z

validator/db/manage.go

+	allAttestations []pubKeyAttestations) (err error) {
+
+	var storesToClose []*Store
+	defer (func() {


Suggested change

defer (func() {

defer func() {

rauljordan · 2020-05-31T19:58:08Z

validator/db/manage.go

+				err = errors.New(errorMessage)
+			}
+		}
+	})()


Suggested change

})()

}()

rauljordan · 2020-05-31T19:59:12Z

validator/db/manage.go

+			if err != nil {
+				err = errors.Wrapf(err, errorMessage)
+			} else {
+				err = errors.New(errorMessage)


You need to do something about these errors in this deferred function. The function will not run until createSplitTargetStores returns, so this error will have nowhere to do. I would suggest creating an error log here instead. There's no way to control nor extract the outputs of deferred functions

Actually, there is:
#6027 (comment)

Wow awesome! Thanks for sharing, good thing to know :)

nisdas · 2020-06-01T03:41:18Z

validator/accounts/account.go

@@ -225,17 +225,26 @@ func HandleEmptyKeystoreFlags(cliCtx *cli.Context, confirmPassword bool) (string
 func Merge(ctx context.Context, sourceDirectories []string, targetDirectory string) (err error) {
 	var sourceStores []*db.Store
 	defer func() {
+		errorMessage := "failed to close one or more source databases"


can you make this error a package variable ?
ex:

errSource := errors.New("failed to close one or more source databases")

You can then resuse the error here and in other places instead of redefining it again.

Making errors package variables would be helpful

rkapka and others added 12 commits May 28, 2020 19:10

initial working implementation with a basic test

d6bc700

merge validator enhancements

0526e4b

added test dependency to db's build file

645ebd4

Merge branch 'master' into merge-validators-enhancements

2fe1284

changed formatting of public key

1b35e18

Merge branch 'master' into merge-validators-enhancements

f75ff8c

Merge branch 'master' into merge-validators-enhancements

26b5d2f

removed unused import

321c71a

Merge branch 'merge-validators-enhancements' into split-validator-dat…

e5a08ee

…abases # Conflicts: # validator/db/BUILD.bazel # validator/db/manage.go # validator/db/manage_test.go

tests and small fixes

c5e6748

extracted common functionality

7661ce3

Merge branch 'master' into split-validator-databases

3de3442

# Conflicts: # validator/accounts/account_test.go # validator/db/manage.go # validator/db/manage_test.go

rkapka requested a review from a team as a code owner May 29, 2020 16:20

rkapka requested review from rauljordan, terencechain and nisdas May 29, 2020 16:20

added missing test dependency to build file

e1bc53b

rauljordan requested changes May 29, 2020

View reviewed changes

nisdas reviewed May 31, 2020

View reviewed changes

validator/db/manage.go Outdated Show resolved Hide resolved

rkapka added 3 commits May 31, 2020 20:57

added missing flags to main.go

d5b8883

applied code review suggestions

3748a65

renamed flags to avoid duplication

63a3a72

rauljordan reviewed May 31, 2020

View reviewed changes

Merge branch 'master' into split-validator-databases

e671449

rauljordan previously approved these changes May 31, 2020

View reviewed changes

rkapka added 2 commits May 31, 2020 22:15

Merge branch 'master' into split-validator-databases

de2fae5

removed redundant parenthesis

84372e1

rkapka dismissed rauljordan’s stale review via 84372e1 May 31, 2020 20:20

rauljordan previously approved these changes May 31, 2020

View reviewed changes

rkapka and others added 2 commits June 1, 2020 00:57

Merge branch 'master' into split-validator-databases

34e5330

Merge branch 'master' into split-validator-databases

788cf13

nisdas reviewed Jun 1, 2020

View reviewed changes

Merge branch 'master' into split-validator-databases

28cebef

rauljordan added 2 commits June 1, 2020 13:32

Merge branch 'master' into split-validator-databases

0aa3ba3

Merge branch 'master' into split-validator-databases

6cdb3dc

rauljordan previously approved these changes Jun 1, 2020

View reviewed changes

extracted defer errors to package-level variables

1819388

rkapka dismissed rauljordan’s stale review via 1819388 June 2, 2020 16:24

rkapka and others added 4 commits June 2, 2020 18:25

Merge branch 'master' into split-validator-databases

847332c

comply with error naming convention

e2d8b2f

removed incorrect import

09b3c12

Merge branch 'master' into split-validator-databases

c1fded2

rauljordan approved these changes Jun 2, 2020

View reviewed changes

rauljordan added the OK to merge label Jun 2, 2020

prylabs-bulldozer bot merged commit ac138ea into prysmaticlabs:master Jun 2, 2020

rkapka deleted the split-validator-databases branch June 3, 2020 15:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split validator databases #6048

Split validator databases #6048

rkapka commented May 29, 2020

rauljordan May 29, 2020

rkapka May 29, 2020

rauljordan May 29, 2020

rkapka May 31, 2020 •

edited

rauljordan May 29, 2020

rkapka May 31, 2020 •

edited

rkapka commented May 31, 2020

codecov bot commented May 31, 2020 •

edited

rauljordan May 31, 2020

rauljordan May 31, 2020

rauljordan May 31, 2020

rkapka May 31, 2020

rauljordan May 31, 2020

nisdas Jun 1, 2020

Split validator databases #6048

Split validator databases #6048

Conversation

rkapka commented May 29, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rkapka May 31, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rkapka May 31, 2020 • edited

Choose a reason for hiding this comment

rkapka commented May 31, 2020

codecov bot commented May 31, 2020 • edited

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rkapka May 31, 2020 •

edited

rkapka May 31, 2020 •

edited

codecov bot commented May 31, 2020 •

edited