Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ledger: move accountdb to its own package #4428

Closed
wants to merge 30 commits into from

Conversation

ghost
Copy link

@ghost ghost commented Aug 18, 2022

Summary

This PR moves accountdb, tests, LRU caches and all related stuff into its own package.
Some data types were exported as interfaces, others were hidden.
There are three types of DB access functions:

  1. Prepared queries for lookup
  2. Commit time functions
    • old account state loaders (AccountLoadOld and similar)
    • Rounds update
    • Accounts/Resources/Online commit functions
  3. Catchpoint state management, DB reader MT committer

Future work possible as separate PR:

  • make compact deltas more self-contained and get rid of exporting AccountLoadOld and Co
  • make accounts commit function (old/existing state vs new state comparison) part of business logic, not storage logic

Test Plan

Rely on existing tests

@ghost ghost self-assigned this Aug 18, 2022
@ghost ghost marked this pull request as draft August 18, 2022 20:42
Copy link
Contributor

@algorandskiy algorandskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass

  1. remove ledger/TestFirstStagePersistence.7233885710903406427 binary files
  2. maybe call accountdb package trackerdb ?

@@ -370,7 +371,7 @@ func printAccountsDatabase(databaseName string, fileHeader ledger.CatchpointFile
progress++
acctCount++
}
_, err = ledger.LoadAllFullAccounts(context.Background(), tx, balancesTable, resourcesTable, acctCb)
_, err = accountdb.LoadAllFullAccounts(context.Background(), tx, balancesTable, resourcesTable, acctCb)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: rewrite the function as iterator class with Next() and Close()

"github.com/algorand/go-algorand/protocol"
)

func accountsAll(tx *sql.Tx) (bals map[basics.Address]basics.AccountData, err error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks like can be replaced by the iterator from the suggestion from above.

"github.com/algorand/msgp/msgp"
)

type EncodedBalanceRecordV5 struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this file should go into catchpoint package.
Rationale: it does not to touch db, only local files

// accountsList contain the list of persistedAccountData, where the front ones are the most "fresh"
// and the ones on the back are the oldest.
accountsList *persistedAccountDataList
// accounts provides fast access to the various elements in the list by using the account address
// accounts provides fast access to the various elements in the list by using the account Address
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Address -> address

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Something annoying that happened was that when using goland refactor, it modified comments that are unrelated to changed variables. Please let me know if you spot any more weird capitalization in the comments and I will fix them.

m.accountsList = newPersistedAccountList().allocateFreeNodes(pendingWrites)
m.accounts = make(map[basics.Address]*persistedAccountDataListNode, pendingWrites)
m.pendingAccounts = make(chan persistedAccountData, pendingWrites)
m.pendingAccounts = make(chan PersistedAccountData, pendingWrites)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a big win would be if we can keep PersistedAccountData internal.

  1. Usage in tracker/dcc : updatedPersistedAccounts []persistedAccountData could be an interface with a single method Len() and lruAccounts might have method writeAccounts taking this interface and casting to []persistedAccountData before iterating.
  2. persistedData, err = au.accountsq.lookup(addr) could also be an interface with
    Round()
    Valid()
    GetLedgerCoreAccountData()

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put these interfaces in acctupdates etc, name them UpdatedAccounts, etc

// so that we can update the base account back.
dcc.updatedPersistedAccounts, dcc.updatedPersistedResources, err = accountsNewRound(tx, dcc.compactAccountDeltas, dcc.compactResourcesDeltas, dcc.compactCreatableDeltas, dcc.genesisProto, dbRound+basics.Round(offset))
dcc.updatedPersistedAccounts, dcc.updatedPersistedResources, err = accountdb.AccountsNewRound(tx, dcc.compactAccountDeltas, dcc.CompactResourcesDeltas, dcc.compactCreatableDeltas, dcc.genesisProto, dbRound+basics.Round(offset))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type UpdatedAccounts struct {
	Data interface{}
	Count uint64
}

@codecov
Copy link

codecov bot commented Sep 2, 2022

Codecov Report

Merging #4428 (3292fb7) into master (00b37f2) will decrease coverage by 1.56%.
The diff coverage is 59.84%.

@@            Coverage Diff             @@
##           master    #4428      +/-   ##
==========================================
- Coverage   55.28%   53.71%   -1.57%     
==========================================
  Files         398      398              
  Lines       50336    50288      -48     
==========================================
- Hits        27829    27013     -816     
- Misses      20178    21072     +894     
+ Partials     2329     2203     -126     
Impacted Files Coverage Δ
ledger/accountdb/accountdb.go 49.66% <ø> (ø)
ledger/accountdb/persistedaccts_list.go 92.00% <ø> (ø)
ledger/accountdb/persistedonlineaccts_list.go 92.00% <ø> (ø)
ledger/accountdb/persistedresources_list.go 92.00% <ø> (ø)
ledger/accountdb/trackerdb.go 0.00% <0.00%> (ø)
ledger/ledgercore/accountdata.go 0.00% <ø> (ø)
ledger/ledgercore/catchpointlabel.go 100.00% <ø> (ø)
ledger/ledgercore/statedelta.go 6.92% <0.00%> (-0.41%) ⬇️
ledger/voters.go 68.65% <0.00%> (-4.48%) ⬇️
ledger/accountdb/catchpoints.go 16.90% <16.90%> (ø)
... and 29 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@algorandskiy algorandskiy marked this pull request as ready for review September 13, 2022 16:16
// want to ensure that the "real" written value isn't being overridden by the value from the pending accounts.
round basics.Round
}

func (pad persistedAccountData) Addr() basics.Address {
Copy link
Contributor

@cce cce Sep 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@algorandskiy possible opportunity for fewer copies / performance — should we make these pointer receivers and then switch to returning references to persistedAccountData as PersistedAccountData implementations?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. But when I tried it there is a problem with writing pointers into a cache. Having shared copies break some tests and possibly some non-test functionality

Copy link
Contributor

@icorderi icorderi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

A few suggestions for followup PRs:

  • move TrackerDB to its own package, similar to what was done with BlockDB
  • create a catchpoint package
  • create a cache package

@algorandskiy
Copy link
Contributor

This looks extremely hard to merge after boxes, closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants