Add wtxmgr package #217
Add wtxmgr package #217
Conversation
I'm currently adding APIs to query for all recorded details for a single transaction. This is done under a single view. At the moment the API looks like this: type TxDetails struct {
TxRecord
Block BlockMeta
Credits []Credit
Debits []Debit
}
// TxDetails returns all saved details regarding a transaction. In case of a
// hash collision, the most recent transaction with a matching hash is returned.
func (s *Store) TxDetails(txHash *wire.ShaHash) (*TxDetails, error) {
var details *TxDetails
err := scopedView(s.namespace, func(ns walletdb.Bucket) error {
var err error
details, err = s.txDetails(ns, txHash)
return err
})
return details, err
}
func (s *Store) txDetails(ns walletdb.Bucket, txHash *wire.ShaHash) (*TxDetails, error) {
// First, check whether there exists an unmined transaction with this
// hash. Use it if found.
v := existsRawUnmined(ns, txHash[:])
if v != nil {
return s.unminedTxDetails(ns, txHash, v)
}
// Otherwise, if there exists a mined transaction with this matching
// hash, skip over to the newest and begin fetching all details.
k, v := latestTxRecord(txHash)
if v == nil {
// not found
return nil, nil
}
// Read k/v, lookup all matching credits, debits.
return nil, nil
}
func (s *Store) unminedTxDetails(ns walletdb.Bucket, txHash *wire.ShaHash, v []byte) (*TxDetails, error) {
// ...
} This will only fetch details for one transaction at a time and is not suitable for fetching entire ranges of transactions, but perhaps some of this code can be reused for that. |
Since the Credit type already implemented includes details which are duplicates of the TxRecord, I'm going to add some new types: type CreditRecord struct {
Index uint32
Spent bool
Change bool
}
type DebitRecord struct {
Amount btcutil.Amount
Index uint32
}
type TxDetails struct {
TxRecord
Block BlockMeta
Credits []CreditRecord
Debits []DebitRecorg
} Further details about the transaction inputs and outputs can then be accessed by indexing the |
API added to lookup rich details regarding a single transaction. I'll update the integration branch to use this for the Longer term, we'll want to move ToJSON out of this package and use these types instead (but not the new method, since we know the block or if it's unmined). In fact, it may make sense to do that now, considering ToJSON is known broken when passed an unconfirmed transaction. |
All of the remaining parts in the integration branch deal with the RPC server returning details about transaction history over ranges of transactions. I am thinking of reusing the I have something like this in mind: func (s *Store) RangeTransactions(begin, end int32, func([]TxDetails)) error
err := w.TxStore.RangeTransactions(0, 300000, func(details []TxDetails) {
// ...
}) This will allow iteration over transactions one block at a time. The height range is inclusive, and the special height -1 may be used as a high bound to include unmined transactions. The db view will be held for the entire duration of the call to RangeTransactions. If necessary, I may also pass a context type to the function so it can return early without error. One downside here is |
I'll take a look at this tomorrow. |
@davecgh informed me that, with bolt, a single Update may begin while Views are active. This is because pages are COW. So a long running view isn't as much of a problem as I explained above. |
APIs settled (for now :)) and integration branch is compiling. |
Probably not for the PR, but we did discuss moving the whole legacy folder under internal so it's no longer externally accessible. EDIT: Actually, I'll just make a separate issue for it now. |
// key/value pairs and nested buckets in forward or backward order. | ||
// | ||
// This function is part of the walletdb.Bucket interface implementation. | ||
func (b *bucket) Cursor() walletdb.Cursor { |
davecgh
Apr 3, 2015
Member
Can we break all of this Cursor functionality into a separate PR?
Can we break all of this Cursor functionality into a separate PR?
jrick
Apr 4, 2015
Author
Member
yep I'll split it out, let's get that merged first.
yep I'll split it out, let's get that merged first.
|
||
func putMinedBalance(ns walletdb.Bucket, amt btcutil.Amount) error { | ||
v := make([]byte, 8) | ||
byteOrder.PutUint64(v, uint64(amt)) |
davecgh
Apr 4, 2015
Member
In the grand scheme, this doesn't matter much, but the reason for BigEndian makes sense when doing cursor scans on keys, but shouldn't the values themselves be LittleEndian since there is an extremely high probability this will be running on a LE host arch?
In the grand scheme, this doesn't matter much, but the reason for BigEndian makes sense when doing cursor scans on keys, but shouldn't the values themselves be LittleEndian since there is an extremely high probability this will be running on a LE host arch?
jrick
Apr 4, 2015
Author
Member
Maybe. The only reason I'd see this mattering is if you wanted to optimize the call to binary.LittleEndian.PutUintXX
on little endian hosts, but it doesn't do that at the moment (at least in the source code, I haven't checked compiler output).
Maybe. The only reason I'd see this mattering is if you wanted to optimize the call to binary.LittleEndian.PutUintXX
on little endian hosts, but it doesn't do that at the moment (at least in the source code, I haven't checked compiler output).
One thing I noticed in the db code is that it's using a ton of magic numbers. They are commented so it's fairly easy to follow, but I suspect it could lead to being a pain to upgrade if a new field gets added and such so that, say the 44 in the block records needs to become 48. I'd make a few constants like |
I considered that approach as well but then I have to keep a thousand different names in my head, instead of just consulting a single comment and calculating offsets in my head (which I don't find hard, perhaps others disagree). edit: I'll also mention that those magic numbers are only used in the code directly under the comment, and not off in other files, so when there needs to be an upgrade, only one section of the code needs to be considered for the new serializations. |
seek := make([]byte, 4) | ||
byteOrder.PutUint32(seek, ^uint32(0)) | ||
c := ns.Bucket(bucketBlocks).Cursor() | ||
return blockIterator{c: c} |
davecgh
Apr 4, 2015
Member
return blockIterator{c: c, seek: seek}
return blockIterator{c: c, seek: seek}
Since the code is making assumptions about the size of the hash everywhere, I think it might be worthwhile to add an init panic if it's not. Something like: func init() {
if wire.HashSize != 32 {
panic("hash size is not 32 bytes as expected.")
}
} That way the code can assume 32-byte hashes (which aren't likely to change any time soon) without having any mysterious failures if the hash size ever changes in the future. |
I prefer static assertions for these cases, but sure, an init check is fine. |
Agreed a static assertion would be better. This should do the trick: var _ [32]byte = wire.ShaHash{} |
var v []byte | ||
if rec.SerializedTx == nil { | ||
txSize := rec.MsgTx.SerializeSize() | ||
v = make([]byte, 8, 8+txSize) |
davecgh
Apr 4, 2015
Member
👍 Nice way here to avoid an extra copy.
// upgradeManager opens the tx store using the specified namespace or creates | ||
// and initializes it if it does not already exist. It also provides | ||
// facilities to upgrade the data in the namespace to newer versions. | ||
func upgradeManager(namespace walletdb.Namespace) error { |
davecgh
Apr 4, 2015
Member
Everywhere else refers to it as the txstore. It seems that upgradeStore
would be a more natural name for this.
Everywhere else refers to it as the txstore. It seems that upgradeStore
would be a more natural name for this.
davecgh
Apr 4, 2015
Member
Also, it looks like this was based on the old waddrmgr upgrade code. It was changed a bit to explicitly separate the open/create functionality. It now creates the buckets at creation time instead of lazily checking/creating them on open each time. That approach also makes the upgrade process easier to handle since new buckets are created specifically in a certain version when doing upgrades and well be created by default for all newly managers at the current version.
Also, it looks like this was based on the old waddrmgr upgrade code. It was changed a bit to explicitly separate the open/create functionality. It now creates the buckets at creation time instead of lazily checking/creating them on open each time. That approach also makes the upgrade process easier to handle since new buckets are created specifically in a certain version when doing upgrades and well be created by default for all newly managers at the current version.
} | ||
err = tx.Commit() | ||
if err != nil { | ||
str := "commit failed" |
davecgh
Apr 4, 2015
Member
We need to check how bolt handles a commit failure. I think it might require a call to .Rollback
in this scenario to release the page references associated with it, but I'm not 100% sure on that.
We need to check how bolt handles a commit failure. I think it might require a call to .Rollback
in this scenario to release the page references associated with it, but I'm not 100% sure on that.
jrick
Apr 6, 2015
Author
Member
https://github.com/boltdb/bolt/blob/6d043164a90129f792615ee964f7b0956c2d38b7/db.go#L481-513
If the commit fails (and the tx's db pointer is not nil, I'm not sure on the exact circumstances where that is true) there is no call to rollback.
https://github.com/boltdb/bolt/blob/6d043164a90129f792615ee964f7b0956c2d38b7/db.go#L481-513
If the commit fails (and the tx's db pointer is not nil, I'm not sure on the exact circumstances where that is true) there is no call to rollback.
jrick
Apr 6, 2015
Author
Member
https://github.com/boltdb/bolt/blob/6d043164a90129f792615ee964f7b0956c2d38b7/tx.go#L131
Rollbacks are performed by Commit on all error paths.
https://github.com/boltdb/bolt/blob/6d043164a90129f792615ee964f7b0956c2d38b7/tx.go#L131
Rollbacks are performed by Commit on all error paths.
I've finished reviewing the database code and it looks really solid overall. As mentioned before, I think it's a little heavy on the magic numbers, but the comments regarding the indices are good enough to follow what's going on. From an implementation perspective, it provides really nice properties for highly scalable access. In particular, I like the how the transaction, credits, and debit record keys share the same prefix so prefix scans can be done to efficiently find all relevant records across buckets. |
// wallet's spendable balance) and are modeled using the Debit structure. | ||
// | ||
// Besides just saving transactions, bidirectional spend tracking is also | ||
// performed on each credit and debit. Unlike packages such as btcdb, which |
davecgh
Apr 4, 2015
Member
This is true now, but, as you know, btcdb is currently being rewritten where it doesn't do any spend tracking on its own. Might want to avoid mentioning it here since this will no longer be true in a relatively short while.
This is true now, but, as you know, btcdb is currently being rewritten where it doesn't do any spend tracking on its own. Might want to avoid mentioning it here since this will no longer be true in a relatively short while.
jrick
Apr 4, 2015
Author
Member
This entire file was lifted from the old implementation and needs to be rewritten.
This entire file was lifted from the old implementation and needs to be rewritten.
// InsertTx. After an insert, credits and debits may be attached to the | ||
// returned transaction record using the AddCredit and AddDebits methods. | ||
// | ||
// Example use: |
davecgh
Apr 4, 2015
Member
I'd suggest creating runnable example via an example_test.go
file (see btcjson's examples for reference). Examples in doc.go almost always end up out-of-date and non-functional over time. By creating them as runnable examples, they show up under the examples category on godoc and also have the benefit of being executed when go test runs along with comparing their output to the expected output.
I'd suggest creating runnable example via an example_test.go
file (see btcjson's examples for reference). Examples in doc.go almost always end up out-of-date and non-functional over time. By creating them as runnable examples, they show up under the examples category on godoc and also have the benefit of being executed when go test runs along with comparing their output to the expected output.
I'd personally prefer the JSON bits to live outside of the txstore. That is specific to the RPC server. |
This will need a |
The JSON stuff will move. It's here now because that's how txstore did it. The code in the PR currently needed it in this package so additional details could be queried out of the db as necessary, but in my current tree I've switched this to be a method of |
I noticed the same thing about wallet yesterday while looking through it again as well (too much RPC server stuff) |
@gsalgado What's the plan for txstore and wtxmgr in votingpool? |
Rebased over hashing API changes. |
// // Handle error | ||
// } | ||
// | ||
// The elem's Spent field is not set to true if the credits is spent by an |
davecgh
Apr 24, 2015
Member
credits -> credit
credits -> credit
// unmined transaction. To check for this case: | ||
// | ||
// k := canonicalOutPoint(&txHash, it.elem.Index) | ||
// it.elem.Spent |= existsRawUnminedInput(ns, k) != nil |
davecgh
Apr 24, 2015
Member
|= should just be =
|= should just be =
|
||
// Save the creation date of the store if it does not already | ||
// exist. | ||
v = ns.Get(rootCreateDate) |
davecgh
Apr 24, 2015
Member
No need to check for this. The code above already proved the namespace is empty.
No need to check for this. The code above already proved the namespace is empty.
} | ||
|
||
// Write a zero balance if it does not already exist. | ||
v = ns.Get(rootMinedBalance) |
davecgh
Apr 24, 2015
Member
Same here.
Same here.
// | ||
// Functions use the naming scheme `Op[Raw]Type[Field]`, which performs the | ||
// operation `Op` on the type `Type`, optionally dealing with raw keys and | ||
// values if `Raw` is used. Fetch and extract operations may only need read |
davecgh
Apr 25, 2015
Member
may only need read -> may only need to read
may only need read -> may only need to read
return details, err | ||
} | ||
|
||
func (s *Store) txDetails(ns walletdb.Bucket, txHash *wire.ShaHash, recKey, recVal []byte) (*TxDetails, error) { |
davecgh
Apr 25, 2015
Member
This would be better defined before it's called by TxDetails
and UniqueTxDetails
.
This would be better defined before it's called by TxDetails
and UniqueTxDetails
.
str := "saved debit index exceeds number of inputs" | ||
return nil, storeError(ErrData, str, nil) | ||
} | ||
|
davecgh
Apr 25, 2015
Member
Minor Nit: This one has a blank line before appending while the credits above doesn't.
Minor Nit: This one has a blank line before appending while the credits above doesn't.
jrick
Apr 25, 2015
Author
Member
Both put a newline after a bounds check...
Both put a newline after a bounds check...
return &details, debIter.err | ||
} | ||
|
||
func (s *Store) unminedTxDetails(ns walletdb.Bucket, txHash *wire.ShaHash, v []byte) (*TxDetails, error) { |
davecgh
Apr 25, 2015
Member
This would also be better defined before it's called by TxDetails
and UniqueTxDetails
.
Also, the code in TxDetails
and UniqueTxDetails
invokes this function before txDetails
, so it should ideally be defined before txDetails
as well.
This would also be better defined before it's called by TxDetails
and UniqueTxDetails
.
Also, the code in TxDetails
and UniqueTxDetails
invokes this function before txDetails
, so it should ideally be defined before txDetails
as well.
jrick
Apr 25, 2015
Author
Member
I personally find putting internal implementations at the top of files to be more unreadable (what is this?? why do we need this?!?), but eh, everyone has their preferred style.
I personally find putting internal implementations at the top of files to be more unreadable (what is this?? why do we need this?!?), but eh, everyone has their preferred style.
davecgh
Apr 25, 2015
Member
The vast majority of the code base does it that way, so I'm just aiming for consistency. Any pros and cons aside, consistency trumps personal preferences.
The vast majority of the code base does it that way, so I'm just aiming for consistency. Any pros and cons aside, consistency trumps personal preferences.
}) | ||
} | ||
|
||
func (s *Store) rangeUnminedTransactions(ns walletdb.Bucket, f func([]TxDetails) (bool, error)) (brk bool, err error) { |
davecgh
Apr 25, 2015
Member
Same deal as above. Better defined before RangeTransactions
which uses it.
Same deal as above. Better defined before RangeTransactions
which uses it.
davecgh
Apr 25, 2015
Member
No need for the named returns here.
No need for the named returns here.
jrick
Apr 25, 2015
Author
Member
The named returns act as a form of documentation (brk
being short for break
, because that is a keyword). When true, it breaks out of the range.
Are they necessary? no. Do they hurt? no. Would a comment be better? maybe. But if you read the file in order you already have seen exactly how the function is being used :)
The named returns act as a form of documentation (brk
being short for break
, because that is a keyword). When true, it breaks out of the range.
Are they necessary? no. Do they hurt? no. Would a comment be better? maybe. But if you read the file in order you already have seen exactly how the function is being used :)
jrick
Apr 25, 2015
Author
Member
I'll make it a comment though, to avoid the pointless debate.
I'll make it a comment though, to avoid the pointless debate.
davecgh
Apr 25, 2015
Member
Thanks.
For what it's worth, I had no idea that brk was intended to mean break when reading it until I read the implementation. Had I been looking at that in godoc (which I often do locally with the internal flag enabled so I can see all unexported functions), I would not have known what it was for since there is no comment.
As you said, no reason to debate it, but I did want to point out that it did not have the desired effect.
Thanks.
For what it's worth, I had no idea that brk was intended to mean break when reading it until I read the implementation. Had I been looking at that in godoc (which I often do locally with the internal flag enabled so I can see all unexported functions), I would not have known what it was for since there is no comment.
As you said, no reason to debate it, but I did want to point out that it did not have the desired effect.
return false, err | ||
} | ||
|
||
func (s *Store) rangeBlockTransactions(ns walletdb.Bucket, begin, end int32, f func([]TxDetails) (bool, error)) (brk bool, err error) { |
davecgh
Apr 25, 2015
Member
Same deal as above. Better defined before RangeTransactions
which uses it.
Same deal as above. Better defined before RangeTransactions
which uses it.
davecgh
Apr 25, 2015
Member
No need for the named returns here.
No need for the named returns here.
} | ||
} | ||
|
||
// TODO: increment credit amount for each credit (but those are unknown |
davecgh
Apr 25, 2015
Member
Reminder. There are several other TODOs, but they're only optimizations. This is logic related.
Reminder. There are several other TODOs, but they're only optimizations. This is logic related.
jrick
Apr 25, 2015
Author
Member
It's not (kinda..), although that isn't very clear from just this comment. It's referring to this TODO from AddCredits:
// TODO(jrick): This should not be necessary. Instead, pass the indexes
// that are known to contain credits when a transaction or merkleblock is
// inserted into the store.
Ideally credits would be marked during the transaction insert, not after and in a different DB transaction. It just doesn't work this way yet (do we want to finish that for this pr?)
It's not (kinda..), although that isn't very clear from just this comment. It's referring to this TODO from AddCredits:
// TODO(jrick): This should not be necessary. Instead, pass the indexes
// that are known to contain credits when a transaction or merkleblock is
// inserted into the store.
Ideally credits would be marked during the transaction insert, not after and in a different DB transaction. It just doesn't work this way yet (do we want to finish that for this pr?)
davecgh
Apr 25, 2015
Member
Alright. Nah, let's get this merged. There are a lot of other blocked PRs waiting on it, and we can do these minor optimizations later.
Alright. Nah, let's get this merged. There are a lot of other blocked PRs waiting on it, and we can do these minor optimizations later.
err := ns.Bucket(bucketUnmined).ForEach(func(k, v []byte) error { | ||
// TODO: Parsing transactions from the db may be a little | ||
// expensive. It's possible the caller only wants the | ||
// serialized tranasctions. |
davecgh
Apr 25, 2015
Member
tranasctions -> transactions
tranasctions -> transactions
return txs, err | ||
} | ||
|
||
func (s *Store) unminedTxs(ns walletdb.Bucket) ([]*wire.MsgTx, error) { |
davecgh
Apr 25, 2015
Member
Define before use please.
Define before use please.
func (s *Store) unminedTxs(ns walletdb.Bucket) ([]*wire.MsgTx, error) { | ||
var unmined []*wire.MsgTx | ||
err := ns.Bucket(bucketUnmined).ForEach(func(k, v []byte) error { | ||
// TODO: Parsing transactions from the db may be a little |
davecgh
Apr 25, 2015
Member
Perhaps provide a separate func UnminedTxsRaw() ([][]byte, error) {
?
Perhaps provide a separate func UnminedTxsRaw() ([][]byte, error) {
?
jrick
Apr 25, 2015
Author
Member
There aren't any callers that would be able to use this yet. btcrpcclient
forces transactions to be passed as a *wire.MsgTx
.
There aren't any callers that would be able to use this yet. btcrpcclient
forces transactions to be passed as a *wire.MsgTx
.
This looks ready to me. OK. |
0087d38
into
btcsuite:master
bucketBlocks, 44, len(v)) | ||
return nil, storeError(ErrData, str, nil) | ||
} | ||
newv := append(v[:len(v):len(v)], txHash[:]...) |
tuxcanfly
Apr 27, 2015
Contributor
Can't this be simplified to append(v, ...
?
Can't this be simplified to append(v, ...
?
tuxcanfly
Apr 27, 2015
Contributor
Thanks, got it.
Thanks, got it.
jrick
Apr 27, 2015
Author
Member
This is necessary because otherwise you may be writing to invalid memory. The v
parameter in this case will often be the value bytes directly returned from bolt. And since slices returned from bolt have caps greater than their lengths, by not limiting the cap during an append, you would end up writing to who-knows-where. Best case scenario here would be a segfault. Worst case would be silently continuing while corrupting the DB.
This is necessary because otherwise you may be writing to invalid memory. The v
parameter in this case will often be the value bytes directly returned from bolt. And since slices returned from bolt have caps greater than their lengths, by not limiting the cap during an append, you would end up writing to who-knows-where. Best case scenario here would be a segfault. Worst case would be silently continuing while corrupting the DB.
tuxcanfly
Apr 27, 2015
Contributor
Thanks, yeah I realize now the third index is for limiting the cap.
Thanks, yeah I realize now the third index is for limiting the cap.
This change implements a scalable wallet transaction database with integrated spend tracking, based on walletdb.
Integration is being tracked in another PR: #234
This is not the final database change related to transactions. In the future, wtxmgr will be an internal package used by waddrmgr to store transactions and manage output spend tracking per account. However, in the interest of improving the current code on master in a timely manner, wtxmgr is currently being given its own namespace. This matches with how the old transaction store was managed independent of addresses and accounts.
Known issues are marked with
TODO
comments. I believe all blockers have been implemented or fixed at this point, and the only remaining TODOs deal with performance issues which can be improved later (such as creating more new byte slices than necessary or parsing more details than needed).Checklist for merge:
txstore.Store.FindPreviousCredits
(needed for votingpool to fetch PkScripts)Make upgrade logic match that of newer waddrmgrjust nuke the bucket when this moves to waddrmgr