New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add wtxmgr package #217

Merged
merged 4 commits into from Apr 25, 2015

Conversation

Projects
None yet
5 participants
@jrick
Member

jrick commented Mar 31, 2015

This change implements a scalable wallet transaction database with integrated spend tracking, based on walletdb.

Integration is being tracked in another PR: #234

This is not the final database change related to transactions. In the future, wtxmgr will be an internal package used by waddrmgr to store transactions and manage output spend tracking per account. However, in the interest of improving the current code on master in a timely manner, wtxmgr is currently being given its own namespace. This matches with how the old transaction store was managed independent of addresses and accounts.

Known issues are marked with TODO comments. I believe all blockers have been implemented or fixed at this point, and the only remaining TODOs deal with performance issues which can be improved later (such as creating more new byte slices than necessary or parsing more details than needed).

Checklist for merge:

  • Add equivalent of txstore.Store.FindPreviousCredits (needed for votingpool to fetch PkScripts)
  • Make upgrade logic match that of newer waddrmgr just nuke the bucket when this moves to waddrmgr
  • Increase test coverage
  • Add testable, runnable examples
@jrick

This comment has been minimized.

Member

jrick commented Apr 1, 2015

I'm currently adding APIs to query for all recorded details for a single transaction. This is done under a single view. At the moment the API looks like this:

type TxDetails struct {
        TxRecord
        Block   BlockMeta
        Credits []Credit
        Debits  []Debit
}

// TxDetails returns all saved details regarding a transaction.  In case of a
// hash collision, the most recent transaction with a matching hash is returned.
func (s *Store) TxDetails(txHash *wire.ShaHash) (*TxDetails, error) {
        var details *TxDetails
        err := scopedView(s.namespace, func(ns walletdb.Bucket) error {
                var err error
                details, err = s.txDetails(ns, txHash)
                return err
        })
        return details, err
}

func (s *Store) txDetails(ns walletdb.Bucket, txHash *wire.ShaHash) (*TxDetails, error) {
        // First, check whether there exists an unmined transaction with this
        // hash.  Use it if found.
        v := existsRawUnmined(ns, txHash[:])
        if v != nil {
                return s.unminedTxDetails(ns, txHash, v)
        }

        // Otherwise, if there exists a mined transaction with this matching
        // hash, skip over to the newest and begin fetching all details.
        k, v := latestTxRecord(txHash)
        if v == nil {
                // not found
                return nil, nil
        }

        // Read k/v, lookup all matching credits, debits.

        return nil, nil
}

func (s *Store) unminedTxDetails(ns walletdb.Bucket, txHash *wire.ShaHash, v []byte) (*TxDetails, error) {
        // ...
}

This will only fetch details for one transaction at a time and is not suitable for fetching entire ranges of transactions, but perhaps some of this code can be reused for that.

@jrick

This comment has been minimized.

Member

jrick commented Apr 1, 2015

Since the Credit type already implemented includes details which are duplicates of the TxRecord, I'm going to add some new types:

type CreditRecord struct {
        Index  uint32
        Spent  bool
        Change bool
}

type DebitRecord struct {
        Amount btcutil.Amount
        Index  uint32
}

type TxDetails struct {
        TxRecord
        Block   BlockMeta
        Credits []CreditRecord
        Debits  []DebitRecorg
}

Further details about the transaction inputs and outputs can then be accessed by indexing the wire.MsgTx slices.

@jrick

This comment has been minimized.

Member

jrick commented Apr 1, 2015

API added to lookup rich details regarding a single transaction. I'll update the integration branch to use this for the gettransaction RPC.

Longer term, we'll want to move ToJSON out of this package and use these types instead (but not the new method, since we know the block or if it's unmined). In fact, it may make sense to do that now, considering ToJSON is known broken when passed an unconfirmed transaction.

@jrick

This comment has been minimized.

Member

jrick commented Apr 2, 2015

All of the remaining parts in the integration branch deal with the RPC server returning details about transaction history over ranges of transactions. I am thinking of reusing the TxDetails type for an API to handle these situations as well, but instead of requesting one transaction at a time, a []TxDetails will be returned, one block at a time, over some specified range of blocks.

I have something like this in mind:

func (s *Store) RangeTransactions(begin, end int32, func([]TxDetails)) error

err := w.TxStore.RangeTransactions(0, 300000, func(details []TxDetails) {
        // ...
})

This will allow iteration over transactions one block at a time. The height range is inclusive, and the special height -1 may be used as a high bound to include unmined transactions. The db view will be held for the entire duration of the call to RangeTransactions. If necessary, I may also pass a context type to the function so it can return early without error.

One downside here is TxDetails includes a BlockMeta, which will be the same for every returned detail at a time. This is inefficient and cumbersome to use, so maybe I'll split out the BlockMeta from TxDetails and also return a BlockMeta for the Store.TxDetails method. Another downside is that the view may be held to for a long time as almost every detail for every transaction in this range is returned and then processed by the caller. While multiple views can run concurrently, writes will block, and this will stop all syncing until all the views return. However, the alternative of returning details from both before and after a change is probably worse.

@davecgh

This comment has been minimized.

Member

davecgh commented Apr 2, 2015

I'll take a look at this tomorrow.

@jrick

This comment has been minimized.

Member

jrick commented Apr 2, 2015

@davecgh informed me that, with bolt, a single Update may begin while Views are active. This is because pages are COW. So a long running view isn't as much of a problem as I explained above.

@jrick

This comment has been minimized.

Member

jrick commented Apr 3, 2015

APIs settled (for now :)) and integration branch is compiling.

@davecgh

This comment has been minimized.

Member

davecgh commented Apr 3, 2015

Probably not for the PR, but we did discuss moving the whole legacy folder under internal so it's no longer externally accessible.

EDIT: Actually, I'll just make a separate issue for it now.

// key/value pairs and nested buckets in forward or backward order.
//
// This function is part of the walletdb.Bucket interface implementation.
func (b *bucket) Cursor() walletdb.Cursor {

This comment has been minimized.

@davecgh

davecgh Apr 3, 2015

Member

Can we break all of this Cursor functionality into a separate PR?

This comment has been minimized.

@jrick

jrick Apr 4, 2015

Member

yep I'll split it out, let's get that merged first.

func putMinedBalance(ns walletdb.Bucket, amt btcutil.Amount) error {
v := make([]byte, 8)
byteOrder.PutUint64(v, uint64(amt))

This comment has been minimized.

@davecgh

davecgh Apr 4, 2015

Member

In the grand scheme, this doesn't matter much, but the reason for BigEndian makes sense when doing cursor scans on keys, but shouldn't the values themselves be LittleEndian since there is an extremely high probability this will be running on a LE host arch?

This comment has been minimized.

@jrick

jrick Apr 4, 2015

Member

Maybe. The only reason I'd see this mattering is if you wanted to optimize the call to binary.LittleEndian.PutUintXX on little endian hosts, but it doesn't do that at the moment (at least in the source code, I haven't checked compiler output).

@davecgh

This comment has been minimized.

Member

davecgh commented Apr 4, 2015

One thing I noticed in the db code is that it's using a ton of magic numbers. They are commented so it's fairly easy to follow, but I suspect it could lead to being a pain to upgrade if a new field gets added and such so that, say the 44 in the block records needs to become 48. I'd make a few constants like const blkRecTxOffset = 44.

@jrick

This comment has been minimized.

Member

jrick commented Apr 4, 2015

I considered that approach as well but then I have to keep a thousand different names in my head, instead of just consulting a single comment and calculating offsets in my head (which I don't find hard, perhaps others disagree).

edit: I'll also mention that those magic numbers are only used in the code directly under the comment, and not off in other files, so when there needs to be an upgrade, only one section of the code needs to be considered for the new serializations.

seek := make([]byte, 4)
byteOrder.PutUint32(seek, ^uint32(0))
c := ns.Bucket(bucketBlocks).Cursor()
return blockIterator{c: c}

This comment has been minimized.

@davecgh

davecgh Apr 4, 2015

Member

return blockIterator{c: c, seek: seek}

@davecgh

This comment has been minimized.

Member

davecgh commented Apr 4, 2015

Since the code is making assumptions about the size of the hash everywhere, I think it might be worthwhile to add an init panic if it's not. Something like:

func init() {
    if wire.HashSize != 32 {
        panic("hash size is not 32 bytes as expected.")
    }
}

That way the code can assume 32-byte hashes (which aren't likely to change any time soon) without having any mysterious failures if the hash size ever changes in the future.

@jrick

This comment has been minimized.

Member

jrick commented Apr 4, 2015

I prefer static assertions for these cases, but sure, an init check is fine.

@davecgh

This comment has been minimized.

Member

davecgh commented Apr 4, 2015

Agreed a static assertion would be better. This should do the trick:

var _ [32]byte = wire.ShaHash{}
var v []byte
if rec.SerializedTx == nil {
txSize := rec.MsgTx.SerializeSize()
v = make([]byte, 8, 8+txSize)

This comment has been minimized.

@davecgh

davecgh Apr 4, 2015

Member

👍 Nice way here to avoid an extra copy.

// upgradeManager opens the tx store using the specified namespace or creates
// and initializes it if it does not already exist. It also provides
// facilities to upgrade the data in the namespace to newer versions.
func upgradeManager(namespace walletdb.Namespace) error {

This comment has been minimized.

@davecgh

davecgh Apr 4, 2015

Member

Everywhere else refers to it as the txstore. It seems that upgradeStore would be a more natural name for this.

This comment has been minimized.

@davecgh

davecgh Apr 4, 2015

Member

Also, it looks like this was based on the old waddrmgr upgrade code. It was changed a bit to explicitly separate the open/create functionality. It now creates the buckets at creation time instead of lazily checking/creating them on open each time. That approach also makes the upgrade process easier to handle since new buckets are created specifically in a certain version when doing upgrades and well be created by default for all newly managers at the current version.

}
err = tx.Commit()
if err != nil {
str := "commit failed"

This comment has been minimized.

@davecgh

davecgh Apr 4, 2015

Member

We need to check how bolt handles a commit failure. I think it might require a call to .Rollback in this scenario to release the page references associated with it, but I'm not 100% sure on that.

This comment has been minimized.

@jrick

jrick Apr 6, 2015

Member

https://github.com/boltdb/bolt/blob/6d043164a90129f792615ee964f7b0956c2d38b7/db.go#L481-513

If the commit fails (and the tx's db pointer is not nil, I'm not sure on the exact circumstances where that is true) there is no call to rollback.

This comment has been minimized.

@jrick
@davecgh

This comment has been minimized.

Member

davecgh commented Apr 4, 2015

I've finished reviewing the database code and it looks really solid overall. As mentioned before, I think it's a little heavy on the magic numbers, but the comments regarding the indices are good enough to follow what's going on.

From an implementation perspective, it provides really nice properties for highly scalable access. In particular, I like the how the transaction, credits, and debit record keys share the same prefix so prefix scans can be done to efficiently find all relevant records across buckets.

// wallet's spendable balance) and are modeled using the Debit structure.
//
// Besides just saving transactions, bidirectional spend tracking is also
// performed on each credit and debit. Unlike packages such as btcdb, which

This comment has been minimized.

@davecgh

davecgh Apr 4, 2015

Member

This is true now, but, as you know, btcdb is currently being rewritten where it doesn't do any spend tracking on its own. Might want to avoid mentioning it here since this will no longer be true in a relatively short while.

This comment has been minimized.

@jrick

jrick Apr 4, 2015

Member

This entire file was lifted from the old implementation and needs to be rewritten.

// InsertTx. After an insert, credits and debits may be attached to the
// returned transaction record using the AddCredit and AddDebits methods.
//
// Example use:

This comment has been minimized.

@davecgh

davecgh Apr 4, 2015

Member

I'd suggest creating runnable example via an example_test.go file (see btcjson's examples for reference). Examples in doc.go almost always end up out-of-date and non-functional over time. By creating them as runnable examples, they show up under the examples category on godoc and also have the benefit of being executed when go test runs along with comparing their output to the expected output.

@davecgh

This comment has been minimized.

Member

davecgh commented Apr 4, 2015

I'd personally prefer the JSON bits to live outside of the txstore. That is specific to the RPC server.

@davecgh

This comment has been minimized.

Member

davecgh commented Apr 4, 2015

This will need a README.md before we merge it as well.

@jrick

This comment has been minimized.

Member

jrick commented Apr 4, 2015

The JSON stuff will move. It's here now because that's how txstore did it. The code in the PR currently needed it in this package so additional details could be queried out of the db as necessary, but in my current tree I've switched this to be a method of TxDetails and it no longer needs the Store. Since fields in the TxDetails are all exported, I plan on moving this totally out of this package and into rpcserver.go, or perhaps the wallet package if it's needed there (that package also includes too much stuff specific to the rpc server).

@davecgh

This comment has been minimized.

Member

davecgh commented Apr 4, 2015

I noticed the same thing about wallet yesterday while looking through it again as well (too much RPC server stuff)

@jrick

This comment has been minimized.

Member

jrick commented Apr 6, 2015

@gsalgado What's the plan for txstore and wtxmgr in votingpool?

@jrick jrick force-pushed the jrick:jrick_wtxmgr branch from 4747c4c to 789cd79 Apr 17, 2015

@jrick

This comment has been minimized.

Member

jrick commented Apr 17, 2015

Rebased over hashing API changes.

@jrick jrick force-pushed the jrick:jrick_wtxmgr branch from 789cd79 to c81a607 Apr 20, 2015

@jrick jrick force-pushed the jrick:jrick_wtxmgr branch from c81a607 to 2877a05 Apr 20, 2015

// // Handle error
// }
//
// The elem's Spent field is not set to true if the credits is spent by an

This comment has been minimized.

@davecgh

davecgh Apr 24, 2015

Member

credits -> credit

// unmined transaction. To check for this case:
//
// k := canonicalOutPoint(&txHash, it.elem.Index)
// it.elem.Spent |= existsRawUnminedInput(ns, k) != nil

This comment has been minimized.

@davecgh

davecgh Apr 24, 2015

Member

|= should just be =

// Save the creation date of the store if it does not already
// exist.
v = ns.Get(rootCreateDate)

This comment has been minimized.

@davecgh

davecgh Apr 24, 2015

Member

No need to check for this. The code above already proved the namespace is empty.

}
// Write a zero balance if it does not already exist.
v = ns.Get(rootMinedBalance)

This comment has been minimized.

@davecgh

davecgh Apr 24, 2015

Member

Same here.

//
// Functions use the naming scheme `Op[Raw]Type[Field]`, which performs the
// operation `Op` on the type `Type`, optionally dealing with raw keys and
// values if `Raw` is used. Fetch and extract operations may only need read

This comment has been minimized.

@davecgh

davecgh Apr 25, 2015

Member

may only need read -> may only need to read

return details, err
}
func (s *Store) txDetails(ns walletdb.Bucket, txHash *wire.ShaHash, recKey, recVal []byte) (*TxDetails, error) {

This comment has been minimized.

@davecgh

davecgh Apr 25, 2015

Member

This would be better defined before it's called by TxDetails and UniqueTxDetails.

str := "saved debit index exceeds number of inputs"
return nil, storeError(ErrData, str, nil)
}

This comment has been minimized.

@davecgh

davecgh Apr 25, 2015

Member

Minor Nit: This one has a blank line before appending while the credits above doesn't.

This comment has been minimized.

@jrick

jrick Apr 25, 2015

Member

Both put a newline after a bounds check...

return &details, debIter.err
}
func (s *Store) unminedTxDetails(ns walletdb.Bucket, txHash *wire.ShaHash, v []byte) (*TxDetails, error) {

This comment has been minimized.

@davecgh

davecgh Apr 25, 2015

Member

This would also be better defined before it's called by TxDetails and UniqueTxDetails.

Also, the code in TxDetails and UniqueTxDetails invokes this function before txDetails, so it should ideally be defined before txDetails as well.

This comment has been minimized.

@jrick

jrick Apr 25, 2015

Member

I personally find putting internal implementations at the top of files to be more unreadable (what is this?? why do we need this?!?), but eh, everyone has their preferred style.

This comment has been minimized.

@davecgh

davecgh Apr 25, 2015

Member

The vast majority of the code base does it that way, so I'm just aiming for consistency. Any pros and cons aside, consistency trumps personal preferences.

})
}
func (s *Store) rangeUnminedTransactions(ns walletdb.Bucket, f func([]TxDetails) (bool, error)) (brk bool, err error) {

This comment has been minimized.

@davecgh

davecgh Apr 25, 2015

Member

Same deal as above. Better defined before RangeTransactions which uses it.

This comment has been minimized.

@davecgh

davecgh Apr 25, 2015

Member

No need for the named returns here.

This comment has been minimized.

@jrick

jrick Apr 25, 2015

Member

The named returns act as a form of documentation (brk being short for break, because that is a keyword). When true, it breaks out of the range.

Are they necessary? no. Do they hurt? no. Would a comment be better? maybe. But if you read the file in order you already have seen exactly how the function is being used :)

This comment has been minimized.

@jrick

jrick Apr 25, 2015

Member

I'll make it a comment though, to avoid the pointless debate.

This comment has been minimized.

@davecgh

davecgh Apr 25, 2015

Member

Thanks.

For what it's worth, I had no idea that brk was intended to mean break when reading it until I read the implementation. Had I been looking at that in godoc (which I often do locally with the internal flag enabled so I can see all unexported functions), I would not have known what it was for since there is no comment.

As you said, no reason to debate it, but I did want to point out that it did not have the desired effect.

return false, err
}
func (s *Store) rangeBlockTransactions(ns walletdb.Bucket, begin, end int32, f func([]TxDetails) (bool, error)) (brk bool, err error) {

This comment has been minimized.

@davecgh

davecgh Apr 25, 2015

Member

Same deal as above. Better defined before RangeTransactions which uses it.

This comment has been minimized.

@davecgh

davecgh Apr 25, 2015

Member

No need for the named returns here.

}
}
// TODO: increment credit amount for each credit (but those are unknown

This comment has been minimized.

@davecgh

davecgh Apr 25, 2015

Member

Reminder. There are several other TODOs, but they're only optimizations. This is logic related.

This comment has been minimized.

@jrick

jrick Apr 25, 2015

Member

It's not (kinda..), although that isn't very clear from just this comment. It's referring to this TODO from AddCredits:

// TODO(jrick): This should not be necessary.  Instead, pass the indexes
// that are known to contain credits when a transaction or merkleblock is
// inserted into the store.

Ideally credits would be marked during the transaction insert, not after and in a different DB transaction. It just doesn't work this way yet (do we want to finish that for this pr?)

This comment has been minimized.

@davecgh

davecgh Apr 25, 2015

Member

Alright. Nah, let's get this merged. There are a lot of other blocked PRs waiting on it, and we can do these minor optimizations later.

err := ns.Bucket(bucketUnmined).ForEach(func(k, v []byte) error {
// TODO: Parsing transactions from the db may be a little
// expensive. It's possible the caller only wants the
// serialized tranasctions.

This comment has been minimized.

@davecgh

davecgh Apr 25, 2015

Member

tranasctions -> transactions

return txs, err
}
func (s *Store) unminedTxs(ns walletdb.Bucket) ([]*wire.MsgTx, error) {

This comment has been minimized.

@davecgh

davecgh Apr 25, 2015

Member

Define before use please.

func (s *Store) unminedTxs(ns walletdb.Bucket) ([]*wire.MsgTx, error) {
var unmined []*wire.MsgTx
err := ns.Bucket(bucketUnmined).ForEach(func(k, v []byte) error {
// TODO: Parsing transactions from the db may be a little

This comment has been minimized.

@davecgh

davecgh Apr 25, 2015

Member

Perhaps provide a separate func UnminedTxsRaw() ([][]byte, error) {?

This comment has been minimized.

@jrick

jrick Apr 25, 2015

Member

There aren't any callers that would be able to use this yet. btcrpcclient forces transactions to be passed as a *wire.MsgTx.

@davecgh

This comment has been minimized.

Member

davecgh commented Apr 25, 2015

This looks ready to me. OK.

@jrick jrick force-pushed the jrick:jrick_wtxmgr branch from e12a191 to 0087d38 Apr 25, 2015

@conformal-deploy conformal-deploy merged commit 0087d38 into btcsuite:master Apr 25, 2015

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@jrick jrick deleted the jrick:jrick_wtxmgr branch Apr 25, 2015

bucketBlocks, 44, len(v))
return nil, storeError(ErrData, str, nil)
}
newv := append(v[:len(v):len(v)], txHash[:]...)

This comment has been minimized.

@tuxcanfly

tuxcanfly Apr 27, 2015

Contributor

Can't this be simplified to append(v, ...?

This comment has been minimized.

@davecgh

davecgh Apr 27, 2015

Member

It would not be guaranteed to be new in that case.

http://play.golang.org/p/aWoBXnZ-em

This comment has been minimized.

@tuxcanfly

tuxcanfly Apr 27, 2015

Contributor

Thanks, got it.

This comment has been minimized.

@jrick

jrick Apr 27, 2015

Member

This is necessary because otherwise you may be writing to invalid memory. The v parameter in this case will often be the value bytes directly returned from bolt. And since slices returned from bolt have caps greater than their lengths, by not limiting the cap during an append, you would end up writing to who-knows-where. Best case scenario here would be a segfault. Worst case would be silently continuing while corrupting the DB.

This comment has been minimized.

@tuxcanfly

tuxcanfly Apr 27, 2015

Contributor

Thanks, yeah I realize now the third index is for limiting the cap.

alexlyp added a commit to alexlyp/btcwallet that referenced this pull request May 27, 2016

alexlyp added a commit to alexlyp/btcwallet that referenced this pull request Jun 17, 2016

alexlyp added a commit to alexlyp/btcwallet that referenced this pull request Jun 24, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment