Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vote bits sync #32

Closed
wants to merge 15 commits into from
Closed

Vote bits sync #32

wants to merge 15 commits into from

Conversation

chappjc
Copy link
Member

@chappjc chappjc commented Sep 27, 2016

This implements vote bits (re)synchronization/repair. Please test and review. All of the new or modified functionality is tested on a testnet pool, but there may be more attack protections that should be made (walletRPCHandlerize any of this?).

In terms of behavior:

  • On startup of dcrstakepool, a full vote bits sync of all tickets owned by the pool is performed. It will be skipped if importscript gets run since that takes a long time.
  • /tickets POST (set bits) has a short timeout so a large number of tickets does not block page loading. TODO: implement a proper timeout with a channel and time.After select.
  • A flash message will indicate that a tickets sync is in progress. No tickets are shown in this period.
  • /tickets GET checks for consistency of vote bits between wallets and launches a resync when needed. A flash message is also used in this case, and no tickets are shown (as shown in image below).

Implementation:

  • A RWMutex is used so multiple readers can access ticket vote bits data, while a single writer can lock before doing a set of any vote bits.
  • An atomic is used to make a kind of "try lock" that the /tickets page can check without blocking.

It needs a dcrrpcclient version change for the new GetTickets RPC.

Issues:

  • getstakeinfo's immature includes unmined, whereas gettickets does not include mempool. This prevents the startup check from passing int(gsi.Live+gsi.Immature) != numLiveTickets while tickets are in mempool. All is well after tickets get mined. Workaround is to avoid such a check, and only check that the wallets agree with eachother w.r.t. gettickets.
  • On startup, if wallets are out of sync w.r.t. address indexes or scripts, then it is likely that SyncVoteBits will either report different numbers of tickets (before they are processing the import/sync) or report -4: syncing... here Rather than quit, the solution is to just skip the full ticket vote bits sync (check and repair). The vote bits are repaired on-demand via /tickets.
  • Look like at least 33% of this PR is done.

Startup auto-repair:
resync

User changing vote bits, if process exceeds 3 seconds:
bitschange

@@ -1117,6 +1313,11 @@ func newWalletSvrManager(walletHosts []string, walletCerts []string, walletUsers
return nil, err
}

err = wsm.SyncVoteBits()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be syncing address indexes and/or redeemscripts at this point. Need a wait loop? Otherwise you need to let it quit and try again later.

@chappjc
Copy link
Member Author

chappjc commented Sep 28, 2016

Updated PR with issues originally identified in comments.

Copy link
Contributor

@jolan jolan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you anticipate having time to work on the TODO items or it should I consider it as-is? I need to test and review in depth but so far it looks like a good enough start that it could be merged as-is and worked on a bit more by me or you.

func (controller *MainController) TicketsPost(c web.C, r *http.Request) (string, int) {
w := controller.rpcServers

// If already processing let /ticktets handle this
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

@@ -29,7 +29,7 @@ const (
// appBuild is defined as a variable so it can be overridden during the build
// process with '-ldflags "-X main.appBuild foo' if needed. It MUST only
// contain characters from semanticAlphabet per the semantic versioning spec.
var appBuild string
var appBuild = "Bravo"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want this changed since people should use the method in the comment, i.e.:

go build -ldflags "-X main.appBuild=Bravo"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left over. Will fix

@chappjc
Copy link
Member Author

chappjc commented Oct 3, 2016

I don't have any further plans.

@jolan
Copy link
Contributor

jolan commented Oct 4, 2016

I started testing this with dcrd/dcrwallet master and it's shaking out some bugs in those so will have to wait a bit. (Which means I will probably merge some other PRs and you'll need to rebase).

@jolan
Copy link
Contributor

jolan commented Oct 4, 2016

Hmm, I haven't been able to successfully test this on my testnet setup since it seemingly chokes on immature tickets.

I have dcrticketbuyer buying 3 tickets per block which go through the stake pool. I consistently hit errors on the immature tickets:

15:00:05 2016-10-04 [INF] DCRS: Beginning resync of vote bits for 1494 tickets.
Fatal error in controller init: -4: ticket ccb66efd102e2946dc9268568366bfb9090fc6c37e26af36c2f311d201ec0512 not found in store

16:04:40 2016-10-04 [INF] DCRS: Beginning resync of vote bits for 1491 tickets.
Fatal error in controller init: -4: ticket 26830a6097e53455ccc530215f1aa536ac788384f87c9b57b60d6afd5712b949 not found in store

16:06:45 2016-10-04 [INF] DCRS: Beginning resync of vote bits for 1492 tickets.
Fatal error in controller init: -4: ticket 6406b7af56092b033555744433c3565b40bd35276c97cd4b7f75095c8b213827 not found in store

16:10:46 2016-10-04 [INF] DCRS: Beginning resync of vote bits for 1494 tickets.
Fatal error in controller init: -4: ticket 01876be56c8fa86a1e84bcb477a2a0c49048a69617c7d3da0795f7dff164c220 not found in store

@chappjc
Copy link
Member Author

chappjc commented Oct 4, 2016

Haven't seen that. Maybe this requires txindex and addrindex? Are these unconfirmed/mempool or just immature?

@chappjc
Copy link
Member Author

chappjc commented Oct 4, 2016

I'll fire up my testnet pool and try to start and stop while buying tickets.

@jolan
Copy link
Contributor

jolan commented Oct 4, 2016

I'm not totally sure what's going on myself yet since I've been testing various other things and jumping back and forth between those and this.

It does seem to be just be immature tickets so far.

I have seen it happen with:

  • tickets that were just mined (1 confirmation) where the tickets show up in gettickets but not getticketsvotebits.
  • tickets that show in both gettickets and getticketsvotebits.

This was with latest dcrd/dcrwallet master so there could have been some sort of behavior change if you tested with a different version.

I'll re-test tomorrow with more focus on the details.

@chappjc
Copy link
Member Author

chappjc commented Oct 5, 2016

Oh, I recall that some functions will not list tickets until they have 2 confirms (validated as well as mined). rpctest revealed this, but I don't recall the details.

@cjepson
Copy link
Contributor

cjepson commented Oct 13, 2016

setticketsvotebits is being added, there's a PR now for it in dcrd:

decred/dcrd#422

This will let you set the votebits of all tickets on sync in a signal call, which should significantly improve performance. It will also make user updates much less brittle.

@chappjc
Copy link
Member Author

chappjc commented Oct 13, 2016

@cjepson I saw that go in. Thanks. I'll update this PR

@chappjc
Copy link
Member Author

chappjc commented Oct 14, 2016

@cjepson In tickets POST, I agree setting all in one call is a greate change, but I'm not sure it's what we want on re-sync. To detect mismatch, there's a loop over the tickets anyway. Unless most or all of the tickets are out of sync, it's not clear a full set is best.

@chappjc
Copy link
Member Author

chappjc commented Oct 14, 2016

@jolan GetTickets(true) returns unmined in the set, which causes the error on startup when ticket purchases are still unconfirmed. The solution may not be the most efficient, but it filters out those unconfirmed tickets so that sync works on startup.

@chappjc
Copy link
Member Author

chappjc commented Oct 14, 2016

@jolan @cjepson I'm running a branch with this (updated) PR, plus #40, #39, and #26.
All seems OK.

@chappjc
Copy link
Member Author

chappjc commented Oct 19, 2016

This whole vote bits issue is driving me crazy. People just screw with this relentlessly. The PR prevents desync, but it's still inefficient. With the last commits today, it should be OK, but the other repos need the implementations to support setticketsvotebits.

@chappjc
Copy link
Member Author

chappjc commented Oct 31, 2016

On account of rebase to master, needs re-testing.

Strict checks for user id errors to avoid possible panics.

If loop over tickets to set vote bits fails on NewHAshFromStr, just continue to set for valid tickets.  Return StatusOK if no errors and /tickets destination.

Set as Bravo build.
Rename GetLiveUserTickets to GetUnspentUserTickets.  The tickets from legacyrpc.gettickets() include live and immature.  Stakepooluserinfo status "live" also includes live an immature.  But getstakeinfo has "live" and "immature" seperate.

Also tweak some logic for efficiency.
-Add flash message to notify of resync in progress on /tickets page.  Update flash before parsing template.
-Add 3 second timeout for POST on /tickets.  Use goroutine with loop over setvotebits -- this should probably be refactored.
-Make /tickets POST check atomic flag for sync in progress and forward to /tickets GET for handling.
-Fx uninit map.
-Fix live+immature == numLiveTickets.
Only set vote bits for live tickets, was mistakenly all including already voted.

Log sync errors, such as cooldown, in walletRPCHandler loop.  Use channel to get error out of goroutine that does syncing in response to /tickets POST.

Address startup issues:
- Only consider mined tickets, not mempool.
- Check if importscript or other sync in progress before attempting vote bits sync.  New CheckWalletsReady().
TODO: Figure out cooldown in the context of batch set.
@jolan
Copy link
Contributor

jolan commented Nov 11, 2016

Can you please rebase this yet again? 😘

We would like this functionality and it can finally be given the proper testing/attention now.

@chappjc
Copy link
Member Author

chappjc commented Nov 11, 2016

In the next hour or two, yup.

@chappjc
Copy link
Member Author

chappjc commented Nov 11, 2016

@jolan Oh, what do I do about my deleted repo? Start a new PR from a new fork?

@jolan
Copy link
Contributor

jolan commented Nov 11, 2016

Oh right :( Yeah that would work best. I'll close this one.

@jolan jolan closed this Nov 11, 2016
@chappjc chappjc mentioned this pull request Nov 11, 2016
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants