New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
server/auth: scoring for offline users must load history from DB #1083
Conversation
if !found { | ||
outcomes = newLatestMatchOutcomes(scoringMatchLimit) | ||
auth.matchOutcomes[user] = outcomes | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the gist of the bug. I suspect it was written (by me) with the thinking "no known outcomes, so this must be a new user with no previous outcomes" but the absence of an entry really means offline user.
// Make outcome entries for the user to optimize subsequent outcomes calls | ||
// while they are disconnected? This could lead to adding duplicate outcomes | ||
// with a concurrent connect/login or subsequent outcomes while offline. | ||
// | ||
// auth.matchOutcomes[user] = matchOutcomes | ||
// auth.preimgOutcomes[user] = piOutcomes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be safe with the new duplicate checks in the outcome structs' add
methods, but it's a little messy and these map entries are assumed not to be present for a disconnected user. (The entries would remain in the map for eternity if they never logged back in.)
Even for the largest matches table so far, the loadUserOutcomes
query is ~30ms on my (admittedly fast) machine and ~200ms on a closer-to-production-equivalent machine. Even the occasional string of a dozen back-to-back outcomes for offline users (match failures or preimage misses) should be tolerable. We can experiment with this optimization or even separate outcome maps if it becomes slow to hit the DB for each outcome for an offline user.
server/auth/auth.go
Outdated
} else if score < int32(auth.banScore) && !open { | ||
if err = auth.Unban(user); err == nil { | ||
log.Warnf("Restoring suspended account %v (score = %d).", acctInfo.ID, score) | ||
} else { | ||
log.Errorf("Failed to restore suspended account %v (score = %d): %v.", | ||
acctInfo.ID, score, err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
banScore
is an operator setting, but even so this branch would have automated account restoration on reconnect in the event that this bug manifested.
// Register the preimage miss violation, adjusting the user's score. | ||
m.auth.MissedPreimage(ord.User(), epochEnd, ord.ID()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved below the DB storage of the preimage miss.
// Credit the user for completing the swap, adjusting the user's score. | ||
if actor.user != counterParty.user || newStatus == order.MatchComplete { // if user is both sides, only credit on MatchComplete (taker done too) | ||
s.authMgr.SwapSuccess(actor.user, db.MatchID(match.Match), match.Quantity, redeemTime) // maybe call this in swapDone callback | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fairly inconsequential for a success, but moved below the DB store for consistency. Note that the relevant callback for match failure outcomes is the Inaction
call in (*Swapper).failMatch
, which was already called after the DB update via db.SetMatchInactive
.
An alternative approach that moves the responsibility squarely into the auth manager was to still attempt adding the outcome after loading match history from DB, allowing the add
method to filter out a duplicate add if it was already in the DB. Still may consider this...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good and working well for me.
Also verified fix on simnet. And deployed on dex-test (testnet). Will merge shortly to include in release-v0.2 |
Tests updated for the auto-unban change, with minor fix applied to avoid need to reconnect in this case. |
This resolves an elusive and long-standing issue where users would seemingly not have their score offset by match successes in their history. The cause is that offline users do not have entries in the
AuthManager
'smatchOutcomes
andpreimageOutcomes
maps, butregisterMatchOutcome
andregisterPreimageOutcome
, which are often called when a user is offline, handle the lack of map entries by creating fresh histories that just included the outcome being registered at the time.This PR resolves the issue by loading the user's match and preimage history from DB when an outcome is registered while they are offline and these outcome tracking map entries are non-existent. This involves some minor refactoring to the
loadUserScore
method to separate the DB loading component (loadUserOutcomes
) from score computation (integrateOutcomes
) and insertion of outcome map entries. I considered just deferring account scoring until the next login, but it is important for unbooking orders of suspended users.This also modifies
handleConnect
so that an account will automatically be reinstated if computed user score indicates their account should not be closed but it is. This is not just for bug recovery, but to allow operator changing the ban score threshold without having to manually apply the change to suspended accounts.This also modifies the
add
methods of both latest outcome tracking structures so that it is not possible to add duplicate matches or orders. This is done to ensure concurrent login/connect and match or swap outcome registration cannot create duplicate entries. With small outcome slices (~100 elements) this dumb search is inexpensive.