Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

market,dex: trade suspension #287

Merged
merged 3 commits into from
May 14, 2020
Merged

market,dex: trade suspension #287

merged 3 commits into from
May 14, 2020

Conversation

chappjc
Copy link
Member

@chappjc chappjc commented Apr 24, 2020

This deals with trade suspension on a per-market basis, and makes Market shutdown more courteous. This does not deal with Market resume, or clean Swapper shutdown or resume, which is going to be a significant task in itself.

Notable areas of work:

  • market: Graceful suspend (the epochPump and (*Market).Run work), with optional book purge and coin unlock (the book and coinlock work).
  • db: Book purge from persistent storage (mass booked -> revoked status, plus pseudo-cancels recording the actions)
  • msgjson, market: Add the TradeSuspension message, and send it to clients from server/dex.DEX to subscribed clients via comms/Server on Market suspend command.
  • admin: market info and suspend routes
  • market: refactor book router's update signals from market

Technically, a suspend involves:

  • Setting a future epoch index to be the last, after which orders are rejected.
  • Suspend of a market means return of the Market's Run goroutine after the final scheduled epoch. Markets can start Run again, and this is done in tests with the same Market instance, but a resume command to do so is note implemented in this work.
  • When (*Market).Run begins shutting down, it always allows the closed epoch processing pipeline (preimage collection and hand off of orders Swapper) to complete/drain so no submitted orders are lost.
  • The DEX manager commands a suspend for a market via the order OrderRouter, which submits the requests to the target Market and gets the suspend request result (final epoch and suspend time). It also has access to a new (*Market).Running method to short circuit the new order processing rather than have the order rejected downstream by the Market. Proxying all suspend requests through the OrderRouter puts it in a position to track running and suspended markets, and markets with pending suspension.

Note that this also improves the regular dcrdex shutdown in response to a CTRL+C (and cancellation of all Contexts), in that a Market's Run goroutine will always allow the closed epoch processing pipeline to completely drain and no received orders are just dropped. dcrdex process interrupt is unlike a suspend in that it does not wait for the current epoch to close, but the closed epochs always complete processing. Better coordination of subsystem shutdown by the DEX manager is still needed however.

Major next steps:

  • Resume. A suspend also stops book routers presently, but this needs to be reconsidered for resume handling (do we maintain book subscriptions, or require new subs?). Also, the order router may want to track running and suspended markets to facilitate resume and be smarter about when orders may be submitted, another task for resume.
  • The Swapper needs a clean shutdown/resume solution too, which I believe will be difficult or complex to do with the pg DB, so I'm considering options to dump tracked trades, swap statuses, etc. and everything Swapper needs to resume into a gob file or similar.

@buck54321
Copy link
Member

Took a cursory look. I'll just drop some random observations here.

  • What is the motivation for leaving the market running when there is a resume epoch? Presumably, a shutdown would entail changing network parameters. Will there be a method to do that without re-initializing the entire Market? Does the case when the epoch duration itself is being adjusted need special handling?

  • When and how does a shutdown also result in a order book dump (see also persist field of suspension notification)?

  • The client will need to be informed asap after Suspend. Also need to inform the client when they connect and a suspension is imminent. These both seem like jobs for the AuthManager.

@chappjc
Copy link
Member Author

chappjc commented Apr 29, 2020

  • What is the motivation for leaving the market running when there is a resume epoch? Presumably, a shutdown would entail changing network parameters. Will there be a method to do that without re-initializing the entire Market? Does the case when the epoch duration itself is being adjusted need special handling?

The idea of this suspend is either a temporary trading halt with no market config changes, or a graceful stop of the market. Changing market settings would involve re-reading markets.json file and the complications you mentioned. I'm not shooting to address that here, but I don't want to make that work harder either.

  • When and how does a shutdown also result in a order book dump (see also persist field of suspension notification)?

The order book is already in the main DB. I'm working in another branch now to reload the book.
EDIT: as for actively purging the orders when persist=false, that's a good point I haven't addressed because all the orders are still valid here (related to your first question).

  • The client will need to be informed asap after Suspend. Also need to inform the client when they connect and a suspension is imminent. These both seem like jobs for the AuthManager.

Hadn't considered the connect modification. The suspend notification would come though, yes.

@chappjc
Copy link
Member Author

chappjc commented Apr 29, 2020

In addition to the connect message change (e.g. "suspends": ["dcr_btc": 13241324]), I suppose the market object in the the config response should get a suspendepoch and maybe resumeepoch field with omitempty, although we describe startepoch as being "Maybe in the future e.g. after maintenance" which could be problematic if the market isn't scheduled to suspend for say another day and users connecting should be able to use it freely since it's actually running.

Considering the possibility for changed market or asset parameters, a reasonable way to handle that is for the client to request a new config after the market resumes. Alternatively, the config response could be further complicated with multiple entries for the same market but with different start epochs. That doesn't sound nice though.

Another complication I see is changing asset parameters that affect all markets using that asset.

@chappjc chappjc mentioned this pull request May 4, 2020
18 tasks
@chappjc
Copy link
Member Author

chappjc commented May 4, 2020

So I don't want to dwell on this too much since Swapper resume without dropping active swaps has to get done right away, and it's a big job. This PR mainly allows the market to shutdown without dropping queued closed epochs prior to handing off to negotiation, and doing it at a specified time.

To wrap this PR up, a persist bool option needs to be added that will revoke/unbook all booked orders. The PR as it stands guarantees that there will be no orders in epoch status. Future work should cover market config reload and on-demand resume. The auto-resume option to the suspend function in this PR may be removed altogether in the future. auto resume option removed as it is not well conceived.

@chappjc
Copy link
Member Author

chappjc commented May 6, 2020

Added the persist book option, with necessary book, db, and coinlock functionality. But keeping the PR as draft while I remove the resume option from the Suspend function. I believe resume is something best handled by the dex manager by tracking running and suspended market subsystems and manually (re)launching them via their Run goroutines.

@chappjc
Copy link
Member Author

chappjc commented May 7, 2020

There is no longer a resume time because it over-complicated the Suspend function and Market's epoch cycling. Resume can be implemented by restarting Market.Run, and although this is done in the test and works, there is no resume function added here.

The admin server has /market{marketName} (for status) and /market/{marketName}/suspend?t={asSoonAsMS}&persist={bool} endpoints added:

$  dexadm() { curl --cacert ~/.dcrdex/rpc.cert --basic -u user:pass https://127.0.0.1:6542/api/$1; }

$  dexadm market/not_a_market
unknown market "not_a_market"

$  dexadm market/dcr_btc
{
    "market": "dcr_btc",
    "running": true
}

$  dexadm market/dcr_btc/suspend?t=$(( (`date +%s`+10)*1000 ))
{
    "market": "dcr_btc",
    "finalEpoch": 264814345,
    "supendTime": "2020-05-07T21:14:36Z"
}

$  dexadm market/dcr_btc
{
    "market": "dcr_btc",
    "running": true
}

<wait>

$  date
Thu 07 May 2020 04:14:48 PM CDT

$  dexadm market/dcr_btc/suspend
market "dcr_btc" not running

$  dexadm market/dcr_btc
{
    "market": "dcr_btc",
    "running": false
}
16:14:36.000 [INF] MKT: Book router terminating for market "dcr_btc"
16:14:36.000 [DBG] MKT: epoch pump drained for market dcr_btc
16:14:36.000 [DBG] MKT: Market "dcr_btc" stopped.

@chappjc chappjc marked this pull request as ready for review May 8, 2020 14:38
Copy link
Member Author

@chappjc chappjc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a bit of clean up and interactive testing to do, but the scope of the PR has gotten large enough. Market resume will be in a follow up.

Comment on lines 367 to 374
// Presently this kills the book router for this market.
// Depending on how resume is handled, the order feed may remain
// opened, allowing this router to wait for resume.
r.sendNote(route, subs, note)
break out
Copy link
Member Author

@chappjc chappjc May 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some open questions about how resume will operate. We don't define a resume ntfn message, just startepoch in config, which is probably not sufficient for clients that never (re)connect. What would we like the client to do if they are subscribed to market X and it suspends? If persist=true, poll for the market status? I think the answer is a TradeResumption notification type, which this PR defines to aid in planning but does not use.

server/market/bookrouter.go Show resolved Hide resolved
server/market/orderrouter.go Show resolved Hide resolved
@chappjc
Copy link
Member Author

chappjc commented May 8, 2020

  • The client will need to be informed asap after Suspend. Also need to inform the client when they connect and a suspension is imminent. These both seem like jobs for the AuthManager.

Oh, so presently there's only the TradeSuspension message when the market actually suspends, not when it's scheduled, that should change, although resending the TradeSuspension at suspend time couldn't hurt. The connect response also does not say anything yet. Some missing pieces here...

@chappjc
Copy link
Member Author

chappjc commented May 8, 2020

@buck54321 After hacking on the comms part more, I think that communications regarding market status (running, suspend time, start/resume time) are better handled by the 'config' and 'orderbook' responses, than by 'connect'.

The 'connect' route and AuthManager are an awkward place to collect and send this information, as it has nothing to do with the user other than which markets they plan to use and that's an 'orderbook' thing. Note that we'd need similar acrobatics as with FeeChecker to get market status in AuthManager. The market and order book router have more direct access to this data, and the ability to notify of live changes to start/suspend schedule.

The 'config' route can be requested any time, unlike 'connect', but it is still on-demand.

On the other hand, the order book subscription can include live changes to market status, e.g. TradeSuspension. The initial market status can be added to the 'orderbook' response, along with the initial seq, epoch, and order book. (Also, if we keep subscriptions active after suspend, we could address notification of a newly-scheduled resume more directly than just the markets.startepoch part of the on-demand 'config' route.)

In short, I think it's best to include market status in the 'config' response as clients always pull this via (*Core).connectDEX, and in the 'orderbook' response, and send market status change messages (such as TradeSuspension) as part of an order book subscription.

@chappjc
Copy link
Member Author

chappjc commented May 8, 2020

To make the 'config' and 'orderbook' messaging suggestions above more concrete: eef7a43

@buck54321
Copy link
Member

That all seems to make sense. There is one case that I'm uncertain about, though. If a client has an active order on some market which will be shutting down, but they are not currently subscribed to the order book, they should probably still get a suspend notification.

@chappjc chappjc marked this pull request as draft May 11, 2020 13:13
@chappjc
Copy link
Member Author

chappjc commented May 11, 2020

If a client has an active order on some market which will be shutting down, but they are not currently subscribed to the order book, they should probably still get a suspend notification.

I suspect the right approach to the trade suspend notification is to send at the time it is scheduled and when the suspend actually happens, but my thinking on the above is that if they have an active order and they are connected, they will be subscribed to that market's orderbook. If they are not connected when the suspend is scheduled, the config route provides the suspend schedule... although now that I say this, the scheduled suspend would seem to be appropriate to the first orderbook response when the user subscribes to the market feed. Starting a chat in Matrix about this now...

Back in draft while I sort out the comms changes.

server/dex/dex.go Outdated Show resolved Hide resolved
@chappjc
Copy link
Member Author

chappjc commented May 11, 2020

As per issue #354, these changes send a TradeSuspension on scheduling the suspend in the server/dex.DEX manager via the server/comms.Server, and it includes scheduled suspend data in the 'config' response.

I need to retest everything, and add some test coverage for server/dex(*configResponse).setMktSuspend.

Also, the exact structure of the 'config' response, particularly the new "status" object for each market, is a little uncertain. e.g. Including the active epoch in this status object would be nice, but it would require a lot more active communication to push active epoch changes from Markets back to the DEX manager. The DEX manager could also cycle these values according to the known epoch durations and wall times, but that's probably less elegant and it also requires more work to do it.

// ResumptionRoute is the DEX-originating request-type message informing the
// client of an upcoming trade resumption. This is part of the
// subscription-based orderbook notification feed.
ResumptionRoute = "resumption"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused. Here for planning.

@@ -669,6 +690,7 @@ type OrderBook struct {
MarketID string `json:"marketid"`
Seq uint64 `json:"seq"`
Epoch uint64 `json:"epoch"`
// MarketStatus `json:"status"`// maybe
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also here for consideration, but it should be redundant with 'config' response data and the TradeSuspension + TradeResumption ntfns.

Comment on lines +836 to +838
StartEpoch uint64 `json:"startepoch"`
FinalEpoch uint64 `json:"finalepoch,omitempty"`
Persist *bool `json:"persistbook,omitempty"` // nil and omitted when finalepoch is omitted
Copy link
Member Author

@chappjc chappjc May 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially had a field ActiveEpoch uint64 json:"epoch", but due to the extra subsystem communication work I described this is not implemented.

dex/msgjson/types.go Show resolved Hide resolved
server/admin/api.go Outdated Show resolved Hide resolved
server/market/market.go Outdated Show resolved Hide resolved
Comment on lines 500 to 501
m.coinLockerBase.UnlockAll()
m.coinLockerQuote.UnlockAll()
Copy link
Member Author

@chappjc chappjc May 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized that this is not correct when there are other markets with a shared asset since these book locks are really shared asset coin lockers for just this reason. Need to selectively unlock a slice of orders...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed this in 19a9e77 by returning purged order IDs, and explicitly unlocking the coins for just those orders, not all coins for asset managed by the coin locker.

Comment on lines +648 to +662
// Signal to the book router of the suspend:
notifyChan <- &updateSignal{
action: suspendAction,
data: sigDataSuspend{
finalEpoch: m.suspendEpochIdx,
stopTime: epochCloseTime,
persistBook: m.persistBook,
},
}
Copy link
Member Author

@chappjc chappjc May 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may not be necessary at all. It all depends on if the book router should stay running when the feeding market suspends, and how resume is implemented. My current thinking is that the DEX manager would handle the resume process, restarting the subsustems as it sees fit (noting that the book router's subscriptions would still persist).

}
}

func (r *OrderRouter) Suspend(asSoonAs time.Time, persistBooks bool) map[string]*SuspendEpoch {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not used presently, but there is likely a need for a "suspend all" with a time rather than immediate sigint shutdown.

Comment on lines +179 to +183
// Spare some resources if the market is closed now. Any orders that make it
// through to a closed market will receive a similar error from SubmitOrder.
if !tunnel.Running() {
return msgjson.NewError(msgjson.MarketNotRunningError, "market closed to new orders")
}
Copy link
Member Author

@chappjc chappjc May 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the present design, the order router commanded the suspension and thus can track the status of each market, at least nominally, so this will likely not query the MarketTunnel in the future. However, handleLimit and the other order handlers should catch orders submitted to suspended markets asap to save resources rather than sending them all the way to SubmitOrder to be bounced.

@chappjc
Copy link
Member Author

chappjc commented May 11, 2020

Updated and tested market/{marketName} admin route:

$   dexadm market/dcr_btc
{
    "market": "dcr_btc",
    "running": true,
    "epochlen": 6000,
    "activeepoch": 264871914,
    "startepoch": 264871914
}
$   dexadm market/dcr_btc/suspend
{
    "market": "dcr_btc",
    "finalepoch": 264871915,
    "supendtime": "2020-05-11T21:11:36Z"
}
$   dexadm market/dcr_btc
{
    "market": "dcr_btc",
    "running": false,
    "epochlen": 6000,
    "activeepoch": 0,
    "startepoch": 0,
    "finalepoch": 264871915,
    "persistbook": true
}
$   dexadm market/dcr_btc/suspend
market "dcr_btc" not running

@chappjc
Copy link
Member Author

chappjc commented May 12, 2020

The only remaining known issue here is the excessive coin unlocking mentioned in #287 (comment). This PR has been a bear, but I ironed out a number of related bugs and design issues in later commits, so it's worth it. Clicking ready for review, again, despite the open coin unlock issue.

@chappjc chappjc marked this pull request as ready for review May 12, 2020 22:25
Copy link
Member

@buck54321 buck54321 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems good to go. Tests well.

Thanks for describing the admin commands with the little dexadm () function. I should add that to the wiki.

dex/msgjson/types.go Outdated Show resolved Hide resolved
type TradeResumption struct {
MarketID string `json:"marketid"`
StartEpoch uint64 `json:"startepoch"`
EpochLen uint64 `json:"epochlen,omitempty"` // maybe just ConfigChange bool `json:"configchange"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably, ConfigChange = true would trigger a new config request from every client. It would be nice if we could send only the necessary market and asset info right here instead. Nothing to change now though.

server/admin/server.go Outdated Show resolved Hide resolved
server/market/market.go Outdated Show resolved Hide resolved
Comment on lines +673 to +686
// Set the orderRouter field now since the main loop below receives on it,
// even though SubmitOrderAsync disallows sends on orderRouter when the
// market is not running.
m.orderRouter = make(chan *orderUpdateSignal, 32) // implicitly guarded by m.runMtx since Market is not running yet
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you're not restarting the Market anymore, I think this can move back to the constructor.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The setup now supports starting the market again with Run/Start with the same Market instance. In TestMarket_Suspend I did this as a proof of concept: https://github.com/decred/dcrdex/pull/287/files#diff-92957aac6dd698f2d846e3b99ed7281dR465-R499 Possibly this will not work out in resume, but a subsystem manager in DEX that can simple restart the same market instance is the idea.

Comment on lines -234 to +283
switch u.action {
case newEpochAction:
switch sigData := u.data.(type) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it correct to say that updateSignal.action is just used for logging now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is, and debugging I suppose. Really the type switch is all that's needed.

server/market/market.go Outdated Show resolved Hide resolved
Refactor epochPump (into epump.go) for suspend support.

Add Market.Suspend and Market.Running.

When persistBook is false, in-memory and persistent books are purged.
Add the PurgeBook method to Market to do this. PurgeBook calls the new
OrderArchiver.FlushBook method, clears the in-memory book, and unlocks
all coins. TODO: unlock just the book orders, not possible epoch orders.

Run shutdown sequence is now meticulous.

Use internal notifyChan with lifetime of Run, and async subscriber
broadcasts via the closure goroutine inside Run.

db: add OrderArchiver.FlushBook

Add the FlushBook method to the pg.Archiver type.  It changes all orders
for a market that have status booked to status revoked, and it creates
and stores the server-generated pseudocancel orders. It does this all in
a single db transaction.  This is similar to calling RevokeOrder
individually on all booked orders.

The new PurgeBook query template is added to do the bulk book-revoked
status change for a market.

coinlock: Add UnlockAll to CoinLocker interface.

book: add Clear method to Book

market: rework bookUpdateSignal -> updateAction+sigData

Remove bookUpdateSignal, which combined data used for many different
signals, and create updateSignal with action and data fields. Receivers
will type cast data into one of several sigData types that contain only
the data that is relevant to the action at hand.

msgjson, market: add TradeSuspension message

(*BookRouter).runBook now sends msgjson.TradeSuspension for
msgjson.SuspensionRoute when the market sends a sigDataSuspend
on the order feed.  (*Market).Run sends this signal when cycleEpoch hits
the target suspend epoch index.

market,dex: SuspendMarket methods for OrderRouter and DEX mgr

Add (*OrderRouter).SuspendMarket, called from the DEX manager in the new
server/dex.(*DEX) SuspendMarket method.

server/admin: suspend and market info routes

msgjson: suspend data in config

server/dex: on suspend update config resp and send TradeSuspension

market: Add Status type and (*Market).Status

dex: add MarketStatus and MarketStatuses methods for admin API

admin: add /markets

The markets route summarizes configured markets.  e.g.

$   dexadm markets
{
    "btc_ltc": {
        "running": true,
        "epochlen": 6000,
        "activeepoch": 264872312,
        "startepoch": 264872312
    },
    "dcr_btc": {
        "running": true,
        "epochlen": 7000,
        "activeepoch": 227033410,
        "startepoch": 227033410
    }
}

$   dexadm market/dcr_btc/suspend
{
    "market": "dcr_btc",
    "finalepoch": 227033424,
    "supendtime": "2020-05-11T21:52:55Z"
}

$   dexadm markets{
    "btc_ltc": {
        "running": true,
        "epochlen": 6000,
        "activeepoch": 264872328,
        "startepoch": 264872328
    },
    "dcr_btc": {
        "running": true,
        "epochlen": 7000,
        "activeepoch": 227033424,
        "startepoch": 227033424,
        "finalepoch": 227033424,
        "persistbook": true
    }
}

$   dexadm markets
{
    "btc_ltc": {
        "running": true,
        "epochlen": 6000,
        "activeepoch": 264872329,
        "startepoch": 264872329
    },
    "dcr_btc": {
        "running": false,
        "epochlen": 7000,
        "activeepoch": 0,
        "startepoch": 0,
        "finalepoch": 227033424,
        "persistbook": true
    }
}
@chappjc
Copy link
Member Author

chappjc commented May 14, 2020

Just a manual squash. No changes since review.

@chappjc chappjc merged commit cb02a99 into decred:master May 14, 2020
@chappjc chappjc deleted the suspend branch May 14, 2020 21:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants