Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: check order cancellations before picking a winner #2532

Open
MartinquaXD opened this issue Mar 18, 2024 · 9 comments
Open

chore: check order cancellations before picking a winner #2532

MartinquaXD opened this issue Mar 18, 2024 · 9 comments

Comments

@MartinquaXD
Copy link
Contributor

Background

Using off-chain mechanisms to cancel an order is inherently racy. If an order gets off-chain cancelled while a solver already tries to bring a solution for it on-chain it will get solved although it will be marked as cancelled in our DB.
This is a known and even documented issue but has not been causing too many problems because the order cancellation / replacement endpoints have not been used a lot.
However, with the initiative to make the frontend "feel" faster we will make more liberal use of these features which will probably result in more people run into these cancellation race conditions.

Details

To limit the time where the race condition can happen to a minimum we should probably check if all orders that a solver tries to settle are in fact still open and not cancelled. That way the race condition windows shrinks from the order being put into the next auction until the order being settled on chain to just the time where a solver tries to bring the solution on chain.
That means we reduce the race limit window by at least 15 seconds but on average longer than that.

Acceptance criteria

Before picking a winner we make sure that all orders the solver intends to settle have not been cancelled in the mean time.

Considerations

This is slightly unfortunate because currently solvers assume all orders in the auction are fair game so it could happen that they play fairly, compute an excellent solution but still don't win the auction.
But since it can be expected that these cancellation events are on average distributed fairly across all solvers none of them should have an advantage over the others.

@fleupold
Copy link
Contributor

Currently, cancelling an order (and thus voiding the current auction) costs some gas and is thus something that doesn't make sense to do at scale.
However, if by simply signing an off-chain message any participant in the auction can invalidate their order I'm worried this will happen much more frequently.

This, plus the fact that we cannot fully prevent the race condition as you describe make this solution in my opinion inferior to simply not including a replacement order in the auction if the original order has been traded already.

@MartinquaXD
Copy link
Contributor Author

MartinquaXD commented Mar 19, 2024

To implement your suggestion we could add a new column to the orders table like replaced_by that holds a single uid. Then when indexing trade events for orders we could recursively walk the list of replaced_by orders and cancel them. That seems not too complicated and should be safe to do since replacing orders happens atomically. The only case where this is not sufficient is if there is an open order that is older than a replacement order that got settled (i.e. some order is open in the earlier part of the linked list of replacements). But because replacements happen in a single DB transaction this can not happen. So at least in the mostly centralized case we have at the moment that solution seems sufficient to avoid double spending of replacement orders.
To catch the case where solvers try to use a previously cancelled order as a JIT order we would still have to check the cancellation state of the JIT orders before picking a winner. Discarding a solution because they used an existing cow swap order that was not part of the auction as a JIT order (cancelled or not) does seem reasonable to me since this is not really intended IMO.

I think this scheme would also work for partially executed cancelled orders. Although there is an argument to be made that the user might need to be informed that the remaining unfilled amount will not be filled anymore because of this scheme.
WDYT?

@fleupold
Copy link
Contributor

For the recursive case I think we should ask the frontend to send us a list of all order uids which should be mutually exclusive (for this order to make it into the batch).

This way wo don't have weird edge cases with expired orders but only include an order if none of the "replaced" orders are either still part of the auction or have a trade.

@anxolin
Copy link

anxolin commented Mar 20, 2024

Thanks @MartinquaXD for putting this together. I also did a writeup explaining the issue, and the potential mitigation that in our opinion would be good enough: https://www.notion.so/cownation/Canceling-Replace-Orders-Edge-Case-5f1d0d38e3f541e7b454f8e7faa7fee1?pvs=4

Basically, agree with most of the comments above:

  • Discarding a solution because it was canceled AFTER the competition started, I don't think is a good option. The only exception we could potentially do as a good to have, is to discard if the only order in the batch is canceled or replaced
  • Discarding a solution because it was canceled BEFORE the competition started is reasonable in my opinion. In the future could be part of the slashable behaviour (if we can't guarantee it with coolocation)
  • We are happy to provide as Felix says a list of mutually exclusive order. For example If we cancel A with B and B with C, we will send:
    • A = {appData ={}, ...}
    • B = {appData ={metadata: {replacedOreders: [A] }}, ...}
    • C = {appData ={metadata: {replacedOreders: [A, B]}}, ...}

Copy link

This issue has been marked as stale because it has been inactive a while. Please update this issue or it will be automatically closed.

@github-actions github-actions bot added the stale label May 20, 2024
@anxolin
Copy link

anxolin commented May 22, 2024

No need to fix this now, but let's not close it Mr. Stale bot

@fleupold
Copy link
Contributor

You need to remove the stale label on the issue in that case...

@fleupold fleupold removed the stale label May 22, 2024
Copy link

This issue has been marked as stale because it has been inactive a while. Please update this issue or it will be automatically closed.

Copy link

This issue has been marked as stale because it has been inactive a while. Please update this issue or it will be automatically closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants