Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AMMBid: use tecINTERNAL for 'impossible' errors #4674

Merged
merged 4 commits into from
Sep 1, 2023

Conversation

mDuo13
Copy link
Collaborator

@mDuo13 mDuo13 commented Aug 24, 2023

Modifies two error cases in AMMBid to return tecINTERNAL rather than another error code, to more clearly indicate that these types of errors should not be possible unless operating in some sort of unforeseen circumstances. Upgrade the related log message to warning level, since it likely indicates a bug.

Caution: I have not actually built this branch myself. Reviewers, please confirm that the code compiles and passes tests.

High Level Overview of Change

Modifies two specific transaction error cases in the processing for the AMMBid transactor:

  • tecAMM_BALANCE - In this case, this error (total LP Tokens outstanding is lower than the amount to be burned for the bid) is a subset of the case where the user doesn't have enough LP Tokens to pay for the bid. When this case is reached, the bidder's LP Tokens balance has already been checked first. The user's LP Tokens should always be a subset of total LP Tokens issued, so this should be impossible.
  • tecINSUFFICIENT_PAYMENT - In this case, the amount to be refunded as a result of the bid is greater than the price paid for the auction slot. This should never occur unless something is wrong with the math for calculating the refund amount.

Context of Change

Both error cases in question are "defense in depth" measures meant to protect against making things worse if the code has already reached a state that is supposed to be impossible, likely due to a bug elsewhere. (For example, some sort of race condition where multiple threads are accessing the same ledger entries in parallel without proper locks, simple memory corruption, or some sort of math failure elsewhere—rounding errors, missing range checks, and so on.)

If these cases have "normal" error codes, that suggests that QA should be able to add test cases that intentionally trigger them, and if they occur then the code is functioning as intended. But, in reality, it should be impossible for an integration test to trigger this case, and if you can trigger these reliably it's indicative of a bug that should be fixed elsewhere. (It's possible that this sort of thing could occur randomly due to, say, cosmic rays flipping bits at random, in which case reproducing the error is likely to be, well, very difficult.)

Any "shouldn't ever occur" checks should use an error code that categorically indicates a larger problem. This is similar to how tecINVARIANT_FAILED is a warning sign that something went wrong and likely could've been worse, but since there isn't a separate-level Invariant Check applying here, tecINTERNAL is the appropriate error code. The official description for tecINTERNAL in the docs is:

This error code should not normally be returned. If you can reproduce this error, please report an issue.

Type of Change

  • Refactor (non-breaking change that only restructures code)

This is "debatably" a transaction processing change since it could hypothetically change how transactions are processed if there's a bug we don't know about, but as far as I know it doesn't actually change anything that can actually happen. Regardless, as long as the changes are incorporated into the AMM amendment before said amendment is part of any stable release, no additional amendment is necessary.

Upgrade log level to warning since it indicates some kind of
(possibly network-wide) unexpected condition when these two
errors occur.
@mDuo13 mDuo13 requested a review from gregtatcam August 24, 2023 23:22
@mDuo13
Copy link
Collaborator Author

mDuo13 commented Aug 24, 2023

It seems I don't know how to run clang-format. I tried, but it didn't work. ¯_(ツ)_/¯

Copy link
Collaborator

@gregtatcam gregtatcam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 LGTM. Builds/passes the unit-tests. Approved, assuming the clang format errors are going to be fixed.
This works for me on OSX. Run from rippled folder: clang-format -style=format -i src/ripple/app/tx/impl/AMMBid.cpp. clang-format version is 15.0.7.

@intelliot intelliot requested a review from ximinez August 25, 2023 15:53
@intelliot
Copy link
Collaborator

I don't have permission to push to your branch, but I applied the clang-format patch and pushed the commit here: intelliot@925552f

Feel free to merge or cherry-pick it

@ximinez
Copy link
Collaborator

ximinez commented Aug 25, 2023

Just a quick comment before I dig into this, I think it would be more accurate to log those messages as "error".

@intelliot intelliot added this to the 1.12 milestone Aug 25, 2023
@intelliot
Copy link
Collaborator

notes:

@intelliot
Copy link
Collaborator

note: this is arguably an API change but it only affects the result code from submit / transaction processing.

@intelliot
Copy link
Collaborator

note: no impact on Clio

Copy link
Collaborator

@ximinez ximinez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than the thing about the log level, this looks fine.

Comment on lines 227 to 229
JLOG(ctx_.journal.warn())
<< "AMM Bid: LP Token burn exceeds AMM balance " << burn << " "
<< lptAMMBalance;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far as I can tell, every tecINTERNAL result that logs, logs as error() or higher, other than a few in AMM code, which is new, and which I'd also consider an oversight.

These two log messages should be changed to error(), and I wouldn't mind seeing those others changed to error(), too.

@intelliot
Copy link
Collaborator

notes:

  • this should log at fatal level
  • should also update other tecINTERNAL logs to be at the same fatal level (as long as they are expected to be impossible)

@mDuo13 mDuo13 requested a review from ximinez August 30, 2023 20:18
@mDuo13
Copy link
Collaborator Author

mDuo13 commented Aug 30, 2023

As for changing to log levels to fatal in other cases where we return tecINTERNAL because the error should be impossible—I'm leaving that to someone else in another PR. Maybe @ximinez?

@intelliot
Copy link
Collaborator

Yes, that should be a separate PR. No urgency, so I'm content to leave it as an open issue for now. Maybe a Good First Issue for a new contributor, even.

@intelliot
Copy link
Collaborator

@ximinez - please review this as well

@intelliot
Copy link
Collaborator

This is a "tiny" PR and review by @seelabs would be welcome, but not required.

Copy link
Collaborator

@ximinez ximinez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Just running through my Windows-based checks. I can't imagine any reason they'd fail.

@intelliot
Copy link
Collaborator

Just running through my Windows-based checks.

If those do turn up anything, feel free to open a new issue/PR. This one will be merged shortly...

@intelliot intelliot merged commit a61a88e into XRPLF:develop Sep 1, 2023
15 checks passed
@intelliot intelliot mentioned this pull request Sep 1, 2023
1 task
ckeshava pushed a commit to ckeshava/rippled that referenced this pull request Sep 22, 2023
Modify two error cases in AMMBid transactor to return `tecINTERNAL` to
more clearly indicate that these errors should not be possible unless
operating in unforeseen circumstances. It likely indicates a bug.

The log level has been updated to `fatal()` since it indicates a
(potentially network-wide) unexpected condition when either of these
errors occurs.

Details:

The two specific transaction error cases changed are:

- `tecAMM_BALANCE` - In this case, this error (total LP Tokens
  outstanding is lower than the amount to be burned for the bid) is a
  subset of the case where the user doesn't have enough LP Tokens to pay
  for the bid. When this case is reached, the bidder's LP Tokens balance
  has already been checked first. The user's LP Tokens should always be
  a subset of total LP Tokens issued, so this should be impossible.
- `tecINSUFFICIENT_PAYMENT` - In this case, the amount to be refunded as
  a result of the bid is greater than the price paid for the auction
  slot. This should never occur unless something is wrong with the math
  for calculating the refund amount.

Both error cases in question are "defense in depth" measures meant to
protect against making things worse if the code has already reached a
state that is supposed to be impossible, likely due to a bug elsewhere.

Such "shouldn't ever occur" checks should use an error code that
categorically indicates a larger problem. This is similar to how
`tecINVARIANT_FAILED` is a warning sign that something went wrong and
likely could've been worse, but since there isn't an Invariant Check
applying here, `tecINTERNAL` is the appropriate error code.

This is "debatably" a transaction processing change since it could
hypothetically change how transactions are processed if there's a bug we
don't know about.
ckeshava pushed a commit to ckeshava/rippled that referenced this pull request Sep 25, 2023
Modify two error cases in AMMBid transactor to return `tecINTERNAL` to
more clearly indicate that these errors should not be possible unless
operating in unforeseen circumstances. It likely indicates a bug.

The log level has been updated to `fatal()` since it indicates a
(potentially network-wide) unexpected condition when either of these
errors occurs.

Details:

The two specific transaction error cases changed are:

- `tecAMM_BALANCE` - In this case, this error (total LP Tokens
  outstanding is lower than the amount to be burned for the bid) is a
  subset of the case where the user doesn't have enough LP Tokens to pay
  for the bid. When this case is reached, the bidder's LP Tokens balance
  has already been checked first. The user's LP Tokens should always be
  a subset of total LP Tokens issued, so this should be impossible.
- `tecINSUFFICIENT_PAYMENT` - In this case, the amount to be refunded as
  a result of the bid is greater than the price paid for the auction
  slot. This should never occur unless something is wrong with the math
  for calculating the refund amount.

Both error cases in question are "defense in depth" measures meant to
protect against making things worse if the code has already reached a
state that is supposed to be impossible, likely due to a bug elsewhere.

Such "shouldn't ever occur" checks should use an error code that
categorically indicates a larger problem. This is similar to how
`tecINVARIANT_FAILED` is a warning sign that something went wrong and
likely could've been worse, but since there isn't an Invariant Check
applying here, `tecINTERNAL` is the appropriate error code.

This is "debatably" a transaction processing change since it could
hypothetically change how transactions are processed if there's a bug we
don't know about.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

None yet

4 participants