Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce "order book not crossed" invariant (for fuzzing only) #2959

Merged
merged 8 commits into from
Mar 30, 2021

Conversation

rokopt
Copy link
Contributor

@rokopt rokopt commented Mar 11, 2021

Introduce "order book not crossed" invariant

Resolves #2417

This PR introduces a new invariant which keeps track of the state of the order book as operations are applied, and asserts that it is not crossed.

There is currently no mechanism by which an invariant can be informed that an operation application for which its checkOnOperationApply() method had been called is being rolled back. Hence, we can not in general support an invariant such as this which maintains state across method calls. But we can when fuzzing, where setup operations are never rolled back and where we can explicitly tell invariants when to snapshot state (that is, after setup completes) and when to reset it to that state (after each fuzz test).

Therefore, this invariant is compiled in when fuzzing is configured, and not otherwise.

This PR updates #2210 to current code, adapting to some changes such as the introduction of InternalLedgerEntry and the rearranging of the fuzzing code, and adds support for passive offers.

Checklist

  • Reviewed the contributing document
  • Rebased on top of master (no merge commits)
  • Ran clang-format v8.0.0 (via make format or the Visual Studio extension)
  • Compiles
  • Ran all tests
  • If change impacts performance, include supporting evidence per the performance document

@rokopt
Copy link
Contributor Author

rokopt commented Mar 11, 2021

This invariant does appear to have a major effect on fuzzer execution speed. Metrics after 20 minutes in current code without this PR:

master-without-order-book-not-crossed-invariant-for-20-min

And with this PR:

order-book-not-crossed-invariant-present-for-20-min

@rokopt rokopt mentioned this pull request Mar 11, 2021
6 tasks
src/invariant/OrderBookIsNotCrossed.cpp Outdated Show resolved Hide resolved
src/invariant/test/InvariantTestUtils.cpp Outdated Show resolved Hide resolved
src/test/FuzzerImpl.cpp Show resolved Hide resolved
void
OrderBookIsNotCrossed::resetForFuzzer()
{
mOrderBook = mRolledBackOrderBook;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resetForFuzzer is likely very slow as it's copying a large data structure, yet it's unlikely that offers got changed.
I think we should perform the actual copy as late as possible: probably in OrderBookIsNotCrossed::check
(reset is probably still here though)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that you reverted your change. From the looks of it, it was just not correct.
you need to always clear in resetForFuzzer (for stability) and CoW

src/invariant/OrderBookIsNotCrossed.cpp Show resolved Hide resolved
src/invariant/test/OrderBookIsNotCrossedTests.cpp Outdated Show resolved Hide resolved
src/invariant/test/OrderBookIsNotCrossedTests.cpp Outdated Show resolved Hide resolved
src/invariant/test/OrderBookIsNotCrossedTests.cpp Outdated Show resolved Hide resolved
src/invariant/test/OrderBookIsNotCrossedTests.cpp Outdated Show resolved Hide resolved
@rokopt
Copy link
Contributor Author

rokopt commented Mar 15, 2021

Alright, I've gone over the inherited files here and made a bunch of changes:

  • The part I found trickiest was just getting the code to build and run in as many situations as possible! We want it to build any time we have BUILD_TESTS configured, and to be able to run the invariant's own unit tests in that case, to minimize code rot. And we want to build and use it when fuzzing is configured. But we don't want the invariant to be active during any other unit tests, because it maintains state across calls to checkOnOperationApply() and the code base in general does not support invariants that do that. What I ended up doing was not registering the invariant in general because test configs enable .*, so if the invariant were registered, then it would be enabled during tests by default. Instead, I exported a registerAndEnableInvariant(), unlike most invariants which export a registerInvariant() and let the config manager decide which to enable. Then only those callers that explicitly know how to handle the order-book invariant's maintaining state can call registerAndEnableInvariant(). And two callers do so: the fuzzer and the invariant's own unit tests. The invariant's own unit tests also have to disable other invariants, because they modify ledger entries directly, which can break some other invariants. (That part is something that several other invariant tests do as well -- they set INVARIANT_CHECKS to just themselves.)
  • Rebased on master, which in particular includes the best-offer debugging code, with which there were a couple small conflicts.
  • Improved the passive-offer-crossing-detection algorithm to break early if possible.
  • Removed some weird C++ such as the references to temporaries and unnecessary at()s.
  • Removed using namespace and added conventional namespace use.
  • Added an assertion to makeUpdateList().
  • Deferred copying mRolledBackOrderBook as late as we can.
  • Made extractAssets() clearer (I hope).
  • Removed some changes which aren't necessary anymore after the above changes.

Copy link
Contributor

@jonjove jonjove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't review the tests yet because I think the implementation is going to change. You should add tests for some of the interesting floating-point edge cases I pointed out.

src/test/FuzzerImpl.cpp Show resolved Hide resolved
src/invariant/OrderBookIsNotCrossed.cpp Outdated Show resolved Hide resolved
src/invariant/OrderBookIsNotCrossed.cpp Outdated Show resolved Hide resolved
src/invariant/OrderBookIsNotCrossed.cpp Outdated Show resolved Hide resolved
src/invariant/OrderBookIsNotCrossed.cpp Outdated Show resolved Hide resolved
src/invariant/OrderBookIsNotCrossed.cpp Outdated Show resolved Hide resolved
src/invariant/OrderBookIsNotCrossed.cpp Outdated Show resolved Hide resolved
src/invariant/OrderBookIsNotCrossed.cpp Outdated Show resolved Hide resolved
src/invariant/OrderBookIsNotCrossed.cpp Outdated Show resolved Hide resolved
src/invariant/OrderBookIsNotCrossed.h Outdated Show resolved Hide resolved
@rokopt
Copy link
Contributor Author

rokopt commented Mar 16, 2021

I've pushed changes to address all comments except on the floating-point reciprocal issue; these changes are small so I figure I may as well make them available for review while I think about the floating-point.

src/invariant/OrderBookIsNotCrossed.h Outdated Show resolved Hide resolved
@rokopt
Copy link
Contributor Author

rokopt commented Mar 18, 2021

After much interesting chat about arithmetic, I've pushed a commit to restore the ordering by double arithmetic within the invariant, as it's done in stellar-core: the salient change is now simply to do reciprocals in rational arithmetic before any conversion to double. In effect, once a value becomes a double, it becomes constant (it should not be used in any further computation, only in comparison).

@jonjove
Copy link
Contributor

jonjove commented Mar 18, 2021

@rokopt can you squash+rebase at this point? We've gone through a lot of iterations here, with considerable back-and-forth on changes, and I need to get a clean look at it.

@rokopt
Copy link
Contributor Author

rokopt commented Mar 18, 2021

@rokopt can you squash+rebase at this point? We've gone through a lot of iterations here, with considerable back-and-forth on changes, and I need to get a clean look at it.

Sure, done.

src/invariant/OrderBookIsNotCrossed.h Outdated Show resolved Hide resolved

namespace stellar
{
double
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: you probably want this in the anonymous namespace or static.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed uses of static in favor of either the anonymous namespace or of membership in OrderBookNotCrossed, depending on whether any members of the latter need the types/functions.

src/invariant/OrderBookIsNotCrossed.cpp Outdated Show resolved Hide resolved
src/invariant/OrderBookIsNotCrossed.h Outdated Show resolved Hide resolved
src/invariant/test/OrderBookIsNotCrossedTests.cpp Outdated Show resolved Hide resolved
@rokopt
Copy link
Contributor Author

rokopt commented Mar 19, 2021

Note: in addition to responding to the latest code review comments, I've also pushed a commit to remove the mRollbackBeforeNextCheck optimization, because I discovered that it causes occasional crashes during long fuzzing runs. I don't understand why yet. We can either delay this PR until I've found a fix, or merge it and file an issue to perform that optimization (correctly) afterwards, as you prefer.

@rokopt rokopt requested a review from jonjove March 19, 2021 17:53
@stellar stellar deleted a comment from rokopt Mar 20, 2021
@stellar stellar deleted a comment from rokopt Mar 20, 2021
@stellar stellar deleted a comment from rokopt Mar 20, 2021
@rokopt
Copy link
Contributor Author

rokopt commented Mar 22, 2021

Note: in addition to responding to the latest code review comments, I've also pushed a commit to remove the mRollbackBeforeNextCheck optimization, because I discovered that it causes occasional crashes during long fuzzing runs. I don't understand why yet. We can either delay this PR until I've found a fix, or merge it and file an issue to perform that optimization (correctly) afterwards, as you prefer.

This optimization is restored now, in a corrected form.

@MonsieurNicolas
Copy link
Contributor

Looks like it's basically ready to go? Needs to be rebased and we can merge. @jonjove what do you think?

As a sanity check I didn't see an update on the fuzzer stability front (only one I see here is the one from very early on). So maybe @rokopt can post this in parallel?

@jonjove
Copy link
Contributor

jonjove commented Mar 25, 2021

@MonsieurNicolas I just went through and resolved all open comments from me. Should be ready for rebase/squash/merge. Agreed that I'd like to see the fuzzer statistics though.

This mechanism will allow for fuzzer-specific invariants
that support a mechanism for notifying them when setup
has completed and then when each new test starts.
Make the fuzzer call the new InvariantManager snapshot/reset
mechanism, as well as calling the existing one for
LedgerTxnRoot whenever BUILD_TESTS is configured.
A new invariant which keeps track of the state of the order
book as operations are applied, and asserts that it is not
crossed.

There is currently no mechanism by which an invariant can be
informed that an operation application for which its
checkOnOperationApply() method had been called is being
rolled back.  Hence, we can not in general support an
invariant such as this which maintains state across method
calls.  But we will be able to do so when fuzzing, where
setup operations are never rolled back and where we can
explicitly tell invariants when to snapshot state (that is,
after setup completes) and when to reset it to that state
(after each fuzz test).
@rokopt
Copy link
Contributor Author

rokopt commented Mar 26, 2021

I've squashed and rebased. Stability didn't change noticeably in this PR; it's still at the ~92% where it's been since the ledger-key mapping work:

afl-stats-with-squashed-order-book-not-crossed-pr

@MonsieurNicolas
Copy link
Contributor

mmm, I do have a question though. Is the "exec speed" really that slow compared to both master and the previous iteration of that PR?

I would not expect such a difference as the vast majority of fuzzing should not involve this invariant (it is, after all, unlikely that the fuzzer explores branches with offers).

{
updateOrderBook(ltxDelta);
auto assetPairs = extractAssetPairs(ltxDelta);
return assetPairs.size() > 0 ? check(assetPairs) : std::string{};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the reason things are super slow is that assetPairs.size() > 0 is always true even when there are no changes to the order book from this operation. In fact we should not perform any work in extractAssetPairs if the operation didn't touch the order book.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, why must assetPairs.size() be > 0? It looks to me as though if there are no OFFERs in LedgerTxnDelta, then it will be 0, and I put in some logs and they say that it is often 0.

One thing I mentioned to Jon about performance which I should bring up is that exec speed of fuzzing seems highly variable on my laptop, even from run to run without code changes (unlike stability, for example, which is generally within a fraction of a percent across many runs). Fuzzing does make my laptop fan roar, and I have a suspicion (though no confirmation) that there might be highly variable amounts of CPU-throttling going on depending on I-don't-know-what thermal factors.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

meh and I thought my head was straight in the morning :)

yeah this is extracting only from the current txn object so not it

@MonsieurNicolas
Copy link
Contributor

I think I found the bug, let me know if this makes sense @rokopt

@MonsieurNicolas
Copy link
Contributor

I ran the version with the invariant enabled and without based on the same initial corpus (so no variation because of that) and they performed the same way, so it was probably a fluke with your computer

@MonsieurNicolas
Copy link
Contributor

we'll merge as soon as master reopens

@jonjove
Copy link
Contributor

jonjove commented Mar 30, 2021

r+ dffc55f

@latobarita latobarita merged commit c64aa5a into stellar:master Mar 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cleanup and merge orderbook not crossed invariant
5 participants