Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add auditing #18

Closed

Conversation

zathras-crypto
Copy link

** NOT FINAL **
I've segwayed a bit for the last few days to build a kind of auditing for OmniCore. The main drivers were increased confidence in data and making finding bugs easier.

In a nutshell, the auditor maintains a series of caches for certain data from the state and then verifies this data has not changed unexpectedly during processing of a block.

The auditor is currently capable of detecting the following scenarios:

  • Property tokens created out of thin air. The auditor is notified when tokens are intentionally created in the state (such as by an issuance or a grant). If the number of tokens for a property increases without the auditor being notified (for example from a math error) the auditor will spot this.
  • Tokens disappearing. Basically the reverse of the above - things like revokes and fee burns can notify the auditor of a reduction in tokens for a property, but the auditor will raise a fail if the number of tokens drops unexpectedly.
  • Properties appearing/disappearing/non-sequential property creations. Again, pretty much the same logic - intentional property creations are notified to the auditor, so if property counts increase/decrease unexpectedly the auditor can spot this.
  • MetaDEx trades with zero values. Basically illegal values such as zero price, zero amount for sale,
    zero amount desired etc appearing in the metadex maps will be spotted by the auditor.
  • MetaDEx trades where the unit price has changed. Unit prices are not supposed to change as per spec, so the auditor tracks the original price for each trade and makes sure that partial trades are not affecting the price.
  • (WIP) MetaDEx trades that are crossed (should have matched) but were not traded. This is a bit rough around the edges, uses pricing to determine highest "buy" (inverse market) against lowest sell.

The auditor is thus able to identify these scenarios and we are able to see why the audit has failed, what block and in most cases what txid has triggered the issue. Idea being to complement the work the testing guys are doing and reduce the time spent identifying bugs so we can spend more time on diagnosis/remediation.

This is a WIP - it's functional, but not ready for merge just yet.

I'm no expert on self-auditing code, so please chime in with any feedback you may have. I'm specifically looking for feedback on two areas but comment on anything you think relevant of course :)

Behaviour
There are two questions I'd like to ask here.
The first; should the auditor be enabled for everyone all the time, or should it be enabled on demand?
The second; what should we do in the event of an audit failure?
The current behaviour is enabled for everyone (unless disabled with --disableauditor startup param) and to shutdown in the event of audit failure. These are obviously open to being changed though.

Failure modes
What other scenarios should we audit? Any suggestions?

@dexX7
Copy link
Member

dexX7 commented Apr 20, 2015

This is extremely interesting and useful for diagnostics.

Just a quick note: in my opinion, it would make more sense to have a more generalized signaling and data handling in Omni Core, and then add an audit module.

Most of the reporting would be helpful to interact with the UI as well, given that both the audit module, and the UI, should be notified, if balances change, a property is created, etc. In the broader picture, this also applies to the persistence layer, and the chain of actions should basically be:

  1. block notification
  2. transaction notification
  3. transaction parsing
  4. data interpretation
  5. sanity checking
  6. applying of effects (i.e. updating balances)
  7. persisting the information (maybe swapped with 6)
  8. updating the UI

An audit module should not be part of this chain, but act as an external, and decoupled entity, like a seperated observer, which has a different point of view.

Where I'm getting: if the audit module is integrated first, and further signaling is build around that module and code, then it looses it's "independent perspective". Hope this description makes sense. :)


The first; should the auditor be enabled for everyone all the time, or should it be enabled on demand?

Ideally, it should be enabled, because a bad state should be prevented by all means, but ...

The second; what should we do in the event of an audit failure?

... it really depends on the goal and scope here.

It was quite an enlighting moment, when you mentioned: "assuming we had an audit module, which detects a bad state, and shuts down the client, then we run into the risk, to shutdown all Omni Core clients out there".

I think, for further discussion, we should categorize events into two groups:

  1. There is some twisted logic in Omni Core, which violates the protocol rules.
    With global consensus, any protocol violation can basically become part of that consensus, but ideally, this should never ever happen in the first place, but there is a good chance, it actually does, during the development process, for example when adding new features.
    From that perspective: parts that are work in progress should be tested by an audit module, to avoid protocol violations in the wild. It's still an open question though, how protocol violations in the wild should be handled.
  2. There is some bad state, which is not caused by twisted logic.
    This could be the result of other factors, for example: data corruption on the disk, in memory, due to an ongoing attack or exploit, and such incidends do not affect the whole network.
    Events in this category should be caught, and initiate mitigation or defensive steps, like triggering a forced shutdown of the client.

@zathras-crypto
Copy link
Author

Just a quick note: in my opinion, it would make more sense to have a more generalized signaling and data handling in Omni Core, and then add an audit module.

Yeah, I do agree with your methodology here, perhaps we'll take this direction - I do see it as needing quite a lot of work though to refactor data handling in Omni Core - though we don't need to merge this yet thankfully, my main goal was to have a tool for picking up faults while we try and sort out MetaDEx and it can still do that without merging :)

@dexX7
Copy link
Member

dexX7 commented Apr 22, 2015

Ahh, please don't get me wrong.. :)

All those handlers actually look close to something that could be useful for, or become, a more generalized signaling system at some point, and I wasn't implying a signaling system should be created first, but I wanted to outline where it could go, and maybe to have that path in mind, when extending the audit module.

What also comes to my mind are the other related topics: an audit ledger to track actions, which could become an alternative to compare the global state and consensus, other than dumping and comparing balances, e.g. by using all valid transaction hashes, etc. etc. ... All this is sort of related, and you're pretty much paving the way right now, and this is basically what I wanted to highlight. :)

When a SIGHUP is received a bool is set true to force reopening of the log file.  After reopening the log this bool is flipped back to false.
Reusing the same bool between debug.log and mastercore.log means whichever is written to first after a SIGHUP will be reopened and the bool flipped back to false, meaning the other log will miss the request and not be reopened.
@zathras-crypto
Copy link
Author

Closing this PR, code was 18 months old. Ported to latest 0.0.11.2 here #427

bvbfan pushed a commit to bvbfan/omnicore that referenced this pull request Jun 6, 2023
fac04cb refactor: Add lock annotations to Active* methods (MacroFake)
fac15ff Fix logical race in rest_getutxos (MacroFake)
fa97a52 Fix UB/data-race in RPCNotifyBlockChange (MacroFake)
fa530bc Add ChainstateManager::GetMutex(), an alias for ::cs_main (MacroFake)

Pull request description:

  This fixes two issues:

  * A data race in `ActiveChain`, which returns a reference to the chain (a `std::vector`), which is not thread safe. See also below traceback.
  * A corrupt rest response, which returns a blockheight and blockhash, which are unrelated to each other and to the result, as the chain might advance between each call without cs_main held.

  The issues are fixed by taking cs_main and holding it for the required time.

  ```
  ==================
  WARNING: ThreadSanitizer: data race (pid=32335)
    Write of size 8 at 0x7b3c000008f0 by thread T22 (mutexes: write M131626, write M151, write M131553):
      #0 std::__1::enable_if<(is_move_constructible<CBlockIndex**>::value) && (is_move_assignable<CBlockIndex**>::value), void>::type std::__1::swap<CBlockIndex**>(CBlockIndex**&, CBlockIndex**&) /usr/lib/llvm-13/bin/../include/c++/v1/__utility/swap.h:39:7 (bitcoind+0x501239)
      OmniLayer#1 std::__1::vector<CBlockIndex*, std::__1::allocator<CBlockIndex*> >::__swap_out_circular_buffer(std::__1::__split_buffer<CBlockIndex*, std::__1::allocator<CBlockIndex*>&>&) /usr/lib/llvm-13/bin/../include/c++/v1/vector:977:5 (bitcoind+0x501239)
      OmniLayer#2 std::__1::vector<CBlockIndex*, std::__1::allocator<CBlockIndex*> >::__append(unsigned long) /usr/lib/llvm-13/bin/../include/c++/v1/vector:1117:9 (bitcoind+0x501239)
      OmniLayer#3 std::__1::vector<CBlockIndex*, std::__1::allocator<CBlockIndex*> >::resize(unsigned long) /usr/lib/llvm-13/bin/../include/c++/v1/vector:2046:15 (bitcoind+0x4ffe29)
      OmniLayer#4 CChain::SetTip(CBlockIndex*) src/chain.cpp:19:12 (bitcoind+0x4ffe29)
      OmniLayer#5 CChainState::ConnectTip(BlockValidationState&, CBlockIndex*, std::__1::shared_ptr<CBlock const> const&, ConnectTrace&, DisconnectedBlockTransactions&) src/validation.cpp:2748:13 (bitcoind+0x475d00)
      OmniLayer#6 CChainState::ActivateBestChainStep(BlockValidationState&, CBlockIndex*, std::__1::shared_ptr<CBlock const> const&, bool&, ConnectTrace&) src/validation.cpp:2884:18 (bitcoind+0x47739e)
      OmniLayer#7 CChainState::ActivateBestChain(BlockValidationState&, std::__1::shared_ptr<CBlock const>) src/validation.cpp:3011:22 (bitcoind+0x477baf)
      OmniLayer#8 node::ThreadImport(ChainstateManager&, std::__1::vector<fs::path, std::__1::allocator<fs::path> >, ArgsManager const&) src/node/blockstorage.cpp:883:30 (bitcoind+0x23cd74)
      OmniLayer#9 AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*)::$_7::operator()() const src/init.cpp:1657:9 (bitcoind+0x15863e)
      OmniLayer#10 decltype(static_cast<AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*)::$_7&>(fp)()) std::__1::__invoke<AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*)::$_7&>(AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*)::$_7&) /usr/lib/llvm-13/bin/../include/c++/v1/type_traits:3918:1 (bitcoind+0x15863e)
      OmniLayer#11 void std::__1::__invoke_void_return_wrapper<void, true>::__call<AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*)::$_7&>(AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*)::$_7&) /usr/lib/llvm-13/bin/../include/c++/v1/__functional/invoke.h:61:9 (bitcoind+0x15863e)
      OmniLayer#12 std::__1::__function::__alloc_func<AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*)::$_7, std::__1::allocator<AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*)::$_7>, void ()>::operator()() /usr/lib/llvm-13/bin/../include/c++/v1/__functional/function.h:171:16 (bitcoind+0x15863e)
      OmniLayer#13 std::__1::__function::__func<AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*)::$_7, std::__1::allocator<AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*)::$_7>, void ()>::operator()() /usr/lib/llvm-13/bin/../include/c++/v1/__functional/function.h:345:12 (bitcoind+0x15863e)
      OmniLayer#14 std::__1::__function::__value_func<void ()>::operator()() const /usr/lib/llvm-13/bin/../include/c++/v1/__functional/function.h:498:16 (bitcoind+0x88891f)
      OmniLayer#15 std::__1::function<void ()>::operator()() const /usr/lib/llvm-13/bin/../include/c++/v1/__functional/function.h:1175:12 (bitcoind+0x88891f)
      OmniLayer#16 util::TraceThread(char const*, std::__1::function<void ()>) src/util/thread.cpp:18:9 (bitcoind+0x88891f)
      OmniLayer#17 decltype(static_cast<void (*>(fp)(static_cast<char const*>(fp0), static_cast<AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*)::$_7>(fp0))) std::__1::__invoke<void (*)(char const*, std::__1::function<void ()>), char const*, AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*)::$_7>(void (*&&)(char const*, std::__1::function<void ()>), char const*&&, AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*)::$_7&&) /usr/lib/llvm-13/bin/../include/c++/v1/type_traits:3918:1 (bitcoind+0x157e6a)
      OmniLayer#18 void std::__1::__thread_execute<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (*)(char const*, std::__1::function<void ()>), char const*, AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*)::$_7, 2ul, 3ul>(std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (*)(char const*, std::__1::function<void ()>), char const*, AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*)::$_7>&, std::__1::__tuple_indices<2ul, 3ul>) /usr/lib/llvm-13/bin/../include/c++/v1/thread:280:5 (bitcoind+0x157e6a)
      OmniLayer#19 void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (*)(char const*, std::__1::function<void ()>), char const*, AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*)::$_7> >(void*) /usr/lib/llvm-13/bin/../include/c++/v1/thread:291:5 (bitcoind+0x157e6a)
    Previous read of size 8 at 0x7b3c000008f0 by main thread:
      #0 std::__1::vector<CBlockIndex*, std::__1::allocator<CBlockIndex*> >::size() const /usr/lib/llvm-13/bin/../include/c++/v1/vector:680:61 (bitcoind+0x15179d)
      OmniLayer#1 CChain::Tip() const src/./chain.h:449:23 (bitcoind+0x15179d)
      OmniLayer#2 ChainstateManager::ActiveTip() const src/./validation.h:927:59 (bitcoind+0x15179d)
      OmniLayer#3 AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*) src/init.cpp:1841:35 (bitcoind+0x15179d)
      OmniLayer#4 AppInit(node::NodeContext&, int, char**) src/bitcoind.cpp:231:43 (bitcoind+0x133fd2)
      OmniLayer#5 main src/bitcoind.cpp:275:13 (bitcoind+0x133fd2)
    Location is heap block of size 232 at 0x7b3c00000870 allocated by main thread:
      #0 operator new(unsigned long) <null> (bitcoind+0x132668)
      OmniLayer#1 ChainstateManager::InitializeChainstate(CTxMemPool*, std::__1::optional<uint256> const&) src/validation.cpp:4851:21 (bitcoind+0x48e26b)
      OmniLayer#2 node::LoadChainstate(bool, ChainstateManager&, CTxMemPool*, bool, Consensus::Params const&, bool, long, long, long, bool, bool, std::__1::function<bool ()>, std::__1::function<void ()>) src/node/chainstate.cpp:31:14 (bitcoind+0x24de07)
      OmniLayer#3 AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*) src/init.cpp:1438:32 (bitcoind+0x14e994)
      OmniLayer#4 AppInit(node::NodeContext&, int, char**) src/bitcoind.cpp:231:43 (bitcoind+0x133fd2)
      OmniLayer#5 main src/bitcoind.cpp:275:13 (bitcoind+0x133fd2)
    Mutex M131626 (0x7b3c00000898) created at:
      #0 pthread_mutex_lock <null> (bitcoind+0xda898)
      OmniLayer#1 std::__1::mutex::lock() <null> (libc++.so.1+0x49f35)
      OmniLayer#2 node::ThreadImport(ChainstateManager&, std::__1::vector<fs::path, std::__1::allocator<fs::path> >, ArgsManager const&) src/node/blockstorage.cpp:883:30 (bitcoind+0x23cd74)
      OmniLayer#3 AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*)::$_7::operator()() const src/init.cpp:1657:9 (bitcoind+0x15863e)
      OmniLayer#4 decltype(static_cast<AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*)::$_7&>(fp)()) std::__1::__invoke<AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*)::$_7&>(AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*)::$_7&) /usr/lib/llvm-13/bin/../include/c++/v1/type_traits:3918:1 (bitcoind+0x15863e)
      OmniLayer#5 void std::__1::__invoke_void_return_wrapper<void, true>::__call<AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*)::$_7&>(AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*)::$_7&) /usr/lib/llvm-13/bin/../include/c++/v1/__functional/invoke.h:61:9 (bitcoind+0x15863e)
      OmniLayer#6 std::__1::__function::__alloc_func<AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*)::$_7, std::__1::allocator<AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*)::$_7>, void ()>::operator()() /usr/lib/llvm-13/bin/../include/c++/v1/__functional/function.h:171:16 (bitcoind+0x15863e)
      OmniLayer#7 std::__1::__function::__func<AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*)::$_7, std::__1::allocator<AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*)::$_7>, void ()>::operator()() /usr/lib/llvm-13/bin/../include/c++/v1/__functional/function.h:345:12 (bitcoind+0x15863e)
      OmniLayer#8 std::__1::__function::__value_func<void ()>::operator()() const /usr/lib/llvm-13/bin/../include/c++/v1/__functional/function.h:498:16 (bitcoind+0x88891f)
      OmniLayer#9 std::__1::function<void ()>::operator()() const /usr/lib/llvm-13/bin/../include/c++/v1/__functional/function.h:1175:12 (bitcoind+0x88891f)
      OmniLayer#10 util::TraceThread(char const*, std::__1::function<void ()>) src/util/thread.cpp:18:9 (bitcoind+0x88891f)
      OmniLayer#11 decltype(static_cast<void (*>(fp)(static_cast<char const*>(fp0), static_cast<AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*)::$_7>(fp0))) std::__1::__invoke<void (*)(char const*, std::__1::function<void ()>), char const*, AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*)::$_7>(void (*&&)(char const*, std::__1::function<void ()>), char const*&&, AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*)::$_7&&) /usr/lib/llvm-13/bin/../include/c++/v1/type_traits:3918:1 (bitcoind+0x157e6a)
      OmniLayer#12 void std::__1::__thread_execute<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (*)(char const*, std::__1::function<void ()>), char const*, AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*)::$_7, 2ul, 3ul>(std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (*)(char const*, std::__1::function<void ()>), char const*, AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*)::$_7>&, std::__1::__tuple_indices<2ul, 3ul>) /usr/lib/llvm-13/bin/../include/c++/v1/thread:280:5 (bitcoind+0x157e6a)
      OmniLayer#13 void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (*)(char const*, std::__1::function<void ()>), char const*, AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*)::$_7> >(void*) /usr/lib/llvm-13/bin/../include/c++/v1/thread:291:5 (bitcoind+0x157e6a)
    Mutex M151 (0x55aacb8ea030) created at:
      #0 pthread_mutex_init <null> (bitcoind+0xbed2f)
      OmniLayer#1 std::__1::recursive_mutex::recursive_mutex() <null> (libc++.so.1+0x49fb3)
      OmniLayer#2 __libc_start_main <null> (libc.so.6+0x29eba)
    Mutex M131553 (0x7b4c000042e0) created at:
      #0 pthread_mutex_init <null> (bitcoind+0xbed2f)
      OmniLayer#1 std::__1::recursive_mutex::recursive_mutex() <null> (libc++.so.1+0x49fb3)
      OmniLayer#2 std::__1::__unique_if<CTxMemPool>::__unique_single std::__1::make_unique<CTxMemPool, CBlockPolicyEstimator*, int const&>(CBlockPolicyEstimator*&&, int const&) /usr/lib/llvm-13/bin/../include/c++/v1/__memory/unique_ptr.h:728:32 (bitcoind+0x15c81d)
      OmniLayer#3 AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*) src/init.cpp:1426:24 (bitcoind+0x14e7b4)
      OmniLayer#4 AppInit(node::NodeContext&, int, char**) src/bitcoind.cpp:231:43 (bitcoind+0x133fd2)
      OmniLayer#5 main src/bitcoind.cpp:275:13 (bitcoind+0x133fd2)
    Thread T22 'b-loadblk' (tid=32370, running) created by main thread at:
      #0 pthread_create <null> (bitcoind+0xbd5bd)
      OmniLayer#1 std::__1::__libcpp_thread_create(unsigned long*, void* (*)(void*), void*) /usr/lib/llvm-13/bin/../include/c++/v1/__threading_support:443:10 (bitcoind+0x155e06)
      OmniLayer#2 std::__1::thread::thread<void (*)(char const*, std::__1::function<void ()>), char const (&) [8], AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*)::$_7, void>(void (*&&)(char const*, std::__1::function<void ()>), char const (&) [8], AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*)::$_7&&) /usr/lib/llvm-13/bin/../include/c++/v1/thread:307:16 (bitcoind+0x155e06)
      OmniLayer#3 AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*) src/init.cpp:1656:29 (bitcoind+0x150164)
      OmniLayer#4 AppInit(node::NodeContext&, int, char**) src/bitcoind.cpp:231:43 (bitcoind+0x133fd2)
      OmniLayer#5 main src/bitcoind.cpp:275:13 (bitcoind+0x133fd2)
  SUMMARY: ThreadSanitizer: data race /usr/lib/llvm-13/bin/../include/c++/v1/__utility/swap.h:39:7 in std::__1::enable_if<(is_move_constructible<CBlockIndex**>::value) && (is_move_assignable<CBlockIndex**>::value), void>::type std::__1::swap<CBlockIndex**>(CBlockIndex**&, CBlockIndex**&)
  ==================
  ```

  From https://cirrus-ci.com/task/5612886578954240?logs=ci#L4868

ACKs for top commit:
  achow101:
    re-ACK fac04cb
  theStack:
    Code-review ACK fac04cb

Tree-SHA512: 9d619f99ff6373874c7ffe1db20674575605646b4b54b692fb54515a4a49f110a770026d7320ed6dfeaa7976be4cd89e93f821acdbf22c7662bd1c5be0cedcd2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants