Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: introduce CChainState::GetCoinsCacheSizeState #16945

Merged
merged 2 commits into from Jan 13, 2020

Conversation

@jamesob
Copy link
Member

jamesob commented Sep 23, 2019

This is part of the assumeutxo project:

Parent PR: #15606
Issue: #15605
Specification: https://github.com/jamesob/assumeutxo-docs/tree/master/proposal


This pulls out the routine for detection of how full the coins cache is from
FlushStateToDisk. We use this logic independently when deciding when to flush
the coins cache during UTXO snapshot activation (see here).

@jamesob jamesob changed the title refactoring: introduce CChainState::GetCoinsCacheSizeState refactor: introduce CChainState::GetCoinsCacheSizeState Sep 23, 2019
@DrahtBot

This comment has been minimized.

Copy link
Contributor

DrahtBot commented Sep 23, 2019

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #17737 (Add ChainstateManager, remove BlockManager global by jamesob)
  • #15606 ([experimental] UTXO snapshots by jamesob)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

Copy link
Contributor

ryanofsky left a comment

utACK e3ac4eb. Verified refactoring seems equivalent to current code

src/validation.cpp Outdated Show resolved Hide resolved
src/validation.cpp Outdated Show resolved Hide resolved
Copy link
Member Author

jamesob left a comment

I'll see about writing some tests as well.

src/validation.cpp Outdated Show resolved Hide resolved
src/validation.cpp Outdated Show resolved Hide resolved
@jamesob jamesob force-pushed the jamesob:2019-09-au-shouldflush branch 2 times, most recently from 9739ed8 to 3c66fb4 Sep 24, 2019
@jamesob

This comment has been minimized.

Copy link
Member Author

jamesob commented Sep 24, 2019

Thanks for the looks so far @ryanofsky @MarcoFalke. I've taken all proposed advice, as well as going further to parameterize the maximum coins cache and mempool target sizes. I've also added a unittest exercising this.

@jamesob jamesob force-pushed the jamesob:2019-09-au-shouldflush branch 2 times, most recently from 320aca2 to 4708fcb Sep 24, 2019
@jamesob jamesob force-pushed the jamesob:2019-09-au-shouldflush branch 3 times, most recently from beea475 to d780c6a Sep 25, 2019
@jamesob

This comment has been minimized.

Copy link
Member Author

jamesob commented Sep 25, 2019

Okay, after some fiddling the unittests are passing.

Copy link
Contributor

ryanofsky left a comment

utACK d780c6a. Changes since last review: adding test, getting rid of cached static size limit value. I left some suggestions, but think this is fine in it's current form.

chainstate.GetCoinsCacheSizeState(tx_pool, MAX_COINS_CACHE_BYTES, /*max_mempool_size_bytes*/ 0),
CoinsCacheSizeState::OK);

// If the initial memory allocations of cacheCoins don't match these common

This comment has been minimized.

Copy link
@ryanofsky

ryanofsky Sep 30, 2019

Contributor

It's not clear from the comment when this is expected to happen, like whether it happens on a certain type of plaform. Also it's not clear how if DynamicMemoryUsage() is just using sizeof internally, the test couldn't take this into account using the same sizeof expressions. Would at least extend comment to explain the issue.

This comment has been minimized.

Copy link
@jamesob

jamesob Oct 17, 2019

Author Member

I wanted to avoid duplicating the DynamicMemoryUsage() calculations explicitly because they're not totally trivial and it seemed like something that'd be bad to copy/paste. To be honest I don't understand why some 64-bit Travis platforms didn't slot into these initial allocations and so I can't think of what else to say here.

// cases, we can't really continue to make assertions about memory usage.
// End the test early.
if (view.DynamicMemoryUsage() != 32 && view.DynamicMemoryUsage() != 16) {
// Add a bunch of coins to see that we at least flip over to CRITICAL.

This comment has been minimized.

Copy link
@ryanofsky

ryanofsky Sep 30, 2019

Contributor

It might be clearer to move these fallback checks into a basic test and have the checks below be an extended test, running the basic test on all platforms and the extended test only on supported platforms. This would make the basic test more robust and easier to debug since it would run everywhere, and improve the extended test by making it shorter.

This comment has been minimized.

Copy link
@jamesob

jamesob Oct 17, 2019

Author Member

I tried breaking the extended tests out but the two end up sharing so much data that it's a pain to pass it all into the extended function. If you really think this is worthwhile I can think harder about how to break it up cleanly but in the meantime I don't think it's too egregious to leave as-is.

This comment has been minimized.

Copy link
@Sjors

Sjors Nov 27, 2019

Member

It should probably report when these tests are skipped, which seems to be the case on native 32 bit machines.

Why not inspect the value of view.DynamicMemoryUsage() to infer how many coins you need to add to reach CRITICAL? Alternatively, why not just keep adding coins until you do reach CRITICAL (with some upper bound to prevent an infinite loop if there's a serious bug)?

This comment has been minimized.

Copy link
@jamesob

jamesob Dec 5, 2019

Author Member

It should probably report when these tests are skipped

What are you thinking, BOOST_TEST_MESSAGE?

Why not inspect the value of view.DynamicMemoryUsage() to infer how many coins you need to add to reach CRITICAL?

I don't think it's that simple - different platforms/std::unordered_map implementations allocate differently, and so the current memory usage doesn't tell us enough about how memory usage will change as individual coins are added.

Alternatively, why not just keep adding coins until you do reach CRITICAL (with some upper bound to prevent an infinite loop if there's a serious bug)?

That's basically what I do below: https://github.com/bitcoin/bitcoin/pull/16945/files#diff-1c02ab3ed49056a4df577f3db9b53d30R64-R71

src/validation.cpp Outdated Show resolved Hide resolved
@jamesob jamesob force-pushed the jamesob:2019-09-au-shouldflush branch from d780c6a to 028778a Oct 17, 2019
Copy link
Contributor

ryanofsky left a comment

Code review ACK 028778a. Changes since last review were getting rid of special -1 defaults values, tweaking the 64-bit test check.

@ariard
ariard approved these changes Oct 30, 2019
Copy link
Contributor

ariard left a comment

Code review ACK 028778a, tested validation_flush_tests on a 64-bit Linux platform.

Not sure about robustness of new test on all platforms...

//! Dictates whether we need to flush the cache to disk or not.
//!
//! @return the state of the size of the coins cache.
CoinsCacheSizeState GetCoinsCacheSizeState(const CTxMemPool& tx_pool)

This comment has been minimized.

Copy link
@ariard

ariard Oct 30, 2019

Contributor

Even if mempool is passed as a const reference can't we go further and pass only mempool memory usage ?

Also if wrapper is for test-only or convenience for its further usage inside PopulateAndValidateSnapshot add reason in function comment


//! No need to periodic flush if at least this much space still available.
static constexpr int64_t MAX_BLOCK_COINSDB_USAGE_BYTES = 10 * 1024 * 1024; // 10MB
int64_t large_threshold =

This comment has been minimized.

Copy link
@ariard

ariard Oct 30, 2019

Contributor

nit: nLargeThreshold (naming conv shouldn't be same with nTotalSpace?)

This comment has been minimized.

Copy link
@jamesob

jamesob Nov 5, 2019

Author Member

Snake case is preferred in new code per the styleguide (https://github.com/bitcoin/bitcoin/blob/master/doc/developer-notes.md#coding-style-c) -- I don't think naming new variables to match the naming convention of old ones is something we've asked for previously.

This comment has been minimized.

Copy link
@ariard

ariard Nov 7, 2019

Contributor

Ah thanks and so snaking case nTotalSpace?

This comment has been minimized.

Copy link
@Sjors

Sjors Nov 19, 2019

Member

total_space, cache_size and mempool_usage would make sense.

src/validation.h Outdated Show resolved Hide resolved
print_view_mem_usage(view);

// Only perform these checks on 64 bit hosts; I haven't done the math for 32.
if (is_64_bit) {

This comment has been minimized.

Copy link
@ariard

ariard Oct 30, 2019

Contributor

Hmmm I spent some time trying to do the math for 32 but without a 32-bit platform it's quite hard so I've tried to observe some allocation patterns thanks to print_view_mem_usage on 64-bit, but given there are allocation threshold effects on how std::unordered_map is behaving (common practices for allocators to multiply some magic numbers by 2 after being full) we can't do hard assumptions...

Anyway I think this test is really dependent on how the standard library is implemented to get the right max_coins_cache_bytes/max_mempool_size_bytes.

Have you tried to run this part of the test as it is on 32-bit platform ?

This comment has been minimized.

Copy link
@mzumsande

mzumsande Oct 31, 2019

Contributor

Just noting that the test fails on ARM (travis-log) because CoinsCacheSizeState::LARGE is not hit.

This comment has been minimized.

Copy link
@jamesob

jamesob Nov 5, 2019

Author Member

Yeah I'm beginning to think that this test is a fool's errand. I might early-exit if the first map expansion doesn't match some pattern, but in general everything feels a little flaky here.


// Passing non-zero max mempool usage should allow us more headroom.
BOOST_CHECK_EQUAL(
chainstate.GetCoinsCacheSizeState(tx_pool, MAX_COINS_CACHE_BYTES, /*max_mempool_size_bytes*/ 1 << 10),

This comment has been minimized.

Copy link
@ariard

ariard Oct 30, 2019

Contributor

nit: could use MAX_MEMPOOL_SIZE_BYTES = 1024, I like bitwise but maybe not everyone :p

This comment has been minimized.

Copy link
@jamesob

jamesob Nov 6, 2019

Author Member

I think this is fine as-is for tests.

@jamesob jamesob force-pushed the jamesob:2019-09-au-shouldflush branch from 028778a to 513981f Nov 5, 2019
@jamesob

This comment has been minimized.

Copy link
Member Author

jamesob commented Nov 6, 2019

I've revised the new unittest so that it passes on Travis. I don't really follow the AppVeyor failure, so would appreciate any hints there.

@ryanofsky

This comment has been minimized.

Copy link
Contributor

ryanofsky commented Nov 6, 2019

I've revised the new unittest so that it passes on Travis. I don't really follow the AppVeyor failure, so would appreciate any hints there.

Appveyor failure is a known issue and should be fixed by #17384 (comment)

Copy link
Contributor

ryanofsky left a comment

Code review ACK 513981f. Changes since last review: replacing int64_t with size_t argument (not actually sure why, since it's called with an int64_t in the commit and only ever passed to std::max<int64_t>). Also: tweaking the
!is_64_bit test case to make it pass on travis. I lost track of the size calculations in the test a while ago, but they seem reasonable, and we can always scale back the test later if they turn out to be not stable.

@jamesob

This comment has been minimized.

Copy link
Member Author

jamesob commented Nov 6, 2019

Thanks for the review, Russ.

replacing int64_t with size_t argument

I did this to appease @ariard without thinking too hard about it but you're right, I think it doesn't make too much sense. Can revert if preferable.

@ariard

This comment has been minimized.

Copy link
Contributor

ariard commented Nov 7, 2019

Code review ACK 513981f.

Changes since last review: replacing int64_t with size_t argument (not actually sure why, since it's called with an int64_t in the commit and only ever passed to std::max<int64_t>).

I think more variable types need to be updated in GetCoinsCacheSizeState to be size_t to make sense. IIRC, it's better to use size_t as it let more flexibility to the compiler to optimize for the platform addresses size.

@fanquake fanquake added this to To do in Assume UTXO Nov 9, 2019
@laanwj laanwj added this to Blockers in High-priority for review Nov 14, 2019
@ryanofsky

This comment has been minimized.

Copy link
Contributor

ryanofsky commented Nov 18, 2019

Status of this PR? I guess it needs a little more review. Hopefully it shouldn't take much. The code change is simple even though the test is complicated.

@jamesob

This comment has been minimized.

Copy link
Member Author

jamesob commented Nov 18, 2019

Happy to make (or at least investigate) the size_t change if reviewers want, but also not eager to invalidate ACKs.

@ryanofsky

This comment has been minimized.

Copy link
Contributor

ryanofsky commented Nov 18, 2019

Happy to make (or at least investigate) the size_t change if reviewers want, but also not eager to invalidate ACKs.

I'm not at all concerned about the size_t thing, so feel to just push for merge or more review

Copy link
Member

Sjors left a comment

Code review ACK for the refactor 73e6503; I did not review the test commit. Nits are probably not worth losing acks.


//! No need to periodic flush if at least this much space still available.
static constexpr int64_t MAX_BLOCK_COINSDB_USAGE_BYTES = 10 * 1024 * 1024; // 10MB
int64_t large_threshold =

This comment has been minimized.

Copy link
@Sjors

Sjors Nov 19, 2019

Member

total_space, cache_size and mempool_usage would make sense.

max_coins_cache_size_bytes + std::max<int64_t>(max_mempool_size_bytes - nMempoolUsage, 0);

//! No need to periodic flush if at least this much space still available.
static constexpr int64_t MAX_BLOCK_COINSDB_USAGE_BYTES = 10 * 1024 * 1024; // 10MB

This comment has been minimized.

Copy link
@Sjors

Sjors Nov 19, 2019

Member

This is an odd place for a constant.

@ariard

This comment has been minimized.

Copy link
Contributor

ariard commented Nov 19, 2019

Happy to make (or at least investigate) the size_t change if reviewers want, but also not eager to invalidate ACKs.

Don't bother for the size_t, not worth invalidating ACKs.

@ryanofsky

This comment has been minimized.

Copy link
Contributor

ryanofsky commented Nov 21, 2019

Is this ready to be merged? It's a simple change that's had reviews it looks like from 5 people, including 3 reviews of the latest 513981f (Sjors ACK is for the latest code but has a different hash because he skipped the test-only commit).

@promag

This comment has been minimized.

Copy link
Member

promag commented Nov 24, 2019

Concept ACK.

@jamesob are you going to do something about https://github.com/bitcoin/bitcoin/pull/16945/files#r342743813?

@jamesob

This comment has been minimized.

Copy link
Member Author

jamesob commented Nov 26, 2019

@promag the tests have been fixed to be stable on CI (see the changes). I agree with @ryanofsky that this seems ready for merge, but if anyone else has any feedback they feel is blocking the PR, I'm happy to address.

Copy link
Member

Sjors left a comment

Having these tests just for 64 bit platforms is a good start. Making them work on different platforms goes beyond the scope of whats being refactored here, so can be done later.

return outp;
};

constexpr int COIN_SIZE = is_64_bit ? 80 : 64;

This comment has been minimized.

Copy link
@Sjors

Sjors Nov 27, 2019

Member

Some explanation would be useful here.

This comment has been minimized.

Copy link
@jamesob

jamesob Dec 17, 2019

Author Member

Is that change good or would you prefer to see something else?

// cases, we can't really continue to make assertions about memory usage.
// End the test early.
if (view.DynamicMemoryUsage() != 32 && view.DynamicMemoryUsage() != 16) {
// Add a bunch of coins to see that we at least flip over to CRITICAL.

This comment has been minimized.

Copy link
@Sjors

Sjors Nov 27, 2019

Member

It should probably report when these tests are skipped, which seems to be the case on native 32 bit machines.

Why not inspect the value of view.DynamicMemoryUsage() to infer how many coins you need to add to reach CRITICAL? Alternatively, why not just keep adding coins until you do reach CRITICAL (with some upper bound to prevent an infinite loop if there's a serious bug)?

Copy link
Member

promag left a comment

Code review ACK, could as well address @Sjors comments.

enum class CoinsCacheSizeState
{
//! The coins cache is in immediate need of a flush.
CRITICAL = 2,

This comment has been minimized.

Copy link
@promag

promag Dec 9, 2019

Member

Why you choose this order (2,1,0)?

This comment has been minimized.

Copy link
@jamesob

jamesob Dec 11, 2019

Author Member

Why not? It's so we can make statements like f8a21cd#diff-24efdb00bfbe56b140fb006b562cc70bR5331

This comment has been minimized.

Copy link
@promag
@jamesob jamesob force-pushed the jamesob:2019-09-au-shouldflush branch from 513981f to 825ab6c Dec 11, 2019
@jamesob

This comment has been minimized.

Copy link
Member Author

jamesob commented Dec 11, 2019

I've pushed changes addressing @Sjors' feedback.

So far the ACK tally for the previous HEAD (and current HEAD, less comment changes) is

@promag - code review ACK
@Sjors - code review ACK
@ryanofsky - code review ACK
@ariard - code review ACK

@Sjors

This comment has been minimized.

Copy link
Member

Sjors commented Dec 12, 2019

Travis bloodbath :-(

This separates out some logic for detecting how full the coins cache is from
FlushStateToDisk. We'll want to reuse this logic when deciding when to flush
the coins cache during UTXO snapshot activation.
@jamesob jamesob force-pushed the jamesob:2019-09-au-shouldflush branch from 825ab6c to 0c20cd4 Dec 12, 2019
@jamesob

This comment has been minimized.

Copy link
Member Author

jamesob commented Dec 12, 2019

Travis bloodbath :-(

Rebased to avoid carnage by fixing a silent merge conflict.

@jamesob jamesob force-pushed the jamesob:2019-09-au-shouldflush branch from 0c20cd4 to 02b9511 Dec 12, 2019
Copy link
Contributor

ryanofsky left a comment

Code review ACK 02b9511. Just rebase, new COIN_SIZE comment, and new test message since last review

@ryanofsky

This comment has been minimized.

Copy link
Contributor

ryanofsky commented Dec 17, 2019

Saw this in IRC:

it's discouraging to think that #16945 probably just would've sailed through review if I hadn't bothered to add a unittest

I understand the sentiment, but I also do think the unit test is hard to understand, and I would not like to have to debug it in the future. I'm comfortable acking it now with the hope that maybe it is the type of thing that's a little complicated to put in place, but once there will be reliable and useful. But if it started breaking in the future and the fix wasn't obvious I could see wanting to drop it instead of update it.

The basic problem is that is that instead of being a test of the just
GetCoinsCacheSizeState function, it is a test of that function plus all the DynamicMemoryUsage functions which are very platform specific. I think a more ideal unit test would either mock or wrap the the DynamicMemoryUsage functions to be independent of the implementations of those functions.

@jamesob

This comment has been minimized.

Copy link
Member Author

jamesob commented Dec 17, 2019

The basic problem is that is that instead of being a test of the just
GetCoinsCacheSizeState function, it is a test of that function plus all the DynamicMemoryUsage functions which are very platform specific. I think a more ideal unit test would either mock or wrap the the DynamicMemoryUsage functions to be independent of the implementations of those functions.

Personally I think it's more desirable to have wider/incidental coverage (at the expense of lower specificity in terms of what failed) because the coins cache is a critically important part of the system. We should be alerted if e.g. the size of the coins cache entries changes, since that has pretty drastic effects on performance. Our unittest coverage is paltry as it stands, and when that's the case I tend more towards adding integration- rather than strict unit-tests to get as much coverage out of what few tests we do have.

I'd much prefer to be hyper-aware of what's going on with the coins cache and have to update a few tests once in a while than risk being unaware of even a small change to such an important part of the system. If there are ways I can make the test clearer or better, I'm all ears, but I am very hesitant to look to mocks - not only because that will probably require substantive changes to code purely for the sake of testing, but because for a system like Bitcoin it seems preferable have overzealous test coverage than the illusion of it thanks to a constellation of mocks.

There is an argument to be made that a test about when to flush the cache isn't an appropriate place to make assertions about the size of its contents, but it was convenient and this stuff wasn't tested so I figured better have it somewhere than not at all.

@fanquake fanquake moved this from To do to In progress in Assume UTXO Jan 4, 2020
@ryanofsky

This comment has been minimized.

Copy link
Contributor

ryanofsky commented Jan 8, 2020

It seems like this has had enough review, but some of the previous reviews are out of date. @promag, @Sjors, @ariard maybe you could reack?

@ariard

This comment has been minimized.

Copy link
Contributor

ariard commented Jan 9, 2020

Code review ACK 02b9511.

Changes since 513981f, documenting better new test and printing an exit message in case of exotic archs.

laanwj added a commit that referenced this pull request Jan 13, 2020
02b9511 tests: add tests for GetCoinsCacheSizeState (James O'Beirne)
b17e91d refactoring: introduce CChainState::GetCoinsCacheSizeState (James O'Beirne)

Pull request description:

  This is part of the [assumeutxo project](https://github.com/bitcoin/bitcoin/projects/11):

  Parent PR: #15606
  Issue: #15605
  Specification: https://github.com/jamesob/assumeutxo-docs/tree/master/proposal

  ---

  This pulls out the routine for detection of how full the coins cache is from
  FlushStateToDisk. We use this logic independently when deciding when to flush
  the coins cache during UTXO snapshot activation ([see here](231fb5f#diff-24efdb00bfbe56b140fb006b562cc70bR5275)).

ACKs for top commit:
  ariard:
    Code review ACK 02b9511.
  ryanofsky:
    Code review ACK 02b9511. Just rebase, new COIN_SIZE comment, and new test message since last review

Tree-SHA512: 8bdd78bf68a4a5d33a776e73fcc2857f050d6d102caa4997ed19ca25468c1358e6e728199d61b423033c02e6bc8f00a1d9da52cf17a2d37d70860fca9237ea7c
@laanwj laanwj removed this from Blockers in High-priority for review Jan 13, 2020
@laanwj laanwj merged commit 02b9511 into bitcoin:master Jan 13, 2020
2 checks passed
2 checks passed
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@fanquake fanquake moved this from In progress to Done in Assume UTXO Jan 13, 2020
sidhujag added a commit to syscoin/syscoin that referenced this pull request Jan 14, 2020
…zeState

02b9511 tests: add tests for GetCoinsCacheSizeState (James O'Beirne)
b17e91d refactoring: introduce CChainState::GetCoinsCacheSizeState (James O'Beirne)

Pull request description:

  This is part of the [assumeutxo project](https://github.com/bitcoin/bitcoin/projects/11):

  Parent PR: bitcoin#15606
  Issue: bitcoin#15605
  Specification: https://github.com/jamesob/assumeutxo-docs/tree/master/proposal

  ---

  This pulls out the routine for detection of how full the coins cache is from
  FlushStateToDisk. We use this logic independently when deciding when to flush
  the coins cache during UTXO snapshot activation ([see here](bitcoin@231fb5f#diff-24efdb00bfbe56b140fb006b562cc70bR5275)).

ACKs for top commit:
  ariard:
    Code review ACK 02b9511.
  ryanofsky:
    Code review ACK 02b9511. Just rebase, new COIN_SIZE comment, and new test message since last review

Tree-SHA512: 8bdd78bf68a4a5d33a776e73fcc2857f050d6d102caa4997ed19ca25468c1358e6e728199d61b423033c02e6bc8f00a1d9da52cf17a2d37d70860fca9237ea7c
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Assume UTXO
  
Done
9 participants
You can’t perform that action at this time.