Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use NuDB burst size #3662

Closed
wants to merge 1 commit into from
Closed

Conversation

miguelportilla
Copy link
Contributor

Use NuDB version 2.0.5:

High Level Overview of Change

NuDB insertion latency sometimes shows large spikes due to two defects in NuDB's write throttling code:

If a burst of inserts occurs immediately after a batch write finishes, they may temporarily exceed the proven sustained write
rate. This is common and doesn't require any throttling.

When a small batch write completes, the execution time may be dominated by overhead, resulting in an artificially low measurement of the sustained store rate. We don't want to replace our measurement from a large batch with that one.

NuDB 2.0.5 allows a burst size that can be set with a "set_burst" function. This PR uses that function to set the burst size according to the node size.

Type of Change

  • New feature (non-breaking change which adds functionality)
  • Refactor (non-breaking change that only restructures code)

Use NuDB version 2.0.5:
Copy link
Collaborator

@JoelKatz JoelKatz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Those reviewing this change should understand that the burst size parameter just tunes the way NuDB detects and measures the sustained write rate. It is a minimum to avoid poor performance when writes are small and in no way a limit on the write size. I've seen writes as large as 400MB with and without the burst size code (and NuDB needs to keep two writes in memory).

Copy link
Contributor

@nbougalis nbougalis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Left two comments; I'd appreciate a discussion on the size; the reformatting is at your discretion.

{SizedItem::txnDBCache, {{4, 12, 24, 64, 128}}},
{SizedItem::lgrDBCache, {{4, 8, 16, 32, 128}}},
{SizedItem::openFinalLimit, {{8, 16, 32, 64, 128}}},
{SizedItem::burstSize, {{4, 8, 16, 32, 48}}},
Copy link
Contributor

@nbougalis nbougalis Nov 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two points:

  1. We should think through these numbers and make sure that they are realistic for the given "node sizes". Why is "4MB" appropriate and why is x12 an appropriate factor when going from "tiny" to "huge"?

  2. Please consider reformatting this thusly:

// clang-format off
inline constexpr std::array<std::pair<SizedItem, std::array<int, 5>>, 12>
sizedItems
{{
    // FIXME: We should document each of these items separately, explaining
    //        exactly what they control and whether there exists an explicit
    //        config option that can be used to override the default.
    { SizedItem::sweepInterval,   {{     10,     30,     60,     90,     120 }} },
    { SizedItem::treeCacheSize,   {{ 128000, 256000, 512000, 768000, 2048000 }} },
    { SizedItem::treeCacheAge,    {{     30,     60,     90,    120,     900 }} },
    { SizedItem::ledgerSize,      {{     32,    128,    256,    384,     768 }} },
    { SizedItem::ledgerAge,       {{     30,     90,    180,    240,     900 }} },
    { SizedItem::ledgerFetch,     {{      2,      3,      4,      5,       8 }} },
    { SizedItem::nodeCacheSize,   {{  16384,  32768, 131072, 262144,  524288 }} },
    { SizedItem::nodeCacheAge,    {{     60,     90,    120,    900,    1800 }} },
    { SizedItem::hashNodeDBCache, {{      4,     12,     24,     64,     128 }} },
    { SizedItem::txnDBCache,      {{      4,     12,     24,     64,     128 }} },
    { SizedItem::lgrDBCache,      {{      4,      8,     16,     32,     128 }} },
    { SizedItem::openFinalLimit,  {{      8,     16,     32,     64,     128 }} }
    { SizedItem::burstSize,       {{      4,      8,     16,     32,      48 }} },
}};
// clang-format on

I love clang-format and having a consistent style but, like Morpheus says:

What you must learn is that these rules are no different from the rules of a computer system. Some of them can be bent, others can be broken.

And this is a case where rules can be bent and/or broken.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4MB is the default NuDB size and a good minimum sized beneficial threshold. David observed increasing benefits up to 32MB, where diminishing returns begin. There is little gain with 48MB but that correlates with "huge".

@carlhua carlhua added the Passed Passed code review & PR owner thinks it's ready to merge. Perf sign-off may still be required. label Nov 18, 2020
This was referenced Nov 19, 2020
@miguelportilla miguelportilla deleted the nudb_burst branch January 28, 2021 12:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Passed Passed code review & PR owner thinks it's ready to merge. Perf sign-off may still be required.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants