Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase page size for large SQLite databases upon creation. #2635

Closed
wants to merge 1 commit into from

Conversation

mtrippled
Copy link
Collaborator

No description provided.

@@ -25,6 +25,9 @@ namespace ripple {
// Transaction database holds transactions and public keys
const char* TxnDBInit[] =
{
"PRAGMA journal_mode=OFF",
"PRAGMA page_size=65536;",
"VACUUM;",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VACUUM would be bad here...I think we just want the page_size in both inits.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mellery451 Good catch. For some reason I had problems initially setting the page_size and (wrongly) assumed that this was only called upon DB creation rather than every time rippled starts. Anyway, the placement of setting page_size seems to matter for it to take effect immediately upon database creation, so I put it at the beginning.

@ripplelabs-jenkins
Copy link
Collaborator

ripplelabs-jenkins commented Jul 28, 2018

Jenkins Build Summary

Built from this commit

Built at 20180807 - 18:48:13

Test Results

Build Type Log Result Status
msvc.Debug logfile 1031 cases, 0 failed, t: 528s PASS ✅
gcc.Debug
-Dcoverage=ON
logfile 1034 cases, 0 failed, t: 842s PASS ✅
docs logfile 1 cases, 0 failed, t: 2s PASS ✅
gcc.Release
-Dassert=ON,
MANUAL_TESTS=true
logfile 1101 cases, 0 failed, t: 924s PASS ✅
msvc.Debug,
NINJA_BUILD=true
logfile 1031 cases, 0 failed, t: 550s PASS ✅
clang.Debug logfile 1034 cases, 0 failed, t: 277s PASS ✅
rpm logfile 1034 cases, 0 failed, t: n/a PASS ✅
gcc.Debug logfile 1034 cases, 0 failed, t: 289s PASS ✅
gcc.Debug
-Dunity=OFF
logfile 1030 cases, 0 failed, t: 117s PASS ✅
clang.Debug
-Dunity=OFF
logfile 1030 cases, 0 failed, t: 695s PASS ✅
clang.Release
-Dassert=ON
logfile 1034 cases, 0 failed, t: 395s PASS ✅
gcc.Release
-Dassert=ON
logfile 1034 cases, 0 failed, t: 421s PASS ✅
msvc.Debug
-Dunity=OFF
logfile 1027 cases, 0 failed, t: 1151s PASS ✅
gcc.Debug
-Dstatic=OFF
logfile 1034 cases, 0 failed, t: 288s PASS ✅
gcc.Debug
-Dstatic=OFF -DBUILD_SHARED_LIBS=ON
logfile 1034 cases, 0 failed, t: 293s PASS ✅
gcc.Debug,
NINJA_BUILD=true
logfile 1034 cases, 0 failed, t: 263s PASS ✅
msvc.Release logfile 1031 cases, 0 failed, t: 391s PASS ✅
clang.Debug
-Dunity=OFF -Dsan=address,
PARALLEL_TESTS=false,
DEBUGGER=false
logfile 1030 cases, 0 failed, t: 1108s PASS ✅
clang.Debug
-Dunity=OFF -Dsan=undefined,
PARALLEL_TESTS=false
logfile 1030 cases, 0 failed, t: 954s PASS ✅

@codecov-io
Copy link

codecov-io commented Jul 28, 2018

Codecov Report

Merging #2635 into develop will decrease coverage by 0.1%.
The diff coverage is 3.65%.

Impacted file tree graph

@@            Coverage Diff             @@
##           develop   #2635      +/-   ##
==========================================
- Coverage    70.71%   70.6%   -0.11%     
==========================================
  Files          697     697              
  Lines        54556   54636      +80     
==========================================
- Hits         38580   38577       -3     
- Misses       15976   16059      +83
Impacted Files Coverage Δ
src/ripple/app/main/DBInit.cpp 100% <ø> (ø) ⬆️
src/ripple/app/main/Application.cpp 56.82% <2.85%> (-2.08%) ⬇️
src/ripple/app/main/Main.cpp 38.62% <4.25%> (-5.9%) ⬇️
src/ripple/basics/DecayingSample.h 76.31% <0%> (-10.53%) ⬇️
src/ripple/app/misc/SHAMapStoreImp.cpp 72.53% <0%> (-0.24%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 09050a8...937f4d2. Read the comment docs.

@seelabs
Copy link
Collaborator

seelabs commented Jul 30, 2018

The granularity of disk IO is the disk's block size. A typical block size is 4K. This change sets the sqlite page size to 64K. I think this means we do 16x more reads and writes than if we set the sqlite page size to 4K. Sqlite sets its default page size to 4K, and postgres sets its default page size to 8K. I'm a little worried about the performance implications of this change. Should I be or is there a reason it doesn't matter?

I agree that we need to change the pagesize from the current 1K. I'm not sure we need to to all the way to 64K. Maybe set it to 8K or 16K?

@mtrippled
Copy link
Collaborator Author

@seelabs Good point--I did a benchmark test with 64K pages and the results are quite bad: 1251/s (whereas normal vary between about 1500-1900). So next I will try 8KB.

Copy link
Contributor

@mellery451 mellery451 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@nbougalis
Copy link
Contributor

Most modern SSDs that I am aware of have pages that are 4K and never read single pages; they read blocks of pages. That's the theory anyways, and if you tested this and 4K and 8K pages result in better I/O patterns, then we should go for that.

@mtrippled
Copy link
Collaborator Author

I just tested 4KB page size to still perform well. 8 and 64 both cause noticeable throughput decline. Since 4KB seems to be the size for most SSD's and also I think linux I/O pages, this is the optimal number.

@seelabs
Copy link
Collaborator

seelabs commented Aug 1, 2018

@mtrippled What's the estimate for how long before we run out of pages with 4K pages? If that's too short we probably need to take the speed hit and use larger pages.

@mtrippled
Copy link
Collaborator Author

@mellery451 @seelabs I added a new commit with some more features. Please re-review this.

@@ -1034,6 +1034,41 @@ class ApplicationImp
<< "Remaining free disk space is less than 512MB";
signalStop ();
}

static std::atomic<std::uint32_t> pageSize {};
static std::atomic<std::uint32_t> maxPages;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using lambdas to initialize these statics would simplify things. It would get rid of the std::atomic and we could make it const. Something like:

            auto db = mTxnDB->checkoutDb();
            static std::uint32_t const pageSize = [&]{
                std::uint32_t ps;
                *db << "PRAGMA page_size;", soci::into(ps);
                return ps;
            }();
            static std::uint32_t const maxPages = [&]{
                std::uint32_t mp;
                *db << "PRAGMA max_page_count;" , soci::into(mp);
                return mp;
            }();

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Good idea to simplify things. Also glad that C++11 lets us do this safely.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the parameter list is empty, you can use static auto const pageSize =

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -342,6 +346,9 @@ int run (int argc, char** argv)
("nodetoshard", "Import node store into shards")
("replay","Replay a ledger close.")
("start", "Start from a fresh Ledger.")
("vacuum", po::value<std::string>(),
"VACUUM the transaction db. Optional string argument specifies "
"temporary directory path.")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter to vacuum is not optioal, it's required.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

"Note that this activity can take multiple days, "
"depending on database size."
"page_size count max: "
<< pageSize << " " << pageCount << " " << maxPages;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A stringstream isn't necessary, you can stream into the journal.
eg.

JLOG(m_journal.fatal()) <<
    "Free SQLite space for transaction db is less than "
    "512MB. To fix this, rippled must be executed with the "
    "vacuum <tmpdir> parameter before restarting. "
    "Note that this activity can take multiple days, "
    "depending on database size."
    "page_size count max: " <<
    pageSize << " " <<
    pageCount << " " <<
    maxPages;

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. As I was writing that code, I was thinking to myself "wouldn't it be nice if the journal object did this."

Copy link
Collaborator

@seelabs seelabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@seelabs
Copy link
Collaborator

seelabs commented Aug 7, 2018

Nit: The commit message needs to be amended. There needs to be a blank line between the subject and the body, and the subject should end in a colon instead of a period. Run git log --oneline to see how it looks.

Increase page size for SQLite transaction database upon creation
Provide diagnostics for transaction db page usage.
Shut down rippled gracefullly if transaction db is running out of pages.
Add new rippled maintenance command line option to cause new page size
to take effect.
@mtrippled mtrippled added the Passed Passed code review & PR owner thinks it's ready to merge. Perf sign-off may still be required. label Aug 7, 2018
@mtrippled
Copy link
Collaborator Author

@seelabs commit message fixed

JLOG(m_journal.error())
<< "Error checking transaction db file size: "
<< ec.message();
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this return prevents us from doing the remaining sweeps and cleanup in the event of the fs error.

}

auto db = mTxnDB->checkoutDb();
static auto const pageSize = [&]{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could these lambdas be made into a single [&] get_pragma_value(char const* fieldname) -> uint32_t and then called three times (page_size, max_page_count, page_count)?

if (vm.count("vacuum"))
{
DatabaseCon::Setup dbSetup = setup_DatabaseCon(*config);
if (dbSetup.standAlone)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have used config->standalone() myself, but either way works I guess.

@@ -488,6 +495,81 @@ int run (int argc, char** argv)
crypto_prng().load_state(entropy.string ());
}

if (vm.count("vacuum"))
{
DatabaseCon::Setup dbSetup = setup_DatabaseCon(*config);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not critical, but this vacuum code seems like a good candidate for a small separate function, possibly placed in SociDB.cpp (where there are already a few util-like functions). Up to you, but it might also make it possible to write a test for it then as well.

catch (soci::soci_error const& e)
{
std::cerr << "SQLite error: " << e.what() << '\n';
return 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice this return value is different than the other failure cases...it doesn't matter although I wondered if we impart meaning to exit values in this case.

@seelabs
Copy link
Collaborator

seelabs commented Aug 9, 2018

In 1.1.0-rc1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Passed Passed code review & PR owner thinks it's ready to merge. Perf sign-off may still be required.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants