Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sqlite: Disallow writing from multiple SQLiteBatchs #29112

Merged
merged 3 commits into from Feb 8, 2024

Conversation

achow101
Copy link
Member

@achow101 achow101 commented Dec 18, 2023

The way that we have configured SQLite to run means that only one database transaction can be open at a time. Typically, each individual read and write operation will be its own transaction that is opened and committed automatically by SQLite. However, sometimes we want these operations to be batched into a multi-statement transaction, so SQLiteBatch::TxnBegin, SQLiteBatch::TxnCommit, and SQLiteBatch::TxnAbort are used to manage the transaction of the database.

However, once a db transaction is begun with one SQLiteBatch, any operations performed by another SQLiteBatch will also occur within the same transaction. Furthermore, those other SQLiteBatchs will not be expecting a transaction to be active, and will abort it once the SQLiteBatch is destructed. This is problematic as it will prevent some data from being written, and also cause the SQLiteBatch that opened the transaction in the first place to be in an unexpected state and throw an error.

To avoid this situation, we need to prevent the multiple batches from writing at the same time. To do so, I've implemented added a CSemaphore within SQLiteDatabase which will be used by any SQLiteBatch trying to do a write operation. wait() is called by TxnBegin, and at the beginning of WriteKey, EraseKey, and ErasePrefix. post() is called in TxnCommit, TxnAbort and at the end of WriteKey, EraseKey, and ErasePrefix. To avoid deadlocking on TxnBegin() followed by a WriteKey(), `SQLiteBatch will now also track whether a transaction is in progress so that it knows whether to use the semaphore.

This issue is not a problem for BDB wallets since BDB uses WAL and provides transaction objects that must be used if an operation is to occur within a transaction. Specifically, we either pass a transaction pointer, or a nullptr, to all BDB operations, and this allows for concurrent transactions so it doesn't have this problem.

Fixes #29110

@DrahtBot
Copy link
Contributor

DrahtBot commented Dec 18, 2023

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK furszy, ryanofsky, josibake

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

No conflicts as of last run.

@maflcko
Copy link
Member

maflcko commented Dec 19, 2023

                                     File "/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/wallet_descriptor.py", line 68, in test_concurrent_writes
                                       assert_equal(cache_records, 10000)
                                     File "/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/util.py", line 57, in assert_equal
                                       raise AssertionError("not(%s)" % " == ".join(str(arg) for arg in (thing1, thing2) + args))
                                   AssertionError: not(0 == 10000)

@achow101 achow101 force-pushed the sqlite-concurrent-writes branch 2 times, most recently from a5475a2 to 70f8870 Compare December 19, 2023 20:52
Copy link
Member

@furszy furszy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found possible deadlock causes and crafted a test that exercises one of them: furszy@20344b9. Feel free to pull it here.

src/wallet/sqlite.cpp Outdated Show resolved Hide resolved
src/wallet/sqlite.cpp Outdated Show resolved Hide resolved
src/wallet/sqlite.cpp Outdated Show resolved Hide resolved
Copy link
Member

@furszy furszy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the sake of completeness and to avoid potential issues, have expanded the test coverage to exercise the previously mentioned "concurrent database transactions deadlock" behavior. All yours: furszy@a80cc56. It verifies that the database handler cannot commit/abort changes occurring in a different database handler.

Also, while was working on this, discovered another deadlocking cause.. test furszy@68e77b5. As it is a bdb-only issue, we could address it in another PR. But, it will mean that furszy@a80cc56 will need to be adapted to only run on sqlite (replace TestDatabases() for sqlite db creation call).

src/wallet/test/db_tests.cpp Outdated Show resolved Hide resolved
@achow101
Copy link
Member Author

achow101 commented Jan 3, 2024

Also, while was working on this, discovered another deadlocking cause.. test furszy@68e77b5. As it is a bdb-only issue, we could address it in another PR. But, it will mean that furszy@a80cc56 will need to be adapted to only run on sqlite (replace TestDatabases() for sqlite db creation call).

Hmm, seems like BDB doesn't behave how I thought it did. It looks like we can't have concurrent transactions that try to operate on the same data, at least not without changing the isolation level, and I don't think it's worth the effort to figure that out when we're planning on removing BDB soon anyways. I've taken your test and updated it to just use SQLite.

I don't think this particular issue is a problem for us however. The fact that it deadlocks in this particular test case is probably a good thing since it means that BDB is handling locking internally. If there were two threads trying to read/write at the same time, BDB would force one thread to wait for the other, as this PR is doing for SQLite. That's the behavior that we want. It's only a deadlock (and test failure) since the test is trying to do the behavior of two threads within one thread.

src/wallet/test/db_tests.cpp Outdated Show resolved Hide resolved
Copy link
Member

@furszy furszy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this particular issue is a problem for us however. The fact that it deadlocks in this particular test case is probably a good thing since it means that BDB is handling locking internally. If there were two threads trying to read/write at the same time, BDB would force one thread to wait for the other, as this PR is doing for SQLite. That's the behavior that we want. It's only a deadlock (and test failure) since the test is trying to do the behavior of two threads within one thread.

The issue I envision is nested WalletBatch within the same thread. We need to be careful on not introducing an inner WalletBatch creation when there is an existing WalletBatch on any upper level.

Still, agree on not doing the changes here. Isn't related. I should revive #25297 after we finish #28574.

Copy link
Member

@furszy furszy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another possible deadlock can arise when the txn abortion fails during the destruction of the batch object.
Essentially, if the txn abortion fails and the object is destructed, the semaphore remains locked indefinitely, leading to a deadlock in any subsequent write operation.

Made a test replicating the behavior; cherry-pick 5d2b79a and fbd59f8 (from https://github.com/furszy/bitcoin-core/commits/sqlite-concurrent-writes/ )

@achow101
Copy link
Member Author

Another possible deadlock can arise when the txn abortion fails during the destruction of the batch object. Essentially, if the txn abortion fails and the object is destructed, the semaphore remains locked indefinitely, leading to a deadlock in any subsequent write operation.

Made a test replicating the behavior; cherry-pick 5d2b79a and fbd59f8 (from https://github.com/furszy/bitcoin-core/commits/sqlite-concurrent-writes/ )

Changed TxnAbort and TxnCommit to always release the lock. Also pulled the test, and modified it to test TxnCommit and explicit TxnAbort

Copy link
Member

@furszy furszy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed TxnAbort and TxnCommit to always release the lock.

This breaks the db txn isolation property.
The db txn is still active at the engine level, the lock shouldn't be released when the sqlite statement fails to execute (only when the abort/commit command executes successfully). Otherwise, other concurrent handlers are allowed to access db at the same time and, in the worst possible outcome, dump information to disk that should have been discarded.

Check this out 7117099. Made a test that exercises the behavior. Feel free to pull it.

Only the handler who originated the db txn should be able to commit and/or abort such txn, not other handlers. If other handlers were able to commit txns that they did not start, they could be saving incomplete information to disk and affecting other processes.

Aside from that, thanks for pulling all the tests commits. We are getting closer.

@achow101
Copy link
Member Author

The db txn is still active at the engine level, the lock shouldn't be released when the sqlite statement fails to execute (only when the abort/commit command executes successfully). Otherwise, other concurrent handlers are allowed to access db at the same time and, in the worst possible outcome, dump information to disk that should have been discarded.

If it fails to abort when the batch is destructed, then we're deadlocked. I don't see how this can be resolved without deadlocking and still preserving isolation.

@furszy
Copy link
Member

furszy commented Jan 15, 2024

If it fails to abort when the batch is destructed, then we're deadlocked. I don't see how this can be resolved without deadlocking and still preserving isolation.

Working on a solution. Will push it in few hours.

@achow101
Copy link
Member Author

Reverted this to b316a81 as #29253 deals with the problem of failing TxnAbort() and TxnCommit().

achow101 added a commit that referenced this pull request Jan 31, 2024
…ctions

b298242 test: sqlite, add coverage for dangling to-be-reverted db txns (furszy)
fc0e747 sqlite: guard against dangling to-be-reverted db transactions (furszy)
472d2ca sqlite: introduce HasActiveTxn method (furszy)
dca874e sqlite: add ability to interrupt statements (furszy)
fdf9f66 test: wallet db, exercise deadlock after write failure (furszy)

Pull request description:

  Discovered while was reviewing #29112, specifically #29112 (review).

  If the db handler that initiated the database transaction is destroyed,
  the ongoing transaction cannot be left dangling when the db txn fails
  to abort. It must be forcefully reverted; otherwise, any subsequent
  db handler executing a write operation will dump the dangling,
  to-be-reverted transaction data to disk.

  This not only breaks the isolation property but also results in the
  improper storage of incomplete information on disk, impacting
  the wallet consistency.

  This PR fixes the issue by resetting the db connection, automatically
  rolling back the transaction (per https://www.sqlite.org/c3ref/close.html)
  when the handler object is being destroyed and the txn abortion failed.

  Testing Notes
  Can verify the failure by reverting the fix e5217fe and running the test.
  It will fail without e5217fe and pass with it.

ACKs for top commit:
  achow101:
    ACK b298242
  ryanofsky:
    Code review ACK b298242. Just fix for exec result codes and comment update since last review.

Tree-SHA512: 44ba0323ab21440e79e9d7791bc1c56a8873c8bd3e8f6a85641b91576e1293011fa8032d8ae5b0580f3fb7a949356f7b9676693d7ceffa617aaad9f6569993eb
Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review fe617be. Looks good but I think there is still a race in SQLiteBatch::Close and a semaphore leak in SQLiteBatch::TxnBegin. Still reviewing, mostly just looked at the first commit so far.

src/wallet/sqlite.h Outdated Show resolved Hide resolved
src/wallet/sqlite.h Show resolved Hide resolved
src/wallet/sqlite.cpp Outdated Show resolved Hide resolved
src/wallet/sqlite.cpp Outdated Show resolved Hide resolved
src/wallet/sqlite.cpp Outdated Show resolved Hide resolved
src/wallet/sqlite.cpp Outdated Show resolved Hide resolved
src/wallet/sqlite.cpp Outdated Show resolved Hide resolved
src/wallet/sqlite.cpp Outdated Show resolved Hide resolved
Copy link
Member

@furszy furszy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code reviewed, looking good. Left a last comment.

Also, have slightly modified my previous unit test to validate the changes (ba333a2).
It is not really needed but feel free to take it if you like it (conceptually, it is similar to your test, the diff is that it is verifying the writes execution order and checks that when a batch finishes writing data to disk, another batch in a concurrent process can read the data immediately).

src/wallet/sqlite.cpp Outdated Show resolved Hide resolved
@josibake
Copy link
Member

josibake commented Feb 6, 2024

Concept ACK

Started reviewing, looks good. Will continue review tomorrow. Might be worth updating the last part of the description with the insights from #29112 (comment) ?

achow101 and others added 3 commits February 6, 2024 12:24
A SQLiteBatch need to wait for any other batch to finish writing before
it can begin writing, otherwise db txn state may be incorrectly
modified. To enforce this, each SQLiteDatabase has a semaphore which
acts as a lock and is acquired by a batch when it begins a write, erase,
or a transaction, and is released by it when it is done.

To avoid deadlocking on itself for writing during a transaction,
SQLiteBatch also keeps track of whether it has begun a transaction.
There are occasions where a multi-statement tx is begun in one batch,
and a second batch is created which does a normal write (without a
multi-statement tx). These should not conflict with each other and all
of the data should end up being written to disk.
Verifying that a database handler can't commit/abort changes
occurring in a different database handler.
Copy link
Member

@furszy furszy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK cfcb9b1

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK cfcb9b1. This looks great and I think it is ready for merge. Just holding off because josibake seemed ready to review #29112 (comment) and might have more feedback.

I did leave some comments but they are all minor and can be ignored. I was was also wondering about this in the PR description:

re: #29112 (comment)

To avoid this situation, we need to prevent the multiple batches from writing at the same time. Although SQLite does have facilities for doing that, they require the Write Ahead Log (WAL) mode, which is a feature that we explicitly chose to disable since it spread wallet data to multiple files. As such, we need to be handling the concurrency ourselves.

This doesn't seem accurate to me. As long as all the batch objects are sharing a sqlite3* m_db pointer and using the same connection to the database, a semaphore or mutex needs to be used to keep their transactions separate. This would be true whether the database is in WAL or journaling mode. Also if each batch object had it's own sqlite3* m_db connection, the semaphore should not be needed regardless of the database mode, IIUC.

@@ -33,6 +36,41 @@ def skip_test_if_missing_module(self):
self.skip_if_no_sqlite()
self.skip_if_no_py_sqlite3()

def test_concurrent_writes(self):
self.log.info("Test sqlite concurrent writes are in the correct order")
self.restart_node(0, extra_args=["-unsafesqlitesync=0"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "tests: Test for concurrent writes with db tx" (548ecd1)

Could there be a comment to explain unsafesqlitesync? Is it just to make the test faster, or is it actually important to the functioning of the test?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The option's help text states Set SQLite synchronous=OFF to disable waiting for the database to sync to disk. This is unsafe and can cause data loss and corruption. This option is only used by tests to improve their performance (default: false)

bilingual_str error;
const auto& database = MakeSQLiteDatabase(m_path_root / "sqlite", options, status, error);

std::unique_ptr<DatabaseBatch> handler = Assert(database)->MakeBatch();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "test: wallet, coverage for concurrent db transactions" (cfcb9b1)

Would suggest calling this "handler1" instead of "handler" because otherwise when the comments below refer to a "handler" is not clear if they are referring to this variable specifically or a handler generically.

Also s/handler/batch/g throughout this function would make it more meaningful and consistent with the rest of wallet code, IMO

BOOST_CHECK(handler2->Exists(key));

// Attempt to commit the handler txn calling the handler2 methods.
// Which, must not be possible.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "test: wallet, coverage for concurrent db transactions" (cfcb9b1)

Would be good to clarify why it is not possible. Maybe "Which must not be possible because batch2->TxnBegin() was never called."

@@ -279,8 +279,48 @@ BOOST_AUTO_TEST_CASE(txn_close_failure_dangling_txn)
BOOST_CHECK(!batch2->Exists(key));
}

#endif // USE_SQLITE
BOOST_AUTO_TEST_CASE(concurrent_txn_dont_interfere)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "test: wallet, coverage for concurrent db transactions" (cfcb9b1

Nothing about this test seems sqlite-specific. Seems like it would be nicer to call TestDatabases and be generic.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test was originally generic and it was moved to sqlite-only because of a bdb bug, see #29112 (review).

Comment on lines +51 to +52
# ChainStateFlushed which will trigger a write to the db, hopefully at the
# same time that the topup still has an open db transaction.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "tests: Test for concurrent writes with db tx" (548ecd1)

This seems fine, but you could probably increase the probability of this by calling keypoolrefill in a loop until gettxoutsetinfo returns instead of calling it once.

@josibake
Copy link
Member

josibake commented Feb 7, 2024

ACK cfcb9b1

Looks good, nothing to add. Happy to reACK if you end up taking ryanofsky's suggestions.

@DrahtBot DrahtBot removed the request for review from josibake February 7, 2024 10:03
@achow101
Copy link
Member Author

achow101 commented Feb 7, 2024

To avoid this situation, we need to prevent the multiple batches from writing at the same time. Although SQLite does have facilities for doing that, they require the Write Ahead Log (WAL) mode, which is a feature that we explicitly chose to disable since it spread wallet data to multiple files. As such, we need to be handling the concurrency ourselves.

This doesn't seem accurate to me. As long as all the batch objects are sharing a sqlite3* m_db pointer and using the same connection to the database, a semaphore or mutex needs to be used to keep their transactions separate. This would be true whether the database is in WAL or journaling mode. Also if each batch object had it's own sqlite3* m_db connection, the semaphore should not be needed regardless of the database mode, IIUC.

Indeed, I misunderstood the documentation. Dropped the sentence.


Going to leave the nits for now.

@ryanofsky ryanofsky self-assigned this Feb 8, 2024
@ryanofsky ryanofsky merged commit 801ef07 into bitcoin:master Feb 8, 2024
16 checks passed
luke-jr pushed a commit to bitcoinknots/bitcoin that referenced this pull request Mar 14, 2024
A SQLiteBatch need to wait for any other batch to finish writing before
it can begin writing, otherwise db txn state may be incorrectly
modified. To enforce this, each SQLiteDatabase has a semaphore which
acts as a lock and is acquired by a batch when it begins a write, erase,
or a transaction, and is released by it when it is done.

To avoid deadlocking on itself for writing during a transaction,
SQLiteBatch also keeps track of whether it has begun a transaction.

Github-Pull: bitcoin#29112
Rebased-From: 395bcd2
luke-jr pushed a commit to bitcoinknots/bitcoin that referenced this pull request Mar 14, 2024
There are occasions where a multi-statement tx is begun in one batch,
and a second batch is created which does a normal write (without a
multi-statement tx). These should not conflict with each other and all
of the data should end up being written to disk.

Github-Pull: bitcoin#29112
Rebased-From: 548ecd1
luke-jr pushed a commit to bitcoinknots/bitcoin that referenced this pull request Mar 14, 2024
Verifying that a database handler can't commit/abort changes
occurring in a different database handler.

Github-Pull: bitcoin#29112
Rebased-From: cfcb9b1
luke-jr pushed a commit to bitcoinknots/bitcoin that referenced this pull request Mar 14, 2024
A SQLiteBatch need to wait for any other batch to finish writing before
it can begin writing, otherwise db txn state may be incorrectly
modified. To enforce this, each SQLiteDatabase has a semaphore which
acts as a lock and is acquired by a batch when it begins a write, erase,
or a transaction, and is released by it when it is done.

To avoid deadlocking on itself for writing during a transaction,
SQLiteBatch also keeps track of whether it has begun a transaction.

Github-Pull: bitcoin#29112
Rebased-From: 395bcd2
luke-jr pushed a commit to bitcoinknots/bitcoin that referenced this pull request Mar 14, 2024
There are occasions where a multi-statement tx is begun in one batch,
and a second batch is created which does a normal write (without a
multi-statement tx). These should not conflict with each other and all
of the data should end up being written to disk.

Github-Pull: bitcoin#29112
Rebased-From: 548ecd1
luke-jr pushed a commit to bitcoinknots/bitcoin that referenced this pull request Mar 14, 2024
Verifying that a database handler can't commit/abort changes
occurring in a different database handler.

Github-Pull: bitcoin#29112
Rebased-From: cfcb9b1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ci: failure in wallet_basic.py --descriptors
6 participants