sqlite: Disallow writing from multiple `SQLiteBatch`s #29112

achow101 · 2023-12-18T22:27:10Z

The way that we have configured SQLite to run means that only one database transaction can be open at a time. Typically, each individual read and write operation will be its own transaction that is opened and committed automatically by SQLite. However, sometimes we want these operations to be batched into a multi-statement transaction, so SQLiteBatch::TxnBegin, SQLiteBatch::TxnCommit, and SQLiteBatch::TxnAbort are used to manage the transaction of the database.

However, once a db transaction is begun with one SQLiteBatch, any operations performed by another SQLiteBatch will also occur within the same transaction. Furthermore, those other SQLiteBatchs will not be expecting a transaction to be active, and will abort it once the SQLiteBatch is destructed. This is problematic as it will prevent some data from being written, and also cause the SQLiteBatch that opened the transaction in the first place to be in an unexpected state and throw an error.

To avoid this situation, we need to prevent the multiple batches from writing at the same time. To do so, I've implemented added a CSemaphore within SQLiteDatabase which will be used by any SQLiteBatch trying to do a write operation. wait() is called by TxnBegin, and at the beginning of WriteKey, EraseKey, and ErasePrefix. post() is called in TxnCommit, TxnAbort and at the end of WriteKey, EraseKey, and ErasePrefix. To avoid deadlocking on TxnBegin() followed by a WriteKey(), `SQLiteBatch will now also track whether a transaction is in progress so that it knows whether to use the semaphore.

This issue is not a problem for BDB wallets since BDB uses WAL and provides transaction objects that must be used if an operation is to occur within a transaction. Specifically, we either pass a transaction pointer, or a nullptr, to all BDB operations, and this allows for concurrent transactions so it doesn't have this problem.

Fixes #29110

DrahtBot · 2023-12-18T22:27:13Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	furszy, ryanofsky, josibake

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

No conflicts as of last run.

maflcko · 2023-12-19T10:23:21Z

                                     File "/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/wallet_descriptor.py", line 68, in test_concurrent_writes
                                       assert_equal(cache_records, 10000)
                                     File "/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/util.py", line 57, in assert_equal
                                       raise AssertionError("not(%s)" % " == ".join(str(arg) for arg in (thing1, thing2) + args))
                                   AssertionError: not(0 == 10000)

furszy

Found possible deadlock causes and crafted a test that exercises one of them: furszy@20344b9. Feel free to pull it here.

src/wallet/sqlite.cpp

furszy

For the sake of completeness and to avoid potential issues, have expanded the test coverage to exercise the previously mentioned "concurrent database transactions deadlock" behavior. All yours: furszy@a80cc56. It verifies that the database handler cannot commit/abort changes occurring in a different database handler.

Also, while was working on this, discovered another deadlocking cause.. test furszy@68e77b5. As it is a bdb-only issue, we could address it in another PR. But, it will mean that furszy@a80cc56 will need to be adapted to only run on sqlite (replace TestDatabases() for sqlite db creation call).

src/wallet/test/db_tests.cpp

achow101 · 2024-01-03T00:23:07Z

Also, while was working on this, discovered another deadlocking cause.. test furszy@68e77b5. As it is a bdb-only issue, we could address it in another PR. But, it will mean that furszy@a80cc56 will need to be adapted to only run on sqlite (replace TestDatabases() for sqlite db creation call).

Hmm, seems like BDB doesn't behave how I thought it did. It looks like we can't have concurrent transactions that try to operate on the same data, at least not without changing the isolation level, and I don't think it's worth the effort to figure that out when we're planning on removing BDB soon anyways. I've taken your test and updated it to just use SQLite.

I don't think this particular issue is a problem for us however. The fact that it deadlocks in this particular test case is probably a good thing since it means that BDB is handling locking internally. If there were two threads trying to read/write at the same time, BDB would force one thread to wait for the other, as this PR is doing for SQLite. That's the behavior that we want. It's only a deadlock (and test failure) since the test is trying to do the behavior of two threads within one thread.

src/wallet/test/db_tests.cpp

furszy

I don't think this particular issue is a problem for us however. The fact that it deadlocks in this particular test case is probably a good thing since it means that BDB is handling locking internally. If there were two threads trying to read/write at the same time, BDB would force one thread to wait for the other, as this PR is doing for SQLite. That's the behavior that we want. It's only a deadlock (and test failure) since the test is trying to do the behavior of two threads within one thread.

The issue I envision is nested WalletBatch within the same thread. We need to be careful on not introducing an inner WalletBatch creation when there is an existing WalletBatch on any upper level.

Still, agree on not doing the changes here. Isn't related. I should revive #25297 after we finish #28574.

furszy

Another possible deadlock can arise when the txn abortion fails during the destruction of the batch object.
Essentially, if the txn abortion fails and the object is destructed, the semaphore remains locked indefinitely, leading to a deadlock in any subsequent write operation.

Made a test replicating the behavior; cherry-pick 5d2b79a and fbd59f8 (from https://github.com/furszy/bitcoin-core/commits/sqlite-concurrent-writes/ )

achow101 · 2024-01-11T22:04:27Z

Another possible deadlock can arise when the txn abortion fails during the destruction of the batch object. Essentially, if the txn abortion fails and the object is destructed, the semaphore remains locked indefinitely, leading to a deadlock in any subsequent write operation.

Made a test replicating the behavior; cherry-pick 5d2b79a and fbd59f8 (from https://github.com/furszy/bitcoin-core/commits/sqlite-concurrent-writes/ )

Changed TxnAbort and TxnCommit to always release the lock. Also pulled the test, and modified it to test TxnCommit and explicit TxnAbort

furszy

Changed TxnAbort and TxnCommit to always release the lock.

This breaks the db txn isolation property.
The db txn is still active at the engine level, the lock shouldn't be released when the sqlite statement fails to execute (only when the abort/commit command executes successfully). Otherwise, other concurrent handlers are allowed to access db at the same time and, in the worst possible outcome, dump information to disk that should have been discarded.

Check this out 7117099. Made a test that exercises the behavior. Feel free to pull it.

Only the handler who originated the db txn should be able to commit and/or abort such txn, not other handlers. If other handlers were able to commit txns that they did not start, they could be saving incomplete information to disk and affecting other processes.

Aside from that, thanks for pulling all the tests commits. We are getting closer.

achow101 · 2024-01-15T16:04:44Z

The db txn is still active at the engine level, the lock shouldn't be released when the sqlite statement fails to execute (only when the abort/commit command executes successfully). Otherwise, other concurrent handlers are allowed to access db at the same time and, in the worst possible outcome, dump information to disk that should have been discarded.

If it fails to abort when the batch is destructed, then we're deadlocked. I don't see how this can be resolved without deadlocking and still preserving isolation.

furszy · 2024-01-15T23:11:52Z

If it fails to abort when the batch is destructed, then we're deadlocked. I don't see how this can be resolved without deadlocking and still preserving isolation.

Working on a solution. Will push it in few hours.

achow101 · 2024-01-16T20:15:16Z

Reverted this to b316a81 as #29253 deals with the problem of failing TxnAbort() and TxnCommit().

…ctions b298242 test: sqlite, add coverage for dangling to-be-reverted db txns (furszy) fc0e747 sqlite: guard against dangling to-be-reverted db transactions (furszy) 472d2ca sqlite: introduce HasActiveTxn method (furszy) dca874e sqlite: add ability to interrupt statements (furszy) fdf9f66 test: wallet db, exercise deadlock after write failure (furszy) Pull request description: Discovered while was reviewing #29112, specifically #29112 (review). If the db handler that initiated the database transaction is destroyed, the ongoing transaction cannot be left dangling when the db txn fails to abort. It must be forcefully reverted; otherwise, any subsequent db handler executing a write operation will dump the dangling, to-be-reverted transaction data to disk. This not only breaks the isolation property but also results in the improper storage of incomplete information on disk, impacting the wallet consistency. This PR fixes the issue by resetting the db connection, automatically rolling back the transaction (per https://www.sqlite.org/c3ref/close.html) when the handler object is being destroyed and the txn abortion failed. Testing Notes Can verify the failure by reverting the fix e5217fe and running the test. It will fail without e5217fe and pass with it. ACKs for top commit: achow101: ACK b298242 ryanofsky: Code review ACK b298242. Just fix for exec result codes and comment update since last review. Tree-SHA512: 44ba0323ab21440e79e9d7791bc1c56a8873c8bd3e8f6a85641b91576e1293011fa8032d8ae5b0580f3fb7a949356f7b9676693d7ceffa617aaad9f6569993eb

ryanofsky

Code review fe617be. Looks good but I think there is still a race in SQLiteBatch::Close and a semaphore leak in SQLiteBatch::TxnBegin. Still reviewing, mostly just looked at the first commit so far.

src/wallet/sqlite.h

src/wallet/sqlite.cpp

furszy

Code reviewed, looking good. Left a last comment.

Also, have slightly modified my previous unit test to validate the changes (ba333a2).
It is not really needed but feel free to take it if you like it (conceptually, it is similar to your test, the diff is that it is verifying the writes execution order and checks that when a batch finishes writing data to disk, another batch in a concurrent process can read the data immediately).

src/wallet/sqlite.cpp

josibake · 2024-02-06T17:02:39Z

Concept ACK

Started reviewing, looks good. Will continue review tomorrow. Might be worth updating the last part of the description with the insights from #29112 (comment) ?

A SQLiteBatch need to wait for any other batch to finish writing before it can begin writing, otherwise db txn state may be incorrectly modified. To enforce this, each SQLiteDatabase has a semaphore which acts as a lock and is acquired by a batch when it begins a write, erase, or a transaction, and is released by it when it is done. To avoid deadlocking on itself for writing during a transaction, SQLiteBatch also keeps track of whether it has begun a transaction.

There are occasions where a multi-statement tx is begun in one batch, and a second batch is created which does a normal write (without a multi-statement tx). These should not conflict with each other and all of the data should end up being written to disk.

Verifying that a database handler can't commit/abort changes occurring in a different database handler.

furszy

ACK cfcb9b1

ryanofsky

Code review ACK cfcb9b1. This looks great and I think it is ready for merge. Just holding off because josibake seemed ready to review #29112 (comment) and might have more feedback.

I did leave some comments but they are all minor and can be ignored. I was was also wondering about this in the PR description:

re: #29112 (comment)

To avoid this situation, we need to prevent the multiple batches from writing at the same time. Although SQLite does have facilities for doing that, they require the Write Ahead Log (WAL) mode, which is a feature that we explicitly chose to disable since it spread wallet data to multiple files. As such, we need to be handling the concurrency ourselves.

This doesn't seem accurate to me. As long as all the batch objects are sharing a sqlite3* m_db pointer and using the same connection to the database, a semaphore or mutex needs to be used to keep their transactions separate. This would be true whether the database is in WAL or journaling mode. Also if each batch object had it's own sqlite3* m_db connection, the semaphore should not be needed regardless of the database mode, IIUC.

ryanofsky · 2024-02-07T01:45:51Z

test/functional/wallet_descriptor.py

@@ -33,6 +36,41 @@ def skip_test_if_missing_module(self):
        self.skip_if_no_sqlite()
        self.skip_if_no_py_sqlite3()

+    def test_concurrent_writes(self):
+        self.log.info("Test sqlite concurrent writes are in the correct order")
+        self.restart_node(0, extra_args=["-unsafesqlitesync=0"])


In commit "tests: Test for concurrent writes with db tx" (548ecd1)

Could there be a comment to explain unsafesqlitesync? Is it just to make the test faster, or is it actually important to the functioning of the test?

The option's help text states Set SQLite synchronous=OFF to disable waiting for the database to sync to disk. This is unsafe and can cause data loss and corruption. This option is only used by tests to improve their performance (default: false)

ryanofsky · 2024-02-07T01:50:45Z

src/wallet/test/db_tests.cpp

+    bilingual_str error;
+    const auto& database = MakeSQLiteDatabase(m_path_root / "sqlite", options, status, error);
+
+    std::unique_ptr<DatabaseBatch> handler = Assert(database)->MakeBatch();


In commit "test: wallet, coverage for concurrent db transactions" (cfcb9b1)

Would suggest calling this "handler1" instead of "handler" because otherwise when the comments below refer to a "handler" is not clear if they are referring to this variable specifically or a handler generically.

Also s/handler/batch/g throughout this function would make it more meaningful and consistent with the rest of wallet code, IMO

ryanofsky · 2024-02-07T01:53:58Z

src/wallet/test/db_tests.cpp

+    BOOST_CHECK(handler2->Exists(key));
+
+    // Attempt to commit the handler txn calling the handler2 methods.
+    // Which, must not be possible.


In commit "test: wallet, coverage for concurrent db transactions" (cfcb9b1)

Would be good to clarify why it is not possible. Maybe "Which must not be possible because batch2->TxnBegin() was never called."

ryanofsky · 2024-02-07T02:07:12Z

src/wallet/test/db_tests.cpp

@@ -279,8 +279,48 @@ BOOST_AUTO_TEST_CASE(txn_close_failure_dangling_txn)
    BOOST_CHECK(!batch2->Exists(key));
 }

-#endif // USE_SQLITE
+BOOST_AUTO_TEST_CASE(concurrent_txn_dont_interfere)


In commit "test: wallet, coverage for concurrent db transactions" (cfcb9b1

Nothing about this test seems sqlite-specific. Seems like it would be nicer to call TestDatabases and be generic.

This test was originally generic and it was moved to sqlite-only because of a bdb bug, see #29112 (review).

ryanofsky · 2024-02-07T02:11:19Z

test/functional/wallet_descriptor.py

+            # ChainStateFlushed which will trigger a write to the db, hopefully at the
+            # same time that the topup still has an open db transaction.


In commit "tests: Test for concurrent writes with db tx" (548ecd1)

This seems fine, but you could probably increase the probability of this by calling keypoolrefill in a loop until gettxoutsetinfo returns instead of calling it once.

josibake · 2024-02-07T10:03:52Z

ACK cfcb9b1

Looks good, nothing to add. Happy to reACK if you end up taking ryanofsky's suggestions.

achow101 · 2024-02-07T16:24:38Z

To avoid this situation, we need to prevent the multiple batches from writing at the same time. Although SQLite does have facilities for doing that, they require the Write Ahead Log (WAL) mode, which is a feature that we explicitly chose to disable since it spread wallet data to multiple files. As such, we need to be handling the concurrency ourselves.

This doesn't seem accurate to me. As long as all the batch objects are sharing a sqlite3* m_db pointer and using the same connection to the database, a semaphore or mutex needs to be used to keep their transactions separate. This would be true whether the database is in WAL or journaling mode. Also if each batch object had it's own sqlite3* m_db connection, the semaphore should not be needed regardless of the database mode, IIUC.

Indeed, I misunderstood the documentation. Dropped the sentence.

Going to leave the nits for now.

A SQLiteBatch need to wait for any other batch to finish writing before it can begin writing, otherwise db txn state may be incorrectly modified. To enforce this, each SQLiteDatabase has a semaphore which acts as a lock and is acquired by a batch when it begins a write, erase, or a transaction, and is released by it when it is done. To avoid deadlocking on itself for writing during a transaction, SQLiteBatch also keeps track of whether it has begun a transaction. Github-Pull: bitcoin#29112 Rebased-From: 395bcd2

There are occasions where a multi-statement tx is begun in one batch, and a second batch is created which does a normal write (without a multi-statement tx). These should not conflict with each other and all of the data should end up being written to disk. Github-Pull: bitcoin#29112 Rebased-From: 548ecd1

Verifying that a database handler can't commit/abort changes occurring in a different database handler. Github-Pull: bitcoin#29112 Rebased-From: cfcb9b1

A SQLiteBatch need to wait for any other batch to finish writing before it can begin writing, otherwise db txn state may be incorrectly modified. To enforce this, each SQLiteDatabase has a semaphore which acts as a lock and is acquired by a batch when it begins a write, erase, or a transaction, and is released by it when it is done. To avoid deadlocking on itself for writing during a transaction, SQLiteBatch also keeps track of whether it has begun a transaction. Github-Pull: bitcoin#29112 Rebased-From: 395bcd2

There are occasions where a multi-statement tx is begun in one batch, and a second batch is created which does a normal write (without a multi-statement tx). These should not conflict with each other and all of the data should end up being written to disk. Github-Pull: bitcoin#29112 Rebased-From: 548ecd1

Verifying that a database handler can't commit/abort changes occurring in a different database handler. Github-Pull: bitcoin#29112 Rebased-From: cfcb9b1

achow101 added Needs backport (26.x) Wallet and removed Needs backport (26.x) labels Dec 18, 2023

achow101 force-pushed the sqlite-concurrent-writes branch from f79468e to a460b50 Compare December 18, 2023 22:36

DrahtBot added the CI failed label Dec 18, 2023

This was referenced Dec 19, 2023

wallet: reenable sethdseed for descriptor wallets #29054

Closed

wallet: Allow users to create a wallet that encrypts all database records #28142

Closed

achow101 force-pushed the sqlite-concurrent-writes branch 2 times, most recently from a5475a2 to 70f8870 Compare December 19, 2023 20:52

DrahtBot removed the CI failed label Dec 19, 2023

furszy reviewed Dec 20, 2023

View reviewed changes

src/wallet/sqlite.cpp Outdated Show resolved Hide resolved

src/wallet/sqlite.cpp Outdated Show resolved Hide resolved

src/wallet/sqlite.cpp Outdated Show resolved Hide resolved

achow101 force-pushed the sqlite-concurrent-writes branch from 70f8870 to 689ed43 Compare December 21, 2023 00:26

DrahtBot mentioned this pull request Dec 21, 2023

Remove the legacy wallet and BDB dependency #28710

Draft

furszy reviewed Dec 23, 2023

View reviewed changes

src/wallet/test/db_tests.cpp Outdated Show resolved Hide resolved

achow101 force-pushed the sqlite-concurrent-writes branch from 689ed43 to f1b5b25 Compare January 3, 2024 00:21

furszy reviewed Jan 4, 2024

View reviewed changes

src/wallet/test/db_tests.cpp Outdated Show resolved Hide resolved

furszy reviewed Jan 4, 2024

View reviewed changes

achow101 force-pushed the sqlite-concurrent-writes branch from f1b5b25 to b316a81 Compare January 4, 2024 14:54

furszy reviewed Jan 8, 2024

View reviewed changes

furszy mentioned this pull request Jan 8, 2024

wallet: simplify and batch zap wallet txes process #28987

Merged

maflcko added this to the 27.0 milestone Jan 11, 2024

achow101 force-pushed the sqlite-concurrent-writes branch from b316a81 to 7b0fcea Compare January 11, 2024 22:04

furszy reviewed Jan 15, 2024

View reviewed changes

DrahtBot added the CI failed label Jan 15, 2024

achow101 force-pushed the sqlite-concurrent-writes branch from 7b0fcea to b316a81 Compare January 16, 2024 20:14

DrahtBot removed the CI failed label Jan 16, 2024

furszy mentioned this pull request Jan 31, 2024

rpc: Drop migratewallet experimental warning #28037

Merged

DrahtBot added the Needs rebase label Jan 31, 2024

achow101 force-pushed the sqlite-concurrent-writes branch from b316a81 to fe617be Compare February 1, 2024 02:07

DrahtBot removed the Needs rebase label Feb 1, 2024

ryanofsky reviewed Feb 1, 2024

View reviewed changes

achow101 force-pushed the sqlite-concurrent-writes branch from fe617be to bdf348f Compare February 1, 2024 19:12

furszy reviewed Feb 5, 2024

View reviewed changes

src/wallet/sqlite.cpp Outdated Show resolved Hide resolved

achow101 and others added 3 commits February 6, 2024 12:24

test: wallet, coverage for concurrent db transactions

cfcb9b1

Verifying that a database handler can't commit/abort changes occurring in a different database handler.

achow101 force-pushed the sqlite-concurrent-writes branch from bdf348f to cfcb9b1 Compare February 6, 2024 17:24

furszy reviewed Feb 6, 2024

View reviewed changes

DrahtBot requested a review from josibake February 6, 2024 21:58

ryanofsky approved these changes Feb 7, 2024

View reviewed changes

DrahtBot removed the request for review from josibake February 7, 2024 10:03

ryanofsky self-assigned this Feb 8, 2024

ryanofsky merged commit 801ef07 into bitcoin:master Feb 8, 2024
16 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sqlite: Disallow writing from multiple `SQLiteBatch`s #29112

sqlite: Disallow writing from multiple `SQLiteBatch`s #29112

achow101 commented Dec 18, 2023 •

edited

DrahtBot commented Dec 18, 2023 •

edited

maflcko commented Dec 19, 2023

furszy left a comment •

edited

furszy left a comment

achow101 commented Jan 3, 2024 •

edited

furszy left a comment

furszy left a comment •

edited

achow101 commented Jan 11, 2024

furszy left a comment •

edited

achow101 commented Jan 15, 2024

furszy commented Jan 15, 2024

achow101 commented Jan 16, 2024

ryanofsky left a comment

furszy left a comment

josibake commented Feb 6, 2024

furszy left a comment •

edited

ryanofsky left a comment

ryanofsky Feb 7, 2024

achow101 Feb 7, 2024

ryanofsky Feb 7, 2024

ryanofsky Feb 7, 2024

ryanofsky Feb 7, 2024

furszy Feb 7, 2024

ryanofsky Feb 7, 2024

josibake commented Feb 7, 2024

achow101 commented Feb 7, 2024 •

edited

		# ChainStateFlushed which will trigger a write to the db, hopefully at the
		# same time that the topup still has an open db transaction.

sqlite: Disallow writing from multiple SQLiteBatchs #29112

sqlite: Disallow writing from multiple SQLiteBatchs #29112

Conversation

achow101 commented Dec 18, 2023 • edited

DrahtBot commented Dec 18, 2023 • edited

Code Coverage

Reviews

Conflicts

maflcko commented Dec 19, 2023

furszy left a comment • edited

Choose a reason for hiding this comment

furszy left a comment

Choose a reason for hiding this comment

achow101 commented Jan 3, 2024 • edited

furszy left a comment

Choose a reason for hiding this comment

furszy left a comment • edited

Choose a reason for hiding this comment

achow101 commented Jan 11, 2024

furszy left a comment • edited

Choose a reason for hiding this comment

achow101 commented Jan 15, 2024

furszy commented Jan 15, 2024

achow101 commented Jan 16, 2024

ryanofsky left a comment

Choose a reason for hiding this comment

furszy left a comment

Choose a reason for hiding this comment

josibake commented Feb 6, 2024

furszy left a comment • edited

Choose a reason for hiding this comment

ryanofsky left a comment

Choose a reason for hiding this comment

ryanofsky Feb 7, 2024

Choose a reason for hiding this comment

achow101 Feb 7, 2024

Choose a reason for hiding this comment

ryanofsky Feb 7, 2024

Choose a reason for hiding this comment

ryanofsky Feb 7, 2024

Choose a reason for hiding this comment

ryanofsky Feb 7, 2024

Choose a reason for hiding this comment

furszy Feb 7, 2024

Choose a reason for hiding this comment

ryanofsky Feb 7, 2024

Choose a reason for hiding this comment

josibake commented Feb 7, 2024

achow101 commented Feb 7, 2024 • edited

sqlite: Disallow writing from multiple `SQLiteBatch`s #29112

sqlite: Disallow writing from multiple `SQLiteBatch`s #29112

achow101 commented Dec 18, 2023 •

edited

DrahtBot commented Dec 18, 2023 •

edited

furszy left a comment •

edited

achow101 commented Jan 3, 2024 •

edited

furszy left a comment •

edited

furszy left a comment •

edited

furszy left a comment •

edited

achow101 commented Feb 7, 2024 •

edited