New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sqlite: Disallow writing from multiple SQLiteBatch
s
#29112
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code CoverageFor detailed information about the code coverage, see the test coverage report. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsNo conflicts as of last run. |
f79468e
to
a460b50
Compare
|
a5475a2
to
70f8870
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found possible deadlock causes and crafted a test that exercises one of them: furszy@20344b9. Feel free to pull it here.
70f8870
to
689ed43
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the sake of completeness and to avoid potential issues, have expanded the test coverage to exercise the previously mentioned "concurrent database transactions deadlock" behavior. All yours: furszy@a80cc56. It verifies that the database handler cannot commit/abort changes occurring in a different database handler.
Also, while was working on this, discovered another deadlocking cause.. test furszy@68e77b5. As it is a bdb-only issue, we could address it in another PR. But, it will mean that furszy@a80cc56 will need to be adapted to only run on sqlite (replace TestDatabases()
for sqlite db creation call).
689ed43
to
f1b5b25
Compare
Hmm, seems like BDB doesn't behave how I thought it did. It looks like we can't have concurrent transactions that try to operate on the same data, at least not without changing the isolation level, and I don't think it's worth the effort to figure that out when we're planning on removing BDB soon anyways. I've taken your test and updated it to just use SQLite. I don't think this particular issue is a problem for us however. The fact that it deadlocks in this particular test case is probably a good thing since it means that BDB is handling locking internally. If there were two threads trying to read/write at the same time, BDB would force one thread to wait for the other, as this PR is doing for SQLite. That's the behavior that we want. It's only a deadlock (and test failure) since the test is trying to do the behavior of two threads within one thread. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this particular issue is a problem for us however. The fact that it deadlocks in this particular test case is probably a good thing since it means that BDB is handling locking internally. If there were two threads trying to read/write at the same time, BDB would force one thread to wait for the other, as this PR is doing for SQLite. That's the behavior that we want. It's only a deadlock (and test failure) since the test is trying to do the behavior of two threads within one thread.
The issue I envision is nested WalletBatch
within the same thread. We need to be careful on not introducing an inner WalletBatch
creation when there is an existing WalletBatch
on any upper level.
Still, agree on not doing the changes here. Isn't related. I should revive #25297 after we finish #28574.
f1b5b25
to
b316a81
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another possible deadlock can arise when the txn abortion fails during the destruction of the batch object.
Essentially, if the txn abortion fails and the object is destructed, the semaphore remains locked indefinitely, leading to a deadlock in any subsequent write operation.
Made a test replicating the behavior; cherry-pick 5d2b79a and fbd59f8 (from https://github.com/furszy/bitcoin-core/commits/sqlite-concurrent-writes/ )
b316a81
to
7b0fcea
Compare
Changed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed TxnAbort and TxnCommit to always release the lock.
This breaks the db txn isolation property.
The db txn is still active at the engine level, the lock shouldn't be released when the sqlite statement fails to execute (only when the abort/commit command executes successfully). Otherwise, other concurrent handlers are allowed to access db at the same time and, in the worst possible outcome, dump information to disk that should have been discarded.
Check this out 7117099. Made a test that exercises the behavior. Feel free to pull it.
Only the handler who originated the db txn should be able to commit and/or abort such txn, not other handlers. If other handlers were able to commit txns that they did not start, they could be saving incomplete information to disk and affecting other processes.
Aside from that, thanks for pulling all the tests commits. We are getting closer.
If it fails to abort when the batch is destructed, then we're deadlocked. I don't see how this can be resolved without deadlocking and still preserving isolation. |
Working on a solution. Will push it in few hours. |
7b0fcea
to
b316a81
Compare
…ctions b298242 test: sqlite, add coverage for dangling to-be-reverted db txns (furszy) fc0e747 sqlite: guard against dangling to-be-reverted db transactions (furszy) 472d2ca sqlite: introduce HasActiveTxn method (furszy) dca874e sqlite: add ability to interrupt statements (furszy) fdf9f66 test: wallet db, exercise deadlock after write failure (furszy) Pull request description: Discovered while was reviewing #29112, specifically #29112 (review). If the db handler that initiated the database transaction is destroyed, the ongoing transaction cannot be left dangling when the db txn fails to abort. It must be forcefully reverted; otherwise, any subsequent db handler executing a write operation will dump the dangling, to-be-reverted transaction data to disk. This not only breaks the isolation property but also results in the improper storage of incomplete information on disk, impacting the wallet consistency. This PR fixes the issue by resetting the db connection, automatically rolling back the transaction (per https://www.sqlite.org/c3ref/close.html) when the handler object is being destroyed and the txn abortion failed. Testing Notes Can verify the failure by reverting the fix e5217fe and running the test. It will fail without e5217fe and pass with it. ACKs for top commit: achow101: ACK b298242 ryanofsky: Code review ACK b298242. Just fix for exec result codes and comment update since last review. Tree-SHA512: 44ba0323ab21440e79e9d7791bc1c56a8873c8bd3e8f6a85641b91576e1293011fa8032d8ae5b0580f3fb7a949356f7b9676693d7ceffa617aaad9f6569993eb
b316a81
to
fe617be
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code review fe617be. Looks good but I think there is still a race in SQLiteBatch::Close and a semaphore leak in SQLiteBatch::TxnBegin. Still reviewing, mostly just looked at the first commit so far.
fe617be
to
bdf348f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code reviewed, looking good. Left a last comment.
Also, have slightly modified my previous unit test to validate the changes (ba333a2).
It is not really needed but feel free to take it if you like it (conceptually, it is similar to your test, the diff is that it is verifying the writes execution order and checks that when a batch finishes writing data to disk, another batch in a concurrent process can read the data immediately).
Concept ACK Started reviewing, looks good. Will continue review tomorrow. Might be worth updating the last part of the description with the insights from #29112 (comment) ? |
A SQLiteBatch need to wait for any other batch to finish writing before it can begin writing, otherwise db txn state may be incorrectly modified. To enforce this, each SQLiteDatabase has a semaphore which acts as a lock and is acquired by a batch when it begins a write, erase, or a transaction, and is released by it when it is done. To avoid deadlocking on itself for writing during a transaction, SQLiteBatch also keeps track of whether it has begun a transaction.
There are occasions where a multi-statement tx is begun in one batch, and a second batch is created which does a normal write (without a multi-statement tx). These should not conflict with each other and all of the data should end up being written to disk.
Verifying that a database handler can't commit/abort changes occurring in a different database handler.
bdf348f
to
cfcb9b1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK cfcb9b1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code review ACK cfcb9b1. This looks great and I think it is ready for merge. Just holding off because josibake seemed ready to review #29112 (comment) and might have more feedback.
I did leave some comments but they are all minor and can be ignored. I was was also wondering about this in the PR description:
re: #29112 (comment)
To avoid this situation, we need to prevent the multiple batches from writing at the same time. Although SQLite does have facilities for doing that, they require the Write Ahead Log (WAL) mode, which is a feature that we explicitly chose to disable since it spread wallet data to multiple files. As such, we need to be handling the concurrency ourselves.
This doesn't seem accurate to me. As long as all the batch objects are sharing a sqlite3* m_db
pointer and using the same connection to the database, a semaphore or mutex needs to be used to keep their transactions separate. This would be true whether the database is in WAL or journaling mode. Also if each batch object had it's own sqlite3* m_db
connection, the semaphore should not be needed regardless of the database mode, IIUC.
@@ -33,6 +36,41 @@ def skip_test_if_missing_module(self): | |||
self.skip_if_no_sqlite() | |||
self.skip_if_no_py_sqlite3() | |||
|
|||
def test_concurrent_writes(self): | |||
self.log.info("Test sqlite concurrent writes are in the correct order") | |||
self.restart_node(0, extra_args=["-unsafesqlitesync=0"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "tests: Test for concurrent writes with db tx" (548ecd1)
Could there be a comment to explain unsafesqlitesync? Is it just to make the test faster, or is it actually important to the functioning of the test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The option's help text states Set SQLite synchronous=OFF to disable waiting for the database to sync to disk. This is unsafe and can cause data loss and corruption. This option is only used by tests to improve their performance (default: false)
bilingual_str error; | ||
const auto& database = MakeSQLiteDatabase(m_path_root / "sqlite", options, status, error); | ||
|
||
std::unique_ptr<DatabaseBatch> handler = Assert(database)->MakeBatch(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "test: wallet, coverage for concurrent db transactions" (cfcb9b1)
Would suggest calling this "handler1" instead of "handler" because otherwise when the comments below refer to a "handler" is not clear if they are referring to this variable specifically or a handler generically.
Also s/handler/batch/g throughout this function would make it more meaningful and consistent with the rest of wallet code, IMO
BOOST_CHECK(handler2->Exists(key)); | ||
|
||
// Attempt to commit the handler txn calling the handler2 methods. | ||
// Which, must not be possible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "test: wallet, coverage for concurrent db transactions" (cfcb9b1)
Would be good to clarify why it is not possible. Maybe "Which must not be possible because batch2->TxnBegin() was never called."
@@ -279,8 +279,48 @@ BOOST_AUTO_TEST_CASE(txn_close_failure_dangling_txn) | |||
BOOST_CHECK(!batch2->Exists(key)); | |||
} | |||
|
|||
#endif // USE_SQLITE | |||
BOOST_AUTO_TEST_CASE(concurrent_txn_dont_interfere) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "test: wallet, coverage for concurrent db transactions" (cfcb9b1
Nothing about this test seems sqlite-specific. Seems like it would be nicer to call TestDatabases
and be generic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test was originally generic and it was moved to sqlite-only because of a bdb bug, see #29112 (review).
# ChainStateFlushed which will trigger a write to the db, hopefully at the | ||
# same time that the topup still has an open db transaction. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "tests: Test for concurrent writes with db tx" (548ecd1)
This seems fine, but you could probably increase the probability of this by calling keypoolrefill in a loop until gettxoutsetinfo returns instead of calling it once.
ACK cfcb9b1 Looks good, nothing to add. Happy to reACK if you end up taking ryanofsky's suggestions. |
Indeed, I misunderstood the documentation. Dropped the sentence. Going to leave the nits for now. |
A SQLiteBatch need to wait for any other batch to finish writing before it can begin writing, otherwise db txn state may be incorrectly modified. To enforce this, each SQLiteDatabase has a semaphore which acts as a lock and is acquired by a batch when it begins a write, erase, or a transaction, and is released by it when it is done. To avoid deadlocking on itself for writing during a transaction, SQLiteBatch also keeps track of whether it has begun a transaction. Github-Pull: bitcoin#29112 Rebased-From: 395bcd2
There are occasions where a multi-statement tx is begun in one batch, and a second batch is created which does a normal write (without a multi-statement tx). These should not conflict with each other and all of the data should end up being written to disk. Github-Pull: bitcoin#29112 Rebased-From: 548ecd1
Verifying that a database handler can't commit/abort changes occurring in a different database handler. Github-Pull: bitcoin#29112 Rebased-From: cfcb9b1
A SQLiteBatch need to wait for any other batch to finish writing before it can begin writing, otherwise db txn state may be incorrectly modified. To enforce this, each SQLiteDatabase has a semaphore which acts as a lock and is acquired by a batch when it begins a write, erase, or a transaction, and is released by it when it is done. To avoid deadlocking on itself for writing during a transaction, SQLiteBatch also keeps track of whether it has begun a transaction. Github-Pull: bitcoin#29112 Rebased-From: 395bcd2
There are occasions where a multi-statement tx is begun in one batch, and a second batch is created which does a normal write (without a multi-statement tx). These should not conflict with each other and all of the data should end up being written to disk. Github-Pull: bitcoin#29112 Rebased-From: 548ecd1
Verifying that a database handler can't commit/abort changes occurring in a different database handler. Github-Pull: bitcoin#29112 Rebased-From: cfcb9b1
The way that we have configured SQLite to run means that only one database transaction can be open at a time. Typically, each individual read and write operation will be its own transaction that is opened and committed automatically by SQLite. However, sometimes we want these operations to be batched into a multi-statement transaction, so
SQLiteBatch::TxnBegin
,SQLiteBatch::TxnCommit
, andSQLiteBatch::TxnAbort
are used to manage the transaction of the database.However, once a db transaction is begun with one
SQLiteBatch
, any operations performed by anotherSQLiteBatch
will also occur within the same transaction. Furthermore, those otherSQLiteBatch
s will not be expecting a transaction to be active, and will abort it once theSQLiteBatch
is destructed. This is problematic as it will prevent some data from being written, and also cause theSQLiteBatch
that opened the transaction in the first place to be in an unexpected state and throw an error.To avoid this situation, we need to prevent the multiple batches from writing at the same time. To do so, I've implemented added a
CSemaphore
withinSQLiteDatabase
which will be used by anySQLiteBatch
trying to do a write operation.wait()
is called byTxnBegin
, and at the beginning ofWriteKey
,EraseKey
, andErasePrefix
.post()
is called inTxnCommit
,TxnAbort
and at the end ofWriteKey
,EraseKey
, andErasePrefix
. To avoid deadlocking onTxnBegin()
followed by aWriteKey()
, `SQLiteBatch will now also track whether a transaction is in progress so that it knows whether to use the semaphore.This issue is not a problem for BDB wallets since BDB uses WAL and provides transaction objects that must be used if an operation is to occur within a transaction. Specifically, we either pass a transaction pointer, or a nullptr, to all BDB operations, and this allows for concurrent transactions so it doesn't have this problem.
Fixes #29110