database: Potential data race for mutual exclusion #9193

hinto-janai · 2024-02-21T19:35:31Z

What

There may be a data race when needing to acquire mutual exclusive access to monerod's database, e.g. when resizing.

Invariant

When resizing LMDB's memory map, the caller must ensure they have mutual exclusive access to it.

As per mdb_env_set_mapsize() docs:

This function [...] may be called at later times if no transactions are active in this process. Note that the library does not check for this condition, the caller must ensure it explicitly.

An error is returned if there are other write transactions and presumably UB will occur if read transaction(s) exist.

Implementation

The solution to this was implemented in #289, and is still used today. My understanding of this code is:

There are 2 atomic values used to achieve mutual exclusive access to the database:

monero/src/blockchain_db/lmdb/db_lmdb.cpp

Lines 354 to 355 in 059028a

    
           std::atomic<uint64_t> mdb_txn_safe::num_active_txns{0}; 
        
           std::atomic_flag mdb_txn_safe::creation_gate = ATOMIC_FLAG_INIT;

The atomic bool is used to indicate "do not enter, we are resizing". Before starting a transaction, this bool will be spinned on until it is false. It is set to true when LMDB is resizing:

monero/src/blockchain_db/lmdb/db_lmdb.cpp

Lines 448 to 451 in 059028a

    
           void mdb_txn_safe::prevent_new_txns() 
        
           { 
        
             while (creation_gate.test_and_set()); 
        
           }

When resizing, this prevents new transactions from starting.

To handle currently active transactions, each transaction will num_active_txns++ after successfully passing the atomic bool, and will num_active_txns-- when done. The thread resizing will spin until num_active_txns is 0, indicating there are no more transactions:

monero/src/blockchain_db/lmdb/db_lmdb.cpp

Lines 453 to 456 in 059028a

    
           void mdb_txn_safe::wait_no_active_txns() 
        
           { 
        
             while (num_active_txns > 0); 
        
           }

Now, with no new transactions allowed, and all current ones gone, we should have mutual exclusive access to the database and resizing should be OK.

Problem

2 atomic operations back-to-back are not atomic. Other threads are free to execute in-between 2 atomic operations. What may occur in the above implementation is such:

  Thread 1 (resizing)                             Thread 2 (normal read tx)
          |                                                 |
          |                                                 |
          |                                           lmdb_txn_begin()
          |                                     "not resizing, OK to start tx"
  prevent_new_txns()                                        |
"no new tx's can start"                                     |
          |                                                 |
 wait_no_active_txns()                                      |
  "spin until 0 tx's"                                       |
          |                                          num_active_txns++
          |                                                 |
 mdb_env_set_mapsize()   <--- ⚠️ unsafe ⚠️ --->   start_doing_tx_stuff()

There is space in-between Thread 2 succesfully entering a transaction and updating num_active_txns.

If Thread 2 is scheduled out by the OS after lmdb_txn_begin() but before num_active_txns++ (unlikely but non-zero chance) Thread 1 will incorrectly assume it has mutual exclusive access and start the resize.

monero/src/blockchain_db/lmdb/db_lmdb.cpp

Lines 606 to 608 in 059028a

    
           mdb_txn_safe::wait_no_active_txns(); 
        
           int result = mdb_env_set_mapsize(m_env, new_mapsize);

Some crashes occurring near a resize that could be explained by this:

The text was updated successfully, but these errors were encountered:

0xFFFC0000 · 2024-02-22T08:31:12Z

In case you use blockchain.cpp to interact with db_lmdb.cpp, 9181 does fix this. Since no read transaction can start while there is a write transaction going on, and write transactions cannot start until acquires an exclusive lock on db.

vtnerd · 2024-02-22T18:25:24Z

@hinto-janai I believe you mis-read the code:

while (creation_gate.test_and_set());
   num_active_txns++;
   creation_gate.clear();

Thread1 is blocked from advancing until creation_gate.clear() is called. This is basically just a spin mutex protecting num_active_txns.

The patch by @0xFFFC0000 is likely overkill, and not needed (that's just my knee-jerk reaction at this point).

selsta · 2024-02-22T18:29:12Z

@vtnerd this isn't related to this specific issue but sometimes when there is heavy RPC traffic it's possible that the node can't add new blocks and falls behind due to the current locking mechanism. moneromooo suggested a while ago that a rw lock would help here.

vtnerd · 2024-02-22T18:29:58Z

And to describe why the example doesn't work as @hinto-janai suggested - lmdb_txn_begin() should have creation_gate as true, so Thread1 is blocked in prevent_new_txs because the prior value is true until after num_active_txns++ is modified.

vtnerd · 2024-02-22T18:35:33Z

@vtnerd this isn't related to this specific issue but sometimes when there is heavy RPC traffic it's possible that the node can't add new blocks and falls behind due to the current locking mechanism. moneromooo suggested a while ago that a rw lock would help here.

Yes, this may have some resource starvation issues since it's a dirty spin lock instead of some queued lock. But there is no data race afaik.

I'm not certain that #9181 fixes this though, I think the original spin lock code would have to be removed as well.

0xFFFC0000 · 2024-02-22T18:40:14Z

The patch by @0xFFFC0000 is likely overkill, and not needed (that's just my knee-jerk reaction at this point).

I believe if you read the PR carefully, you will realize that PR is addressing another issue. As a side-effect, it does solve the data race problem on lmdb_db too if you are accessing lmdb via blockchain.cpp APIs.

If you want to go into more technical depth, I am happy to do it :)

vtnerd · 2024-02-22T18:42:18Z

I'm not certain that #9181 fixes this though, I think the original spin lock code would have to be removed as well.

Scratch that, #9181 probably works, but we just wasting cycles on atomic operations.

vtnerd · 2024-02-22T18:45:12Z

@0xFFFC0000 well there is no data race where @hinto-janai describes, so you are saying there is yet another one? I'm doubting this as well, but sure why not enlighten me.

0xFFFC0000 · 2024-02-22T18:56:24Z

@0xFFFC0000 well there is no data race where @hinto-janai describes, so you are saying there is yet another one? I'm doubting this as well, but sure why not enlighten me.

I did specifically talk about rw access from blockchain.cpp.

About the data race, directly from lmdb_db, as hinto explains in his comment, I have not tried to prove (or disprove) it either.

Keep in mind, if you are interacting with lmdb_db via blockchain.cpp, that data race cannot happen due to rwlock.

Though I am tracking you and @hinto-janai discussion, to understand the situation.

vtnerd · 2024-02-22T18:58:36Z

And sorry for being a bit rude, I'm just crabby that I have to review this DB change :/

0xFFFC0000 · 2024-02-22T19:03:07Z

And sorry for being a bit rude, I'm just crabby that I have to review this DB change :/

No no, I didn't even realize or notice anything rude from you :)

We were just discussing technical matters as always. Don't mention it at all. Looking forward to continuing our discussion. Because (possibility of) data race is a multi-level issue, we have to address it.

hyc · 2024-02-22T19:08:12Z

@vtnerd is correct in his first reply #9193 (comment). Because of the first creation_gate.test_and_set() when mdb_txn_safe begins, prevent_new_txns() cannot proceed until the creation_gate.clear(). There is no race there.

vtnerd · 2024-02-22T19:12:00Z

@0xFFFC0000 is this about the potential starvation issue that @selsta mentioned or some other data race? I still don't know why #9181 was created (perhaps the discussion should move to that PR). It sounds like #9181 was not created due to the specific issue referenced by @hinto-janai in this issue.

0xFFFC0000 · 2024-02-22T19:14:46Z

It sounds like #9181 was not created due to the specific issue referenced by @hinto-janai in this issue.

Exactly, the 9181 is for a different issue. But a side-effect of that locking is, as I mentioned in my first comment, if there was a data race in lmdb_db, it would've not happened if you were accessing it via Blockchain.cpp APIs, since it does have its own locking mechanism.

I am available to discuss it on the PR page.

hinto-janai · 2024-02-22T22:37:26Z

@vtnerd Ah you are correct. I was looking at lmdb_resized() where that constructor doesn't run so the 2 atomics being checked appear like an ABA problem. Although, every transaction (in db_lmdb.cpp at least) seems to be guarded by mdb_txn_safe at some point in the call stack.

There's BlockchainLMDB::do_resize() which doesn't use mdb_txn_safe but seems like that's guarded by a lock.

selsta added the database label Feb 22, 2024

hinto-janai closed this as not planned Won't fix, can't repro, duplicate, stale Feb 22, 2024

Rucknium mentioned this issue Feb 27, 2024

Monero Research Lab Meeting - Wed 21 February 2024, 17:00 UTC monero-project/meta#970

Closed

0xFFFC0000 mentioned this issue Feb 29, 2024

cryptonote_core: introducing rwlock mechanism #9181

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

database: Potential data race for mutual exclusion #9193

database: Potential data race for mutual exclusion #9193

hinto-janai commented Feb 21, 2024

0xFFFC0000 commented Feb 22, 2024 •

edited

vtnerd commented Feb 22, 2024

selsta commented Feb 22, 2024

vtnerd commented Feb 22, 2024 •

edited

vtnerd commented Feb 22, 2024

0xFFFC0000 commented Feb 22, 2024

vtnerd commented Feb 22, 2024

vtnerd commented Feb 22, 2024

0xFFFC0000 commented Feb 22, 2024

vtnerd commented Feb 22, 2024

0xFFFC0000 commented Feb 22, 2024 •

edited

hyc commented Feb 22, 2024 •

edited

vtnerd commented Feb 22, 2024

0xFFFC0000 commented Feb 22, 2024 •

edited

hinto-janai commented Feb 22, 2024

database: Potential data race for mutual exclusion #9193

database: Potential data race for mutual exclusion #9193

Comments

hinto-janai commented Feb 21, 2024

What

Invariant

Implementation

Problem

0xFFFC0000 commented Feb 22, 2024 • edited

vtnerd commented Feb 22, 2024

selsta commented Feb 22, 2024

vtnerd commented Feb 22, 2024 • edited

vtnerd commented Feb 22, 2024

0xFFFC0000 commented Feb 22, 2024

vtnerd commented Feb 22, 2024

vtnerd commented Feb 22, 2024

0xFFFC0000 commented Feb 22, 2024

vtnerd commented Feb 22, 2024

0xFFFC0000 commented Feb 22, 2024 • edited

hyc commented Feb 22, 2024 • edited

vtnerd commented Feb 22, 2024

0xFFFC0000 commented Feb 22, 2024 • edited

hinto-janai commented Feb 22, 2024

0xFFFC0000 commented Feb 22, 2024 •

edited

vtnerd commented Feb 22, 2024 •

edited

0xFFFC0000 commented Feb 22, 2024 •

edited

hyc commented Feb 22, 2024 •

edited

0xFFFC0000 commented Feb 22, 2024 •

edited