MySQLOnRocksDB/mysql-5.6
forked from facebook/mysql-5.6

Loading…
SELECT ... LOCK IN SHARE MODE doesn't work #44
So,
SELECT ... LOCK IN SHARE MODE should place read locks on rows it returns.
(a read lock is compatible with another read lock. write lock is not compatible
with either read or a write lock)
Tasks to do:
- Lock_table and Row_lock should support read locks
- ha_rocksdb should recognize LOCK IN SHARE MODE and request read locks.
Looking at how to change LockTable/Row_locks to support read-write locks.
MySQL has rwlock.h, but in include/atomic/rwlock.h one can find
#define my_atomic_rwlock_init(name) pthread_mutex_init(& (name)->rw, 0)
that is, these rw locks are actually mutexes.
On using pthread_rwlock_*
pthread_rwlock_t provides RW-locks.
The API provides functions pthread_rwlock_timedwrlock() and
pthread_rwlock_timedrdlock() which wait for the lock and support timeouts.
One problem is that there is no way for other thread to abort the wait.
Current exclusive-locking code uses {thd_enter_cond(); mysql_cond_timedwait()}
which allows KILL command to abort the wait by signaling the condition.
Another issue is that a user of pthread_rwlock* will need to keep track
of what locks he already has. Man page for pthread_rwlock_timedrdlock() says:
The calling thread may deadlock if at the time the call is made it holds
a write lock on rwlock.
This can be fixed by keeping tracks of the write locks that are acquired by
the current transaction and simply not trying to acquire a read lock if we
already have a write lock on the same row.
Man page for pthread_rwlock_timedwrlock() says:
The calling thread may deadlock if at the time the call is made it holds
the read-write lock.
That is, there seems to be no way to "upgrade" a read lock into a write lock.
On lock recursivity.
The user can issue
SELECT * FROM t1 WHERE pk=1 FOR UPDATE;
SELECT * FROM t1 WHERE pk=1 LOCK IN SHARE MODE;
in arbitrary order. (Current) exclusive locks allow multiple get_lock() calls which must be followed by multiple release_lock() calls.
Row_lock::owner_data and Row_lock::busy store information about how many times the owner has acquired the lock.
The situation gets more complex when we have multiple readers who may have acquired read locks multiple times, as well as the writer who may also hold read locks.
Possible ways out:
1. Support recursive locks.
2. Demand that the callers don't acquire locks they already have.
== Recursive locks ==
Row_lock should remember who has locked it (and how many times).
Currently Row_lock stores info about the thread that holds the write lock, and how many times it has acquired it. We will also need a list of read lock owners + counts.
(Q: is it ok to have a list or array or any other structure?)
Advantages:
- Lock subsystem will have simple API.
Disadvantages:
- Locking Internals will be heavy and complex. If there are a lot of read locks, we will be inefficient.
(however, transaction will do its lock lookups while not holding the mutex. as opposed to scanning the read lock holders which will be done while holding row lock's mutex)
== Non-recursive locks ==
Before a transaction attempts to get a lock, it should check it against the locks it has.
Currently a transaction stores locks it has acquired in an array. The array may have duplicates.
We will need to switch to std::unordered_map (or something similar), and we will also need to keep a list of locks acquired by the current statement (if the statement fails, we will need to release those)
Advantages:
- Lock_table implementation is simpler.
Disadvantages:
- Lock_table API will be complex.
- Transactions will need to maintain hash tables of their locks, which is an extra expense ( number-of-locks-transaction-holds is typically greater than number-of-read-lock-holders-for-a-certain-row).
Now, implementing a recursive r/w locks with capability to upgrade read lock + write lock. The implementation has a limit of how many read locks can be held for a given row. This limitation can be lifted in the future.
Implementation: https://reviews.facebook.net/D38265
Limitations:
- There is max. limit on the number of read locks one can have
- There is one
mysql_cond_tcondition, and everyone who wishes to acquire a read or a write lock wait for it. Then, mysql_cond_broadcast() is used to wake up all of the waiters. This may cause unneeded wake-ups. - The question of read lockers starving a write locker is left to OS thread scheduling.
|
|
spetrunia |
Issue #44: SELECT ... LOCK IN SHARE MODE
…
Summary: Add support for read and read write locks in RocksDB's LockTable. The implementation is more concerned with correctness than with concurrency. Test Plan: Added unit test, enabled MTR tests Reviewers: maykov, jtolmer, yoshinorim, hermanlee4 Reviewed By: hermanlee4 Differential Revision: https://reviews.facebook.net/D38265 |
f270f10
|
|
|
spetrunia |
Issue #44: SELECT ... LOCK IN SHARE MODE
…
Summary: Add support for read and read write locks in RocksDB's LockTable. The implementation is more concerned with correctness than with concurrency. Test Plan: Added unit test, enabled MTR tests Reviewers: maykov, jtolmer, yoshinorim, hermanlee4 Reviewed By: hermanlee4 Differential Revision: https://reviews.facebook.net/D38265 |
cb52d29
|
create table r1 (id1 int, id2 int, value int, primary key (id1, id2)) engine=rocksdb;
insert into r1 values (1,1,1), (2,2,2), (3,3,3), (4,4,4), (5,5,5);
T1:
begin;
update r1 set value=value+1 where id1=2 and id2=2;
T2:
begin;
select * from r1 where id1=2 and id2=2 lock in share mode;
=> not blocked