Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Only one related transaction should be allowed to exist when making a DDL statement #171

Closed
KKould opened this issue Mar 18, 2024 · 19 comments
Assignees
Labels
bug Something isn't working

Comments

@KKould
Copy link
Member

KKould commented Mar 18, 2024

Feature Request

When DDL exists in multiple transactions, it may cause the data modification to be incorrect but still able to be submitted.(Write conflict detection requires the key to be the same)

@KKould KKould added the bug Something isn't working label Mar 18, 2024
@crwen
Copy link
Member

crwen commented Mar 19, 2024

maybe some lock manager?

@KKould
Copy link
Member Author

KKould commented Mar 19, 2024

Yes, it is consistent with the table lock in MySQL. When a DDL operation occurs, no other DDL or DML is allowed.

Tips: I think DML and DDL are operations similar to RWLock, DML can be run by multiple DML transactions simultaneously, but DDL runs a single DDL transaction simultaneously in DML and DDL
Tips 2: If there are duplicates in DML, Storage will detect and throw an exception by itself, This is already implemented so you don't need to worry about this

This may be a relatively big job, so before starting this issue, could you please fix the nestloopjoin pr problem first?

@crwen
Copy link
Member

crwen commented Mar 19, 2024

could you please fix the nestloopjoin pr problem first?

I think the main problem is in the binder layer, or are you referring to the problem of explain?

@KKould
Copy link
Member Author

KKould commented Mar 19, 2024 via email

@KKould
Copy link
Member Author

KKould commented Mar 19, 2024 via email

@KKould KKould assigned KKould and unassigned KKould Mar 20, 2024
@crwen
Copy link
Member

crwen commented Mar 22, 2024

I have some problems in adding locks for methods of Transaction. Here is my understanding:

  1. read, read_by_index, append, delete, table, drop_data should add read lock
  2. add_column, drop_column, create_table should add write lock
  3. how about add_index, del_index and add_index_meta? I found both ddl(insert, update) and dml(create index, add column) use add_index. Maybe add_index_meta should add write lock, others should be read lock.
  4. should meta-related methods be locked? Which lock should be used?(table_metas, table_meta_path, save_table_meta, meta_load)

@KKould
Copy link
Member Author

KKould commented Mar 22, 2024

I tried to implement it today and discovered something. DDL is executed as a separate transaction in MySQL. If a transaction has not yet been submitted after executing DQL or DML, etc., executing DDL at this time will implicitly submit the transaction. And to do a separate transaction for the DDL (which means only using RWLock), my plan is to add a new attribute to the DataBase

pub struct Database<S: Storage> {
    pub storage: S,
    functions: Arc<Functions>,
    mdl: Arc<RwLock<()>>
}

I chose to learn MySQL and do it on the upper layer of the storage engine, and table_cache is based on your PR

@crwen
Copy link
Member

crwen commented Mar 22, 2024

If a transaction has not yet been submitted after executing DQL or DML, etc., executing DDL at this time will implicitly submit the transaction.

I think DDL will be blocked if a DQL or DML does not submit. (MySQL 8.3.0)

my plan is to add a new attribute to the DataBase

This will lock the whole database.

@KKould
Copy link
Member Author

KKould commented Mar 22, 2024

yes, I originally meant within the same transaction, but if it is multiple transactions, then it is what you said

@KKould KKould mentioned this issue Mar 22, 2024
9 tasks
@KKould
Copy link
Member Author

KKould commented Mar 22, 2024

This will lock the whole database.

yes, so it is guaranteed that no other operations are allowed to be performed when DDL is executed
In MySQL, mdl often produces deadlocks

@crwen
Copy link
Member

crwen commented Mar 22, 2024

In MySQL, mdl often produces deadlocks

As far as I know, mdl is table level lock, and it is 2PL that causes deadlocks.

@KKould
Copy link
Member Author

KKould commented Mar 22, 2024

In MySQL, mdl often produces deadlocks

As far as I know, mdl is table level lock, and it is 2PL that causes deadlocks.

https://stackoverflow.com/questions/76484118/why-mysql-mdlmeta-data-lock-led-to-a-deadlock

@KKould KKould self-assigned this Mar 22, 2024
@crwen
Copy link
Member

crwen commented Mar 22, 2024

It's kind of like a hierarchical lock and intention lock, but it is still a table level lock in most case, and multiple ddl on different table can execute concurrently.

session1                                        session2
select * from t;
                                               alter table t2 add column d int(ok)
                                               alter table t add column d int(block)

+-------------+--------------------+----------------+---------------------+
| OBJECT_TYPE | OBJECT_SCHEMA      | OBJECT_NAME    | LOCK_TYPE           |
+-------------+--------------------+----------------+---------------------+
| TABLE       | test               | t              | SHARED_READ         | 
| GLOBAL      | NULL               | NULL           | INTENTION_EXCLUSIVE | 
| BACKUP LOCK | NULL               | NULL           | INTENTION_EXCLUSIVE |
| SCHEMA      | test               | NULL           | INTENTION_EXCLUSIVE |
| TABLE       | test               | t              | SHARED_UPGRADABLE   |
| TABLESPACE  | NULL               | test/t         | INTENTION_EXCLUSIVE |
| TABLE       | test               | #sql-790d_b    | EXCLUSIVE           | 
| TABLE       | test               | t              | EXCLUSIVE           |
| TABLE       | performance_schema | metadata_locks | SHARED_READ         | 
+-------------+--------------------+----------------+---------------------+

I don't think implicitly submiting is a good thing, you can't even rollback(even when you kill it, it is crazy).

@KKould
Copy link
Member Author

KKould commented Mar 22, 2024

Yes, you are right, so I did not do implicit commit in pr, but refused to execute DDL in transaction

@KKould
Copy link
Member Author

KKould commented Mar 22, 2024

Oh, I made a mistake, I should have used table level locks

@KKould
Copy link
Member Author

KKould commented Mar 23, 2024

@crwen do you have any ideas for implementing table level locks? I don't have any good ideas now

@crwen
Copy link
Member

crwen commented Mar 25, 2024

@crwen do you have any ideas for implementing table level locks? I don't have any good ideas now

I have no idea. But tables are known only when binding or after, maybe lock can be added when executing or acessing the storage.

@KKould
Copy link
Member Author

KKould commented Mar 25, 2024

I think we can first make mdl a global lock, restrict DDL to global serialization, and downgrade it to table level lock as a perf issue

@crwen
Copy link
Member

crwen commented Mar 25, 2024

I think we can first make mdl a global lock, restrict DDL to global serialization, and downgrade it to table level lock as a perf issue

So maybe #177 could be merged now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants