Abort readers of prepared and then aborted transaction #8654

alyapunov · 2023-05-15T04:51:55Z

When a transaction is prepared (called commit and yielded, waiting for WAL) it is marked as 'prepared' (untill commit finishes and it becomes 'committed'). Other transactions are allowed to read changes of prepared transaction (depending of RO/RW state and isolation level though). The problem is that if prepared transaction is aborted (because of WAL error) all its readers must be aborted too. This behavior must be identical for vinyl and memtx engines.

Reproduce for insertion

box.cfg{memtx_use_mvcc_engine = true}
local s = box.schema.space.create('test')
s:create_index('pk')
s:create_index('sk', {parts = {{2, 'uint'}}})
local txn_proxy = require('txn_proxy')

local tx1 = txn_proxy.new()
local tx2 = txn_proxy.new()
tx1:begin()
tx2:begin()
tx1('s:replace{1, 1}')
tx2('s:replace{2, 2}')
box.error.injection.set('ERRINJ_WAL_WRITE', true)
local fib = fiber.create(tx1.commit, tx1)
fib:set_joinable(true)
-- select by any index must read {1, 1} and lead to the same behavior.
tx2('s:select{1}')
--tx2('s.index[0]:select{1}')
--tx2('s.index[1]:select{1}')
fib:join()
box.error.injection.set('ERRINJ_WAL_WRITE', false)
tx2:commit() -- must fail, but it does not.

Reproduce for deletion

box.cfg{memtx_use_mvcc_engine = true}
local s = box.schema.space.create('test')
s:create_index('pk')
s:create_index('sk', {parts = {{2, 'uint'}}})
local txn_proxy = require('txn_proxy')

s:replace{1, 1, 1}
local tx1 = txn_proxy.new()
local tx2 = txn_proxy.new()
tx1:begin()
tx2:begin()
tx1('s:delete{1}')
tx2('s:replace{2, 2}')
box.error.injection.set('ERRINJ_WAL_WRITE', true)
fib = fiber.create(tx1.commit, tx1)
fib:set_joinable(true)
-- select by any index must read {} and lead to the same behavior.
tx2('s:select{1}')
--tx2('s.index[0]:select{1}')
--tx2('s.index[1]:select{1}')
fib:join()
box.error.injection.set('ERRINJ_WAL_WRITE', false)
tx2:commit() -- must fail, but it does not.

The text was updated successfully, but these errors were encountered:

That's strange, but in this test in a group of simple test cases there are test cases that checks replaces, updates and deletes, but occasionally there's no test case that checks inserts. Fix it and add simple test cases for inserts. No logical changes. Part of tarantool#8648 Part of tarantool#8654 NO_DOC=new test case NO_CHANGELOG=new test case

If addition of a tuple is rolled back while the corresponding story is needed for something else (for example it stores a read set of another transaction) - the story cannot be deleted. Now there's a special flag `rollbacked` that is set to true for such stories, and the flag must be considered in places where history chains are scanned. That approach also requires psn to be set for rolled-back transactions, which suprisingly not as simple as it to say. All that makes the code complicated and hard to maintain. There's another approach for managing rolled back stories: simply set their del_pos to a low enough value (lower than any existing transaction's PSN) and (if necessary) push them to the end of history chain. Such a story would be invisible to any transaction due to already existing mechanisms, that's what is needed. In order to provide "low enough" del_pls it will be natural to assign real PSN starting from some predefined value, so any value below that predefined value will be less that any existing PSN and thus "low enough". Implement this more simple approach. Part of tarantool#8648 Part of tarantool#8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring

Conflict trackers are used to store information that if some transaction is committed then some another transaction must be aborted. This happens when the first transaction writes some key while the other reads the same key. On the other hand there are another trackers - read trackers - that are designed to handle exactly the same situation. That's why conflict trackers can be simply replaced with read trackers. That would allow to remove conflict trackers as not needed anymore. Part of tarantool#8648 Part of tarantool#8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring

Remove runtime allocation error handling and use panic-on-fail versions of allocation functions. Reasons for that: * Memory error handling was never tested an probably doesn't work. * Some return codes was ignored so the code had obvious flaws. * Rollback in case of memory error made some code overcomplicated. Part of tarantool#8648 Part of tarantool#8654 NO_DOC=no new functionality added NO_TEST=no new functionality added NO_CHANGELOG=no new functionalily added

Before this patch there were several different places in the code that deal with referencing tuple in space, setting in_index member and marking the story as retained or not. But logically all above is about the same - about placing a story to the top of a chain, i.e. the first story in version list to which index points. This commit refactors these things a bit. This mostly relates to two functions - memtx_tx_story_new and memtx_tx_story_link_top. Changes in memtx_tx_story_new are based on the fact that if a story is created by tuple, it is or immediately will be at the top of chain. Considering this we can omit argument `is_referenced_to_pk` and always create a story ready to be in top of chain. If a story is already in the top - nothing esle is need; if it is to become the top - memtx_tx_story_link_top must be called after. Further, linking to top of chain is needed exactly in two cases: * if a story just created by memtx_tx_story_new must become a top * if a chain is reordered involving the top story (the top and the next stories are swapped) These two cases are logically very close but still different. Even more, previously there ware two functions for that: memtx_tx_story_new_light and memtx_tx_story_new correspondingly. This commit introduces one function for that (although with one more argument) that also incapsulates activities about referencing tuples and marking stories as retained. After this patch the rules are logical and simple: * if a tuple is inserted - call _story_new and _link_top. * if a story of existring tuple is needed - call _story_new. * if chain is reordered involving top story - call _link_top. Part of tarantool#8648 Part of tarantool#8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring

There's a special 'point hole' mechanism in mvcc transactional manager that manages point gap reads by full key when no raw tuple was found in the index. For instance, it's the only way to collect gap reads for non-tree indexes. Once a new tuple is inserted to the index, the read records are transfered to the normal read set in the corresponding story. Actually after that the 'point hole' record in no more needed. So let's remove it. While we are here, improve name and comments. Part of tarantool#8648 Part of tarantool#8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring

The only place where this static function is used is more general static function - memtx_tx_track_read_story. This is a bit confusing - usually slow variant stand for public inline 'fast' method. So merge both functions in one. Part of tarantool#8648 Part of tarantool#8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring

In our SQL implementation temporary spaces are used. They come to MVCC engine in two variants - NULL or ephemeral. In both cases MVCC engine must not do anything, there are several checks for that in different parts of code. Normalize these checks and make them similar to each other. Part of tarantool#8648 Part of tarantool#8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring

In futher commit this list will be used for tracking all read gaps, not only 'nearby'. Since this rename has rather huge diff, let's make it in separate commit. No logical changes. Part of tarantool#8648 Part of tarantool#8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring

Now read trackers are used both for cases when a transaction has read an existing value and it has read nothing (read by key but there was no visible tuple in this place). For latter case an addition index_mask was used to idetify by which index was the select. Along with that there was per-index interval gap trackers. This patch divides area of resposibility between read trackers and gap tracker in the following way: * Reads of existing visible values are stored in read trackers. * Reads of non-existing or non-visible values are store in gap trackers. This new approach allows to provide new invariants: gap trackers are stored only at top of chain, and read trackers are stored only at topmost committed story in chain. Part of tarantool#8648 Part of tarantool#8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring

The function memtx_tx_story_delete is expected to delete fully unlinked stories and thus should not try to unlink something by itself. So remove unlink and add asserts instead. Part of tarantool#8648 Part of tarantool#8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring

It was an ugly solution when MVCC engine requires outside engine to set this flag which is not convinient. Remove it and use mode arguments to set up proper read trackers. Part of tarantool#8648 Part of tarantool#8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring

The latter flag is a bit wider: it reveals not only inserting statements after deleting by the same transaction, but also replacing and deleting statements after all kinds of previois changes but the same transaction. This extended behavoir will be used in futher commits. Part of tarantool#8648 Part of tarantool#8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring

Almost completely rewrite, simplify and comment this part of code. Part of tarantool#8648 Part of tarantool#8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring

There's rather seldom case when a transaction is rolled back from prepared state. This happens when WAL fails, or perhaps in other similar cases. By desing other RW transactions and transactions with READ_COMMITTED isolation level are allowed to read prepared state. All these transactions must be aborted in case of rollback of prepared transaction since they definitely have read no more possible database state. This patch implements this abortion. Closed tarantool#8654 NO_DOC=bugfix

If addition of a tuple is rolled back while the corresponding story is needed for something else (for example it stores a read set of another transaction) - the story cannot be deleted. Now there's a special flag `rollbacked` that is set to true for such stories, and the flag must be considered in places where history chains are scanned. That approach also requires psn to be set for rolled-back transactions, which surprisingly not as simple as it to say. All that makes the code complicated and hard to maintain. There's another approach for managing rolled back stories: simply set their del_pos to a low enough value (lower than any existing transaction's PSN) and (if necessary) push them to the end of history chain. Such a story would be invisible to any transaction due to already existing mechanisms, that's what is needed. In order to provide "low enough" del_pls it will be natural to assign real PSN starting from some predefined value, so any value below that predefined value will be less that any existing PSN and thus "low enough". Implement this more simple approach. Part of tarantool#8648 Part of tarantool#8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring

Conflict trackers are used to store information that if some transaction is committed then some another transaction must be aborted. This happens when the first transaction writes some key while the other reads the same key. On the other hand there are another trackers - read trackers - that are designed to handle exactly the same situation. That's why conflict trackers can be simply replaced with read trackers. That would allow to remove conflict trackers as not needed anymore. Part of tarantool#8648 Part of tarantool#8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring

Remove runtime allocation error handling and use panic-on-fail versions of allocation functions. Reasons for that: * Memory error handling was never tested an probably doesn't work. * Some return codes was ignored so the code had obvious flaws. * Rollback in case of memory error made some code overcomplicated. Part of tarantool#8648 Part of tarantool#8654 NO_DOC=no new functionality added NO_TEST=no new functionality added NO_CHANGELOG=no new functionalily added

Before this patch there were several different places in the code that deal with referencing tuple in space, setting in_index member and marking the story as retained or not. But logically all above is about the same - about placing a story to the top of a chain, i.e. the first story in version list to which index points. This commit refactors these things a bit. This mostly relates to two functions - memtx_tx_story_new and memtx_tx_story_link_top. Changes in memtx_tx_story_new are based on the fact that if a story is created by tuple, it is or immediately will be at the top of chain. Considering this we can omit argument `is_referenced_to_pk` and always create a story ready to be in top of chain. If a story is already in the top - nothing else is need; if it is to become the top - memtx_tx_story_link_top must be called after. Further, linking to top of chain is needed exactly in two cases: * if a story just created by memtx_tx_story_new must become a top * if a chain is reordered involving the top story (the top and the next stories are swapped) These two cases are logically very close but still different. Even more, previously there ware two functions for that: memtx_tx_story_new_light and memtx_tx_story_new correspondingly. This commit introduces one function for that (although with one more argument) that also incapsulates activities about referencing tuples and marking stories as retained. After this patch the rules are logical and simple: * if a tuple is inserted - call _story_new and _link_top. * if a story of existring tuple is needed - call _story_new. * if chain is reordered involving top story - call _link_top. Part of tarantool#8648 Part of tarantool#8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring

There's a special 'point hole' mechanism in mvcc transactional manager that manages point gap reads by full key when no raw tuple was found in the index. For instance, it's the only way to collect gap reads for non-tree indexes. Once a new tuple is inserted to the index, the read records are transferred to the normal read set in the corresponding story. Actually after that the 'point hole' record in no more needed. So let's remove it. While we are here, improve name and comments. Part of tarantool#8648 Part of tarantool#8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring

The only place where this static function is used is more general static function - memtx_tx_track_read_story. This is a bit confusing - usually slow variant stand for public inline 'fast' method. So merge both functions in one. Part of tarantool#8648 Part of tarantool#8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring

In our SQL implementation temporary spaces are used. They come to MVCC engine in two variants - NULL or ephemeral. In both cases MVCC engine must not do anything, there are several checks for that in different parts of code. Normalize these checks and make them similar to each other. Part of tarantool#8648 Part of tarantool#8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring

In further commit this list will be used for tracking all read gaps, not only 'nearby'. Since this rename has rather huge diff, let's make it in separate commit. No logical changes. Part of tarantool#8648 Part of tarantool#8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring

Now read trackers are used both for cases when a transaction has read an existing value and it has read nothing (read by key but there was no visible tuple in this place). For latter case an addition index_mask was used to idetify by which index was the select. Along with that there was per-index interval gap trackers. This patch divides area of responsibility between read trackers and gap tracker in the following way: * Reads of existing visible values are stored in read trackers. * Reads of non-existing or non-visible values are store in gap trackers. This new approach allows to provide new invariants: gap trackers are stored only at top of chain, and read trackers are stored only at topmost committed story in chain. Part of tarantool#8648 Part of tarantool#8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring

There's a special 'point hole' mechanism in mvcc transactional manager that manages point gap reads by full key when no raw tuple was found in the index. For instance, it's the only way to collect gap reads for non-tree indexes. Once a new tuple is inserted to the index, the read records are transferred to the normal read set in the corresponding story. Actually after that the 'point hole' record in no more needed. So let's remove it. While we are here, drop unused point_holes_size, improve names and comments. Part of tarantool#8648 Part of tarantool#8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring (cherry picked from commit 62c6563)

The only place where this static function is used is more general static function - memtx_tx_track_read_story. This is a bit confusing - usually slow variant stand for public inline 'fast' method. So merge both functions in one. Part of tarantool#8648 Part of tarantool#8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring (cherry picked from commit 32e41b7)

In our SQL implementation temporary spaces are used. They come to MVCC engine in two variants - NULL or ephemeral. In both cases MVCC engine must not do anything, there are several checks for that in different parts of code. Normalize these checks and make them similar to each other. Part of tarantool#8648 Part of tarantool#8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring (cherry picked from commit 86a8155)

In further commit this list will be used for tracking all read gaps, not only 'nearby'. Since this rename has rather huge diff, let's make it in separate commit. No logical changes. Part of tarantool#8648 Part of tarantool#8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring (cherry picked from commit d3feb69)

Now read trackers are used both for cases when a transaction has read an existing value and it has read nothing (read by key but there was no visible tuple in this place). For latter case an additional index_mask was used to identify from which index the read was done. Along with that there was per-index interval gap trackers. This patch divides area of responsibility between read trackers and gap tracker in the following way: * Reads of existing visible values are stored in read trackers. * Reads of non-existing or non-visible values are store in gap trackers. This new approach allows to provide new invariants: gap trackers are stored only at top of chain, and read trackers are stored only at topmost committed story in chain. Part of tarantool#8648 Part of tarantool#8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring (cherry picked from commit 7b8b78b)

Now there are three kinds of very close trackers: * The transaction have read some tuple that is not committed and thus not visible. This kind is now stored as gap_item with is_nearby = false. * The transaction made a select or range scan, reading a key or range between two adjacent tuples of the index. This kind is stored as gap_iteam with is_nearby = true. * A transaction completed a full scan of unordered index. This kind is stored as full_scan_item. All these trackers serve for the same thing: to record a fact that a transaction read something but didn't see anything. There are some problems with the current solution: * gap_item with is_nearby = false has several unused members that just consume space. * bool is_nearby flag for type descriptin is an ugly solution. * full_scan_item is separated from logically close items. This commit joins all these trackers under one base (that is struct gap_item_base) and solves problems above. Part of tarantool#8648 Part of tarantool#8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring (cherry picked from commit f8d97a2)

The function memtx_tx_story_delete is expected to delete fully unlinked stories and thus should not try to unlink something by itself. So remove unlink and add asserts instead. Part of tarantool#8648 Part of tarantool#8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring (cherry picked from commit 0706740)

It was an ugly solution when MVCC engine requires outside engine to set this flag which is not convenient. Remove it and use mode arguments to set up proper read trackers. Part of tarantool#8648 Part of tarantool#8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring (cherry picked from commit 7414973)

The latter flag is a bit wider: it reveals not only inserting statements after deleting by the same transaction, but also replacing and deleting statements after all kinds of previois changes but the same transaction. This extended behavior will be used in further commits. Part of tarantool#8648 Part of tarantool#8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring (cherry picked from commit 3cfa675)

Almost completely rewrite, simplify and comment this part of code. Part of tarantool#8648 Part of tarantool#8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring (cherry picked from commit 63da3be)

Almost completely rewrite, simplify and comment this part of code. Part of tarantool#8648 Part of tarantool#8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring (cherry picked from commit e901507)

There's case when a transaction is rolled back from prepared state. This happens when WAL fails, synchronized replication confirmation failure or perhaps in other similar cases. By design other RW transactions and transactions with READ_COMMITTED isolation level are allowed to read prepared state. All these transactions must be aborted in case of rollback of prepared transaction since they definitely have read no more possible database state. This patch implements this abortion. Closed tarantool#8654 NO_DOC=bugfix (cherry picked from commit 5498690)

That's strange, but in this test in a group of simple test cases there are test cases that checks replaces, updates and deletes, but occasionally there's no test case that checks inserts. Fix it and add simple test cases for inserts. No logical changes. Part of #8648 Part of #8654 NO_DOC=new test case NO_CHANGELOG=new test case (cherry picked from commit 37b4561)

If addition of a tuple is rolled back while the corresponding story is needed for something else (for example it stores a read set of another transaction) - the story cannot be deleted. Now there's a special flag `rollbacked` that is set to true for such stories, and the flag must be considered in places where history chains are scanned. That approach also requires psn to be set for rolled-back transactions, which surprisingly not as simple as it to say. All that makes the code complicated and hard to maintain. There's another approach for managing rolled back stories: simply set their del_psn to a low enough value (lower than any existing transaction's PSN) and (if necessary) push them to the end of history chain. Such a story would be invisible to any transaction due to already existing mechanisms, that's what is needed. In order to provide "low enough" del_psn it will be natural to assign real PSN starting from some predefined value, so any value below that predefined value will be less that any existing PSN and thus "low enough". Implement this more simple approach. Part of #8648 Part of #8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring (cherry picked from commit ba394a5)

Conflict trackers are used to store information that if some transaction is committed then some another transaction must be aborted. This happens when the first transaction writes some key while the other reads the same key. On the other hand there are another trackers - read trackers - that are designed to handle exactly the same situation. That's why conflict trackers can be simply replaced with read trackers. That would allow to remove conflict trackers as not needed anymore. Part of #8648 Part of #8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring (cherry picked from commit a6c2b9f)

Remove runtime allocation error handling and use panic-on-fail versions of allocation functions. Reasons for that: * Memory error handling was never tested an probably doesn't work. * Some return codes was ignored so the code had obvious flaws. * Rollback in case of memory error made some code overcomplicated. Part of #8648 Part of #8654 NO_DOC=no new functionality added NO_TEST=no new functionality added NO_CHANGELOG=no new functionalily added (cherry picked from commit c951c9d)

Before this patch there were several different places in the code that deal with referencing tuple in space, setting in_index member and marking the story as retained or not. But logically all above is about the same - about placing a story to the top of a chain, i.e. the first story in version list to which index points. This commit refactors these things a bit. This mostly relates to two functions - memtx_tx_story_new and memtx_tx_story_link_top. Changes in memtx_tx_story_new are based on the fact that if a story is created by tuple, it is or immediately will be at the top of chain. Considering this we can omit argument `is_referenced_to_pk` and always create a story ready to be in top of chain. If a story is already in the top - nothing else is needed; if it is to become the top - memtx_tx_story_link_top must be called after. Further, linking to top of chain is needed exactly in two cases: * if a story just created by memtx_tx_story_new must become a top * if a chain is reordered involving the top story (the top and the next stories are swapped) These two cases are logically very close but still different. Even more, previously there were two functions for that: memtx_tx_story_link_top_light and memtx_tx_story_link_top correspondingly. This commit introduces one function for that (although with one more argument) that also incapsulates activities about referencing tuples and marking stories as retained. After this patch the rules are logical and simple: * if a tuple is inserted - call _story_new and _link_top(.. true). * if a story of existing clean tuple is needed - call _story_new. * if a chain is reordered involving top story - _link_top(.. false). Part of #8648 Part of #8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring (cherry picked from commit 202340b)

There's a special 'point hole' mechanism in mvcc transactional manager that manages point gap reads by full key when no raw tuple was found in the index. For instance, it's the only way to collect gap reads for non-tree indexes. Once a new tuple is inserted to the index, the read records are transferred to the normal read set in the corresponding story. Actually after that the 'point hole' record in no more needed. So let's remove it. While we are here, drop unused point_holes_size, improve names and comments. Part of #8648 Part of #8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring (cherry picked from commit 62c6563)

The only place where this static function is used is more general static function - memtx_tx_track_read_story. This is a bit confusing - usually slow variant stand for public inline 'fast' method. So merge both functions in one. Part of #8648 Part of #8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring (cherry picked from commit 32e41b7)

In our SQL implementation temporary spaces are used. They come to MVCC engine in two variants - NULL or ephemeral. In both cases MVCC engine must not do anything, there are several checks for that in different parts of code. Normalize these checks and make them similar to each other. Part of #8648 Part of #8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring (cherry picked from commit 86a8155)

In further commit this list will be used for tracking all read gaps, not only 'nearby'. Since this rename has rather huge diff, let's make it in separate commit. No logical changes. Part of #8648 Part of #8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring (cherry picked from commit d3feb69)

Now read trackers are used both for cases when a transaction has read an existing value and it has read nothing (read by key but there was no visible tuple in this place). For latter case an additional index_mask was used to identify from which index the read was done. Along with that there was per-index interval gap trackers. This patch divides area of responsibility between read trackers and gap tracker in the following way: * Reads of existing visible values are stored in read trackers. * Reads of non-existing or non-visible values are store in gap trackers. This new approach allows to provide new invariants: gap trackers are stored only at top of chain, and read trackers are stored only at topmost committed story in chain. Part of #8648 Part of #8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring (cherry picked from commit 7b8b78b)

Now there are three kinds of very close trackers: * The transaction have read some tuple that is not committed and thus not visible. This kind is now stored as gap_item with is_nearby = false. * The transaction made a select or range scan, reading a key or range between two adjacent tuples of the index. This kind is stored as gap_iteam with is_nearby = true. * A transaction completed a full scan of unordered index. This kind is stored as full_scan_item. All these trackers serve for the same thing: to record a fact that a transaction read something but didn't see anything. There are some problems with the current solution: * gap_item with is_nearby = false has several unused members that just consume space. * bool is_nearby flag for type descriptin is an ugly solution. * full_scan_item is separated from logically close items. This commit joins all these trackers under one base (that is struct gap_item_base) and solves problems above. Part of #8648 Part of #8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring (cherry picked from commit f8d97a2)

The function memtx_tx_story_delete is expected to delete fully unlinked stories and thus should not try to unlink something by itself. So remove unlink and add asserts instead. Part of #8648 Part of #8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring (cherry picked from commit 0706740)

It was an ugly solution when MVCC engine requires outside engine to set this flag which is not convenient. Remove it and use mode arguments to set up proper read trackers. Part of #8648 Part of #8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring (cherry picked from commit 7414973)

The latter flag is a bit wider: it reveals not only inserting statements after deleting by the same transaction, but also replacing and deleting statements after all kinds of previois changes but the same transaction. This extended behavior will be used in further commits. Part of #8648 Part of #8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring (cherry picked from commit 3cfa675)

Almost completely rewrite, simplify and comment this part of code. Part of #8648 Part of #8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring (cherry picked from commit 63da3be)

Almost completely rewrite, simplify and comment this part of code. Part of #8648 Part of #8654 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring (cherry picked from commit e901507)

There's case when a transaction is rolled back from prepared state. This happens when WAL fails, synchronized replication confirmation failure or perhaps in other similar cases. By design other RW transactions and transactions with READ_COMMITTED isolation level are allowed to read prepared state. All these transactions must be aborted in case of rollback of prepared transaction since they definitely have read no more possible database state. This patch implements this abortion. Closed #8654 NO_DOC=bugfix (cherry picked from commit 5498690)

alyapunov added bug Something isn't working mvcc labels May 15, 2023

alyapunov changed the title ~~Abort readed of prepared and then aborted transaction~~ Abort reades of prepared and then aborted transaction May 15, 2023

alyapunov mentioned this issue May 16, 2023

memtx: mvcc abort readers of prepared #8665

Merged

alyapunov changed the title ~~Abort reades of prepared and then aborted transaction~~ Abort readers of prepared and then aborted transaction May 16, 2023

alyapunov mentioned this issue Jun 23, 2023

Gh 8654 memtx mvcc abort readers of prepared 2.10 #8806

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Abort readers of prepared and then aborted transaction #8654

Abort readers of prepared and then aborted transaction #8654

alyapunov commented May 15, 2023 •

edited

Abort readers of prepared and then aborted transaction #8654

Abort readers of prepared and then aborted transaction #8654

Comments

alyapunov commented May 15, 2023 • edited

alyapunov commented May 15, 2023 •

edited