Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Abort readers of prepared and then aborted transaction #8654

Closed
alyapunov opened this issue May 15, 2023 · 0 comments · Fixed by #8665
Closed

Abort readers of prepared and then aborted transaction #8654

alyapunov opened this issue May 15, 2023 · 0 comments · Fixed by #8665
Assignees
Labels
2.10 Target is 2.10 and all newer release/master branches bug Something isn't working mvcc

Comments

@alyapunov
Copy link
Contributor

alyapunov commented May 15, 2023

When a transaction is prepared (called commit and yielded, waiting for WAL) it is marked as 'prepared' (untill commit finishes and it becomes 'committed'). Other transactions are allowed to read changes of prepared transaction (depending of RO/RW state and isolation level though). The problem is that if prepared transaction is aborted (because of WAL error) all its readers must be aborted too. This behavior must be identical for vinyl and memtx engines.

Reproduce for insertion
box.cfg{memtx_use_mvcc_engine = true}
local s = box.schema.space.create('test')
s:create_index('pk')
s:create_index('sk', {parts = {{2, 'uint'}}})
local txn_proxy = require('txn_proxy')

local tx1 = txn_proxy.new()
local tx2 = txn_proxy.new()
tx1:begin()
tx2:begin()
tx1('s:replace{1, 1}')
tx2('s:replace{2, 2}')
box.error.injection.set('ERRINJ_WAL_WRITE', true)
local fib = fiber.create(tx1.commit, tx1)
fib:set_joinable(true)
-- select by any index must read {1, 1} and lead to the same behavior.
tx2('s:select{1}')
--tx2('s.index[0]:select{1}')
--tx2('s.index[1]:select{1}')
fib:join()
box.error.injection.set('ERRINJ_WAL_WRITE', false)
tx2:commit() -- must fail, but it does not.
Reproduce for deletion
box.cfg{memtx_use_mvcc_engine = true}
local s = box.schema.space.create('test')
s:create_index('pk')
s:create_index('sk', {parts = {{2, 'uint'}}})
local txn_proxy = require('txn_proxy')

s:replace{1, 1, 1}
local tx1 = txn_proxy.new()
local tx2 = txn_proxy.new()
tx1:begin()
tx2:begin()
tx1('s:delete{1}')
tx2('s:replace{2, 2}')
box.error.injection.set('ERRINJ_WAL_WRITE', true)
fib = fiber.create(tx1.commit, tx1)
fib:set_joinable(true)
-- select by any index must read {} and lead to the same behavior.
tx2('s:select{1}')
--tx2('s.index[0]:select{1}')
--tx2('s.index[1]:select{1}')
fib:join()
box.error.injection.set('ERRINJ_WAL_WRITE', false)
tx2:commit() -- must fail, but it does not.
@alyapunov alyapunov added bug Something isn't working mvcc labels May 15, 2023
@alyapunov alyapunov changed the title Abort readed of prepared and then aborted transaction Abort reades of prepared and then aborted transaction May 15, 2023
@alyapunov alyapunov changed the title Abort reades of prepared and then aborted transaction Abort readers of prepared and then aborted transaction May 16, 2023
alyapunov added a commit to alyapunov/tarantool that referenced this issue May 16, 2023
That's strange, but in this test in a group of simple test cases
there are test cases that checks replaces, updates and deletes,
but occasionally there's no test case that checks inserts.

Fix it and add simple test cases for inserts.

No logical changes.

Part of tarantool#8648
Part of tarantool#8654

NO_DOC=new test case
NO_CHANGELOG=new test case
alyapunov added a commit to alyapunov/tarantool that referenced this issue May 16, 2023
If addition of a tuple is rolled back while the corresponding
story is needed for something else (for example it stores a read
set of another transaction) - the story cannot be deleted.
Now there's a special flag `rollbacked` that is set to true
for such stories, and the flag must be considered in places
where history chains are scanned. That approach also requires
psn to be set for rolled-back transactions, which suprisingly
not as simple as it to say. All that makes the code complicated
and hard to maintain.

There's another approach for managing rolled back stories: simply
set their del_pos to a low enough value (lower than any existing
transaction's PSN) and (if necessary) push them to the end of
history chain. Such a story would be invisible to any transaction
due to already existing mechanisms, that's what is needed.

In order to provide "low enough" del_pls it will be natural to
assign real PSN starting from some predefined value, so any value
below that predefined value will be less that any existing PSN and
thus "low enough".

Implement this more simple approach.

Part of tarantool#8648
Part of tarantool#8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
alyapunov added a commit to alyapunov/tarantool that referenced this issue May 16, 2023
Conflict trackers are used to store information that if some
transaction is committed then some another transaction must be
aborted. This happens when the first transaction writes some
key while the other reads the same key. On the other hand there
are another trackers - read trackers - that are designed to
handle exactly the same situation. That's why conflict trackers
can be simply replaced with read trackers.

That would allow to remove conflict trackers as not needed
anymore.

Part of tarantool#8648
Part of tarantool#8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
alyapunov added a commit to alyapunov/tarantool that referenced this issue May 16, 2023
Remove runtime allocation error handling and use panic-on-fail
versions of allocation functions. Reasons for that:
* Memory error handling was never tested an probably doesn't work.
* Some return codes was ignored so the code had obvious flaws.
* Rollback in case of memory error made some code overcomplicated.

Part of tarantool#8648
Part of tarantool#8654

NO_DOC=no new functionality added
NO_TEST=no new functionality added
NO_CHANGELOG=no new functionalily added
alyapunov added a commit to alyapunov/tarantool that referenced this issue May 16, 2023
Before this patch there were several different places in the code
that deal with referencing tuple in space, setting in_index member
and marking the story as retained or not. But logically all above
is about the same - about placing a story to the top of a chain,
i.e. the first story in version list to which index points.

This commit refactors these things a bit. This mostly relates to
two functions - memtx_tx_story_new and memtx_tx_story_link_top.

Changes in memtx_tx_story_new are based on the fact that if a story
is created by tuple, it is or immediately will be at the top of
chain. Considering this we can omit argument `is_referenced_to_pk`
and always create a story ready to be in top of chain. If a story
is already in the top - nothing esle is need; if it is to become
the top - memtx_tx_story_link_top must be called after.

Further, linking to top of chain is needed exactly in two cases:
* if a story just created by memtx_tx_story_new must become a top
* if a chain is reordered involving the top story (the top and the
  next stories are swapped)
These two cases are logically very close but still different.
Even more, previously there ware two functions for that:
memtx_tx_story_new_light and memtx_tx_story_new correspondingly.
This commit introduces one function for that (although with one
more argument) that also incapsulates activities about referencing
tuples and marking stories as retained.

After this patch the rules are logical and simple:
* if a tuple is inserted - call _story_new and _link_top.
* if a story of existring tuple is needed - call _story_new.
* if chain is reordered involving top story - call _link_top.

Part of tarantool#8648
Part of tarantool#8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
alyapunov added a commit to alyapunov/tarantool that referenced this issue May 16, 2023
There's a special 'point hole' mechanism in mvcc transactional
manager that manages point gap reads by full key when no raw
tuple was found in the index. For instance, it's the only way
to collect gap reads for non-tree indexes.

Once a new tuple is inserted to the index, the read records are
transfered to the normal read set in the corresponding story.
Actually after that the 'point hole' record in no more needed.

So let's remove it.

While we are here, improve name and comments.

Part of tarantool#8648
Part of tarantool#8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
alyapunov added a commit to alyapunov/tarantool that referenced this issue May 16, 2023
The only place where this static function is used is more general
static function - memtx_tx_track_read_story. This is a bit
confusing - usually slow variant stand for public inline 'fast'
method.

So merge both functions in one.

Part of tarantool#8648
Part of tarantool#8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
alyapunov added a commit to alyapunov/tarantool that referenced this issue May 16, 2023
In our SQL implementation temporary spaces are used. They come to
MVCC engine in two variants - NULL or ephemeral. In both cases
MVCC engine must not do anything, there are several checks for
that in different parts of code.

Normalize these checks and make them similar to each other.

Part of tarantool#8648
Part of tarantool#8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
alyapunov added a commit to alyapunov/tarantool that referenced this issue May 16, 2023
In futher commit this list will be used for tracking all read
gaps, not only 'nearby'. Since this rename has rather huge
diff, let's make it in separate commit.

No logical changes.

Part of tarantool#8648
Part of tarantool#8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
alyapunov added a commit to alyapunov/tarantool that referenced this issue May 16, 2023
Now read trackers are used both for cases when a transaction has
read an existing value and it has read nothing (read by key but
there was no visible tuple in this place). For latter case an
addition index_mask was used to idetify by which index was the
select. Along with that there was per-index interval gap trackers.

This patch divides area of resposibility between read trackers
and gap tracker in the following way:
* Reads of existing visible values are stored in read trackers.
* Reads of non-existing or non-visible values are store in gap
  trackers.

This new approach allows to provide new invariants: gap trackers
are stored only at top of chain, and read trackers are stored
only at topmost committed story in chain.

Part of tarantool#8648
Part of tarantool#8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
alyapunov added a commit to alyapunov/tarantool that referenced this issue May 16, 2023
The function memtx_tx_story_delete is expected to delete fully
unlinked stories and thus should not try to unlink something by
itself. So remove unlink and add asserts instead.

Part of tarantool#8648
Part of tarantool#8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
alyapunov added a commit to alyapunov/tarantool that referenced this issue May 16, 2023
It was an ugly solution when MVCC engine requires outside engine
to set this flag which is not convinient.

Remove it and use mode arguments to set up proper read trackers.

Part of tarantool#8648
Part of tarantool#8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
alyapunov added a commit to alyapunov/tarantool that referenced this issue May 16, 2023
The latter flag is a bit wider: it reveals not only inserting
statements after deleting by the same transaction, but also
replacing and deleting statements after all kinds of previois
changes but the same transaction. This extended behavoir will
be used in futher commits.

Part of tarantool#8648
Part of tarantool#8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
alyapunov added a commit to alyapunov/tarantool that referenced this issue May 16, 2023
Almost completely rewrite, simplify and comment this part of code.

Part of tarantool#8648
Part of tarantool#8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
alyapunov added a commit to alyapunov/tarantool that referenced this issue May 16, 2023
Almost completely rewrite, simplify and comment this part of code.

Part of tarantool#8648
Part of tarantool#8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
alyapunov added a commit to alyapunov/tarantool that referenced this issue May 16, 2023
There's rather seldom case when a transaction is rolled back from
prepared state. This happens when WAL fails, or perhaps in other
similar cases. By desing other RW transactions and transactions
with READ_COMMITTED isolation level are allowed to read prepared
state. All these transactions must be aborted in case of rollback
of prepared transaction since they definitely have read no more
possible database state.

This patch implements this abortion.

Closed tarantool#8654

NO_DOC=bugfix
alyapunov added a commit to alyapunov/tarantool that referenced this issue May 17, 2023
If addition of a tuple is rolled back while the corresponding
story is needed for something else (for example it stores a read
set of another transaction) - the story cannot be deleted.
Now there's a special flag `rollbacked` that is set to true
for such stories, and the flag must be considered in places
where history chains are scanned. That approach also requires
psn to be set for rolled-back transactions, which surprisingly
not as simple as it to say. All that makes the code complicated
and hard to maintain.

There's another approach for managing rolled back stories: simply
set their del_pos to a low enough value (lower than any existing
transaction's PSN) and (if necessary) push them to the end of
history chain. Such a story would be invisible to any transaction
due to already existing mechanisms, that's what is needed.

In order to provide "low enough" del_pls it will be natural to
assign real PSN starting from some predefined value, so any value
below that predefined value will be less that any existing PSN and
thus "low enough".

Implement this more simple approach.

Part of tarantool#8648
Part of tarantool#8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
alyapunov added a commit to alyapunov/tarantool that referenced this issue May 17, 2023
Conflict trackers are used to store information that if some
transaction is committed then some another transaction must be
aborted. This happens when the first transaction writes some
key while the other reads the same key. On the other hand there
are another trackers - read trackers - that are designed to
handle exactly the same situation. That's why conflict trackers
can be simply replaced with read trackers.

That would allow to remove conflict trackers as not needed
anymore.

Part of tarantool#8648
Part of tarantool#8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
alyapunov added a commit to alyapunov/tarantool that referenced this issue May 17, 2023
Remove runtime allocation error handling and use panic-on-fail
versions of allocation functions. Reasons for that:
* Memory error handling was never tested an probably doesn't work.
* Some return codes was ignored so the code had obvious flaws.
* Rollback in case of memory error made some code overcomplicated.

Part of tarantool#8648
Part of tarantool#8654

NO_DOC=no new functionality added
NO_TEST=no new functionality added
NO_CHANGELOG=no new functionalily added
alyapunov added a commit to alyapunov/tarantool that referenced this issue May 17, 2023
Before this patch there were several different places in the code
that deal with referencing tuple in space, setting in_index member
and marking the story as retained or not. But logically all above
is about the same - about placing a story to the top of a chain,
i.e. the first story in version list to which index points.

This commit refactors these things a bit. This mostly relates to
two functions - memtx_tx_story_new and memtx_tx_story_link_top.

Changes in memtx_tx_story_new are based on the fact that if a story
is created by tuple, it is or immediately will be at the top of
chain. Considering this we can omit argument `is_referenced_to_pk`
and always create a story ready to be in top of chain. If a story
is already in the top - nothing else is need; if it is to become
the top - memtx_tx_story_link_top must be called after.

Further, linking to top of chain is needed exactly in two cases:
* if a story just created by memtx_tx_story_new must become a top
* if a chain is reordered involving the top story (the top and the
  next stories are swapped)
These two cases are logically very close but still different.
Even more, previously there ware two functions for that:
memtx_tx_story_new_light and memtx_tx_story_new correspondingly.
This commit introduces one function for that (although with one
more argument) that also incapsulates activities about referencing
tuples and marking stories as retained.

After this patch the rules are logical and simple:
* if a tuple is inserted - call _story_new and _link_top.
* if a story of existring tuple is needed - call _story_new.
* if chain is reordered involving top story - call _link_top.

Part of tarantool#8648
Part of tarantool#8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
alyapunov added a commit to alyapunov/tarantool that referenced this issue May 17, 2023
There's a special 'point hole' mechanism in mvcc transactional
manager that manages point gap reads by full key when no raw
tuple was found in the index. For instance, it's the only way
to collect gap reads for non-tree indexes.

Once a new tuple is inserted to the index, the read records are
transferred to the normal read set in the corresponding story.
Actually after that the 'point hole' record in no more needed.

So let's remove it.

While we are here, improve name and comments.

Part of tarantool#8648
Part of tarantool#8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
alyapunov added a commit to alyapunov/tarantool that referenced this issue May 17, 2023
The only place where this static function is used is more general
static function - memtx_tx_track_read_story. This is a bit
confusing - usually slow variant stand for public inline 'fast'
method.

So merge both functions in one.

Part of tarantool#8648
Part of tarantool#8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
alyapunov added a commit to alyapunov/tarantool that referenced this issue May 17, 2023
In our SQL implementation temporary spaces are used. They come to
MVCC engine in two variants - NULL or ephemeral. In both cases
MVCC engine must not do anything, there are several checks for
that in different parts of code.

Normalize these checks and make them similar to each other.

Part of tarantool#8648
Part of tarantool#8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
alyapunov added a commit to alyapunov/tarantool that referenced this issue May 17, 2023
In further commit this list will be used for tracking all read
gaps, not only 'nearby'. Since this rename has rather huge
diff, let's make it in separate commit.

No logical changes.

Part of tarantool#8648
Part of tarantool#8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
alyapunov added a commit to alyapunov/tarantool that referenced this issue May 17, 2023
Now read trackers are used both for cases when a transaction has
read an existing value and it has read nothing (read by key but
there was no visible tuple in this place). For latter case an
addition index_mask was used to idetify by which index was the
select. Along with that there was per-index interval gap trackers.

This patch divides area of responsibility between read trackers
and gap tracker in the following way:
* Reads of existing visible values are stored in read trackers.
* Reads of non-existing or non-visible values are store in gap
  trackers.

This new approach allows to provide new invariants: gap trackers
are stored only at top of chain, and read trackers are stored
only at topmost committed story in chain.

Part of tarantool#8648
Part of tarantool#8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
alyapunov added a commit to alyapunov/tarantool that referenced this issue Jun 23, 2023
There's a special 'point hole' mechanism in mvcc transactional
manager that manages point gap reads by full key when no raw
tuple was found in the index. For instance, it's the only way
to collect gap reads for non-tree indexes.

Once a new tuple is inserted to the index, the read records are
transferred to the normal read set in the corresponding story.
Actually after that the 'point hole' record in no more needed.

So let's remove it.

While we are here, drop unused point_holes_size, improve names
and comments.

Part of tarantool#8648
Part of tarantool#8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

(cherry picked from commit 62c6563)
alyapunov added a commit to alyapunov/tarantool that referenced this issue Jun 23, 2023
The only place where this static function is used is more general
static function - memtx_tx_track_read_story. This is a bit
confusing - usually slow variant stand for public inline 'fast'
method.

So merge both functions in one.

Part of tarantool#8648
Part of tarantool#8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

(cherry picked from commit 32e41b7)
alyapunov added a commit to alyapunov/tarantool that referenced this issue Jun 23, 2023
In our SQL implementation temporary spaces are used. They come to
MVCC engine in two variants - NULL or ephemeral. In both cases
MVCC engine must not do anything, there are several checks for
that in different parts of code.

Normalize these checks and make them similar to each other.

Part of tarantool#8648
Part of tarantool#8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

(cherry picked from commit 86a8155)
alyapunov added a commit to alyapunov/tarantool that referenced this issue Jun 23, 2023
In further commit this list will be used for tracking all read
gaps, not only 'nearby'. Since this rename has rather huge
diff, let's make it in separate commit.

No logical changes.

Part of tarantool#8648
Part of tarantool#8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

(cherry picked from commit d3feb69)
alyapunov added a commit to alyapunov/tarantool that referenced this issue Jun 23, 2023
Now read trackers are used both for cases when a transaction has
read an existing value and it has read nothing (read by key but
there was no visible tuple in this place). For latter case an
additional index_mask was used to identify from which index the
read was done. Along with that there was per-index interval gap
trackers.

This patch divides area of responsibility between read trackers
and gap tracker in the following way:
* Reads of existing visible values are stored in read trackers.
* Reads of non-existing or non-visible values are store in gap
  trackers.

This new approach allows to provide new invariants: gap trackers
are stored only at top of chain, and read trackers are stored
only at topmost committed story in chain.

Part of tarantool#8648
Part of tarantool#8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

(cherry picked from commit 7b8b78b)
alyapunov added a commit to alyapunov/tarantool that referenced this issue Jun 23, 2023
Now there are three kinds of very close trackers:
* The transaction have read some tuple that is not committed and
  thus not visible. This kind is now stored as gap_item with
  is_nearby = false.
* The transaction made a select or range scan, reading a key or
  range between two adjacent tuples of the index. This kind is
  stored as gap_iteam with is_nearby = true.
* A transaction completed a full scan of unordered index. This
  kind is stored as full_scan_item.

All these trackers serve for the same thing: to record a fact
that a transaction read something but didn't see anything.

There are some problems with the current solution:
* gap_item with is_nearby = false has several unused members that
  just consume space.
* bool is_nearby flag for type descriptin is an ugly solution.
* full_scan_item is separated from logically close items.

This commit joins all these trackers under one base (that is
struct gap_item_base) and solves problems above.

Part of tarantool#8648
Part of tarantool#8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

(cherry picked from commit f8d97a2)
alyapunov added a commit to alyapunov/tarantool that referenced this issue Jun 23, 2023
The function memtx_tx_story_delete is expected to delete fully
unlinked stories and thus should not try to unlink something by
itself. So remove unlink and add asserts instead.

Part of tarantool#8648
Part of tarantool#8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

(cherry picked from commit 0706740)
alyapunov added a commit to alyapunov/tarantool that referenced this issue Jun 23, 2023
It was an ugly solution when MVCC engine requires outside engine
to set this flag which is not convenient.

Remove it and use mode arguments to set up proper read trackers.

Part of tarantool#8648
Part of tarantool#8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

(cherry picked from commit 7414973)
alyapunov added a commit to alyapunov/tarantool that referenced this issue Jun 23, 2023
The latter flag is a bit wider: it reveals not only inserting
statements after deleting by the same transaction, but also
replacing and deleting statements after all kinds of previois
changes but the same transaction. This extended behavior will
be used in further commits.

Part of tarantool#8648
Part of tarantool#8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

(cherry picked from commit 3cfa675)
alyapunov added a commit to alyapunov/tarantool that referenced this issue Jun 23, 2023
Almost completely rewrite, simplify and comment this part of code.

Part of tarantool#8648
Part of tarantool#8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

(cherry picked from commit 63da3be)
alyapunov added a commit to alyapunov/tarantool that referenced this issue Jun 23, 2023
Almost completely rewrite, simplify and comment this part of code.

Part of tarantool#8648
Part of tarantool#8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

(cherry picked from commit e901507)
alyapunov added a commit to alyapunov/tarantool that referenced this issue Jun 23, 2023
There's case when a transaction is rolled back from prepared
state. This happens when WAL fails, synchronized replication
confirmation failure or perhaps in other similar cases. By design
other RW transactions and transactions with READ_COMMITTED
isolation level are allowed to read prepared state. All these
transactions must be aborted in case of rollback of prepared
transaction since they definitely have read no more possible
database state.

This patch implements this abortion.

Closed tarantool#8654

NO_DOC=bugfix

(cherry picked from commit 5498690)
alyapunov added a commit that referenced this issue Jun 23, 2023
That's strange, but in this test in a group of simple test cases
there are test cases that checks replaces, updates and deletes,
but occasionally there's no test case that checks inserts.

Fix it and add simple test cases for inserts.

No logical changes.

Part of #8648
Part of #8654

NO_DOC=new test case
NO_CHANGELOG=new test case

(cherry picked from commit 37b4561)
alyapunov added a commit that referenced this issue Jun 23, 2023
If addition of a tuple is rolled back while the corresponding
story is needed for something else (for example it stores a read
set of another transaction) - the story cannot be deleted.
Now there's a special flag `rollbacked` that is set to true
for such stories, and the flag must be considered in places
where history chains are scanned. That approach also requires
psn to be set for rolled-back transactions, which surprisingly
not as simple as it to say. All that makes the code complicated
and hard to maintain.

There's another approach for managing rolled back stories: simply
set their del_psn to a low enough value (lower than any existing
transaction's PSN) and (if necessary) push them to the end of
history chain. Such a story would be invisible to any transaction
due to already existing mechanisms, that's what is needed.

In order to provide "low enough" del_psn it will be natural to
assign real PSN starting from some predefined value, so any value
below that predefined value will be less that any existing PSN and
thus "low enough".

Implement this more simple approach.

Part of #8648
Part of #8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

(cherry picked from commit ba394a5)
alyapunov added a commit that referenced this issue Jun 23, 2023
Conflict trackers are used to store information that if some
transaction is committed then some another transaction must be
aborted. This happens when the first transaction writes some
key while the other reads the same key. On the other hand there
are another trackers - read trackers - that are designed to
handle exactly the same situation. That's why conflict trackers
can be simply replaced with read trackers.

That would allow to remove conflict trackers as not needed
anymore.

Part of #8648
Part of #8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

(cherry picked from commit a6c2b9f)
alyapunov added a commit that referenced this issue Jun 23, 2023
Remove runtime allocation error handling and use panic-on-fail
versions of allocation functions. Reasons for that:
* Memory error handling was never tested an probably doesn't work.
* Some return codes was ignored so the code had obvious flaws.
* Rollback in case of memory error made some code overcomplicated.

Part of #8648
Part of #8654

NO_DOC=no new functionality added
NO_TEST=no new functionality added
NO_CHANGELOG=no new functionalily added

(cherry picked from commit c951c9d)
alyapunov added a commit that referenced this issue Jun 23, 2023
Before this patch there were several different places in the code
that deal with referencing tuple in space, setting in_index member
and marking the story as retained or not. But logically all above
is about the same - about placing a story to the top of a chain,
i.e. the first story in version list to which index points.

This commit refactors these things a bit. This mostly relates to
two functions - memtx_tx_story_new and memtx_tx_story_link_top.

Changes in memtx_tx_story_new are based on the fact that if a story
is created by tuple, it is or immediately will be at the top of
chain. Considering this we can omit argument `is_referenced_to_pk`
and always create a story ready to be in top of chain. If a story
is already in the top - nothing else is needed; if it is to become
the top - memtx_tx_story_link_top must be called after.

Further, linking to top of chain is needed exactly in two cases:
* if a story just created by memtx_tx_story_new must become a top
* if a chain is reordered involving the top story (the top and the
  next stories are swapped)
These two cases are logically very close but still different.
Even more, previously there were two functions for that:
memtx_tx_story_link_top_light and memtx_tx_story_link_top
correspondingly. This commit introduces one function for that
(although with one more argument) that also incapsulates
activities about referencing tuples and marking stories as
retained.

After this patch the rules are logical and simple:
* if a tuple is inserted - call _story_new and _link_top(.. true).
* if a story of existing clean tuple is needed - call _story_new.
* if a chain is reordered involving top story - _link_top(.. false).

Part of #8648
Part of #8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

(cherry picked from commit 202340b)
alyapunov added a commit that referenced this issue Jun 23, 2023
There's a special 'point hole' mechanism in mvcc transactional
manager that manages point gap reads by full key when no raw
tuple was found in the index. For instance, it's the only way
to collect gap reads for non-tree indexes.

Once a new tuple is inserted to the index, the read records are
transferred to the normal read set in the corresponding story.
Actually after that the 'point hole' record in no more needed.

So let's remove it.

While we are here, drop unused point_holes_size, improve names
and comments.

Part of #8648
Part of #8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

(cherry picked from commit 62c6563)
alyapunov added a commit that referenced this issue Jun 23, 2023
The only place where this static function is used is more general
static function - memtx_tx_track_read_story. This is a bit
confusing - usually slow variant stand for public inline 'fast'
method.

So merge both functions in one.

Part of #8648
Part of #8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

(cherry picked from commit 32e41b7)
alyapunov added a commit that referenced this issue Jun 23, 2023
In our SQL implementation temporary spaces are used. They come to
MVCC engine in two variants - NULL or ephemeral. In both cases
MVCC engine must not do anything, there are several checks for
that in different parts of code.

Normalize these checks and make them similar to each other.

Part of #8648
Part of #8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

(cherry picked from commit 86a8155)
alyapunov added a commit that referenced this issue Jun 23, 2023
In further commit this list will be used for tracking all read
gaps, not only 'nearby'. Since this rename has rather huge
diff, let's make it in separate commit.

No logical changes.

Part of #8648
Part of #8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

(cherry picked from commit d3feb69)
alyapunov added a commit that referenced this issue Jun 23, 2023
Now read trackers are used both for cases when a transaction has
read an existing value and it has read nothing (read by key but
there was no visible tuple in this place). For latter case an
additional index_mask was used to identify from which index the
read was done. Along with that there was per-index interval gap
trackers.

This patch divides area of responsibility between read trackers
and gap tracker in the following way:
* Reads of existing visible values are stored in read trackers.
* Reads of non-existing or non-visible values are store in gap
  trackers.

This new approach allows to provide new invariants: gap trackers
are stored only at top of chain, and read trackers are stored
only at topmost committed story in chain.

Part of #8648
Part of #8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

(cherry picked from commit 7b8b78b)
alyapunov added a commit that referenced this issue Jun 23, 2023
Now there are three kinds of very close trackers:
* The transaction have read some tuple that is not committed and
  thus not visible. This kind is now stored as gap_item with
  is_nearby = false.
* The transaction made a select or range scan, reading a key or
  range between two adjacent tuples of the index. This kind is
  stored as gap_iteam with is_nearby = true.
* A transaction completed a full scan of unordered index. This
  kind is stored as full_scan_item.

All these trackers serve for the same thing: to record a fact
that a transaction read something but didn't see anything.

There are some problems with the current solution:
* gap_item with is_nearby = false has several unused members that
  just consume space.
* bool is_nearby flag for type descriptin is an ugly solution.
* full_scan_item is separated from logically close items.

This commit joins all these trackers under one base (that is
struct gap_item_base) and solves problems above.

Part of #8648
Part of #8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

(cherry picked from commit f8d97a2)
alyapunov added a commit that referenced this issue Jun 23, 2023
The function memtx_tx_story_delete is expected to delete fully
unlinked stories and thus should not try to unlink something by
itself. So remove unlink and add asserts instead.

Part of #8648
Part of #8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

(cherry picked from commit 0706740)
alyapunov added a commit that referenced this issue Jun 23, 2023
It was an ugly solution when MVCC engine requires outside engine
to set this flag which is not convenient.

Remove it and use mode arguments to set up proper read trackers.

Part of #8648
Part of #8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

(cherry picked from commit 7414973)
alyapunov added a commit that referenced this issue Jun 23, 2023
The latter flag is a bit wider: it reveals not only inserting
statements after deleting by the same transaction, but also
replacing and deleting statements after all kinds of previois
changes but the same transaction. This extended behavior will
be used in further commits.

Part of #8648
Part of #8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

(cherry picked from commit 3cfa675)
alyapunov added a commit that referenced this issue Jun 23, 2023
Almost completely rewrite, simplify and comment this part of code.

Part of #8648
Part of #8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

(cherry picked from commit 63da3be)
alyapunov added a commit that referenced this issue Jun 23, 2023
Almost completely rewrite, simplify and comment this part of code.

Part of #8648
Part of #8654

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

(cherry picked from commit e901507)
alyapunov added a commit that referenced this issue Jun 23, 2023
There's case when a transaction is rolled back from prepared
state. This happens when WAL fails, synchronized replication
confirmation failure or perhaps in other similar cases. By design
other RW transactions and transactions with READ_COMMITTED
isolation level are allowed to read prepared state. All these
transactions must be aborted in case of rollback of prepared
transaction since they definitely have read no more possible
database state.

This patch implements this abortion.

Closed #8654

NO_DOC=bugfix

(cherry picked from commit 5498690)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.10 Target is 2.10 and all newer release/master branches bug Something isn't working mvcc
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants