New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recovery: force recovery should try to recover mixed transactions from WAL #7932
Comments
yanshtunder
added a commit
to yanshtunder/tarantool
that referenced
this issue
Mar 30, 2023
See the docbot request for details. Closes tarantool#7932 @TarantoolBot document Title: correct recovery of mixed transactions Transaction boundaries in WAL were introduced in 2.1.2 (`872d6f1`), but replication applier started to apply transaction atomically only in 2.2.1(`e94ba2e`), so within this range it is possible that transactions are mixed in WAL. There was no problem with them until 2.8.1 because recovering process was still working row-by-row, but in 2.8.1 (`9311113`) it became transactional as well so starting from 2.8.1 tarantool fails to handle mixed transactions from WAL. Mixed transactions is obviously the one of unexpected scenarios in modern tarantool, so we should not handle it in normal mode, and we take care of it only in force recovery mode. But current 'force recovery' treats all the rows of the unexpected transaction as unexpected and thus just skips the entire transaction row-by-row. However we have enough information in this case to handle it more data-friendly.
yanshtunder
added a commit
to yanshtunder/tarantool
that referenced
this issue
Apr 27, 2023
See the docbot request for details. Closes tarantool#7932 @TarantoolBot document Title: correct recovery of mixed transactions Transaction boundaries in WAL were introduced in 2.1.2 (`872d6f1`), but replication applier started to apply transaction atomically only in 2.2.1(`e94ba2e`), so within this range it is possible that transactions are mixed in WAL. There was no problem with them until 2.8.1 because recovering process was still working row-by-row, but in 2.8.1 (`9311113`) it became transactional as well so starting from 2.8.1 tarantool fails to handle mixed transactions from WAL. Mixed transactions is obviously the one of unexpected scenarios in modern tarantool, so we should not handle it in normal mode, and we take care of it only in force recovery mode. But current 'force recovery' treats all the rows of the unexpected transaction as unexpected and thus just skips the entire transaction row-by-row. However we have enough information in this case to handle it more data-friendly. Note: To replication do the following. Recover the broken xlog node by setting `force_recovery` to `true`. Then do replica rebootstrap.
yanshtunder
added a commit
to yanshtunder/tarantool
that referenced
this issue
May 16, 2023
A new macro has been introduced because OOM errors are not handled in the code. Needed for tarantool#7932 NO_DOC=internal NO_TEST=internal NO_CHANGELOG=internal
yanshtunder
added a commit
to yanshtunder/tarantool
that referenced
this issue
May 16, 2023
See the docbot request for details. Closes tarantool#7932 @TarantoolBot document Title: correct recovery of mixed transactions Transaction boundaries in WAL were introduced in 2.1.2 (`872d6f1`), but replication applier started to apply transaction atomically only in 2.2.1(`e94ba2e`), so within this range it is possible that transactions are mixed in WAL. There was no problem with them until 2.8.1 because recovering process was still working row-by-row, but in 2.8.1 (`9311113`) it became transactional as well so starting from 2.8.1 tarantool fails to handle mixed transactions from WAL. Mixed transactions is obviously the one of unexpected scenarios in modern tarantool, so we should not handle it in normal mode, and we take care of it only in force recovery mode. But current 'force recovery' treats all the rows of the unexpected transaction as unexpected and thus just skips the entire transaction row-by-row. However we have enough information in this case to handle it more data-friendly. Note: Let there be two nodes (`node#1` and `node#2`). And let the data be replicated from `node#1` to `node#2`. Suppose that at some point in time, `node#1` is restoring data from an xlog containing mixed transactions. To replicate data from `node#1` to `node#2`, do the following: 1) Disable `node#2` 2) Start `node#1` by setting `force_recovery` to `true` 3) Connect `node#2`
yanshtunder
added a commit
to yanshtunder/tarantool
that referenced
this issue
May 16, 2023
See the docbot request for details. Closes tarantool#7932 @TarantoolBot document Title: correct recovery of mixed transactions Transaction boundaries in WAL were introduced in 2.1.2 (`872d6f1`), but replication applier started to apply transaction atomically only in 2.2.1(`e94ba2e`), so within this range it is possible that transactions are mixed in WAL. There was no problem with them until 2.8.1 because recovering process was still working row-by-row, but in 2.8.1 (`9311113`) it became transactional as well so starting from 2.8.1 tarantool fails to handle mixed transactions from WAL. Mixed transactions is obviously the one of unexpected scenarios in modern tarantool, so we should not handle it in normal mode, and we take care of it only in force recovery mode. But current 'force recovery' treats all the rows of the unexpected transaction as unexpected and thus just skips the entire transaction row-by-row. However we have enough information in this case to handle it more data-friendly. Note: Let there be two nodes (`node#1` and `node#2`). And let the data be replicated from `node#1` to `node#2`. Suppose that at some point in time, `node#1` is restoring data from an xlog containing mixed transactions. To replicate data from `node#1` to `node#2`, do the following: 1) Disable `node#2` 2) Start `node#1` by setting `force_recovery` to `true` 3) Connect `node#2`
yanshtunder
added a commit
to yanshtunder/tarantool
that referenced
this issue
May 22, 2023
See the docbot request for details. Closes tarantool#7932 @TarantoolBot document Title: correct recovery of mixed transactions Transaction boundaries in WAL were introduced in 2.1.2 (`872d6f1`), but replication applier started to apply transaction atomically only in 2.2.1(`e94ba2e`), so within this range it is possible that transactions are mixed in WAL. There was no problem with them until 2.8.1 because recovering process was still working row-by-row, but in 2.8.1 (`9311113`) it became transactional as well so starting from 2.8.1 tarantool fails to handle mixed transactions from WAL. Mixed transactions is obviously the one of unexpected scenarios in modern tarantool, so we should not handle it in normal mode, and we take care of it only in force recovery mode. But current 'force recovery' treats all the rows of the unexpected transaction as unexpected and thus just skips the entire transaction row-by-row. However we have enough information in this case to handle it more data-friendly. Note: Let there be two nodes (`node#1` and `node#2`). And let the data be replicated from `node#1` to `node#2`. Suppose that at some point in time, `node#1` is restoring data from an xlog containing mixed transactions. To replicate data from `node#1` to `node#2`, do the following: 1. Stop `node#2` and delete all xlog files from it 2. Restart `node#1` by setting `force_recovery` to `true` 3. Make `node#2` rejoin to `node#1` again (tarantoolgh-7932).
yanshtunder
added a commit
to yanshtunder/tarantool
that referenced
this issue
May 22, 2023
A new macro has been introduced because OOM errors are not handled in the code. Needed for tarantool#7932 NO_DOC=internal NO_TEST=internal NO_CHANGELOG=internal
yanshtunder
added a commit
to yanshtunder/tarantool
that referenced
this issue
May 22, 2023
See the docbot request for details. Closes tarantool#7932 @TarantoolBot document Title: correct recovery of mixed transactions Transaction boundaries in WAL were introduced in 2.1.2 (`872d6f1`), but replication applier started to apply transaction atomically only in 2.2.1(`e94ba2e`), so within this range it is possible that transactions are mixed in WAL. There was no problem with them until 2.8.1 because recovering process was still working row-by-row, but in 2.8.1 (`9311113`) it became transactional as well so starting from 2.8.1 tarantool fails to handle mixed transactions from WAL. Mixed transactions is obviously the one of unexpected scenarios in modern tarantool, so we should not handle it in normal mode, and we take care of it only in force recovery mode. But current 'force recovery' treats all the rows of the unexpected transaction as unexpected and thus just skips the entire transaction row-by-row. However we have enough information in this case to handle it more data-friendly. Note: Let there be two nodes (`node#1` and `node#2`). And let the data be replicated from `node#1` to `node#2`. Suppose that at some point in time, `node#1` is restoring data from an xlog containing mixed transactions. To replicate data from `node#1` to `node#2`, do the following: 1. Stop `node#2` and delete all xlog files from it 2. Restart `node#1` by setting `force_recovery` to `true` 3. Make `node#2` rejoin to `node#1` again (tarantoolgh-7932).
yanshtunder
added a commit
to yanshtunder/tarantool
that referenced
this issue
May 22, 2023
See the docbot request for details. Closes tarantool#7932 @TarantoolBot document Title: correct recovery of mixed transactions Transaction boundaries in WAL were introduced in 2.1.2 (`872d6f1`), but replication applier started to apply transaction atomically only in 2.2.1(`e94ba2e`), so within this range it is possible that transactions are mixed in WAL. There was no problem with them until 2.8.1 because recovering process was still working row-by-row, but in 2.8.1 (`9311113`) it became transactional as well so starting from 2.8.1 tarantool fails to handle mixed transactions from WAL. Mixed transactions is obviously the one of unexpected scenarios in modern tarantool, so we should not handle it in normal mode, and we take care of it only in force recovery mode. But current 'force recovery' treats all the rows of the unexpected transaction as unexpected and thus just skips the entire transaction row-by-row. However we have enough information in this case to handle it more data-friendly. Note: Let there be two nodes (`node#1` and `node#2`). And let the data be replicated from `node#1` to `node#2`. Suppose that at some point in time, `node#1` is restoring data from an xlog containing mixed transactions. To replicate data from `node#1` to `node#2`, do the following: 1. Stop `node#2` and delete all xlog files from it 2. Restart `node#1` by setting `force_recovery` to `true` 3. Make `node#2` rejoin to `node#1` again.
yanshtunder
added a commit
to yanshtunder/tarantool
that referenced
this issue
May 25, 2023
A new macro has been introduced because OOM errors are not handled in the code. Needed for tarantool#7932 NO_DOC=internal NO_TEST=internal NO_CHANGELOG=internal
yanshtunder
added a commit
to yanshtunder/tarantool
that referenced
this issue
May 25, 2023
See the docbot request for details. Closes tarantool#7932 @TarantoolBot document Title: correct recovery of mixed transactions Transaction boundaries in WAL were introduced in 2.1.2 (`872d6f1`), but replication applier started to apply transaction atomically only in 2.2.1(`e94ba2e`), so within this range it is possible that transactions are mixed in WAL. There was no problem with them until 2.8.1 because recovering process was still working row-by-row, but in 2.8.1 (`9311113`) it became transactional as well so starting from 2.8.1 tarantool fails to handle mixed transactions from WAL. Mixed transactions is obviously the one of unexpected scenarios in modern tarantool, so we should not handle it in normal mode, and we take care of it only in force recovery mode. But current 'force recovery' treats all the rows of the unexpected transaction as unexpected and thus just skips the entire transaction row-by-row. However we have enough information in this case to handle it more data-friendly. Note: Let there be two nodes (`node#1` and `node#2`). And let the data be replicated from `node#1` to `node#2`. Suppose that at some point in time, `node#1` is restoring data from an xlog containing mixed transactions. To replicate data from `node#1` to `node#2`, do the following: 1. Stop `node#2` and delete all xlog files from it 2. Restart `node#1` by setting `force_recovery` to `true` 3. Make `node#2` rejoin to `node#1` again.
yanshtunder
added a commit
to yanshtunder/tarantool
that referenced
this issue
May 31, 2023
A new macro has been introduced because OOM errors are not handled in the code. Needed for tarantool#7932 NO_DOC=internal NO_TEST=internal NO_CHANGELOG=internal
yanshtunder
added a commit
to yanshtunder/tarantool
that referenced
this issue
May 31, 2023
See the docbot request for details. Closes tarantool#7932 @TarantoolBot document Title: correct recovery of mixed transactions In this patch implemented correct recovery of mixed transactions. To do this, set `box.cfg.force_recovery` to `true`. If you need to revert to the old behavior, don't set the `force_recovery` option. What to do when one node feeds the other a xlog with mixed transactions? Let there be two nodes (`node#1` and `node#2`). And let the data be replicated from `node#1` to `node#2`. Suppose that at some point in time, `node#1` is restoring data from an xlog containing mixed transactions. To replicate data from `node#1` to `node#2`, do the following: 1. Stop `node#2` and delete all xlog files from it 2. Restart `node#1` by setting `force_recovery` to `true` 3. Make `node#2` rejoin to `node#1` again.
yanshtunder
added a commit
to yanshtunder/tarantool
that referenced
this issue
Jun 5, 2023
A new macro has been introduced because OOM errors are not handled in the code. Needed for tarantool#7932 NO_DOC=internal NO_TEST=internal NO_CHANGELOG=internal
yanshtunder
added a commit
to yanshtunder/tarantool
that referenced
this issue
Jun 5, 2023
See the docbot request for details. Closes tarantool#7932 @TarantoolBot document Title: correct recovery of mixed transactions In this patch implemented correct recovery of mixed transactions. To do this, set `box.cfg.force_recovery` to `true`. If you need to revert to the old behavior, don't set the `force_recovery` option. What to do when one node feeds the other a xlog with mixed transactions? Let there be two nodes (`node#1` and `node#2`). And let the data be replicated from `node#1` to `node#2`. Suppose that at some point in time, `node#1` is restoring data from an xlog containing mixed transactions. To replicate data from `node#1` to `node#2`, do the following: 1. Stop `node#2` and delete all xlog files from it 2. Restart `node#1` by setting `force_recovery` to `true` 3. Make `node#2` rejoin to `node#1` again.
yanshtunder
added a commit
to yanshtunder/tarantool
that referenced
this issue
Jun 7, 2023
A new macro has been introduced because OOM errors are not handled in the code. Needed for tarantool#7932 NO_DOC=internal NO_TEST=internal NO_CHANGELOG=internal
yanshtunder
added a commit
to yanshtunder/tarantool
that referenced
this issue
Jun 7, 2023
See the docbot request for details. Closes tarantool#7932 @TarantoolBot document Title: correct recovery of mixed transactions In this patch implemented correct recovery of mixed transactions. To do this, set `box.cfg.force_recovery` to `true`. If you need to revert to the old behavior, don't set the `force_recovery` option. What to do when one node feeds the other a xlog with mixed transactions? Let there be two nodes (`node#1` and `node#2`). And let the data be replicated from `node#1` to `node#2`. Suppose that at some point in time, `node#1` is restoring data from an xlog containing mixed transactions. To replicate data from `node#1` to `node#2`, do the following: 1. Stop `node#2` and delete all xlog files from it 2. Restart `node#1` by setting `force_recovery` to `true` 3. Make `node#2` rejoin to `node#1` again.
yanshtunder
added a commit
to yanshtunder/tarantool
that referenced
this issue
Jun 15, 2023
A new macros have been introduced because OOM errors are not handled in the code. Needed for tarantool#7932 NO_DOC=internal NO_TEST=internal NO_CHANGELOG=internal
yanshtunder
added a commit
to yanshtunder/tarantool
that referenced
this issue
Jun 15, 2023
See the docbot request for details. Closes tarantool#7932 @TarantoolBot document Title: correct recovery of mixed transactions In this patch implemented correct recovery of mixed transactions. To do this, set `box.cfg.force_recovery` to `true`. If you need to revert to the old behavior, don't set the `force_recovery` option. What to do when one node feeds the other a xlog with mixed transactions? Let there be two nodes (`node#1` and `node#2`). And let the data be replicated from `node#1` to `node#2`. Suppose that at some point in time, `node#1` is restoring data from an xlog containing mixed transactions. To replicate data from `node#1` to `node#2`, do the following: 1. Stop `node#2` and delete all xlog files from it 2. Restart `node#1` by setting `force_recovery` to `true` 3. Make `node#2` rejoin to `node#1` again.
sergepetrenko
pushed a commit
to yanshtunder/tarantool
that referenced
this issue
Jun 16, 2023
See the docbot request for details. Closes tarantool#7932 @TarantoolBot document Title: correct recovery of mixed transactions In this patch implemented correct recovery of mixed transactions. To do this, set `box.cfg.force_recovery` to `true`. If you need to revert to the old behavior, don't set the `force_recovery` option. What to do when one node feeds the other a xlog with mixed transactions? Let there be two nodes (`node#1` and `node#2`). And let the data be replicated from `node#1` to `node#2`. Suppose that at some point in time, `node#1` is restoring data from an xlog containing mixed transactions. To replicate data from `node#1` to `node#2`, do the following: 1. Stop `node#2` and delete all xlog files from it 2. Restart `node#1` by setting `force_recovery` to `true` 3. Make `node#2` rejoin to `node#1` again.
sergepetrenko
pushed a commit
that referenced
this issue
Jun 19, 2023
A new macros have been introduced because OOM errors are not handled in the code. Needed for #7932 NO_DOC=internal NO_TEST=internal NO_CHANGELOG=internal
sergepetrenko
pushed a commit
that referenced
this issue
Jun 19, 2023
See the docbot request for details. Closes #7932 @TarantoolBot document Title: correct recovery of mixed transactions In this patch implemented correct recovery of mixed transactions. To do this, set `box.cfg.force_recovery` to `true`. If you need to revert to the old behavior, don't set the `force_recovery` option. What to do when one node feeds the other a xlog with mixed transactions? Let there be two nodes (`node#1` and `node#2`). And let the data be replicated from `node#1` to `node#2`. Suppose that at some point in time, `node#1` is restoring data from an xlog containing mixed transactions. To replicate data from `node#1` to `node#2`, do the following: 1. Stop `node#2` and delete all xlog files from it 2. Restart `node#1` by setting `force_recovery` to `true` 3. Make `node#2` rejoin to `node#1` again.
sergepetrenko
added
2.11
Target is 2.11 and all newer release/master branches
and removed
2.10
Target is 2.10 and all newer release/master branches
labels
Jun 19, 2023
Decided with @sergos this shouldn't go to 2.10, because it's too big of a change. |
sergepetrenko
pushed a commit
that referenced
this issue
Jun 19, 2023
See the docbot request for details. Closes #7932 @TarantoolBot document Title: correct recovery of mixed transactions In this patch implemented correct recovery of mixed transactions. To do this, set `box.cfg.force_recovery` to `true`. If you need to revert to the old behavior, don't set the `force_recovery` option. What to do when one node feeds the other a xlog with mixed transactions? Let there be two nodes (`node#1` and `node#2`). And let the data be replicated from `node#1` to `node#2`. Suppose that at some point in time, `node#1` is restoring data from an xlog containing mixed transactions. To replicate data from `node#1` to `node#2`, do the following: 1. Stop `node#2` and delete all xlog files from it 2. Restart `node#1` by setting `force_recovery` to `true` 3. Make `node#2` rejoin to `node#1` again. (cherry picked from commit 2b1c871)
yanshtunder
added a commit
to yanshtunder/tarantool
that referenced
this issue
Jul 24, 2023
Added a new parameter `is_sync` to `box.begin()`, `box.commit()` and `box.atomic`. Thanks to this parameter can make the transaction synchronous. To do this set `is_sync = true`. `box.commit()` cannot change `is_sync` to `false` if the transaction was opened as synchronous or has writes to synchro spaces. By default, `is_sync = false`. Examples: ```Lua -- Sync transactions box.atomic({is_sync = true}, function() ... end) box.begin({is_sync = true}) ... box.commit({is_sync = false}) box.begin({is_sync = false}) ... box.commit({is_sync = true}) box.begin({is_sync = true}) ... box.commit({is_sync = true}) -- Async transactions box.atomic({is_sync = false}, function() ... end) box.begin({is_sync = false}) ... box.commit({is_sync = false}) box.begin() ... box.commit() ``` Closes tarantool#7932 NO_DOC=internal
sergepetrenko
added a commit
to sergepetrenko/tarantool
that referenced
this issue
Aug 24, 2023
Force recovery first tries to collect all rows of a transaction into a single list, and only then applies those rows. The problem was that it collected rows based on the row replica_id. For local rows replica_id is set to 0, but actually such rows can be part of a transaction coming from any instance. Fix recovery of such rows Follow-up tarantool#8746 Follow-up tarantool#7932 NO_DOC=bugfix NO_CHANGELOG=the broken behaviour couldn't be seen due to bug tarantool#8746
sergepetrenko
added a commit
to sergepetrenko/tarantool
that referenced
this issue
Aug 30, 2023
Force recovery first tries to collect all rows of a transaction into a single list, and only then applies those rows. The problem was that it collected rows based on the row replica_id. For local rows replica_id is set to 0, but actually such rows can be part of a transaction coming from any instance. Fix recovery of such rows Follow-up tarantool#8746 Follow-up tarantool#7932 NO_DOC=bugfix NO_CHANGELOG=the broken behaviour couldn't be seen due to bug tarantool#8746
sergepetrenko
added a commit
to sergepetrenko/tarantool
that referenced
this issue
Aug 31, 2023
Force recovery first tries to collect all rows of a transaction into a single list, and only then applies those rows. The problem was that it collected rows based on the row replica_id. For local rows replica_id is set to 0, but actually such rows can be part of a transaction coming from any instance. Fix recovery of such rows Follow-up tarantool#8746 Follow-up tarantool#7932 NO_DOC=bugfix NO_CHANGELOG=the broken behaviour couldn't be seen due to bug tarantool#8746
sergepetrenko
added a commit
to sergepetrenko/tarantool
that referenced
this issue
Sep 1, 2023
Force recovery first tries to collect all rows of a transaction into a single list, and only then applies those rows. The problem was that it collected rows based on the row replica_id. For local rows replica_id is set to 0, but actually such rows can be part of a transaction coming from any instance. Fix recovery of such rows Follow-up tarantool#8746 Follow-up tarantool#7932 NO_DOC=bugfix NO_CHANGELOG=the broken behaviour couldn't be seen due to bug tarantool#8746
sergepetrenko
added a commit
to sergepetrenko/tarantool
that referenced
this issue
Sep 8, 2023
Force recovery first tries to collect all rows of a transaction into a single list, and only then applies those rows. The problem was that it collected rows based on the row replica_id. For local rows replica_id is set to 0, but actually such rows can be part of a transaction coming from any instance. Fix recovery of such rows Follow-up tarantool#8746 Follow-up tarantool#7932 NO_DOC=bugfix NO_CHANGELOG=the broken behaviour couldn't be seen due to bug tarantool#8746
sergepetrenko
added a commit
to sergepetrenko/tarantool
that referenced
this issue
Sep 11, 2023
Force recovery first tries to collect all rows of a transaction into a single list, and only then applies those rows. The problem was that it collected rows based on the row replica_id. For local rows replica_id is set to 0, but actually such rows can be part of a transaction coming from any instance. Fix recovery of such rows Follow-up tarantool#8746 Follow-up tarantool#7932 NO_DOC=bugfix NO_CHANGELOG=the broken behaviour couldn't be seen due to bug tarantool#8746
sergepetrenko
added a commit
to sergepetrenko/tarantool
that referenced
this issue
Sep 11, 2023
Force recovery first tries to collect all rows of a transaction into a single list, and only then applies those rows. The problem was that it collected rows based on the row replica_id. For local rows replica_id is set to 0, but actually such rows can be part of a transaction coming from any instance. Fix recovery of such rows Follow-up tarantool#8746 Follow-up tarantool#7932 NO_DOC=bugfix NO_CHANGELOG=the broken behaviour couldn't be seen due to bug tarantool#8746
sergepetrenko
added a commit
to sergepetrenko/tarantool
that referenced
this issue
Sep 13, 2023
Force recovery first tries to collect all rows of a transaction into a single list, and only then applies those rows. The problem was that it collected rows based on the row replica_id. For local rows replica_id is set to 0, but actually such rows can be part of a transaction coming from any instance. Fix recovery of such rows Follow-up tarantool#8746 Follow-up tarantool#7932 NO_DOC=bugfix NO_CHANGELOG=the broken behaviour couldn't be seen due to bug tarantool#8746
sergepetrenko
added a commit
to sergepetrenko/tarantool
that referenced
this issue
Sep 15, 2023
Force recovery first tries to collect all rows of a transaction into a single list, and only then applies those rows. The problem was that it collected rows based on the row replica_id. For local rows replica_id is set to 0, but actually such rows can be part of a transaction coming from any instance. Fix recovery of such rows Follow-up tarantool#8746 Follow-up tarantool#7932 NO_DOC=bugfix NO_CHANGELOG=the broken behaviour couldn't be seen due to bug tarantool#8746
sergepetrenko
added a commit
to sergepetrenko/tarantool
that referenced
this issue
Sep 18, 2023
Force recovery first tries to collect all rows of a transaction into a single list, and only then applies those rows. The problem was that it collected rows based on the row replica_id. For local rows replica_id is set to 0, but actually such rows can be part of a transaction coming from any instance. Fix recovery of such rows Follow-up tarantool#8746 Follow-up tarantool#7932 NO_DOC=bugfix NO_CHANGELOG=the broken behaviour couldn't be seen due to bug tarantool#8746
sergepetrenko
added a commit
to sergepetrenko/tarantool
that referenced
this issue
Oct 2, 2023
Force recovery first tries to collect all rows of a transaction into a single list, and only then applies those rows. The problem was that it collected rows based on the row replica_id. For local rows replica_id is set to 0, but actually such rows can be part of a transaction coming from any instance. Fix recovery of such rows Follow-up tarantool#8746 Follow-up tarantool#7932 NO_DOC=bugfix NO_CHANGELOG=the broken behaviour couldn't be seen due to bug tarantool#8746
sergepetrenko
added a commit
to sergepetrenko/tarantool
that referenced
this issue
Oct 5, 2023
Force recovery first tries to collect all rows of a transaction into a single list, and only then applies those rows. The problem was that it collected rows based on the row replica_id. For local rows replica_id is set to 0, but actually such rows can be part of a transaction coming from any instance. Fix recovery of such rows Follow-up tarantool#8746 Follow-up tarantool#7932 NO_DOC=bugfix NO_CHANGELOG=the broken behaviour couldn't be seen due to bug tarantool#8746
sergepetrenko
added a commit
to sergepetrenko/tarantool
that referenced
this issue
Oct 6, 2023
Force recovery first tries to collect all rows of a transaction into a single list, and only then applies those rows. The problem was that it collected rows based on the row replica_id. For local rows replica_id is set to 0, but actually such rows can be part of a transaction coming from any instance. Fix recovery of such rows Follow-up tarantool#8746 Follow-up tarantool#7932 NO_DOC=bugfix NO_CHANGELOG=the broken behaviour couldn't be seen due to bug tarantool#8746
sergepetrenko
added a commit
that referenced
this issue
Oct 9, 2023
Force recovery first tries to collect all rows of a transaction into a single list, and only then applies those rows. The problem was that it collected rows based on the row replica_id. For local rows replica_id is set to 0, but actually such rows can be part of a transaction coming from any instance. Fix recovery of such rows Follow-up #8746 Follow-up #7932 NO_DOC=bugfix NO_CHANGELOG=the broken behaviour couldn't be seen due to bug #8746
sergepetrenko
added a commit
that referenced
this issue
Oct 9, 2023
Force recovery first tries to collect all rows of a transaction into a single list, and only then applies those rows. The problem was that it collected rows based on the row replica_id. For local rows replica_id is set to 0, but actually such rows can be part of a transaction coming from any instance. Fix recovery of such rows Follow-up #8746 Follow-up #7932 NO_DOC=bugfix NO_CHANGELOG=the broken behaviour couldn't be seen due to bug #8746 (cherry picked from commit 85df1c9)
sergepetrenko
pushed a commit
to sergepetrenko/tarantool
that referenced
this issue
Oct 9, 2023
A new macros have been introduced because OOM errors are not handled in the code. Needed for tarantool#7932 NO_DOC=internal NO_TEST=internal NO_CHANGELOG=internal (cherry picked from commit 5385714)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Transaction boundaries in WAL were introduced in 2.1.2 (872d6f1), but replication applier started to apply transaction atomically only in 2.2.1(e94ba2e), so within this range it is possible that transactions are mixed in WAL. There was no problem with them until 2.8.1 because recovering process was still working row-by-row, but in 2.8.1 (9311113) it became transactional as well so starting from 2.8.1 tarantool fails to handle mixed transactions from WAL.
Recently this problem arose here
I find it difficult to provide 100% reproducer here, but at least you need:
Mixed transactions is obviously the one of unexpected scenarios in modern tarantool, so we should not handle it in normal mode, and we take care of it only in force recovery mode. But current 'force recovery' treats all the rows of the unexpected transaction as unexpected and thus just skips the entire transaction row-by-row. However it seems we have enough information in this case to handle it more data-friendly, so I'd suggest to improve force recovery to avoid data loss, or at least try to avoid
The text was updated successfully, but these errors were encountered: