Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync transaction rolls back async and local ones #7903

Closed
rosik opened this issue Nov 7, 2022 · 4 comments
Closed

Sync transaction rolls back async and local ones #7903

rosik opened this issue Nov 7, 2022 · 4 comments
Labels
bug Something isn't working

Comments

@rosik
Copy link
Contributor

rosik commented Nov 7, 2022

Tarantool verison

In general, all. I've tested Tarantool 2.11.0-entrypoint-671-gdec0e0221.

Bug description

The loss of replication_synchro_quorum results in transactions rollback, including async and local transactions.

Steps to reproduce

_G.log  = require('log')
_G.json = require('json')
_G.fiber = require('fiber')

box.cfg({
    election_mode = "manual",
    memtx_use_mvcc_engine = true,
    replication_synchro_quorum = 2,
    replication_synchro_timeout = 3,
    listen = 3301,
})

box.schema.space.create('test_normal'):create_index('pk')
box.schema.space.create('test_local', {is_local = true}):create_index('pk')
box.schema.space.create('test_temp', {temporary = true}):create_index('pk')
box.schema.space.create('test_sync', {is_sync = true}):create_index('pk')

local txn_isolation = "read-confirmed"

function f(sp, v)
    fiber.name(sp.name)

    log.info("box.begin(%s)", txn_isolation)
    box.begin({txn_isolation = txn_isolation})

    if v then
        log.info("box.space.%s:replace() ...", sp.name)
        sp:replace(v)
    end
    -- log.info(
    --     "box.space.test_sync:select() == %s",
    --     json.encode(box.space.test_sync:select(nil, {limit = 100}))
    -- )

    log.info("committing ...", sp.name)
    box.on_commit(function() log.info("committed", sp.name) end)
    box.on_rollback(function() log.info("rolled back", sp.name) end)
    box.commit()
end

-- require('console').start()

fiber.new(f, box.space.test_sync, {1})
fiber.sleep(0.1)

fiber.new(f, box.space.test_temp, {2})
fiber.new(f, box.space.test_local, {3})
fiber.new(f, box.space.test_normal, {4})

Actual behavior

2022-11-03 20:32:52.576 [3363655] main/113/test_sync I> box.begin(read-confirmed)
2022-11-03 20:32:52.576 [3363655] main/113/test_sync I> box.space.test_sync:replace() ...
2022-11-03 20:32:52.576 [3363655] main/113/test_sync I> committing ...

2022-11-03 20:32:52.677 [3363655] main/114/test_temp I> box.begin(read-confirmed)
2022-11-03 20:32:52.678 [3363655] main/114/test_temp I> box.space.test_temp:replace() ...
2022-11-03 20:32:52.678 [3363655] main/114/test_temp I> committing ...
2022-11-03 20:32:52.678 [3363655] main/114/test_temp I> committed

2022-11-03 20:32:52.678 [3363655] main/115/test_local I> box.begin(read-confirmed)
2022-11-03 20:32:52.678 [3363655] main/115/test_local I> box.space.test_local:replace() ...
2022-11-03 20:32:52.678 [3363655] main/115/test_local I> committing ...

2022-11-03 20:32:52.678 [3363655] main/116/test_normal I> box.begin(read-confirmed)
2022-11-03 20:32:52.678 [3363655] main/116/test_normal I> box.space.test_normal:replace() ...
2022-11-03 20:32:52.678 [3363655] main/116/test_normal I> committing ...

2022-11-03 20:32:52.678 [3363655] main I> entering the event loop

2022-11-03 20:32:55.578 [3363655] main/113/test_sync I> rolled back
2022-11-03 20:32:55.578 [3363655] main/113/test_sync I> rolled back
2022-11-03 20:32:55.578 [3363655] main/113/test_sync I> rolled back

2022-11-03 20:32:55.578 [3363655] main/113/test_sync txn_limbo.c:295 E> ER_SYNC_QUORUM_TIMEOUT: Quorum collection for a synchronous transaction is timed out
2022-11-03 20:32:55.578 [3363655] main/116/test_normal txn_limbo.c:316 E> ER_SYNC_ROLLBACK: A rollback for a synchronous transaction is received
2022-11-03 20:32:55.578 [3363655] main/115/test_local txn_limbo.c:316 E> ER_SYNC_ROLLBACK: A rollback for a synchronous transaction is received

Expected behavior

I believe neither local nor async transactions shouldn't be affected.

@rosik rosik added the bug Something isn't working label Nov 7, 2022
@rosik
Copy link
Contributor Author

rosik commented Nov 7, 2022

P.S. This issue emerged from #7592. Yet I've decided to start a new one as it concerns the synchro queue owner, and the previous one was about local transactions on the read-only replica.

@rosik
Copy link
Contributor Author

rosik commented Nov 7, 2022

If there are both sync and async spaces, it's most reasonable to suppose that those transactions do interleave. And under conditions of quorum loss an async transaction only has a chance to be committed right after the rollback and before the next sync tx arrives. Catching that moment seems to me nearly impossible. As a result, making a synchronous transaction in fact is a well-documented denial-of-service attack.

@sergepetrenko
Copy link
Collaborator

sergepetrenko commented Nov 8, 2022

We don't commit local / async transactions right away, because they might touch the same data synchronous transactions touch. For example,

box.begin() box.space.sync:replace{1} box.space.local:replace{2} box.commit() -- hangs waiting for quorum in another fiber.
...
box.begin() box.space.local:select{2} box.space.local:insert{3} box.commit() -- can't complete right away, because it might depend on the synchronous transaction above.

Actually, when mvcc is used, the second transaction will be rolled back by conflict even if we don't wait for sync transaction. So it seems we can let local transactions bypass limbo. At least when mvcc is turned on. I don't think we should do so for async (not local) transactions though.

@kyukhin
Copy link
Contributor

kyukhin commented Nov 11, 2022

This is by design and we do not have plans to change it.

@kyukhin kyukhin closed this as not planned Won't fix, can't repro, duplicate, stale Nov 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants