All triggers, visible to a user, crash on non-trivial changes #4264

Gerold103 · 2019-05-31T23:32:05Z

box.cfg{}
s = box.schema.create_space('test')
pk = s:create_index('pk')
t1 = function() print('t1') end
t2 = function() s:on_replace(nil, t1) s:on_replace(nil, t2) print('t2') end

s:on_replace(t1)
s:on_replace(t2)
s:replace{1}

This leads to surprising results. It could be a gamble to bet how that code will crash next time. Or will not crash. The only certain thing is that this code corrupts memory. While the example is not artificial. IMO a user should be able in a trigger to decide to drop all the triggers. Results are:

tarantool> s:replace{1}
t2
---
- error: attempt to call a nil value
...

Or

tarantool> s:replace{1}
t2
---
- [1]
...

Or

tarantool> s:replace{1}
t2
Process 83468 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)
    frame #0: 0x0000000100150004 tarantool`tarantool_lua_init(tarantool_bin="?u\x80\x1d\x01", argc=1, argv=0x0000000104019038) at init.c:486:3
   483 			} else {
   484 				lua_pop(L, 1); /* nil */
   485 			}
-> 486 			lua_pop(L, 1); /* chunkname */
   487 		}
   488 		lua_pop(L, 1); /* _LOADED */
   489

Or anything else. This is because all public triggers are run with function trigger_run, which uses rlist_foreach_safe, but the latter can't withstand deletion of an element after current.

Possible solution - introduce struct rlist_safe, or even do not use rlist for triggers at all. When a one wants to delete a trigger, it should be marked for deletion, but deleted only on a next attempt to run, for example.

The text was updated successfully, but these errors were encountered:

Gerold103 · 2019-05-31T23:34:36Z

Affected places: space:on_replace, before_replace, box.on_commit, box.on_rollback, box.session.on_connect, and all the others using lbox_trigger_reset. They are public.

alyapunov · 2019-06-01T09:16:34Z

* Vladislav Shpilevoy <notifications@github.com> [19/06/01 06:03]:

Affected places: space:on_replace, before_replace, box.on_commit, box.on_rollback, box.session.on_connect, and all the others using lbox_trigger_reset. They are public.

I believe this is documented even. I remember discussing the problem with PeterG when triggers were introduced. IMO fixing this is just not worth it. We can simply state the behaviour is undefined. In Tarantool you can trivially crash the system (ffi.cast and dereference a null pointer), so we are not obliged to fix all of the crash scenarios just because it-is-a-crash.

…

-- Konstantin Osipov, Moscow, Russia

Gerold103 · 2019-06-01T09:21:40Z

It is a bug, reproduced without any C manipulations. If you consider it not important, then you can move it to wishlist.

Gerold103 · 2019-06-26T20:01:05Z

As I understand, Kirill considers it a bug, as I do. Here are my thoughts on how we could resolve it.
I thought about a new structure struct rlist_mutable, which is allowed to be totally mutable, be changed anyhow, which could allow to add or remove any item from any place any number of times.

It should contain a flag bool is_changed, which is set when the list is changed. During iteration over the list the cycle will check if the flag was set between iterations. If it did, then the cycle restarts, skips all the items until the current one, and continues normal operation.

Problems with my approach are

What to do, if exactly the current item was removed? We won't find it on re-roll. Because of the same reason we can't rely on any other items. Probably, we should skip till the current item and not more number of items than already scanned. Or add another flag bool is_seen to each item, which is set during iteration, and unset after. The latter is a problem, because it does not allow to iterate in multiple cycles, and it requires two iterations.
struct rlist_mutable probably won't be compatible with struct rlist, and we won't be able to use functions from rlist.h. But probably I am wrong here.

This mindblowing way is really hard to implement and support. There is another one, but less flexible. Lets define

struct rlist_mutable {
        struct base;
        int version;
};

Version is contained in the head only, incremented on each change. Before iteration the version is saved. If during iteration the version was changed, then abort the cycle. We will document that behaviour, that attempts to change a trigger list during iteration will stop its current processing.

Totktonada · 2019-06-27T11:36:11Z

I believe this is documented even.

I don't see anything about changing triggers inside triggers here and here.

@lenkis Can you verify and proceed with that?

kyukhin · 2019-07-18T13:00:45Z

IMHO, we need to ban this. No segfaults are allowed.

kostja · 2019-07-19T11:58:53Z

Kirill, this is contrary to triage guidelines. Please do not waste time on matters with low impact.

locker · 2022-07-06T10:14:20Z

Looks like documented undefined behavior:

I don't think we need to fix this, because:

Nobody requested this kind of functionality.
Implementing/fixing it would complicate the code.

kyukhin · 2022-07-06T10:17:11Z

This is documented UB, closing.

sergepetrenko · 2022-07-26T11:34:52Z

It's not necessary for a trigger to clear another trigger to crash. For example, a trigger may yield (this is allowed, I believe?) and while it yields some other code might clear the next-to-be-run trigger. This will result in a crash (actually, in an infinite loop).

struct trigger is about to get a new field, and it's mandatory that this field is specified in all initializers. Let's better switch to explicit trigger_create() in all initialization code, to avoid specifying the new field everywhere. Part-of tarantool#4264 NO_DOC=refactoring NO_CHANGELOG=refactoring NO_TEST=refactoring

This patch fixes a number of issues with trigger_clear() while the trigger list is being run: 1) clearing the next-to-be-run trigger doesn't prevent it from being run 2) clearing the next-to-be-run trigger causes an infinite loop or a crash Closes tarantool#4264

Make trigger_fiber_run return an error, when it occurs, so that the calling code decides how to log it. Also, while I'm at it, simplify trigger_fiber_run's code a bit. In-scope-of tarantool#4264 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring

struct trigger is about to get a new field, and it's mandatory that this field is specified in all initializers. Let's introduce a macro to avoid adding every new field to all the initializers and at the same time keep the benefits of static initialization. Also while we're at it fix `lbox_trigger_reset` setting all trigger fileds manually. Part-of tarantool#4264 NO_DOC=refactoring NO_CHANGELOG=refactoring NO_TEST=refactoring

This patch fixes a number of issues with trigger_clear() while the trigger list is being run: 1) clearing the next-to-be-run trigger doesn't prevent it from being run 2) clearing the next-to-be-run trigger causes an infinite loop or a crash 3) swapping trigger list head before the last trigger is run causes an infinite loop or a crash (see space_swap_triggers() in alter.cc, which had worked all this time by miracle: space _space on_replace trigger swaps its own head during local recovery, and that had only worked because the trigger by luck was the last to run) This is fixed by adding triggers in a separate run list on trigger_run. This list may be iterated by `rlist_shift_entry`, which doesn't suffer from any of the problems mentioned above. While being bad in a number of ways, old approach supported practically unlimited number of concurrent trigger_runs for the same trigger list. The new approach requires the trigger to be in as many run lists as there are concurrent trigger_runs, which results in quite a big refactoring. Add a luatest-based test and a unit test. Closes tarantool#4264 NO_DOC=bugfix

Our triggers support recursive invocation: for example, an on_replace trigger on a space may do a replace in the same space. However, this is not tested and might get broken easily. Let's add a corresponding test. In-scope-of tarantool#4264 NO_DOC=testing NO_CHANGELOG=testing

Unit test compilation with `#define UNIT_TAP_COMPATIBLE 1` might fail with an error complaining that <stdarg.h> is not included. Fix this. In-scope-of tarantool#4264 NO_CHANGELOG=testing stuff NO_DOC=testing stuff

cord_exit should be always called in the exiting thread. It's a single place to call all the thread-specific module deinitalization routines. In-scope-of tarantool#4264 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring

Make trigger_fiber_run return an error, when it occurs, so that the calling code decides how to log it. Also, while I'm at it, simplify trigger_fiber_run's code a bit. In-scope-of tarantool#4264 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring

struct trigger is about to get a new field, and it's mandatory that this field is specified in all initializers. Let's introduce a macro to avoid adding every new field to all the initializers and at the same time keep the benefits of static initialization. Also while we're at it fix `lbox_trigger_reset` setting all trigger fileds manually. Part-of tarantool#4264 NO_DOC=refactoring NO_CHANGELOG=refactoring NO_TEST=refactoring

This patch fixes a number of issues with trigger_clear() while the trigger list is being run: 1) clearing the next-to-be-run trigger doesn't prevent it from being run 2) clearing the next-to-be-run trigger causes an infinite loop or a crash 3) swapping trigger list head before the last trigger is run causes an infinite loop or a crash (see space_swap_triggers() in alter.cc, which had worked all this time by miracle: space _space on_replace trigger swaps its own head during local recovery, and that had only worked because the trigger by luck was the last to run) This is fixed by adding triggers in a separate run list on trigger_run. This list may be iterated by `rlist_shift_entry`, which doesn't suffer from any of the problems mentioned above. While being bad in a number of ways, old approach supported practically unlimited number of concurrent trigger_runs for the same trigger list. The new approach requires the trigger to be in as many run lists as there are concurrent trigger_runs, which results in quite a big refactoring. Add a luatest-based test and a unit test. Closes tarantool#4264 NO_DOC=bugfix

Our triggers support recursive invocation: for example, an on_replace trigger on a space may do a replace in the same space. However, this is not tested and might get broken easily. Let's add a corresponding test. In-scope-of #4264 NO_DOC=testing NO_CHANGELOG=testing

Unit test compilation with `#define UNIT_TAP_COMPATIBLE 1` might fail with an error complaining that <stdarg.h> is not included. Fix this. In-scope-of #4264 NO_CHANGELOG=testing stuff NO_DOC=testing stuff

cord_exit should be always called in the exiting thread. It's a single place to call all the thread-specific module deinitalization routines. In-scope-of #4264 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring

Make trigger_fiber_run return an error, when it occurs, so that the calling code decides how to log it. Also, while I'm at it, simplify trigger_fiber_run's code a bit. In-scope-of #4264 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring

struct trigger is about to get a new field, and it's mandatory that this field is specified in all initializers. Let's introduce a macro to avoid adding every new field to all the initializers and at the same time keep the benefits of static initialization. Also while we're at it fix `lbox_trigger_reset` setting all trigger fileds manually. Part-of #4264 NO_DOC=refactoring NO_CHANGELOG=refactoring NO_TEST=refactoring

This patch fixes a number of issues with trigger_clear() while the trigger list is being run: 1) clearing the next-to-be-run trigger doesn't prevent it from being run 2) clearing the next-to-be-run trigger causes an infinite loop or a crash 3) swapping trigger list head before the last trigger is run causes an infinite loop or a crash (see space_swap_triggers() in alter.cc, which had worked all this time by miracle: space _space on_replace trigger swaps its own head during local recovery, and that had only worked because the trigger by luck was the last to run) This is fixed by adding triggers in a separate run list on trigger_run. This list may be iterated by `rlist_shift_entry`, which doesn't suffer from any of the problems mentioned above. While being bad in a number of ways, old approach supported practically unlimited number of concurrent trigger_runs for the same trigger list. The new approach requires the trigger to be in as many run lists as there are concurrent trigger_runs, which results in quite a big refactoring. Add a luatest-based test and a unit test. Closes #4264 NO_DOC=bugfix

Our triggers support recursive invocation: for example, an on_replace trigger on a space may do a replace in the same space. However, this is not tested and might get broken easily. Let's add a corresponding test. In-scope-of #4264 NO_DOC=testing NO_CHANGELOG=testing (cherry picked from commit bf852b4)

Unit test compilation with `#define UNIT_TAP_COMPATIBLE 1` might fail with an error complaining that <stdarg.h> is not included. Fix this. In-scope-of #4264 NO_CHANGELOG=testing stuff NO_DOC=testing stuff (cherry picked from commit b9fd455)

cord_exit should be always called in the exiting thread. It's a single place to call all the thread-specific module deinitalization routines. In-scope-of #4264 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring (cherry picked from commit 35b724c)

Make trigger_fiber_run return an error, when it occurs, so that the calling code decides how to log it. Also, while I'm at it, simplify trigger_fiber_run's code a bit. In-scope-of #4264 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring (cherry picked from commit ca59d30)

struct trigger is about to get a new field, and it's mandatory that this field is specified in all initializers. Let's introduce a macro to avoid adding every new field to all the initializers and at the same time keep the benefits of static initialization. Also while we're at it fix `lbox_trigger_reset` setting all trigger fileds manually. Part-of #4264 NO_DOC=refactoring NO_CHANGELOG=refactoring NO_TEST=refactoring (cherry picked from commit 2040d1f)

This patch fixes a number of issues with trigger_clear() while the trigger list is being run: 1) clearing the next-to-be-run trigger doesn't prevent it from being run 2) clearing the next-to-be-run trigger causes an infinite loop or a crash 3) swapping trigger list head before the last trigger is run causes an infinite loop or a crash (see space_swap_triggers() in alter.cc, which had worked all this time by miracle: space _space on_replace trigger swaps its own head during local recovery, and that had only worked because the trigger by luck was the last to run) This is fixed by adding triggers in a separate run list on trigger_run. This list may be iterated by `rlist_shift_entry`, which doesn't suffer from any of the problems mentioned above. While being bad in a number of ways, old approach supported practically unlimited number of concurrent trigger_runs for the same trigger list. The new approach requires the trigger to be in as many run lists as there are concurrent trigger_runs, which results in quite a big refactoring. Add a luatest-based test and a unit test. Closes #4264 NO_DOC=bugfix (cherry picked from commit 607cb55)

Our triggers support recursive invocation: for example, an on_replace trigger on a space may do a replace in the same space. However, this is not tested and might get broken easily. Let's add a corresponding test. In-scope-of tarantool#4264 NO_DOC=testing NO_CHANGELOG=testing

Unit test compilation with `#define UNIT_TAP_COMPATIBLE 1` might fail with an error complaining that <stdarg.h> is not included. Fix this. In-scope-of tarantool#4264 NO_CHANGELOG=testing stuff NO_DOC=testing stuff

cord_exit should be always called in the exiting thread. It's a single place to call all the thread-specific module deinitalization routines. In-scope-of tarantool#4264 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring

Make trigger_fiber_run return an error, when it occurs, so that the calling code decides how to log it. Also, while I'm at it, simplify trigger_fiber_run's code a bit. In-scope-of tarantool#4264 NO_DOC=refactoring NO_TEST=refactoring NO_CHANGELOG=refactoring

struct trigger is about to get a new field, and it's mandatory that this field is specified in all initializers. Let's introduce a macro to avoid adding every new field to all the initializers and at the same time keep the benefits of static initialization. Also while we're at it fix `lbox_trigger_reset` setting all trigger fileds manually. Part-of tarantool#4264 NO_DOC=refactoring NO_CHANGELOG=refactoring NO_TEST=refactoring

This patch fixes a number of issues with trigger_clear() while the trigger list is being run: 1) clearing the next-to-be-run trigger doesn't prevent it from being run 2) clearing the next-to-be-run trigger causes an infinite loop or a crash 3) swapping trigger list head before the last trigger is run causes an infinite loop or a crash (see space_swap_triggers() in alter.cc, which had worked all this time by miracle: space _space on_replace trigger swaps its own head during local recovery, and that had only worked because the trigger by luck was the last to run) This is fixed by adding triggers in a separate run list on trigger_run. This list may be iterated by `rlist_shift_entry`, which doesn't suffer from any of the problems mentioned above. While being bad in a number of ways, old approach supported practically unlimited number of concurrent trigger_runs for the same trigger list. The new approach requires the trigger to be in as many run lists as there are concurrent trigger_runs, which results in quite a big refactoring. Add a luatest-based test and a unit test. Closes tarantool#4264 NO_DOC=bugfix

Gerold103 added crash bug Something isn't working labels May 31, 2019

kyukhin added this to the 2.2.0 milestone Jun 4, 2019

Totktonada assigned ImeevMA Jun 18, 2019

kostja removed the bug Something isn't working label Jun 26, 2019

kostja removed this from the 2.2.0 milestone Jun 26, 2019

kostja unassigned ImeevMA Jun 26, 2019

kyukhin added this to the 2.3.0 milestone Jul 18, 2019

kostja modified the milestones: 2.3.1, wishlist Aug 6, 2019

kyukhin removed this from the wishlist milestone Jun 24, 2022

kyukhin added the teamC label Jul 1, 2022

alyapunov self-assigned this Jul 1, 2022

alyapunov added the triggers label Jul 5, 2022

R-omk mentioned this issue Jul 5, 2022

Crash and undefined behavior in transactions triggers on_commit /on_rollback #7331

Closed

kyukhin closed this as not planned Won't fix, can't repro, duplicate, stale Jul 6, 2022

kyukhin added the wontfix This will not be worked on label Jul 6, 2022

kyukhin unassigned alyapunov Jul 6, 2022

locker closed this as completed in #7550 Aug 25, 2022

locker added this to the 2.10.2 milestone Aug 25, 2022

sergos added the 5sp label Aug 31, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

All triggers, visible to a user, crash on non-trivial changes #4264

All triggers, visible to a user, crash on non-trivial changes #4264

Gerold103 commented May 31, 2019

Gerold103 commented May 31, 2019

alyapunov commented Jun 1, 2019 via email

Gerold103 commented Jun 1, 2019

Gerold103 commented Jun 26, 2019

Totktonada commented Jun 27, 2019

kyukhin commented Jul 18, 2019

kostja commented Jul 19, 2019

locker commented Jul 6, 2022

kyukhin commented Jul 6, 2022

sergepetrenko commented Jul 26, 2022

All triggers, visible to a user, crash on non-trivial changes #4264

All triggers, visible to a user, crash on non-trivial changes #4264

Comments

Gerold103 commented May 31, 2019

Gerold103 commented May 31, 2019

alyapunov commented Jun 1, 2019 via email

Gerold103 commented Jun 1, 2019

Gerold103 commented Jun 26, 2019

Totktonada commented Jun 27, 2019

kyukhin commented Jul 18, 2019

kostja commented Jul 19, 2019

locker commented Jul 6, 2022

kyukhin commented Jul 6, 2022

sergepetrenko commented Jul 26, 2022