Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

All triggers, visible to a user, crash on non-trivial changes #4264

Closed
Gerold103 opened this issue May 31, 2019 · 11 comments · Fixed by #7550
Closed

All triggers, visible to a user, crash on non-trivial changes #4264

Gerold103 opened this issue May 31, 2019 · 11 comments · Fixed by #7550
Assignees
Labels
bug Something isn't working crash triggers

Comments

@Gerold103
Copy link
Collaborator

box.cfg{}
s = box.schema.create_space('test')
pk = s:create_index('pk')
t1 = function() print('t1') end
t2 = function() s:on_replace(nil, t1) s:on_replace(nil, t2) print('t2') end

s:on_replace(t1)
s:on_replace(t2)
s:replace{1}

This leads to surprising results. It could be a gamble to bet how that code will crash next time. Or will not crash. The only certain thing is that this code corrupts memory. While the example is not artificial. IMO a user should be able in a trigger to decide to drop all the triggers. Results are:

tarantool> s:replace{1}
t2
---
- error: attempt to call a nil value
...

Or

tarantool> s:replace{1}
t2
---
- [1]
...

Or

tarantool> s:replace{1}
t2
Process 83468 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)
    frame #0: 0x0000000100150004 tarantool`tarantool_lua_init(tarantool_bin="?u\x80\x1d\x01", argc=1, argv=0x0000000104019038) at init.c:486:3
   483 			} else {
   484 				lua_pop(L, 1); /* nil */
   485 			}
-> 486 			lua_pop(L, 1); /* chunkname */
   487 		}
   488 		lua_pop(L, 1); /* _LOADED */
   489 	

Or anything else. This is because all public triggers are run with function trigger_run, which uses rlist_foreach_safe, but the latter can't withstand deletion of an element after current.

Possible solution - introduce struct rlist_safe, or even do not use rlist for triggers at all. When a one wants to delete a trigger, it should be marked for deletion, but deleted only on a next attempt to run, for example.

@Gerold103 Gerold103 added crash bug Something isn't working labels May 31, 2019
@Gerold103
Copy link
Collaborator Author

Affected places: space:on_replace, before_replace, box.on_commit, box.on_rollback, box.session.on_connect, and all the others using lbox_trigger_reset. They are public.

@alyapunov
Copy link
Contributor

alyapunov commented Jun 1, 2019 via email

@Gerold103
Copy link
Collaborator Author

It is a bug, reproduced without any C manipulations. If you consider it not important, then you can move it to wishlist.

@kyukhin kyukhin added this to the 2.2.0 milestone Jun 4, 2019
@Gerold103
Copy link
Collaborator Author

As I understand, Kirill considers it a bug, as I do. Here are my thoughts on how we could resolve it.
I thought about a new structure struct rlist_mutable, which is allowed to be totally mutable, be changed anyhow, which could allow to add or remove any item from any place any number of times.

It should contain a flag bool is_changed, which is set when the list is changed. During iteration over the list the cycle will check if the flag was set between iterations. If it did, then the cycle restarts, skips all the items until the current one, and continues normal operation.

Problems with my approach are

  1. What to do, if exactly the current item was removed? We won't find it on re-roll. Because of the same reason we can't rely on any other items. Probably, we should skip till the current item and not more number of items than already scanned. Or add another flag bool is_seen to each item, which is set during iteration, and unset after. The latter is a problem, because it does not allow to iterate in multiple cycles, and it requires two iterations.
  2. struct rlist_mutable probably won't be compatible with struct rlist, and we won't be able to use functions from rlist.h. But probably I am wrong here.

This mindblowing way is really hard to implement and support. There is another one, but less flexible. Lets define

struct rlist_mutable {
        struct base;
        int version;
};

Version is contained in the head only, incremented on each change. Before iteration the version is saved. If during iteration the version was changed, then abort the cycle. We will document that behaviour, that attempts to change a trigger list during iteration will stop its current processing.

@kostja kostja removed the bug Something isn't working label Jun 26, 2019
@kostja kostja removed this from the 2.2.0 milestone Jun 26, 2019
@Totktonada
Copy link
Member

I believe this is documented even.

I don't see anything about changing triggers inside triggers here and here.

@lenkis Can you verify and proceed with that?

@kyukhin
Copy link
Contributor

kyukhin commented Jul 18, 2019

IMHO, we need to ban this. No segfaults are allowed.

@kyukhin kyukhin added this to the 2.3.0 milestone Jul 18, 2019
@kostja
Copy link
Contributor

kostja commented Jul 19, 2019

Kirill, this is contrary to triage guidelines. Please do not waste time on matters with low impact.

@locker
Copy link
Member

locker commented Jul 6, 2022

Looks like documented undefined behavior:

image

I don't think we need to fix this, because:

  1. Nobody requested this kind of functionality.
  2. Implementing/fixing it would complicate the code.

@kyukhin
Copy link
Contributor

kyukhin commented Jul 6, 2022

This is documented UB, closing.

@kyukhin kyukhin closed this as not planned Won't fix, can't repro, duplicate, stale Jul 6, 2022
@kyukhin kyukhin added the wontfix This will not be worked on label Jul 6, 2022
@sergepetrenko
Copy link
Collaborator

It's not necessary for a trigger to clear another trigger to crash. For example, a trigger may yield (this is allowed, I believe?) and while it yields some other code might clear the next-to-be-run trigger. This will result in a crash (actually, in an infinite loop).

sergepetrenko added a commit to sergepetrenko/tarantool that referenced this issue Jul 27, 2022
struct trigger is about to get a new field, and it's mandatory that this
field is specified in all initializers. Let's better switch to explicit
trigger_create() in all initialization code, to avoid specifying the new
field everywhere.

Part-of tarantool#4264

NO_DOC=refactoring
NO_CHANGELOG=refactoring
NO_TEST=refactoring
sergepetrenko added a commit to sergepetrenko/tarantool that referenced this issue Jul 27, 2022
This patch fixes a number of issues with trigger_clear() while the
trigger list is being run:
1) clearing the next-to-be-run trigger doesn't prevent it from being run
2) clearing the next-to-be-run trigger causes an infinite loop or a
   crash

Closes tarantool#4264
sergepetrenko added a commit to sergepetrenko/tarantool that referenced this issue Aug 15, 2022
Make trigger_fiber_run return an error, when it occurs, so that the
calling code decides how to log it.
Also, while I'm at it, simplify trigger_fiber_run's code a bit.

In-scope-of tarantool#4264

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
sergepetrenko added a commit to sergepetrenko/tarantool that referenced this issue Aug 15, 2022
struct trigger is about to get a new field, and it's mandatory that this
field is specified in all initializers. Let's introduce a macro to avoid
adding every new field to all the initializers and at the same time keep
the benefits of static initialization.

Also while we're at it fix `lbox_trigger_reset` setting all trigger
fileds manually.

Part-of tarantool#4264

NO_DOC=refactoring
NO_CHANGELOG=refactoring
NO_TEST=refactoring
sergepetrenko added a commit to sergepetrenko/tarantool that referenced this issue Aug 15, 2022
This patch fixes a number of issues with trigger_clear() while the
trigger list is being run:
1) clearing the next-to-be-run trigger doesn't prevent it from being run
2) clearing the next-to-be-run trigger causes an infinite loop or a
   crash
3) swapping trigger list head before the last trigger is run causes an
   infinite loop or a crash (see space_swap_triggers() in alter.cc, which
   had worked all this time by miracle: space _space on_replace trigger
   swaps its own head during local recovery, and that had only worked
   because the trigger by luck was the last to run)

This is fixed by adding triggers in a separate run list on trigger_run.
This list may be iterated by `rlist_shift_entry`, which doesn't suffer
from any of the problems mentioned above.

While being bad in a number of ways, old approach supported practically
unlimited number of concurrent trigger_runs for the same trigger list.
The new approach requires the trigger to be in as many run lists as
there are concurrent trigger_runs, which results in quite a big
refactoring.

Add a luatest-based test and a unit test.

Closes tarantool#4264

NO_DOC=bugfix
sergepetrenko added a commit to sergepetrenko/tarantool that referenced this issue Aug 24, 2022
Our triggers support recursive invocation: for example, an on_replace
trigger on a space may do a replace in the same space.

However, this is not tested and might get broken easily. Let's add a
corresponding test.

In-scope-of tarantool#4264

NO_DOC=testing
NO_CHANGELOG=testing
sergepetrenko added a commit to sergepetrenko/tarantool that referenced this issue Aug 24, 2022
Unit test compilation with `#define UNIT_TAP_COMPATIBLE 1` might fail
with an error complaining that <stdarg.h> is not included. Fix this.

In-scope-of tarantool#4264

NO_CHANGELOG=testing stuff
NO_DOC=testing stuff
sergepetrenko added a commit to sergepetrenko/tarantool that referenced this issue Aug 24, 2022
cord_exit should be always called in the exiting thread. It's a single
place to call all the thread-specific module deinitalization routines.

In-scope-of tarantool#4264

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
sergepetrenko added a commit to sergepetrenko/tarantool that referenced this issue Aug 24, 2022
Make trigger_fiber_run return an error, when it occurs, so that the
calling code decides how to log it.
Also, while I'm at it, simplify trigger_fiber_run's code a bit.

In-scope-of tarantool#4264

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
sergepetrenko added a commit to sergepetrenko/tarantool that referenced this issue Aug 24, 2022
struct trigger is about to get a new field, and it's mandatory that this
field is specified in all initializers. Let's introduce a macro to avoid
adding every new field to all the initializers and at the same time keep
the benefits of static initialization.

Also while we're at it fix `lbox_trigger_reset` setting all trigger
fileds manually.

Part-of tarantool#4264

NO_DOC=refactoring
NO_CHANGELOG=refactoring
NO_TEST=refactoring
sergepetrenko added a commit to sergepetrenko/tarantool that referenced this issue Aug 24, 2022
This patch fixes a number of issues with trigger_clear() while the
trigger list is being run:
1) clearing the next-to-be-run trigger doesn't prevent it from being run
2) clearing the next-to-be-run trigger causes an infinite loop or a
   crash
3) swapping trigger list head before the last trigger is run causes an
   infinite loop or a crash (see space_swap_triggers() in alter.cc, which
   had worked all this time by miracle: space _space on_replace trigger
   swaps its own head during local recovery, and that had only worked
   because the trigger by luck was the last to run)

This is fixed by adding triggers in a separate run list on trigger_run.
This list may be iterated by `rlist_shift_entry`, which doesn't suffer
from any of the problems mentioned above.

While being bad in a number of ways, old approach supported practically
unlimited number of concurrent trigger_runs for the same trigger list.
The new approach requires the trigger to be in as many run lists as
there are concurrent trigger_runs, which results in quite a big
refactoring.

Add a luatest-based test and a unit test.

Closes tarantool#4264

NO_DOC=bugfix
locker pushed a commit that referenced this issue Aug 25, 2022
Our triggers support recursive invocation: for example, an on_replace
trigger on a space may do a replace in the same space.

However, this is not tested and might get broken easily. Let's add a
corresponding test.

In-scope-of #4264

NO_DOC=testing
NO_CHANGELOG=testing
locker pushed a commit that referenced this issue Aug 25, 2022
Unit test compilation with `#define UNIT_TAP_COMPATIBLE 1` might fail
with an error complaining that <stdarg.h> is not included. Fix this.

In-scope-of #4264

NO_CHANGELOG=testing stuff
NO_DOC=testing stuff
locker pushed a commit that referenced this issue Aug 25, 2022
cord_exit should be always called in the exiting thread. It's a single
place to call all the thread-specific module deinitalization routines.

In-scope-of #4264

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
locker pushed a commit that referenced this issue Aug 25, 2022
Make trigger_fiber_run return an error, when it occurs, so that the
calling code decides how to log it.
Also, while I'm at it, simplify trigger_fiber_run's code a bit.

In-scope-of #4264

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
locker pushed a commit that referenced this issue Aug 25, 2022
struct trigger is about to get a new field, and it's mandatory that this
field is specified in all initializers. Let's introduce a macro to avoid
adding every new field to all the initializers and at the same time keep
the benefits of static initialization.

Also while we're at it fix `lbox_trigger_reset` setting all trigger
fileds manually.

Part-of #4264

NO_DOC=refactoring
NO_CHANGELOG=refactoring
NO_TEST=refactoring
locker pushed a commit that referenced this issue Aug 25, 2022
This patch fixes a number of issues with trigger_clear() while the
trigger list is being run:
1) clearing the next-to-be-run trigger doesn't prevent it from being run
2) clearing the next-to-be-run trigger causes an infinite loop or a
   crash
3) swapping trigger list head before the last trigger is run causes an
   infinite loop or a crash (see space_swap_triggers() in alter.cc, which
   had worked all this time by miracle: space _space on_replace trigger
   swaps its own head during local recovery, and that had only worked
   because the trigger by luck was the last to run)

This is fixed by adding triggers in a separate run list on trigger_run.
This list may be iterated by `rlist_shift_entry`, which doesn't suffer
from any of the problems mentioned above.

While being bad in a number of ways, old approach supported practically
unlimited number of concurrent trigger_runs for the same trigger list.
The new approach requires the trigger to be in as many run lists as
there are concurrent trigger_runs, which results in quite a big
refactoring.

Add a luatest-based test and a unit test.

Closes #4264

NO_DOC=bugfix
@locker locker added this to the 2.10.2 milestone Aug 25, 2022
locker pushed a commit that referenced this issue Aug 25, 2022
Our triggers support recursive invocation: for example, an on_replace
trigger on a space may do a replace in the same space.

However, this is not tested and might get broken easily. Let's add a
corresponding test.

In-scope-of #4264

NO_DOC=testing
NO_CHANGELOG=testing

(cherry picked from commit bf852b4)
locker pushed a commit that referenced this issue Aug 25, 2022
Unit test compilation with `#define UNIT_TAP_COMPATIBLE 1` might fail
with an error complaining that <stdarg.h> is not included. Fix this.

In-scope-of #4264

NO_CHANGELOG=testing stuff
NO_DOC=testing stuff

(cherry picked from commit b9fd455)
locker pushed a commit that referenced this issue Aug 25, 2022
cord_exit should be always called in the exiting thread. It's a single
place to call all the thread-specific module deinitalization routines.

In-scope-of #4264

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

(cherry picked from commit 35b724c)
locker pushed a commit that referenced this issue Aug 25, 2022
Make trigger_fiber_run return an error, when it occurs, so that the
calling code decides how to log it.
Also, while I'm at it, simplify trigger_fiber_run's code a bit.

In-scope-of #4264

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring

(cherry picked from commit ca59d30)
locker pushed a commit that referenced this issue Aug 25, 2022
struct trigger is about to get a new field, and it's mandatory that this
field is specified in all initializers. Let's introduce a macro to avoid
adding every new field to all the initializers and at the same time keep
the benefits of static initialization.

Also while we're at it fix `lbox_trigger_reset` setting all trigger
fileds manually.

Part-of #4264

NO_DOC=refactoring
NO_CHANGELOG=refactoring
NO_TEST=refactoring

(cherry picked from commit 2040d1f)
locker pushed a commit that referenced this issue Aug 25, 2022
This patch fixes a number of issues with trigger_clear() while the
trigger list is being run:
1) clearing the next-to-be-run trigger doesn't prevent it from being run
2) clearing the next-to-be-run trigger causes an infinite loop or a
   crash
3) swapping trigger list head before the last trigger is run causes an
   infinite loop or a crash (see space_swap_triggers() in alter.cc, which
   had worked all this time by miracle: space _space on_replace trigger
   swaps its own head during local recovery, and that had only worked
   because the trigger by luck was the last to run)

This is fixed by adding triggers in a separate run list on trigger_run.
This list may be iterated by `rlist_shift_entry`, which doesn't suffer
from any of the problems mentioned above.

While being bad in a number of ways, old approach supported practically
unlimited number of concurrent trigger_runs for the same trigger list.
The new approach requires the trigger to be in as many run lists as
there are concurrent trigger_runs, which results in quite a big
refactoring.

Add a luatest-based test and a unit test.

Closes #4264

NO_DOC=bugfix

(cherry picked from commit 607cb55)
@sergos sergos added the 5sp label Aug 31, 2022
mkokryashkin pushed a commit to mkokryashkin/tarantool that referenced this issue Sep 9, 2022
Our triggers support recursive invocation: for example, an on_replace
trigger on a space may do a replace in the same space.

However, this is not tested and might get broken easily. Let's add a
corresponding test.

In-scope-of tarantool#4264

NO_DOC=testing
NO_CHANGELOG=testing
mkokryashkin pushed a commit to mkokryashkin/tarantool that referenced this issue Sep 9, 2022
Unit test compilation with `#define UNIT_TAP_COMPATIBLE 1` might fail
with an error complaining that <stdarg.h> is not included. Fix this.

In-scope-of tarantool#4264

NO_CHANGELOG=testing stuff
NO_DOC=testing stuff
mkokryashkin pushed a commit to mkokryashkin/tarantool that referenced this issue Sep 9, 2022
cord_exit should be always called in the exiting thread. It's a single
place to call all the thread-specific module deinitalization routines.

In-scope-of tarantool#4264

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
mkokryashkin pushed a commit to mkokryashkin/tarantool that referenced this issue Sep 9, 2022
Make trigger_fiber_run return an error, when it occurs, so that the
calling code decides how to log it.
Also, while I'm at it, simplify trigger_fiber_run's code a bit.

In-scope-of tarantool#4264

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
mkokryashkin pushed a commit to mkokryashkin/tarantool that referenced this issue Sep 9, 2022
struct trigger is about to get a new field, and it's mandatory that this
field is specified in all initializers. Let's introduce a macro to avoid
adding every new field to all the initializers and at the same time keep
the benefits of static initialization.

Also while we're at it fix `lbox_trigger_reset` setting all trigger
fileds manually.

Part-of tarantool#4264

NO_DOC=refactoring
NO_CHANGELOG=refactoring
NO_TEST=refactoring
mkokryashkin pushed a commit to mkokryashkin/tarantool that referenced this issue Sep 9, 2022
This patch fixes a number of issues with trigger_clear() while the
trigger list is being run:
1) clearing the next-to-be-run trigger doesn't prevent it from being run
2) clearing the next-to-be-run trigger causes an infinite loop or a
   crash
3) swapping trigger list head before the last trigger is run causes an
   infinite loop or a crash (see space_swap_triggers() in alter.cc, which
   had worked all this time by miracle: space _space on_replace trigger
   swaps its own head during local recovery, and that had only worked
   because the trigger by luck was the last to run)

This is fixed by adding triggers in a separate run list on trigger_run.
This list may be iterated by `rlist_shift_entry`, which doesn't suffer
from any of the problems mentioned above.

While being bad in a number of ways, old approach supported practically
unlimited number of concurrent trigger_runs for the same trigger list.
The new approach requires the trigger to be in as many run lists as
there are concurrent trigger_runs, which results in quite a big
refactoring.

Add a luatest-based test and a unit test.

Closes tarantool#4264

NO_DOC=bugfix
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working crash triggers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants