core: deadlock in module initialization #497

Open
mworrell opened this Issue Jan 11, 2013 · 28 comments

Comments

Projects
None yet
4 participants
Owner

mworrell commented Jan 11, 2013

When the mod_survey is initialized from the admin then there is a timeout error.

This timeout error occurs because the mod_search is listening to the #rsc_update_done{} notification. This is done in its own process, so handled asynchronously.

Problem is that the initialization, and from that the #rsc_update_done{} notification, is done inside a transaction. All changes to the database are not yet committed.

mod_search is requesting the path of the non-committed "survey" category. That starts a memo entry in the depcache. The resulting category query is then waiting for the transaction that is inserting category.

When mod_survey then inserts the next category ("poll"), which is a sub-category of the previous "survey" category, it also needs the path of the the "survey" category. Which it can't get because the depcache then queues this request for the result of the memo call from the mod_search.

After the depcache gave a timeout this all ends.

How shall we solve this?

I am thinking of:

  • Adding a mechanism to handle all #rsc_update_done{} notifications after the transaction is committed
  • Change mod_search to do some work inside the calling process (using a pid_observe_rsc_update_done/2 function) before dispatching it to the mod_search process.

What do you think?

cc @arjan @kaos @mmzeeman

Owner

arjan commented Jan 11, 2013

I think solution nr. 1 is better. Since it is just a notify, the observers should have no influence on the transaction.

Owner

mworrell commented Jan 11, 2013

We can collect some "todo" work in the db connection process.

I was thinking of two lists: one to be done on commit and another on rollback.

Owner

mworrell commented Jan 11, 2013

Should we then add a new kind of z_notifier function? z_notifier:on_commit(.....) ?

Owner

mworrell commented Jan 11, 2013

or rather: z_db:on_commit(fun(C) -> .... end, Context)

Owner

arjan commented Jan 11, 2013

Can't we just move this particular notification out of the transaction?

I would not add new stuff to z_db at this point because @mmzeeman is working on new db stuff anyway

Owner

kaos commented Jan 11, 2013

I'm with Arjan, why not have the rsc_update_done event sent /after/ the transaction has been committed.
To me that feels like the right thing to do any way. Why send a done event for something that is not yet committed?

Owner

mmzeeman commented Jan 11, 2013

Can't we just move this particular notification out of the transaction?

I would not add new stuff to z_db at this point because @mmzeeman is
working on new db stuff anyway


Reply to this email directly or view it on GitHub:
#497 (comment)

No worries, stuff like that is not difficult to add.

I would also like to extend the db layer with some other calls too, like
make a call to do backups.

Maas

Owner

mmzeeman commented Jan 11, 2013

I'm with Arjan, why not have the rsc_update_done event sent /after/ the
transaction has been committed.
To me that feels like the right thing to do any way. Why send a done event
for something that is not yet committed?


Reply to this email directly or view it on GitHub:
#497 (comment)

That is a bit of a race condition indeed. In channel I make sure
notifications are sent after the transaction.

One way to do that is add parameters to the transaction function with
functions to call after the commit or something like that. So you do
something like

F = function(Ctx) ->
db stuff
end,
AfterCommit = function(R) ->
z_notifier:notify(...) end,

z_db:transaction(F, AfterCommit)

Also mind that sql transactions can be retried when possible.

Owner

mworrell commented Jan 11, 2013

Problem is that this is a nested transaction. Inside the rsc update routines the notification is sent after the insert transaction. But the module schema init has a transaction around the rsc update one.

The notification needs to be delayed till after the outer schema init transaction.

Sent from my iPhone

On 11 jan. 2013, at 12:22, Andreas Stenius notifications@github.com wrote:

I'm with Arjan, why not have the rsc_update_done event sent /after/ the transaction has been committed.
To me that feels like the right thing to do any way. Why send a done event for something that is not yet committed?


Reply to this email directly or view it on GitHub.

Owner

mmzeeman commented Jan 11, 2013

O yeah, tiny but important detail.... Then collecting
things-to-do-after-commit in the context is the only option.

Maas

Problem is that this is a nested transaction. Inside the rsc update
routines the notification is sent after the insert transaction. But the
module schema init has a transaction around the rsc update one.

The notification needs to be delayed till after the outer schema init
transaction.

Sent from my iPhone

On 11 jan. 2013, at 12:22, Andreas Stenius notifications@github.com
wrote:

I'm with Arjan, why not have the rsc_update_done event sent /after/ the
transaction has been committed.
To me that feels like the right thing to do any way. Why send a done
event for something that is not yet committed?


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub:
#497 (comment)

Owner

arjan commented Jan 11, 2013

yup..

Owner

kaos commented Jan 11, 2013

Or make z_notifier transaction aware, with functions for sending notifications that should be sent outside of any transactions... So that notification can be sent inside the transaction, but z_notifier will queue them until the transaction is done (and maybe discard them in case of a rollback rather than a commit).

Owner

mworrell commented Jan 11, 2013

Though there are some z_notifier things you want to do during a transaction.
So an explicit tasks list to be done on commit (or rollback) is needed.

As the transaction is something managed by the db connection, I think we need to collect those lists in the connection process.

Owner

mmzeeman commented Jan 11, 2013

+1

Or make z_notifier transaction aware, with functions for sending
notifications that should be sent outside of any transactions... So that
notification can be sent inside the transaction, but z_notifier will queue
them until the transaction is done (and maybe discard them in case of a
rollback rather than a commit).


Reply to this email directly or view it on GitHub:
#497 (comment)

Owner

kaos commented Jan 11, 2013

Yep. could be like z_db:on_commit(F, Ctx) and z_db:on_rollback(F, Ctx). That way you can queue any kind of operation wrapped in a fun expression. Maybe also a z_db:on_end_transaction(F, Ctx) for completeness, if you want F to be called regardless of commit or rollback.

Owner

mworrell commented Jan 11, 2013

We need a bit more processing than just some notifications after the transaction has been committed.
I prefer a more generic solution involving a task queue on commit/rollback.

Owner

kaos commented Jan 11, 2013

That's what I suggested, but expressed with different words....
[edit] and possibly with options on the tasks if they are to be run on commit, rollback or both.

Owner

mworrell commented Jan 11, 2013

:-) our comments crossed each other

Owner

kaos commented Jan 11, 2013

Ah, ok :). Good to know I wasn't writing gibberish. :p

Owner

mmzeeman commented Jan 11, 2013

Indeed, the connection should do that.

Btw, the new db layer is not that far away. Especially not the postgres version. It supports customizable connection processes... I only don't have time to do it right now. Maybe in the weekend, but I have some other projects waiting too.

Owner

mmzeeman commented Jan 11, 2013

@marc I fished up some email correspondence with you from december 2011 about exactly this. It contains another idea for collecting the things-to-do-after the commit... They can also be placed in the cache....

Owner

mmzeeman commented Jan 11, 2013

Looking at m_rsc_update, there are indeed some other things which should be done after the transaction finishes.

  • Flushing the depcache. Currently there is race condition, the rcs might be retrieved between the flush and the commit so the old value will be cached.
  • Starting a category renumbering (again a sub transaction), Shouldn't the transaction be cleared here? Now it could use the connection of the transaction in the pivot task. Not sure about that though.
  • and notifying.

Basically all code with side-effects other then direct db related stuff should be done after the transaction finishes.

It might be better to refactor the rsc update code first before deciding to add the after commit functions to the connection. Maybe it isn't even needed.

Owner

mworrell commented Jan 13, 2013

When did we have that correspondence, then I check my e-mail as well :)

Owner

mworrell commented Jan 13, 2013

@mmzeeman the problem is the nested transaction. All updates are done within the transaction that initializes the module's data model. Currently all steps you mention are happening after the m_rsc_update transaction.

Owner

mmzeeman commented Jan 13, 2013

21 december 2011

Then it wasn't a nested transaction though.

When did we have that correspondence, then I check my e-mail as well :)


Reply to this email directly or view it on GitHub.

Owner

arjan commented Jan 30, 2013

so guys, whats the proposed solution here?

Owner

mworrell commented Jan 30, 2013

I propose to collect all "post commit/rollback" notifications in the db process.
To be handled by the z_db transaction function.
(Unless the db process crashes, in which case the lists are gone).

Owner

arjan commented Mar 14, 2013

postponing this to the new db layer impl

arjan added a commit that referenced this issue Apr 17, 2014

mod_survey: Work around mod_survey initialization error
Prevent database deadlock by doing the queries async after a delay. Not
the most beautiful solution, but it appears to work.

Fixes #734

See issue #497 for the discussion on the "proper" solution.

@arjan arjan modified the milestones: Release 0.11, Release 0.10 Apr 17, 2014

@arjan arjan modified the milestones: Release 1.0, Release 0.11 Sep 30, 2014

@mworrell mworrell modified the milestones: Roadmap, 1.0 Feb 9, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment