Race around mnesia:create_schema, mnesia:create_table #1

Open
squaremo opened this Issue Mar 11, 2010 · 3 comments

2 participants

@squaremo
Collaborator

On my mac, RabbitMQ with rabbithub as a plugin often (nearly always, in fact) fails to start.
It aborts in rabbithub_app:setup_schema while calling mnesia:create_table, and gives {badtype, rabbithub_lease, disc_copies, rabbit@localhost} as the reason.

It's easy to elicit this from mnesia:
$ erl

1> mnesia:start().
ok
2> mnesia:create_table(foo, [{attributes, [bar, baz]}, {disc_copies, [node()]}]).
{aborted,{bad_type,foo,disc_copies,nonode@nohost}}

This doesn't happen if you've called mnesia:create_schema/1 before starting mnesia. But wait -- that's exactly what setup_schema does!

I suspect there's a race between RabbitMQ calling mnesia:create_schema/1 and mnesia:start/0, and rabbithub doing so then trying to create the table.

@squaremo
Collaborator

It's not just my mac. I can make this happen with the following steps, in rabbitmq-server:

  • remove all but rabbithub from plugins/
  • ./scripts/rabbitmq-activate-plugins
  • make cleandb
  • make run

To make it work again,

  • remove rabbithub from plugins/
  • ./scripts/rabbitmq-activate-plugins
  • make cleandb; make run; q().
  • reinstate plugins/rabbithub
  • ./scripts/rabbitmq-activate-plugins
  • make run
@squaremo
Collaborator

This looks like a bug (er, feature?) in mnesia:

$ erl
Erlang R13B02 (erts-5.7.3) [source] [smp:4:4] [rq:4] [async-threads:0] [hipe] [kernel-poll:false]

Eshell V5.7.3 (abort with ^G)
1> mnesia:start().
ok
2> mnesia:create_schema([node()]).
{error,{nonode@nohost,{already_exists,nonode@nohost}}}
3> mnesia:create_table(foo, [{attributes, [bar, baz]}, {disc_copies, [node()]}]).
{aborted,{bad_type,foo,disc_copies,nonode@nohost}}

The problem is that create_schema reports 'already_exists', even though it didn't in fact create the schema! The rabbithub code matches on that and proceeds, presuming it means what it says.

The solution is to call mnesia:stop() before create_schema; if mnesia is stopped, it will create the schema properly.

@tonyg
Owner

So looking at git blame rabbithub.erl, it looks like the boot_step was added at around the same time you were posting this report. I suspect the boot step fixes the issue: is that true? Can you still reproduce it with the latest revision?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment