Application locks has stopped on double write-lock #19

ddosia · 2015-10-21T13:55:59Z

I have two actors which works approximately in the same time. Each of them begins transaction. Each of them acquires read lock on the same oid(). Then first tries to upgrade read lock to write lock. Second does the same and application crashes immediately:

Logs of the first actor:

Erlang R16B03-1 (erts-5.10.4) [source] [64-bit] [smp:4:4] [async-threads:10] [hipe] [kernel-poll:false]

Eshell V5.10.4  (abort with ^G)
(n1@dch-mbp)1> application:ensure_all_started(locks).
{ok,[locks]}
(n1@dch-mbp)2> {Agent, TrRes} = locks:begin_transaction().
{<0.46.0>,{ok,[]}}
(n1@dch-mbp)3> locks:lock(Agent, [table], read).
{ok,[]}
(n1@dch-mbp)4> locks:lock(Agent, [table], write).
=ERROR REPORT==== 21-Oct-2015::14:45:19 ===                                                                                                                                                                [20/376]
** Generic server locks_server terminating 
** Last message in was {'$gen_cast',{surrender,[table],<0.55.0>}}
** When Server state == {st,{locks_server_locks,locks_server_agents},
                            {dict,2,16,16,8,80,48,
                                  {[],[],[],[],[],[],[],[],[],[],[],[],[],[],
                                   [],[]},
                                  {{[],[],[],[],[],[],[],
                                    [[<0.55.0>|#Ref<0.0.0.76>]],
                                    [],[],[],[],[],[],
                                    [[<0.46.0>|#Ref<0.0.0.69>]],
                                    []}}},
                            <0.44.0>}
** Reason for termination == 
** {function_clause,[{locks_server,queue_entries_,
                                   [[{entry,<0.55.0>,<0.53.0>,4,direct}]],
                                   [{file,"src/locks_server.erl"},{line,211}]},
                     {locks_server,queue_entries_,1,
                                   [{file,"src/locks_server.erl"},{line,214}]},
                     {locks_server,queue_entries_,1,
                                   [{file,"src/locks_server.erl"},{line,214}]},
                     {locks_server,queue_entries_,1,
                                   [{file,"src/locks_server.erl"},{line,212}]},
                     {locks_server,queue_entries,1,
                                   [{file,"src/locks_server.erl"},{line,207}]},
                     {locks_server,notify,3,
                                   [{file,"src/locks_server.erl"},{line,193}]},
                     {locks_server,handle_cast,2,
                                   [{file,"src/locks_server.erl"},{line,142}]},
                     {gen_server,handle_msg,5,
                                 [{file,"gen_server.erl"},{line,604}]}]}

=INFO REPORT==== 21-Oct-2015::14:45:19 ===
    application: locks
    exited: shutdown
    type: temporary
** exception error: {cannot_lock_objects,[{req,[table],
                                               read,
                                               ['n1@dch-mbp'],
                                               0,all},
                                          {req,[table],write,['n1@dch-mbp'],1,all}]}
     in function  locks_agent:await_reply/1 (src/locks_agent.erl, line 397)
     in call from locks_agent:lock_/6 (src/locks_agent.erl, line 380)
(n1@dch-mbp)5> application:which_applications().
[{stdlib,"ERTS  CXC 138 10","1.19.4"},
 {kernel,"ERTS  CXC 138 10","2.16.4"}]

Logs of the second actor:

Erlang R16B03-1 (erts-5.10.4) [source] [64-bit] [smp:4:4] [async-threads:10] [hipe] [kernel-poll:false]

Eshell V5.10.4  (abort with ^G)
(n2@dch-mbp)1> 
User switch command
 --> r 'n1@dch-mbp'
 --> c
Eshell V5.10.4  (abort with ^G)
(n1@dch-mbp)1> {Agent, TrRes} = locks:begin_transaction().
{<0.55.0>,{ok,[]}}
(n1@dch-mbp)2> locks:lock(Agent, [table], read).
{ok,[]}
(n1@dch-mbp)3> locks:lock(Agent, [table], write).
** exception error: {cannot_lock_objects,[{req,[table],
                                               read,
                                               ['n1@dch-mbp'],
                                               0,all},
                                          {req,[table],write,['n1@dch-mbp'],1,all}]}
     in function  locks_agent:await_reply/1 (src/locks_agent.erl, line 397)
     in call from locks_agent:lock_/6 (src/locks_agent.erl, line 380)

I am new to locks so I am trying to learn how it works. In some sense I need lock upgrade functionality, that is why I was curious how it works. Maybe I miss something and what I did goes against very basics of what locks should do.

The text was updated successfully, but these errors were encountered:

uwiger · 2015-10-21T19:20:53Z

Could you try the PR above ( #20 )? I added a test case, which did fail before this fix.

ddosia · 2015-10-22T12:43:01Z

It doesn't crash any more, but it hangs forever on both sides when I am trying to acquire write lock.
My naive understanding of upgrade lock is like that: both acquires read lock, first tries to upgrade to write lock and this implies that it releases read lock and stands in the end of the queue, second does the same and should release read lock and step after first one. Maybe I misunderstand how lock upgrade works? Why then deadlock detection mechanism doesn't prevent me from this?

uwiger · 2015-10-31T21:02:25Z

It's not a question of the deadlock resolution algorithm, but rather of the lock upgrade semantics. Specifically, the locks_server handles the trivial case of upgrade when there's one read lock, but when there are several read locks, it can't differentiate between agents that want nothing more than a read lock and agents that are holding a read lock but hoping to upgrade.

uwiger · 2015-10-31T21:05:22Z

I'm at a Halloween party, so probably not sober enough to tackle the issue right now, nor would it likely be socially acceptable. ;-)

If contribs are offered, I'll gratefully review them. Otherwise, I'll take a look at this later.

uwiger · 2015-10-31T21:39:53Z

Another problem is that the test case needs to verify that the two write lock requests reach different results (currently, they both time out, which is wrong).

uwiger · 2015-11-01T22:13:09Z

I've pushed some fixes to the uw-lock_upgrade3 branch. They seem to fix the problem.

Could you try to verify at your end?

ddosia · 2015-11-03T12:03:40Z

It works now, first actor obtains write lock immediately after second tries to acquire write lock.
thanks!

uwiger · 2015-11-03T12:11:11Z

Thanks! I've merged PR #20 into master.

uwiger mentioned this issue Oct 21, 2015

faulty locks_server:append_read_entry() #20

Merged

uwiger closed this as completed Nov 3, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Application locks has stopped on double write-lock #19

Application locks has stopped on double write-lock #19

ddosia commented Oct 21, 2015

uwiger commented Oct 21, 2015

ddosia commented Oct 22, 2015

uwiger commented Oct 31, 2015

uwiger commented Oct 31, 2015

uwiger commented Oct 31, 2015

uwiger commented Nov 1, 2015

ddosia commented Nov 3, 2015

uwiger commented Nov 3, 2015

Application *locks* has stopped on double write-lock #19

Application *locks* has stopped on double write-lock #19

Comments

ddosia commented Oct 21, 2015

uwiger commented Oct 21, 2015

ddosia commented Oct 22, 2015

uwiger commented Oct 31, 2015

uwiger commented Oct 31, 2015

uwiger commented Oct 31, 2015

uwiger commented Nov 1, 2015

ddosia commented Nov 3, 2015

uwiger commented Nov 3, 2015

Application locks has stopped on double write-lock #19

Application locks has stopped on double write-lock #19