Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible deadlock in ARC between arc_shrinker and arc_user_evicts_thread #4688

Closed
lorddoskias opened this issue May 23, 2016 · 6 comments
Closed
Labels
Component: Memory Management kernel memory management Status: Inactive Not being actively updated Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@lorddoskias
Copy link
Contributor

While running some zfs tests I got the following splat:

[   41.809167] random: nonblocking pool is initialized
[   61.964111]
[   61.964344] ======================================================
[   61.964853] [ INFO: possible circular locking dependency detected ]
[   61.965395] 4.6.0-rc5-nbor #8 Tainted: G           O
[   61.965817] -------------------------------------------------------
[   61.966319] arc_user_evicts/281 is trying to acquire lock:
[   61.966745]  (&buf->b_evict_lock){+.+...}, at: [<ffffffffa00c4899>] arc_do_user_evicts+0x139/0x3c0 [zfs]
[   61.967573]
[   61.967573] but task is already holding lock:
[   61.968031]  (&arc_user_evicts_lock){+.+.-.}, at: [<ffffffffa00c47aa>] arc_do_user_evicts+0x4a/0x3c0 [zfs]
[   61.968853]
[   61.968853] which lock already depends on the new lock.
[   61.968853]
[   61.969499]
[   61.969499] the existing dependency chain (in reverse order) is:
[   61.970087]
-> #1 (&arc_user_evicts_lock){+.+.-.}:
[   61.970533]        [<ffffffff810a596e>] lock_acquire+0xbe/0x1f0
[   61.971008]        [<ffffffff815fa8d3>] mutex_lock_nested+0x63/0x3b0
[   61.971524]        [<ffffffffa00ccc6e>] arc_evict_state+0x5de/0x1040 [zfs]
[   61.972089]        [<ffffffffa00cdcc6>] arc_adjust+0x5f6/0x760 [zfs]
[   61.972626]        [<ffffffffa00d1414>] arc_shrink+0x44/0x110 [zfs]
[   61.973146]        [<ffffffffa00d1aa9>] __arc_shrinker_func.isra.12+0xd9/0x1d0 [zfs]
[   61.973795]        [<ffffffffa00d1bb7>] arc_shrinker_func_scan_objects+0x17/0x30 [zfs]
[   61.974453]        [<ffffffff81157e0b>] shrink_slab.part.14+0x1eb/0x5a0
[   61.974982]        [<ffffffff8115b921>] shrink_zone+0x2c1/0x2d0
[   61.975466]        [<ffffffff8115cc41>] kswapd+0x471/0xa40
[   61.975907]        [<ffffffff81073f3e>] kthread+0xfe/0x120
[   61.976359]        [<ffffffff815fed92>] ret_from_fork+0x22/0x50
[   61.976834]
-> #0 (&buf->b_evict_lock){+.+...}:
[   61.977255]        [<ffffffff810a4e5b>] __lock_acquire+0x1cdb/0x1ee0
[   61.977766]        [<ffffffff810a596e>] lock_acquire+0xbe/0x1f0
[   61.978250]        [<ffffffff815fa8d3>] mutex_lock_nested+0x63/0x3b0
[   61.978757]        [<ffffffffa00c4899>] arc_do_user_evicts+0x139/0x3c0 [zfs]
[   61.979345]        [<ffffffffa00c6ad2>] arc_user_evicts_thread+0xc2/0x3c0 [zfs]
[   61.979943]        [<ffffffffa00048b5>] thread_generic_wrapper+0x85/0xc0 [spl]
[   61.980534]        [<ffffffff81073f3e>] kthread+0xfe/0x120
[   61.980975]        [<ffffffff815fed92>] ret_from_fork+0x22/0x50

[   61.981458] other info that might help us debug this:
[   61.981458]
[   61.982126]  Possible unsafe locking scenario:
[   61.982126]
[   61.982600]        CPU0                    CPU1
[   61.982957]        ----                    ----
[   61.983325]   lock(&arc_user_evicts_lock);
[   61.983671]                                lock(&buf->b_evict_lock);
[   61.984198]                                lock(&arc_user_evicts_lock);
[   61.984742]   lock(&buf->b_evict_lock);
[   61.985067]
[   61.985067]  *** DEADLOCK ***
[   61.985067]
[   61.985544] 1 lock held by arc_user_evicts/281:
[   61.985905]  #0:  (&arc_user_evicts_lock){+.+.-.}, at: [<ffffffffa00c47aa>] arc_do_user_evicts+0x4a/0x3c0 [zfs]
[   61.986771]
[   61.986771] stack backtrace:
[   61.987116] CPU: 1 PID: 281 Comm: arc_user_evicts Tainted: G           O    4.6.0-rc5-nbor #8
[   61.987788] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[   61.988533]  0000000000000000 ffff88003581bbd0 ffffffff813c2fc3 ffffffff826a2fe0
[   61.989148]  ffffffff82694f70 ffff88003581bc10 ffffffff810a17e0 ffff88003581bc60
[   61.989784]  ffff880039ee0b28 ffff880039ee0000 ffff880039ee0b00 0000000000000001
[   61.990412] Call Trace:
[   61.990612]  [<ffffffff813c2fc3>] dump_stack+0x85/0xc2
[   61.991015]  [<ffffffff810a17e0>] print_circular_bug+0x1e0/0x2e0
[   61.991496]  [<ffffffff810a4e5b>] __lock_acquire+0x1cdb/0x1ee0
[   61.991953]  [<ffffffff810a3ae9>] ? __lock_acquire+0x969/0x1ee0
[   61.992428]  [<ffffffff810a596e>] lock_acquire+0xbe/0x1f0
[   61.992867]  [<ffffffffa00c4899>] ? arc_do_user_evicts+0x139/0x3c0 [zfs]
[   61.993405]  [<ffffffff815fa8d3>] mutex_lock_nested+0x63/0x3b0
[   61.993889]  [<ffffffffa00c4899>] ? arc_do_user_evicts+0x139/0x3c0 [zfs]
[   61.994434]  [<ffffffff810a2dd4>] ? trace_hardirqs_on_caller+0x134/0x1c0
[   61.994972]  [<ffffffffa00c4899>] arc_do_user_evicts+0x139/0x3c0 [zfs]
[   61.995510]  [<ffffffffa00c6ad2>] arc_user_evicts_thread+0xc2/0x3c0 [zfs]
[   61.996055]  [<ffffffffa00c6a10>] ? hdr_recl+0x50/0x50 [zfs]
[   61.996514]  [<ffffffffa00048b5>] thread_generic_wrapper+0x85/0xc0 [spl]
[   61.997040]  [<ffffffffa0004830>] ? __thread_exit+0x20/0x20 [spl]
[   61.997528]  [<ffffffff81073f3e>] kthread+0xfe/0x120
[   61.997922]  [<ffffffff815fed92>] ret_from_fork+0x22/0x50
[   61.998359]  [<ffffffff81073e40>] ? kthread_create_on_node+0x230/0x230

If I have understood this correctly it seems that arc_do_user_evicts acquires arc_user_evicts_lock first and then b_evict_lock while setting b_hdr to null. At the same time the arc shrinker, which is invoked by the kernel in low memory conditions, locks b_evict_lock and then arc_user_evicts_lock. The actual call chain in arc_evict_state before inlining is : arc_evict_state->arc_evict_state_impl->arc_evict_hdr

@dweeezil
Copy link
Contributor

How does a4eeee6 look for a fix? Pinging @prakashsurya since I think this was part of the arc multilist rework.

@lorddoskias
Copy link
Contributor Author

lorddoskias commented May 23, 2016

@dweeezil it's getting late here but if I'm not completely asleep I think you've introduced a subtle change in semantics since you've also moved the:

if (buf->b_data != NULL)
           bytes_evicted += hdr->b_size;

in the else branch in case ->b_efunc is not set, meaning that you access b_data without the b_evict_lock being held. Now I haven't looked the code in details to see if this member is protected by the evict lock. This is something which should be checked. Also the check for b_efunc presence is done outside of the buffer's evict lock which itself might lead to subtle bugs.

@dweeezil
Copy link
Contributor

I was planning on looking into both of those issues but a quick look around earlier seemed to indicate that the exact bytes_evicted += hdr->b_size was done elsewhere without being under the lock. Also, a missed increment of bytes_evicted shouldn't (I think) cause too much trouble, anyway. The main point was to restructure the code so that the arc_user_evicts lock was taken first and protected the code manipulating arc_eviction_list.

@lorddoskias
Copy link
Contributor Author

@dweeezil so took another look in arc_do_user_evicts and b_efunc is being checked and written to outside of b_evict_lock. So based on that LGTM.

@prakashsurya
Copy link
Member

is the potential deadlock due to how the locks are acquired in arc_evict_hdr, i.e.

        if (!mutex_tryenter(&buf->b_evict_lock)) {
            ARCSTAT_BUMP(arcstat_mutex_miss);
            break;
        }
...
        if (buf->b_efunc != NULL) {
            mutex_enter(&arc_user_evicts_lock);
...
            arc_eviction_list = buf;
            cv_signal(&arc_user_evicts_cv);
            mutex_exit(&arc_user_evicts_lock);
            mutex_exit(&buf->b_evict_lock);
...

and arc_do_user_evicts, i.e.

    mutex_enter(&arc_user_evicts_lock);
    while (arc_eviction_list != NULL) {
        arc_buf_t *buf = arc_eviction_list;
        arc_eviction_list = buf->b_next;
        mutex_enter(&buf->b_evict_lock);
...
        mutex_enter(&arc_user_evicts_lock);
    }
    mutex_exit(&arc_user_evicts_lock);

?

If so, my initial thought is this isn't an issue because arc_do_user_evicts will only lock buffers that are on the arc_eviction_list, and arc_evict_hdr won't ever place a buffer on the list unless it's able to grab both locks.

I could easily be overlooking something though, but before I look at this in detail, I want to verify the two call paths (it looks like the stacks in the bug report are truncated, probably due to inline functions).

@STRML
Copy link

STRML commented Jul 4, 2016

This looks very similar to a deadlock I'm seeing every few days on a test server.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Memory Management kernel memory management Status: Inactive Not being actively updated Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests

5 participants
@prakashsurya @STRML @dweeezil @lorddoskias and others