Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sporadic deadlocks during rsync #4554

Closed
varju opened this issue Apr 24, 2016 · 8 comments · Fixed by #4571
Closed

Sporadic deadlocks during rsync #4554

varju opened this issue Apr 24, 2016 · 8 comments · Fixed by #4571
Milestone

Comments

@varju
Copy link

varju commented Apr 24, 2016

For a number of months now I've been seeing deadlocks once or twice a week. Typically rsync will be running when I notice the problem, but this might not be the actual trigger.

CPUs:           1
Memory:         32GB
VM/Hypervisor:  no
ECC mem:        no
Distribution:       Ubuntu 15.10
Kernel version: Linux mars 4.2.0-35-generic #40-Ubuntu SMP Tue Mar 15 22:15:45 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
SPL/ZFS source: Pre-built packages
SPL/ZFS version:    DebianZFS-Wheezy-SCST2:/usr/src/zfs# dmesg | grep -E 'SPL:|ZFS:'
                [    2.540352] SPL: Loaded module v0.6.5.6-1~wily
                [    2.557759] ZFS: Loaded module v0.6.5.6-1~wily, ZFS pool version 5000, ZFS filesystem version 5
                [    3.780796] SPL: using hostid 0x007f0101
System services:    Home server (rsync, rtorrent, docker containers)

Pool configuration:

tank:
    version: 5000
    name: 'tank'
    state: 0
    txg: 9623613
    pool_guid: 15827881257669332996
    errata: 0
    hostid: 8323329
    hostname: 'mars'
    vdev_children: 1
    vdev_tree:
        type: 'root'
        id: 0
        guid: 15827881257669332996
        children[0]:
            type: 'mirror'
            id: 0
            guid: 800403149707481874
            whole_disk: 0
            metaslab_array: 33
            metaslab_shift: 34
            ashift: 12
            asize: 4000698597376
            is_log: 0
            create_txg: 4
            children[0]:
                type: 'disk'
                id: 0
                guid: 4337798707782094442
                path: '/dev/disk/by-id/usb-Seagate_Expansion_Desk_NA4KA8F8-0:0-part1'
                whole_disk: 1
                DTL: 93927
                create_txg: 4
            children[1]:
                type: 'disk'
                id: 1
                guid: 5447552150331130358
                path: '/dev/disk/by-id/ata-ST5000DM000-1FK178_W4J0HYAC-part1'
                whole_disk: 1
                DTL: 122910
                create_txg: 4
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data

Stack traces that include zfs:

PID 1386:
zfs snapshot -o com.sun:auto-snapshot-desc - -r tank zfs-auto-snap_hourly-2016-04-24-0117 
[<ffffffffc09d020b>] arc_buf_remove_ref+0xab/0x150 [zfs]
[<ffffffffc09d64be>] dbuf_rele_and_unlock+0xae/0x3e0 [zfs]
[<ffffffffc09d6936>] dmu_buf_rele+0x36/0x40 [zfs]
[<ffffffffc09d694e>] dbuf_rele+0xe/0x10 [zfs]
[<ffffffffc09f2af7>] dnode_rele_and_unlock+0x77/0x90 [zfs]
[<ffffffffc09f2b49>] dnode_rele+0x39/0x40 [zfs]
[<ffffffffc09d6783>] dbuf_rele_and_unlock+0x373/0x3e0 [zfs]
[<ffffffffc09d6936>] dmu_buf_rele+0x36/0x40 [zfs]
[<ffffffffc0a19ac4>] sa_handle_destroy+0x84/0xe0 [zfs]
[<ffffffffc0a75149>] zfs_zinactive+0x89/0xf0 [zfs]
[<ffffffffc0a6e3f1>] zfs_inactive+0x61/0x270 [zfs]
[<ffffffffc0a85be7>] zpl_evict_inode+0x47/0x60 [zfs]
[<ffffffff8121ba4a>] evict+0xba/0x180
[<ffffffff8121bb4a>] dispose_list+0x3a/0x50
[<ffffffff8121cc6a>] prune_icache_sb+0x5a/0x80
[<ffffffff81203a6f>] super_cache_scan+0x14f/0x1a0
[<ffffffff81197839>] shrink_slab+0x219/0x400
[<ffffffff8119c159>] shrink_zone+0x2b9/0x2d0
[<ffffffff8119c2dd>] do_try_to_free_pages+0x16d/0x400
[<ffffffff8119c63e>] try_to_free_pages+0xce/0x180
[<ffffffff8118e25f>] __alloc_pages_nodemask+0x60f/0xa30
[<ffffffff811d5f31>] alloc_pages_current+0x91/0x100
[<ffffffff811df428>] new_slab+0x368/0x450
[<ffffffff811e048f>] __slab_alloc+0x27f/0x4b0
[<ffffffff811e174f>] kmem_cache_alloc+0x1af/0x210
[<ffffffffc06b9c42>] spl_kmem_cache_alloc+0x72/0x7e0 [spl]
[<ffffffffc0a7a6d9>] zio_buf_alloc+0x59/0x60 [zfs]
[<ffffffffc09cc894>] arc_get_data_buf.isra.21+0x274/0x3a0 [zfs]
[<ffffffffc09d06b8>] arc_read+0x398/0xa30 [zfs]
[<ffffffffc09d8f81>] dbuf_prefetch+0x181/0x2d0 [zfs]
[<ffffffffc09f033e>] dmu_zfetch_dofetch.isra.6+0xee/0x160 [zfs]
[<ffffffffc09f0aca>] dmu_zfetch+0x48a/0xe30 [zfs]
[<ffffffffc09d8704>] dbuf_read+0x754/0x860 [zfs]
[<ffffffffc09f2538>] dnode_hold_impl+0xb8/0x4e0 [zfs]
[<ffffffffc09f2979>] dnode_hold+0x19/0x20 [zfs]
[<ffffffffc09e14a6>] dmu_bonus_hold+0x36/0x280 [zfs]
[<ffffffffc0a04d04>] dsl_dir_hold_obj+0x44/0x3e0 [zfs]
[<ffffffffc0a0525a>] dsl_dir_hold+0x1ba/0x2c0 [zfs]
[<ffffffffc09f9620>] dsl_dataset_hold+0x40/0x220 [zfs]
[<ffffffffc0a5a969>] zfs_secpolicy_write_perms.isra.8+0x69/0xc0 [zfs]
[<ffffffffc0a5b375>] zfs_secpolicy_snapshot+0xa5/0xc0 [zfs]
[<ffffffffc0a5c37e>] zfsdev_ioctl+0x17e/0x4b0 [zfs]
[<ffffffff81213cd5>] do_vfs_ioctl+0x295/0x480
[<ffffffff81213f39>] SyS_ioctl+0x79/0x90
[<ffffffff817f8c72>] entry_SYSCALL_64_fastpath+0x16/0x75
[<ffffffffffffffff>] 0xffffffffffffffff

PID 15852:
zfs list -H -t filesystem volume -s name -o name com.sun:auto-snapshot com.sun:auto-snapshot:frequent 
[<ffffffffc09f082a>] dmu_zfetch+0x1ea/0xe30 [zfs]
[<ffffffffc09d8704>] dbuf_read+0x754/0x860 [zfs]
[<ffffffffc09f2538>] dnode_hold_impl+0xb8/0x4e0 [zfs]
[<ffffffffc09f2979>] dnode_hold+0x19/0x20 [zfs]
[<ffffffffc09e14a6>] dmu_bonus_hold+0x36/0x280 [zfs]
[<ffffffffc0a04d04>] dsl_dir_hold_obj+0x44/0x3e0 [zfs]
[<ffffffffc0a0525a>] dsl_dir_hold+0x1ba/0x2c0 [zfs]
[<ffffffffc09f9620>] dsl_dataset_hold+0x40/0x220 [zfs]
[<ffffffffc09e43bc>] dmu_objset_hold+0x6c/0xb0 [zfs]
[<ffffffffc0a5cf32>] zfs_ioc_objset_stats+0x32/0xa0 [zfs]
[<ffffffffc0a5d1b5>] zfs_ioc_dataset_list_next+0x125/0x150 [zfs]
[<ffffffffc0a5c623>] zfsdev_ioctl+0x423/0x4b0 [zfs]
[<ffffffff81213cd5>] do_vfs_ioctl+0x295/0x480
[<ffffffff81213f39>] SyS_ioctl+0x79/0x90
[<ffffffff817f8c72>] entry_SYSCALL_64_fastpath+0x16/0x75
[<ffffffffffffffff>] 0xffffffffffffffff

PID 1857:
[<ffffffffc06bfd0b>] cv_wait_common+0x10b/0x140 [spl]
[<ffffffffc06bfd75>] __cv_wait_sig+0x15/0x20 [spl]
[<ffffffffc0a33dd7>] txg_quiesce_thread+0x3e7/0x3f0 [zfs]
[<ffffffffc06bae61>] thread_generic_wrapper+0x71/0x80 [spl]
[<ffffffff8109c2b8>] kthread+0xd8/0xf0
[<ffffffff817f909f>] ret_from_fork+0x3f/0x70
[<ffffffffffffffff>] 0xffffffffffffffff

PID 1858:
[<ffffffffc09d1855>] arc_release+0xc5/0x490 [zfs]
[<ffffffffc09d76cd>] dbuf_write.isra.14+0x9d/0x430 [zfs]
[<ffffffffc09da221>] dbuf_sync_leaf+0x131/0x3b0 [zfs]
[<ffffffffc09daa80>] dbuf_sync_list+0xf0/0x100 [zfs]
[<ffffffffc09da8e6>] dbuf_sync_indirect+0xe6/0x190 [zfs]
[<ffffffffc09daa5e>] dbuf_sync_list+0xce/0x100 [zfs]
[<ffffffffc09da8e6>] dbuf_sync_indirect+0xe6/0x190 [zfs]
[<ffffffffc09daa5e>] dbuf_sync_list+0xce/0x100 [zfs]
[<ffffffffc09da8e6>] dbuf_sync_indirect+0xe6/0x190 [zfs]
[<ffffffffc09daa5e>] dbuf_sync_list+0xce/0x100 [zfs]
[<ffffffffc09da8e6>] dbuf_sync_indirect+0xe6/0x190 [zfs]
[<ffffffffc09daa5e>] dbuf_sync_list+0xce/0x100 [zfs]
[<ffffffffc09da8e6>] dbuf_sync_indirect+0xe6/0x190 [zfs]
[<ffffffffc09daa5e>] dbuf_sync_list+0xce/0x100 [zfs]
[<ffffffffc09da8e6>] dbuf_sync_indirect+0xe6/0x190 [zfs]
[<ffffffffc09daa5e>] dbuf_sync_list+0xce/0x100 [zfs]
[<ffffffffc09f57f2>] dnode_sync+0x312/0x910 [zfs]
[<ffffffffc09e5326>] dmu_objset_sync+0x126/0x320 [zfs]
[<ffffffffc09fe052>] dsl_dataset_sync+0x52/0xa0 [zfs]
[<ffffffffc0a072df>] dsl_pool_sync+0x9f/0x430 [zfs]
[<ffffffffc0a23019>] spa_sync+0x369/0xb30 [zfs]
[<ffffffffc0a34670>] txg_sync_thread+0x3c0/0x640 [zfs]
[<ffffffffc06bae61>] thread_generic_wrapper+0x71/0x80 [spl]
[<ffffffff8109c2b8>] kthread+0xd8/0xf0
[<ffffffff817f909f>] ret_from_fork+0x3f/0x70
[<ffffffffffffffff>] 0xffffffffffffffff

PID 26396:
rsync --server -vlWHogDtprxSze.iLsfxC --delete-during --delete-excluded --ignore-errors --numeric-ids . /tank/manual-backups/io 
[<ffffffffc09ca674>] buf_hash_find+0xa4/0x160 [zfs]
[<ffffffffc09d0440>] arc_read+0x120/0xa30 [zfs]
[<ffffffffc09d8269>] dbuf_read+0x2b9/0x860 [zfs]
[<ffffffffc09e1870>] dmu_buf_hold+0x50/0x80 [zfs]
[<ffffffffc0a489d1>] zap_lockdir+0x61/0x940 [zfs]
[<ffffffffc0a4946a>] zap_cursor_retrieve+0x1ba/0x300 [zfs]
[<ffffffffc0a69816>] zfs_readdir+0x146/0x490 [zfs]
[<ffffffffc0a84876>] zpl_iterate+0x56/0x80 [zfs]
[<ffffffff81213fe2>] iterate_dir+0x92/0x120
[<ffffffff8121447d>] SyS_getdents+0x8d/0x100
[<ffffffff817f8c72>] entry_SYSCALL_64_fastpath+0x16/0x75
[<ffffffffffffffff>] 0xffffffffffffffff

PID 4195:
rtorrent 
[<ffffffffc09cfe85>] arc_buf_add_ref+0xa5/0x1d0 [zfs]
[<ffffffffc09d8881>] __dbuf_hold_impl+0x71/0x550 [zfs]
[<ffffffffc09d8dd2>] dbuf_hold_impl+0x72/0xa0 [zfs]
[<ffffffffc09d90ff>] dbuf_hold+0x2f/0x60 [zfs]
[<ffffffffc09e07a0>] dmu_buf_hold_array_by_dnode+0x100/0x480 [zfs]
[<ffffffffc09e114f>] dmu_read+0x9f/0x190 [zfs]
[<ffffffffc0a68091>] zfs_getpage+0x121/0x1e0 [zfs]
[<ffffffffc0a840ab>] zpl_readpage+0x5b/0xb0 [zfs]
[<ffffffff81186c0b>] filemap_fault+0x23b/0x3f0
[<ffffffff811b31a0>] __do_fault+0x50/0xe0
[<ffffffff811b8206>] handle_mm_fault+0xf96/0x1800
[<ffffffff81068ab7>] __do_page_fault+0x197/0x400
[<ffffffff81068d42>] do_page_fault+0x22/0x30
[<ffffffff817fabc8>] page_fault+0x28/0x30
[<ffffffffffffffff>] 0xffffffffffffffff

PID 7345:
zfs list -H -t filesystem volume -s name -o name com.sun:auto-snapshot com.sun:auto-snapshot:frequent 
[<ffffffffc09f082a>] dmu_zfetch+0x1ea/0xe30 [zfs]
[<ffffffffc09d8704>] dbuf_read+0x754/0x860 [zfs]
[<ffffffffc09f2538>] dnode_hold_impl+0xb8/0x4e0 [zfs]
[<ffffffffc09f2979>] dnode_hold+0x19/0x20 [zfs]
[<ffffffffc09e14a6>] dmu_bonus_hold+0x36/0x280 [zfs]
[<ffffffffc0a04d04>] dsl_dir_hold_obj+0x44/0x3e0 [zfs]
[<ffffffffc0a0525a>] dsl_dir_hold+0x1ba/0x2c0 [zfs]
[<ffffffffc09f9620>] dsl_dataset_hold+0x40/0x220 [zfs]
[<ffffffffc09e43bc>] dmu_objset_hold+0x6c/0xb0 [zfs]
[<ffffffffc0a5cf32>] zfs_ioc_objset_stats+0x32/0xa0 [zfs]
[<ffffffffc0a5d1b5>] zfs_ioc_dataset_list_next+0x125/0x150 [zfs]
[<ffffffffc0a5c623>] zfsdev_ioctl+0x423/0x4b0 [zfs]
[<ffffffff81213cd5>] do_vfs_ioctl+0x295/0x480
[<ffffffff81213f39>] SyS_ioctl+0x79/0x90
[<ffffffff817f8c72>] entry_SYSCALL_64_fastpath+0x16/0x75
[<ffffffffffffffff>] 0xffffffffffffffff

PID 813:
[<ffffffffc06bfb25>] __cv_timedwait_common+0xc5/0x160 [spl]
[<ffffffffc06bfbf3>] __cv_timedwait_sig+0x13/0x20 [spl]
[<ffffffffc09d118f>] arc_reclaim_thread+0x12f/0x240 [zfs]
[<ffffffffc06bae61>] thread_generic_wrapper+0x71/0x80 [spl]
[<ffffffff8109c2b8>] kthread+0xd8/0xf0
[<ffffffff817f909f>] ret_from_fork+0x3f/0x70
[<ffffffffffffffff>] 0xffffffffffffffff

PID 814:
[<ffffffffc06bfb25>] __cv_timedwait_common+0xc5/0x160 [spl]
[<ffffffffc06bfbf3>] __cv_timedwait_sig+0x13/0x20 [spl]
[<ffffffffc09cb413>] arc_user_evicts_thread+0xd3/0x150 [zfs]
[<ffffffffc06bae61>] thread_generic_wrapper+0x71/0x80 [spl]
[<ffffffff8109c2b8>] kthread+0xd8/0xf0
[<ffffffff817f909f>] ret_from_fork+0x3f/0x70
[<ffffffffffffffff>] 0xffffffffffffffff

PID 815:
[<ffffffffc06bfb25>] __cv_timedwait_common+0xc5/0x160 [spl]
[<ffffffffc06bfbf3>] __cv_timedwait_sig+0x13/0x20 [spl]
[<ffffffffc09cdfde>] l2arc_feed_thread+0x9e/0xce0 [zfs]
[<ffffffffc06bae61>] thread_generic_wrapper+0x71/0x80 [spl]
[<ffffffff8109c2b8>] kthread+0xd8/0xf0
[<ffffffff817f909f>] ret_from_fork+0x3f/0x70
[<ffffffffffffffff>] 0xffffffffffffffff
$ cat /proc/spl/kstat/zfs/arcstats
6 1 0x01 91 4368 2554302254 356450570380444
name                            type data
hits                            4    100254112
misses                          4    43395126
demand_data_hits                4    26339343
demand_data_misses              4    631091
demand_metadata_hits            4    62155992
demand_metadata_misses          4    40019266
prefetch_data_hits              4    270848
prefetch_data_misses            4    2074390
prefetch_metadata_hits          4    11487929
prefetch_metadata_misses        4    670379
mru_hits                        4    15133350
mru_ghost_hits                  4    1413489
mfu_hits                        4    73381845
mfu_ghost_hits                  4    3467166
deleted                         4    5145773
mutex_miss                      4    3680
evict_skip                      4    107212294
evict_not_enough                4    6544502
evict_l2_cached                 4    0
evict_l2_eligible               4    375381034496
evict_l2_ineligible             4    219200025088
evict_l2_skip                   4    0
hash_elements                   4    560196
hash_elements_max               4    914576
hash_collisions                 4    1602690
hash_chains                     4    35060
hash_chain_max                  4    5
p                               4    7356586025
c                               4    16502760944
c_min                           4    33554432
c_max                           4    16796549120
size                            4    16486070336
hdr_size                        4    222271784
data_size                       4    8584200192
metadata_size                   4    3169592832
other_size                      4    4510005528
anon_size                       4    1782982656
anon_evictable_data             4    0
anon_evictable_metadata         4    0
mru_size                        4    3222027264
mru_evictable_data              4    1557180416
mru_evictable_metadata          4    490040320
mru_ghost_size                  4    8812756992
mru_ghost_evictable_data        4    6427667456
mru_ghost_evictable_metadata    4    2385089536
mfu_size                        4    6748783104
mfu_evictable_data              4    5238257664
mfu_evictable_metadata          4    850108928
mfu_ghost_size                  4    723440640
mfu_ghost_evictable_data        4    390201344
mfu_ghost_evictable_metadata    4    333239296
l2_hits                         4    0
l2_misses                       4    0
l2_feeds                        4    0
l2_rw_clash                     4    0
l2_read_bytes                   4    0
l2_write_bytes                  4    0
l2_writes_sent                  4    0
l2_writes_done                  4    0
l2_writes_error                 4    0
l2_writes_lock_retry            4    0
l2_evict_lock_retry             4    0
l2_evict_reading                4    0
l2_evict_l1cached               4    0
l2_free_on_write                4    0
l2_cdata_free_on_write          4    0
l2_abort_lowmem                 4    0
l2_cksum_bad                    4    0
l2_io_error                     4    0
l2_size                         4    0
l2_asize                        4    0
l2_hdr_size                     4    0
l2_compress_successes           4    0
l2_compress_zeros               4    0
l2_compress_failures            4    0
memory_throttle_count           4    0
duplicate_buffers               4    0
duplicate_buffers_size          4    0
duplicate_reads                 4    0
memory_direct_count             4    6775
memory_indirect_count           4    135338
arc_no_grow                     4    0
arc_tempreserve                 4    0
arc_loaned_bytes                4    0
arc_prune                       4    0
arc_meta_used                   4    7901870144
arc_meta_limit                  4    12597411840
arc_meta_max                    4    12488268552
arc_meta_min                    4    16777216
arc_need_free                   4    0
arc_sys_free                    4    524890112

$ cat  /proc/spl/kmem/slab
--------------------- cache -------------------------------------------------------  ----- slab ------  ---- object -----  --- emergency ---
name                                    flags      size     alloc slabsize  objsize  total alloc   max  total alloc   max  dlock alloc   max
spl_vn_cache                          0x00020         0         0     8192      112      0     0     5      0     0   210      0     0     0
spl_vn_file_cache                     0x00020         0         0     8192      120      0     0     5      0     0   210      0     0     0
spl_zlib_workspace_cache              0x00240         0         0  2145216   268104      0     0     0      0     0     0      0     0     0
ddt_cache                             0x00040    398592    248640   199296    24864      2     2     5     16    10    40      0     0     0
zio_buf_20480                         0x00042   9232384   7372800   200704    20480     46    45    81    368   360   648      0     0     0
zio_data_buf_20480                    0x00042   5218304   3092480   200704    20480     26    26   247    208   151  1976      0     0     0
zio_buf_24576                         0x00042   7471104   6217728   233472    24576     32    32    55    256   253   440      0     0     0
zio_data_buf_24576                    0x00042   5136384   2826240   233472    24576     22    22   146    176   115  1168      0     0     0
zio_buf_28672                         0x00042   9584640   7254016   266240    28672     36    36    80    288   253   640      0     0     0
zio_data_buf_28672                    0x00042   3993600   2895872   266240    28672     15    15   106    120   101   848      0     0     0
zio_buf_32768                         0x00042   8073216   6062080   299008    32768     27    27    42    216   185   336      0     0     0
zio_data_buf_32768                    0x00042   6279168   5308416   299008    32768     21    21   186    168   162  1488      0     0     0
zio_buf_40960                         0x00042  12029952   9502720   364544    40960     33    33    99    264   232   792      0     0     0
zio_data_buf_40960                    0x00042   6926336   5898240   364544    40960     19    18   231    152   144  1848      0     0     0
zio_buf_49152                         0x00042  12042240  10616832   430080    49152     28    27   128    224   216  1024      0     0     0
zio_data_buf_49152                    0x00042   8171520   6488064   430080    49152     19    19   152    152   132  1216      0     0     0
zio_buf_57344                         0x00042  14868480  13303808   495616    57344     30    29    53    240   232   424      0     0     0
zio_data_buf_57344                    0x00042   7434240   6422528   495616    57344     15    14   115    120   112   920      0     0     0
zio_buf_65536                         0x00042  14028800  12910592   561152    65536     25    25    65    200   197   520      0     0     0
zio_data_buf_65536                    0x00042   7856128   6881280   561152    65536     14    14    76    112   105   608      0     0     0
zio_buf_81920                         0x00042  24227840  22282240   692224    81920     35    34    55    280   272   440      0     0     0
zio_data_buf_81920                    0x00042  14536704  12943360   692224    81920     21    21   164    168   158  1312      0     0     0
zio_buf_98304                         0x00042  23875584  19562496   823296    98304     29    29    58    232   199   464      0     0     0
zio_data_buf_98304                    0x00042  15642624  13860864   823296    98304     19    19    78    152   141   624      0     0     0
zio_buf_114688                        0x00042  26722304  24772608   954368   114688     28    27    66    224   216   528      0     0     0
zio_data_buf_114688                   0x00042  20996096  15941632   954368   114688     22    22    65    176   139   520      0     0     0
zio_buf_131072                        0x00042  68382720  33161216  1085440   131072     63    63   387    504   253  3092      0     0     0
zio_data_buf_131072                   0x00042 10780590080 8550088704  1085440   131072   9932  9932 11556  79456 65232 92448      0     0     0
zio_buf_163840                        0x00042         0         0  1347584   163840      0     0     0      0     0     0      0     0     0
zio_data_buf_163840                   0x00042         0         0  1347584   163840      0     0     0      0     0     0      0     0     0
zio_buf_196608                        0x00042         0         0  1609728   196608      0     0     0      0     0     0      0     0     0
zio_data_buf_196608                   0x00042         0         0  1609728   196608      0     0     0      0     0     0      0     0     0
zio_buf_229376                        0x00042         0         0  1871872   229376      0     0     0      0     0     0      0     0     0
zio_data_buf_229376                   0x00042         0         0  1871872   229376      0     0     0      0     0     0      0     0     0
zio_buf_262144                        0x00042         0         0  2134016   262144      0     0     0      0     0     0      0     0     0
zio_data_buf_262144                   0x00042         0         0  2134016   262144      0     0     0      0     0     0      0     0     0
zio_buf_327680                        0x00042         0         0  2658304   327680      0     0     0      0     0     0      0     0     0
zio_data_buf_327680                   0x00042         0         0  2658304   327680      0     0     0      0     0     0      0     0     0
zio_buf_393216                        0x00042         0         0  3182592   393216      0     0     0      0     0     0      0     0     0
zio_data_buf_393216                   0x00042         0         0  3182592   393216      0     0     0      0     0     0      0     0     0
zio_buf_458752                        0x00042         0         0  3706880   458752      0     0     0      0     0     0      0     0     0
zio_data_buf_458752                   0x00042         0         0  3706880   458752      0     0     0      0     0     0      0     0     0
zio_buf_524288                        0x00042         0         0  4231168   524288      0     0     0      0     0     0      0     0     0
zio_data_buf_524288                   0x00042         0         0  4231168   524288      0     0     0      0     0     0      0     0     0
zio_buf_655360                        0x00042         0         0  5279744   655360      0     0     0      0     0     0      0     0     0
zio_data_buf_655360                   0x00042         0         0  5279744   655360      0     0     0      0     0     0      0     0     0
zio_buf_786432                        0x00042         0         0  6328320   786432      0     0     0      0     0     0      0     0     0
zio_data_buf_786432                   0x00042         0         0  6328320   786432      0     0     0      0     0     0      0     0     0
zio_buf_917504                        0x00042         0         0  7376896   917504      0     0     0      0     0     0      0     0     0
zio_data_buf_917504                   0x00042         0         0  7376896   917504      0     0     0      0     0     0      0     0     0
zio_buf_1048576                       0x00042         0         0  8425472  1048576      0     0     0      0     0     0      0     0     0
zio_data_buf_1048576                  0x00042         0         0  8425472  1048576      0     0     0      0     0     0      0     0     0
zio_buf_1310720                       0x00042         0         0 10522624  1310720      0     0     0      0     0     0      0     0     0
zio_data_buf_1310720                  0x00042         0         0 10522624  1310720      0     0     0      0     0     0      0     0     0
zio_buf_1572864                       0x00042         0         0 12619776  1572864      0     0     0      0     0     0      0     0     0
zio_data_buf_1572864                  0x00042         0         0 12619776  1572864      0     0     0      0     0     0      0     0     0
zio_buf_1835008                       0x00042         0         0 14716928  1835008      0     0     0      0     0     0      0     0     0
zio_data_buf_1835008                  0x00042         0         0 14716928  1835008      0     0     0      0     0     0      0     0     0
zio_buf_2097152                       0x00042         0         0 16814080  2097152      0     0     0      0     0     0      0     0     0
zio_data_buf_2097152                  0x00042         0         0 16814080  2097152      0     0     0      0     0     0      0     0     0
zio_buf_2621440                       0x00042         0         0 21008384  2621440      0     0     0      0     0     0      0     0     0
zio_data_buf_2621440                  0x00042         0         0 21008384  2621440      0     0     0      0     0     0      0     0     0
zio_buf_3145728                       0x00042         0         0 25202688  3145728      0     0     0      0     0     0      0     0     0
zio_data_buf_3145728                  0x00042         0         0 25202688  3145728      0     0     0      0     0     0      0     0     0
zio_buf_3670016                       0x00042         0         0 29396992  3670016      0     0     0      0     0     0      0     0     0
zio_data_buf_3670016                  0x00042         0         0 29396992  3670016      0     0     0      0     0     0      0     0     0
zio_buf_4194304                       0x00042         0         0 29392896  4194304      0     0     0      0     0     0      0     0     0
zio_data_buf_4194304                  0x00042         0         0 29392896  4194304      0     0     0      0     0     0      0     0     0
zio_buf_5242880                       0x00042         0         0 31485952  5242880      0     0     0      0     0     0      0     0     0
zio_data_buf_5242880                  0x00042         0         0 31485952  5242880      0     0     0      0     0     0      0     0     0
zio_buf_6291456                       0x00042         0         0 31481856  6291456      0     0     0      0     0     0      0     0     0
zio_data_buf_6291456                  0x00042         0         0 31481856  6291456      0     0     0      0     0     0      0     0     0
zio_buf_7340032                       0x00042         0         0 29380608  7340032      0     0     0      0     0     0      0     0     0
zio_data_buf_7340032                  0x00042         0         0 29380608  7340032      0     0     0      0     0     0      0     0     0
zio_buf_8388608                       0x00042         0         0 25182208  8388608      0     0     0      0     0     0      0     0     0
zio_data_buf_8388608                  0x00042         0         0 25182208  8388608      0     0     0      0     0     0      0     0     0
zio_buf_10485760                      0x00042         0         0 31473664 10485760      0     0     0      0     0     0      0     0     0
zio_data_buf_10485760                 0x00042         0         0 31473664 10485760      0     0     0      0     0     0      0     0     0
zio_buf_12582912                      0x00042         0         0 25178112 12582912      0     0     0      0     0     0      0     0     0
zio_data_buf_12582912                 0x00042         0         0 25178112 12582912      0     0     0      0     0     0      0     0     0
zio_buf_14680064                      0x00042         0         0 29372416 14680064      0     0     0      0     0     0      0     0     0
zio_data_buf_14680064                 0x00042         0         0 29372416 14680064      0     0     0      0     0     0      0     0     0
zio_buf_16777216                      0x00042         0         0 16785408 16777216      0     0     0      0     0     0      0     0     0
zio_data_buf_16777216                 0x00042         0         0 16785408 16777216      0     0     0      0     0     0      0     0     0
dweeezil added a commit to dweeezil/zfs that referenced this issue Apr 24, 2016
At the very least, the zfs_secpolicy_write_perms ioctl security policy
callback, which calls dsl_dataset_hold(), can require freeing memory and,
therefore, re-enter ZFS.  This patch enables PF_FSTRANS for all of the
security policy callbacks similarly to the manner in which it's enabled
for the actual ioctl callback.

May-fix: openzfs#4554
@dweeezil
Copy link
Contributor

@varju If you can reproduce this, you ought to be able to cherry-pick 8f80c0a on top of 0.6.5.6 and see if it helps.

@varju
Copy link
Author

varju commented Apr 25, 2016

Thanks for the quick feedback. I'll give that a try and let you know how things turn out. I typically see a crash once or twice a week, right now, so shouldn't be long to know if it's doing better.

@varju
Copy link
Author

varju commented Apr 27, 2016

Two and a half days of uptime so far, and no crashes. It's too soon to say for sure whether things are fixed, but your patch doesn't appear to have made things worse. I'll keep an eye on things and report back in a few days.

Thanks again for the quick patch.

@dweeezil
Copy link
Contributor

@varju Thanks for the feedback. Your stack trace makes it pretty clear we need this so I'll post it as a PR today.

@behlendorf
Copy link
Contributor

@varju could you comment on if you've seen the issue after a week now?

@varju
Copy link
Author

varju commented May 2, 2016

7 days of uptime, and no issues so far.

@behlendorf
Copy link
Contributor

@varju great, thanks for the quick reply. As @dweeezil commented the evidence here is pretty clear but I wanted to verify with you before declaring this fixed when the patch is merged.

@behlendorf behlendorf added this to the 0.6.5.7 milestone May 2, 2016
behlendorf pushed a commit that referenced this issue May 2, 2016
At the very least, the zfs_secpolicy_write_perms ioctl security policy
callback, which calls dsl_dataset_hold(), can require freeing memory and,
therefore, re-enter ZFS.  This patch enables PF_FSTRANS for all of the
security policy callbacks similarly to the manner in which it's enabled
for the actual ioctl callback.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4554
@varju
Copy link
Author

varju commented May 2, 2016

Thank you both. My server is no longer crashing and I'm a happy camper.

nedbass pushed a commit to nedbass/zfs that referenced this issue May 6, 2016
At the very least, the zfs_secpolicy_write_perms ioctl security policy
callback, which calls dsl_dataset_hold(), can require freeing memory and,
therefore, re-enter ZFS.  This patch enables PF_FSTRANS for all of the
security policy callbacks similarly to the manner in which it's enabled
for the actual ioctl callback.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#4554
nedbass pushed a commit to nedbass/zfs that referenced this issue May 6, 2016
At the very least, the zfs_secpolicy_write_perms ioctl security policy
callback, which calls dsl_dataset_hold(), can require freeing memory and,
therefore, re-enter ZFS.  This patch enables PF_FSTRANS for all of the
security policy callbacks similarly to the manner in which it's enabled
for the actual ioctl callback.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#4554
ryao pushed a commit to ClusterHQ/zfs that referenced this issue Jun 7, 2016
At the very least, the zfs_secpolicy_write_perms ioctl security policy
callback, which calls dsl_dataset_hold(), can require freeing memory and,
therefore, re-enter ZFS.  This patch enables PF_FSTRANS for all of the
security policy callbacks similarly to the manner in which it's enabled
for the actual ioctl callback.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#4554
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants