Skip to content
Permalink
Qu-Wenruo/Btrf…

Commits on Mar 22, 2016

  1. btrfs: dedupe: Fix a space cache delalloc bytes underflow bug

    Dedupe has a bug that underflow block_group_cache->delalloc_bytes, makes
    it unable to return to 0.
    This will cause free space cache for that block group never written to
    disk.
    
    And cause the following kernel message at umount:
    BTRFS info (device vdc): The free space cache file (1485570048) is
    invalid. skip it
    
    Reported-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
    Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
    Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
    Qu Wenruo authored and fengguang committed Mar 22, 2016
  2. btrfs: relocation: Enhance error handling to avoid BUG_ON

    Since the introduce of btrfs dedupe tree, it's possible that balance can
    race with dedupe disabling.
    
    When this happens, dedupe_enabled will make btrfs_get_fs_root() return
    PTR_ERR(-ENOENT).
    But due to a bug in error handling branch, when this happens
    backref_cache->nr_nodes is increased but the node is neither added to
    backref_cache or nr_nodes decreased.
    Causing BUG_ON() in backref_cache_cleanup()
    
    [ 2611.668810] ------------[ cut here ]------------
    [ 2611.669946] kernel BUG at
    /home/sat/ktest/linux/fs/btrfs/relocation.c:243!
    [ 2611.670572] invalid opcode: 0000 [#1] SMP
    [ 2611.686797] Call Trace:
    [ 2611.687034]  [<ffffffffa01f71d3>]
    btrfs_relocate_block_group+0x1b3/0x290 [btrfs]
    [ 2611.687706]  [<ffffffffa01cc177>]
    btrfs_relocate_chunk.isra.40+0x47/0xd0 [btrfs]
    [ 2611.688385]  [<ffffffffa01cdb12>] btrfs_balance+0xb22/0x11e0 [btrfs]
    [ 2611.688966]  [<ffffffffa01d9611>] btrfs_ioctl_balance+0x391/0x3a0
    [btrfs]
    [ 2611.689587]  [<ffffffffa01ddaf0>] btrfs_ioctl+0x1650/0x2290 [btrfs]
    [ 2611.690145]  [<ffffffff81171cda>] ? lru_cache_add+0x3a/0x80
    [ 2611.690647]  [<ffffffff81171e4c>] ?
    lru_cache_add_active_or_unevictable+0x4c/0xc0
    [ 2611.691310]  [<ffffffff81193f04>] ? handle_mm_fault+0xcd4/0x17f0
    [ 2611.691842]  [<ffffffff811da423>] ? cp_new_stat+0x153/0x180
    [ 2611.692342]  [<ffffffff8119913d>] ? __vma_link_rb+0xfd/0x110
    [ 2611.692842]  [<ffffffff81199209>] ? vma_link+0xb9/0xc0
    [ 2611.693303]  [<ffffffff811e7e81>] do_vfs_ioctl+0xa1/0x5a0
    [ 2611.693781]  [<ffffffff8104e024>] ? __do_page_fault+0x1b4/0x400
    [ 2611.694310]  [<ffffffff811e83c1>] SyS_ioctl+0x41/0x70
    [ 2611.694758]  [<ffffffff816dfc6e>] entry_SYSCALL_64_fastpath+0x12/0x71
    [ 2611.695331] Code: ff 48 8b 45 bf 49 83 af a8 05 00 00 01 49 89 87 a0
    05 00 00 e9 2e fd ff ff b8 f4 ff ff ff e9 e4 fb ff ff 0f 0b 0f 0b 0f 0b
    0f 0b <0f> 0b 0f 0b 41 89 c6 e9 b8 fb ff ff e8 9e a6 e8 e0 4c 89 e7 44
    [ 2611.697870] RIP  [<ffffffffa01f6fc1>]
    relocate_block_group+0x741/0x7a0 [btrfs]
    [ 2611.698818]  RSP <ffff88002a81fb30>
    
    This patch will call remove_backref_node() in error handling branch, and
    cache the returned -ENOENT in relocate_tree_block() and continue
    balancing.
    
    Reported-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
    Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
    Qu Wenruo authored and fengguang committed Mar 22, 2016
  3. btrfs: dedupe: Add support for compression and dedpue

    The basic idea is also calculate hash before compression, and add needed
    members for dedupe to record compressed file extent.
    
    Since dedupe support dedupe_bs larger than 128K, which is the up limit
    of compression file extent, in that case we will skip dedupe and prefer
    compression, as in that size dedupe rate is low and compression will be
    more obvious.
    
    Current implement is far from elegant. The most elegant one should split
    every data processing method into its own and independent function, and
    have a unified function to co-operate them.
    
    Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
    Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
    wangxiaoguang authored and fengguang committed Mar 22, 2016
  4. btrfs: dedupe: Preparation for compress-dedupe co-work

    For dedupe to work with compression, new members recording compression
    algorithm and on-disk extent length are needed.
    
    Add them for later compress-dedupe co-work.
    
    Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
    Qu Wenruo authored and fengguang committed Mar 22, 2016
  5. btrfs: dedupe: Avoid submit IO for hash hit extent

    Before this patch, even for duplicated extent, it will still go through
    page write, meaning we didn't skip IO for them.
    
    Although such write will be skipped by block level, as block level will
    only select the last submitted write request to the same bytenr.
    
    This patch will manually skip such IO to reduce dedupe overhead.
    After this patch, dedupe all miss performance is higher than low
    compress ratio performance.
    
    Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
    Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
    wangxiaoguang authored and fengguang committed Mar 22, 2016
  6. btrfs: dedupe: Fix metadata balance error when dedupe is enabled

    A missing branch in btrfs_get_fs_root() is making dedupe_root read from
    disk, and REF_COWS bit set.
    This makes btrfs balance treating dedupe_root as fs root, and reusing the
    old dedupe root bytenr to drop tree ref, causing the following kernel
    warning after metadata balancing:
    
    BTRFS error (device sdb6): unable to find ref byte nr 29736960 parent 0
    root 11  owner 0 offset 0
    ------------[ cut here ]------------
    WARNING: CPU: 1 PID: 19113 at fs/btrfs/extent-tree.c:6636
    __btrfs_free_extent.isra.66+0xb6d/0xd20 [btrfs]()
    BTRFS: Transaction aborted (error -2)
    Modules linked in: btrfs(O) xor zlib_deflate raid6_pq xfs [last
    unloaded: btrfs]
    CPU: 1 PID: 19113 Comm: btrfs Tainted: G        W  O    4.5.0-rc5+ #2
    Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox
    12/01/2006
     0000000000000000 ffff880035b0ba18 ffffffff813771ff ffff880035b0ba60
     ffffffffa06a810a ffff880035b0ba50 ffffffff810bcb81 ffff88003c45c528
     0000000001c5c000 00000000fffffffe ffff88003dc8c520 0000000000000000
    Call Trace:
     [<ffffffff813771ff>] dump_stack+0x67/0x98
     [<ffffffff810bcb81>] warn_slowpath_common+0x81/0xc0
     [<ffffffff810bcc07>] warn_slowpath_fmt+0x47/0x50
     [<ffffffffa06028fd>] __btrfs_free_extent.isra.66+0xb6d/0xd20 [btrfs]
     [<ffffffffa0606d4d>] __btrfs_run_delayed_refs.constprop.71+0x96d/0x1560
    [btrfs]
     [<ffffffff81202ad9>] ? cmpxchg_double_slab.isra.68+0x149/0x160
     [<ffffffff81106a1d>] ? trace_hardirqs_on+0xd/0x10
     [<ffffffffa060a5ce>] btrfs_run_delayed_refs+0x8e/0x2d0 [btrfs]
     [<ffffffffa06209fe>] btrfs_commit_transaction+0x3e/0xb50 [btrfs]
     [<ffffffffa069f26e>] ? btrfs_dedupe_disable+0x28e/0x2c0 [btrfs]
     [<ffffffff812035c3>] ? kfree+0x223/0x270
     [<ffffffffa069f27a>] btrfs_dedupe_disable+0x29a/0x2c0 [btrfs]
     [<ffffffffa065e403>] btrfs_ioctl+0x2363/0x2a40 [btrfs]
     [<ffffffff8116b12a>] ? __audit_syscall_entry+0xaa/0xf0
     [<ffffffff81137ce6>] ? current_kernel_time64+0x56/0xa0
     [<ffffffff8122080e>] do_vfs_ioctl+0x8e/0x690
     [<ffffffff8116b12a>] ? __audit_syscall_entry+0xaa/0xf0
     [<ffffffff8122c181>] ? __fget_light+0x61/0x90
     [<ffffffff81220e84>] SyS_ioctl+0x74/0x80
     [<ffffffff8180ad57>] entry_SYSCALL_64_fastpath+0x12/0x6f
    ---[ end trace 618d5a5bc21d6a7c ]---
    
    Fix it by adding corresponding branch for btrfs_get_fs_root().
    
    Reported-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
    Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
    Qu Wenruo authored and fengguang committed Mar 22, 2016
  7. btrfs: Fix a memory leak in inband dedupe hash

    We allocate a dedupe hash into async_extent, but forget to free it.
    Fix it by freeing the hash before freeing async_extent.
    
    Reported-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
    Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
    Qu Wenruo authored and fengguang committed Mar 22, 2016
  8. btrfs: dedupe: Fix a bug when running inband dedupe with balance

    When running inband dedupe with balance, it's possible that inband dedupe
    still increase ref on extents which are in RO chunk.
    
    This may cause either find_data_references() gives warning, or make
    run_delayed_refs() return -EIO and cause trans abort.
    
    The cause is, normal dedupe_del() is only called at run_delayed_ref()
    time, which is too late for balance case.
    
    This patch fixes this bug by calling dedupe_del() at extent searching
    time of balance.
    
    Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
    wangxiaoguang authored and fengguang committed Mar 22, 2016
  9. btrfs: try more times to alloc metadata reserve space

    In btrfs_delalloc_reserve_metadata(), the number of metadata bytes we try
    to reserve is calculated by the difference between outstanding_extents and
    reserved_extents.
    
    When reserve_metadata_bytes() fails to reserve desited metadata space,
    it has already done some reclaim work, such as write ordered extents.
    
    In that case, outstanding_extents and reserved_extents may already
    changed, and we may reserve enough metadata space then.
    
    So this patch will try to call reserve_metadata_bytes() at most 3 times
    to ensure we really run out of space.
    
    Such false ENOSPC is mainly caused by small file extents and time
    consuming delalloc functions, which mainly affects in-band
    de-duplication. (Compress should also be affected, but LZO/zlib is
    faster than SHA256, so still harder to trigger than dedupe).
    
    Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
    wangxiaoguang authored and fengguang committed Mar 22, 2016
  10. btrfs: dedupe: add per-file online dedupe control

    Introduce inode_need_dedupe() to implement per-file online dedupe control.
    
    Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
    wangxiaoguang authored and fengguang committed Mar 22, 2016
  11. btrfs: dedupe: add a property handler for online dedupe

    We use btrfs extended attribute "btrfs.dedupe" to record per-file online
    dedupe status, so add a dedupe property handler.
    
    Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
    wangxiaoguang authored and fengguang committed Mar 22, 2016
  12. btrfs: dedupe: add an inode nodedupe flag

    Introduce BTRFS_INODE_NODEDUP flag, then we can explicitly disable
    online data dedupelication for specified files.
    
    Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
    wangxiaoguang authored and fengguang committed Mar 22, 2016
  13. btrfs: dedupe: Add ioctl for inband dedupelication

    Add ioctl interface for inband dedupelication, which includes:
    1) enable
    2) disable
    3) status
    
    We will later add ioctl to disable inband dedupe for given file/dir.
    
    Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
    Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
    wangxiaoguang authored and fengguang committed Mar 22, 2016
  14. btrfs: dedupe: Add support for adding hash for on-disk backend

    Now on-disk backend can add hash now.
    
    Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
    Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
    Qu Wenruo authored and fengguang committed Mar 22, 2016
  15. btrfs: dedupe: Add support to delete hash for on-disk backend

    Now on-disk backend can delete hash now.
    
    Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
    Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
    Qu Wenruo authored and fengguang committed Mar 22, 2016
  16. btrfs: dedupe: Add support for on-disk hash search

    Now on-disk backend should be able to search hash now.
    
    Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
    Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
    Qu Wenruo authored and fengguang committed Mar 22, 2016
  17. btrfs: dedupe: Introduce interfaces to resume and cleanup dedupe info

    Since we will introduce a new on-disk based dedupe method, introduce new
    interfaces to resume previous dedupe setup.
    
    And since we introduce a new tree for status, also add disable handler
    for it.
    
    Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
    Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
    Qu Wenruo authored and fengguang committed Mar 22, 2016
  18. btrfs: dedupe: Add basic tree structure for on-disk dedupe method

    Introduce a new tree, dedupe tree to record on-disk dedupe hash.
    As a persist hash storage instead of in-memeory only implement.
    
    Unlike Liu Bo's implement, in this version we won't do hack for
    bytenr -> hash search, but add a new type, DEDUP_BYTENR_ITEM for such
    search case, just like in-memory backend.
    
    Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
    Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
    Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
    Qu Wenruo authored and fengguang committed Mar 22, 2016
  19. btrfs: dedupe: Inband in-memory only de-duplication implement

    Core implement for inband de-duplication.
    It reuse the async_cow_start() facility to do the calculate dedupe hash.
    And use dedupe hash to do inband de-duplication at extent level.
    
    The work flow is as below:
    1) Run delalloc range for an inode
    2) Calculate hash for the delalloc range at the unit of dedupe_bs
    3) For hash match(duplicated) case, just increase source extent ref
       and insert file extent.
       For hash mismatch case, go through the normal cow_file_range()
       fallback, and add hash into dedupe_tree.
       Compress for hash miss case is not supported yet.
    
    Current implement restore all dedupe hash in memory rb-tree, with LRU
    behavior to control the limit.
    
    Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
    Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
    Qu Wenruo authored and fengguang committed Mar 22, 2016
  20. btrfs: ordered-extent: Add support for dedupe

    Add ordered-extent support for dedupe.
    
    Note, current ordered-extent support only supports non-compressed source
    extent.
    Support for compressed source extent will be added later.
    
    Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
    Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
    wangxiaoguang authored and fengguang committed Mar 22, 2016
  21. btrfs: dedupe: Implement btrfs_dedupe_calc_hash interface

    Unlike in-memory or on-disk dedupe method, only SHA256 hash method is
    supported yet, so implement btrfs_dedupe_calc_hash() interface using
    SHA256.
    
    Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
    Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
    wangxiaoguang authored and fengguang committed Mar 22, 2016
  22. btrfs: dedupe: Introduce function to search for an existing hash

    Introduce static function inmem_search() to handle the job for in-memory
    hash tree.
    
    The trick is, we must ensure the delayed ref head is not being run at
    the time we search the for the hash.
    
    With inmem_search(), we can implement the btrfs_dedupe_search()
    interface.
    
    Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
    Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
    wangxiaoguang authored and fengguang committed Mar 22, 2016
  23. btrfs: delayed-ref: Add support for increasing data ref under spinlock

    For in-band dedupe, btrfs needs to increase data ref with delayed_ref
    locked, so add a new function btrfs_add_delayed_data_ref_lock() to
    increase extent ref with delayed_refs already locked.
    
    Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
    Qu Wenruo authored and fengguang committed Mar 22, 2016
  24. btrfs: dedupe: Introduce function to remove hash from in-memory tree

    Introduce static function inmem_del() to remove hash from in-memory
    dedupe tree.
    And implement btrfs_dedupe_del() and btrfs_dedup_destroy() interfaces.
    
    Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
    Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
    wangxiaoguang authored and fengguang committed Mar 22, 2016
  25. btrfs: dedupe: Introduce function to add hash into in-memory tree

    Introduce static function inmem_add() to add hash into in-memory tree.
    And now we can implement the btrfs_dedupe_add() interface.
    
    Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
    Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
    wangxiaoguang authored and fengguang committed Mar 22, 2016
  26. btrfs: dedupe: Introduce function to initialize dedupe info

    Add generic function to initialize dedupe info.
    
    Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
    Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
    wangxiaoguang authored and fengguang committed Mar 22, 2016
  27. btrfs: dedupe: Introduce dedupe framework and its header

    Introduce the header for btrfs online(write time) de-duplication
    framework and needed header.
    
    The new de-duplication framework is going to support 2 different dedupe
    methods and 1 dedupe hash.
    
    Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
    Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
    wangxiaoguang authored and fengguang committed Mar 22, 2016

Commits on Mar 14, 2016

  1. btrfs: Fix misspellings in comments.

    Signed-off-by: Adam Buchbinder <adam.buchbinder@gmail.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    adambuchbinder authored and kdave committed Mar 14, 2016
  2. btrfs: Print Warning only if ENOSPC_DEBUG is enabled

    Dont print warning for ENOSPC error unless ENOSPC_DEBUG is enabled. Use
    btrfs_debug if it is enabled.
    
    Signed-off-by: Ashish Samant <ashish.samant@oracle.com>
    [ preserve the WARN_ON ]
    Signed-off-by: David Sterba <dsterba@suse.com>
    Ashish Samant authored and kdave committed Mar 14, 2016

Commits on Mar 11, 2016

  1. btrfs: scrub: silence an uninitialized variable warning

    It's basically harmless if "ref_level" isn't initialized since it's only
    used for an error message, but it causes a static checker warning.
    
    Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
    Reviewed-by: David Sterba <dsterba@suse.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    error27 authored and kdave committed Mar 11, 2016
  2. btrfs: move btrfs_compression_type to compression.h

    So that its better organized.
    
    Signed-off-by: Anand Jain <anand.jain@oracle.com>
    Reviewed-by: David Sterba <dsterba@suse.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    asj authored and kdave committed Mar 11, 2016
  3. btrfs: rename btrfs_print_info to btrfs_print_mod_info

    So that it indicates what it does.
    
    Signed-off-by: Anand Jain <anand.jain@oracle.com>
    Reviewed-by: David Sterba <dsterba@suse.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    asj authored and kdave committed Mar 11, 2016
  4. Btrfs: Show a warning message if one of objectid reaches its highest …

    …value
    
    It's better to show a warning message for the exceptional case
    that one of objectid (in most case, inode number) reaches its
    highest value. For example, if inode cache is off and this event
    happens, we can't create any file even if there are not so many files.
    This message ease detecting such problem.
    
    Signed-off-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
    Reviewed-by: David Sterba <dsterba@suse.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Satoru Takeuchi authored and kdave committed Mar 11, 2016
  5. Documentation: btrfs: remove usage specific information

    The document in the kernel sources is yet another palce where the
    documentation would need to be updated, while it is not the primary
    source. We actively maintain the wiki pages.
    
    Signed-off-by: David Sterba <dsterba@suse.com>
    kdave committed Mar 11, 2016
  6. btrfs: use kbasename in btrfsic_mount

    This is more readable.
    
    Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
    Reviewed-by Andy Shevchenko <andy.shevchenko@gmail.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Villemoes authored and kdave committed Mar 11, 2016
Older
You can’t perform that action at this time.