Skip to content
Permalink
Qu-Wenruo/Btrf…

Commits on Jun 30, 2016

  1. btrfs: dedupe: Introduce new reconfigure ioctl

    Introduce new reconfigure ioctl, and new FORCE flag for in-band dedupe
    ioctls.
    
    Now dedupe enable and reconfigure ioctl are stateful.
    
    --------------------------------------------
    | Current state |   Ioctl    | Next state  |
    --------------------------------------------
    | Disabled	|  enable    | Enabled     |
    | Enabled       |  enable    | Not allowed |
    | Enabled       |  reconf    | Enabled     |
    | Enabled       |  disable   | Disabled    |
    | Disabled      |  dsiable   | Disabled    |
    | Disabled      |  reconf    | Not allowed |
    --------------------------------------------
    (While disbale is always stateless)
    
    While for guys prefer stateless ioctl (myself for example), new FORCE
    flag is introduced.
    
    In FORCE mode, enable/disable is completely stateless.
    --------------------------------------------
    | Current state |   Ioctl    | Next state  |
    --------------------------------------------
    | Disabled	|  enable    | Enabled     |
    | Enabled       |  enable    | Enabled     |
    | Enabled       |  disable   | Disabled    |
    | Disabled      |  disable   | Disabled    |
    --------------------------------------------
    
    Also, re-configure ioctl will only modify specified fields.
    Unlike enable, un-specified fields will be filled with default value.
    
    For example:
     # btrfs dedupe enable --block-size 64k /mnt
     # btrfs dedupe reconfigure --limit-hash 1m /mnt
    Will leads to:
     dedupe blocksize: 64K
     dedupe hash limit nr: 1m
    
    While for enable:
     # btrfs dedupe enable --force --block-size 64k /mnt
     # btrfs dedupe enable --force --limit-hash 1m /mnt
    Will reset blocksize to default value:
     dedupe blocksize: 128K     << reset
     dedupe hash limit nr: 1m
    
    Suggested-by: David Sterba <dsterba@suse.cz>
    Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
    Qu Wenruo authored and fengguang committed Jun 30, 2016
  2. btrfs: dedupe: fix false ENOSPC

    When testing in-band dedupe, sometimes we got ENOSPC error, though fs
    still has much free space. After some debuging work, we found that it's
    btrfs_delalloc_reserve_metadata() which sometimes tries to reserve
    plenty of metadata space, even for very small data range.
    
    In btrfs_delalloc_reserve_metadata(), the number of metadata bytes we try
    to reserve is calculated by the difference between outstanding_extents and
    reserved_extents. Please see below case for how ENOSPC occurs:
    
      1, Buffered write 128MB data in unit of 1MB, so finially we'll have
    inode outstanding extents be 1, and reserved_extents be 128.
    Note it's btrfs_merge_extent_hook() that merges these 1MB units into
    one big outstanding extent, but do not change reserved_extents.
    
      2, When writing dirty pages, for in-band dedupe, cow_file_range() will
    split above big extent in unit of 16KB(assume our in-band dedupe blocksize
    is 16KB). When first split opeartion finishes, we'll have 2 outstanding
    extents and 128 reserved extents, and just right the currently generated
    ordered extent is dispatched to run and complete, then
    btrfs_delalloc_release_metadata()(see btrfs_finish_ordered_io()) will be
    called to release metadata, after that we will have 1 outstanding extents
    and 1 reserved extents(also see logic in drop_outstanding_extent()). Later
    cow_file_range() continues to handles left data range[16KB, 128MB), and if
    no other ordered extent was dispatched to run, there will be 8191
    outstanding extents and 1 reserved extent.
    
      3, Now if another bufferd write for this file enters, then
    btrfs_delalloc_reserve_metadata() will at least try to reserve metadata
    for 8191 outstanding extents' metadata, for 64K node size, it'll be
    8191*65536*16, about 8GB metadata, so obviously it'll return ENOSPC error.
    
    But indeed when a file goes through in-band dedupe, its max extent size
    will no longer be BTRFS_MAX_EXTENT_SIZE(128MB), it'll be limited by in-band
    dedupe blocksize, so current metadata reservation method in btrfs is not
    appropriate or correct, here we introduce btrfs_max_extent_size(), which
    will return max extent size for corresponding files, which go through
    in-band and we use this value to do metadata reservation and extent_io
    merge, split, clear operations, we can make sure difference between
    outstanding_extents and reserved_extents will not be so big.
    
    Currently only buffered write will go through in-band dedupe if in-band
    dedupe is enabled.
    
    Reported-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
    Cc: Josef Bacik <jbacik@fb.com>
    Cc: Mark Fasheh <mfasheh@suse.de>
    Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
    wangxiaoguang authored and fengguang committed Jun 30, 2016
  3. btrfs: improve inode's outstanding_extents computation

    This issue was revealed by modifying BTRFS_MAX_EXTENT_SIZE(128MB) to 64KB,
    When modifying BTRFS_MAX_EXTENT_SIZE(128MB) to 64KB, fsstress test often
    gets these warnings from btrfs_destroy_inode():
    	WARN_ON(BTRFS_I(inode)->outstanding_extents);
    	WARN_ON(BTRFS_I(inode)->reserved_extents);
    
    Simple test program below can reproduce this issue steadily.
    Note: you need to modify BTRFS_MAX_EXTENT_SIZE to 64KB to have test,
    otherwise there won't be such WARNING.
    	#include <string.h>
    	#include <unistd.h>
    	#include <sys/types.h>
    	#include <sys/stat.h>
    	#include <fcntl.h>
    
    	int main(void)
    	{
    		int fd;
    		char buf[68 *1024];
    
    		memset(buf, 0, 68 * 1024);
    		fd = open("testfile", O_CREAT | O_EXCL | O_RDWR);
    		pwrite(fd, buf, 68 * 1024, 64 * 1024);
    		return;
    	}
    
    When BTRFS_MAX_EXTENT_SIZE is 64KB, and buffered data range is:
    64KB						128K		132KB
    |-----------------------------------------------|---------------|
                             64 + 4KB
    
    1) for above data range, btrfs_delalloc_reserve_metadata() will reserve
    metadata and set BTRFS_I(inode)->outstanding_extents to 2.
    (68KB + 64KB - 1) / 64KB == 2
    
    Outstanding_extents: 2
    
    2) then btrfs_dirty_page() will be called to dirty pages and set
    EXTENT_DELALLOC flag. In this case, btrfs_set_bit_hook() will be called
    twice.
    The 1st set_bit_hook() call will set DEALLOC flag for the first 64K.
    64KB						128KB
    |-----------------------------------------------|
    	64KB DELALLOC
    Outstanding_extents: 2
    
    Set_bit_hooks() uses FIRST_DELALLOC flag to avoid re-increase
    outstanding_extents counter.
    So for 1st set_bit_hooks() call, it won't modify outstanding_extents,
    it's still 2.
    
    Then FIRST_DELALLOC flag is *CLEARED*.
    
    3) 2nd btrfs_set_bit_hook() call.
    Because FIRST_DELALLOC have been cleared by previous set_bit_hook(),
    btrfs_set_bit_hook() will increase BTRFS_I(inode)->outstanding_extents by
    one, so now BTRFS_I(inode)->outstanding_extents is 3.
    64KB                                            128KB            132KB
    |-----------------------------------------------|----------------|
    	64K DELALLOC				   4K DELALLOC
    Outstanding_extents: 3
    
    But the correct outstanding_extents number should be 2, not 3.
    The 2nd btrfs_set_bit_hook() call just screwed up this, and leads to the
    WARN_ON().
    
    Normally, we can solve it by only increasing outstanding_extents in
    set_bit_hook().
    But the problem is for delalloc_reserve/release_metadata(), we only have
    a 'length' parameter, and calculate in-accurate outstanding_extents.
    If we only rely on set_bit_hook() release_metadata() will crew things up
    as it will decrease inaccurate number.
    
    So the fix we use is:
    1) Increase *INACCURATE* outstanding_extents at delalloc_reserve_meta
       Just as a place holder.
    2) Increase *accurate* outstanding_extents at set_bit_hooks()
       This is the real increaser.
    3) Decrease *INACCURATE* outstanding_extents before returning
       This makes outstanding_extents to correct value.
    
    For 128M BTRFS_MAX_EXTENT_SIZE, due to limitation of
    __btrfs_buffered_write(), each iteration will only handle about 2MB
    data.
    So btrfs_dirty_pages() won't need to handle cases cross 2 extents.
    
    Cc: Mark Fasheh <mfasheh@suse.de>
    Cc: Josef Bacik <jbacik@fb.com>
    Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
    wangxiaoguang authored and fengguang committed Jun 30, 2016
  4. btrfs: relocation: Enhance error handling to avoid BUG_ON

    Since the introduce of btrfs dedupe tree, it's possible that balance can
    race with dedupe disabling.
    
    When this happens, dedupe_enabled will make btrfs_get_fs_root() return
    PTR_ERR(-ENOENT).
    But due to a bug in error handling branch, when this happens
    backref_cache->nr_nodes is increased but the node is neither added to
    backref_cache or nr_nodes decreased.
    Causing BUG_ON() in backref_cache_cleanup()
    
    [ 2611.668810] ------------[ cut here ]------------
    [ 2611.669946] kernel BUG at
    /home/sat/ktest/linux/fs/btrfs/relocation.c:243!
    [ 2611.670572] invalid opcode: 0000 [#1] SMP
    [ 2611.686797] Call Trace:
    [ 2611.687034]  [<ffffffffa01f71d3>]
    btrfs_relocate_block_group+0x1b3/0x290 [btrfs]
    [ 2611.687706]  [<ffffffffa01cc177>]
    btrfs_relocate_chunk.isra.40+0x47/0xd0 [btrfs]
    [ 2611.688385]  [<ffffffffa01cdb12>] btrfs_balance+0xb22/0x11e0 [btrfs]
    [ 2611.688966]  [<ffffffffa01d9611>] btrfs_ioctl_balance+0x391/0x3a0
    [btrfs]
    [ 2611.689587]  [<ffffffffa01ddaf0>] btrfs_ioctl+0x1650/0x2290 [btrfs]
    [ 2611.690145]  [<ffffffff81171cda>] ? lru_cache_add+0x3a/0x80
    [ 2611.690647]  [<ffffffff81171e4c>] ?
    lru_cache_add_active_or_unevictable+0x4c/0xc0
    [ 2611.691310]  [<ffffffff81193f04>] ? handle_mm_fault+0xcd4/0x17f0
    [ 2611.691842]  [<ffffffff811da423>] ? cp_new_stat+0x153/0x180
    [ 2611.692342]  [<ffffffff8119913d>] ? __vma_link_rb+0xfd/0x110
    [ 2611.692842]  [<ffffffff81199209>] ? vma_link+0xb9/0xc0
    [ 2611.693303]  [<ffffffff811e7e81>] do_vfs_ioctl+0xa1/0x5a0
    [ 2611.693781]  [<ffffffff8104e024>] ? __do_page_fault+0x1b4/0x400
    [ 2611.694310]  [<ffffffff811e83c1>] SyS_ioctl+0x41/0x70
    [ 2611.694758]  [<ffffffff816dfc6e>] entry_SYSCALL_64_fastpath+0x12/0x71
    [ 2611.695331] Code: ff 48 8b 45 bf 49 83 af a8 05 00 00 01 49 89 87 a0
    05 00 00 e9 2e fd ff ff b8 f4 ff ff ff e9 e4 fb ff ff 0f 0b 0f 0b 0f 0b
    0f 0b <0f> 0b 0f 0b 41 89 c6 e9 b8 fb ff ff e8 9e a6 e8 e0 4c 89 e7 44
    [ 2611.697870] RIP  [<ffffffffa01f6fc1>]
    relocate_block_group+0x741/0x7a0 [btrfs]
    [ 2611.698818]  RSP <ffff88002a81fb30>
    
    This patch will call remove_backref_node() in error handling branch, and
    cache the returned -ENOENT in relocate_tree_block() and continue
    balancing.
    
    Reported-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
    Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
    Qu Wenruo authored and fengguang committed Jun 30, 2016
  5. btrfs: dedupe: Add ioctl for inband dedupelication

    Add ioctl interface for inband dedupelication, which includes:
    1) enable
    2) disable
    3) status
    
    And a pseudo RO compat flag, to imply that btrfs now supports inband
    dedup.
    However we don't add any ondisk format change, it's just a pseudo RO
    compat flag.
    
    All these ioctl interfaces are state-less, which means caller don't need
    to bother previous dedupe state before calling them, and only need to
    care the final desired state.
    
    For example, if user want to enable dedupe with specified block size and
    limit, just fill the ioctl structure and call enable ioctl.
    No need to check if dedupe is already running.
    
    These ioctls will handle things like re-configure or disable quite well.
    
    Also, for invalid parameters, enable ioctl interface will set the field
    of the first encounted invalid parameter to (-1) to inform caller.
    While for limit_nr/limit_mem, the value will be (0).
    
    Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
    Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
    wangxiaoguang authored and fengguang committed Jun 30, 2016
  6. btrfs: dedupe: Inband in-memory only de-duplication implement

    Core implement for inband de-duplication.
    It reuse the async_cow_start() facility to do the calculate dedupe hash.
    And use dedupe hash to do inband de-duplication at extent level.
    
    The work flow is as below:
    1) Run delalloc range for an inode
    2) Calculate hash for the delalloc range at the unit of dedupe_bs
    3) For hash match(duplicated) case, just increase source extent ref
       and insert file extent.
       For hash mismatch case, go through the normal cow_file_range()
       fallback, and add hash into dedupe_tree.
       Compress for hash miss case is not supported yet.
    
    Current implement restore all dedupe hash in memory rb-tree, with LRU
    behavior to control the limit.
    
    Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
    Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
    Qu Wenruo authored and fengguang committed Jun 30, 2016
  7. btrfs: ordered-extent: Add support for dedupe

    Add ordered-extent support for dedupe.
    
    Note, current ordered-extent support only supports non-compressed source
    extent.
    Support for compressed source extent will be added later.
    
    Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
    Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
    Reviewed-by: Josef Bacik <jbacik@fb.com>
    wangxiaoguang authored and fengguang committed Jun 30, 2016
  8. btrfs: dedupe: Implement btrfs_dedupe_calc_hash interface

    Unlike in-memory or on-disk dedupe method, only SHA256 hash method is
    supported yet, so implement btrfs_dedupe_calc_hash() interface using
    SHA256.
    
    Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
    Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
    Reviewed-by: Josef Bacik <jbacik@fb.com>
    wangxiaoguang authored and fengguang committed Jun 30, 2016
  9. btrfs: dedupe: Introduce function to search for an existing hash

    Introduce static function inmem_search() to handle the job for in-memory
    hash tree.
    
    The trick is, we must ensure the delayed ref head is not being run at
    the time we search the for the hash.
    
    With inmem_search(), we can implement the btrfs_dedupe_search()
    interface.
    
    Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
    Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
    Reviewed-by: Josef Bacik <jbacik@fb.com>
    wangxiaoguang authored and fengguang committed Jun 30, 2016
  10. btrfs: delayed-ref: Add support for increasing data ref under spinlock

    For in-band dedupe, btrfs needs to increase data ref with delayed_ref
    locked, so add a new function btrfs_add_delayed_data_ref_lock() to
    increase extent ref with delayed_refs already locked.
    
    Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
    Reviewed-by: Josef Bacik <jbacik@fb.com>
    Qu Wenruo authored and fengguang committed Jun 30, 2016
  11. btrfs: dedupe: Introduce function to remove hash from in-memory tree

    Introduce static function inmem_del() to remove hash from in-memory
    dedupe tree.
    And implement btrfs_dedupe_del() and btrfs_dedup_disable() interfaces.
    
    Also for btrfs_dedupe_disable(), add new functions to wait existing
    writer and block incoming writers to eliminate all possible race.
    
    Cc: Mark Fasheh <mfasheh@suse.de>
    Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
    Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
    wangxiaoguang authored and fengguang committed Jun 30, 2016
  12. btrfs: dedupe: Introduce function to add hash into in-memory tree

    Introduce static function inmem_add() to add hash into in-memory tree.
    And now we can implement the btrfs_dedupe_add() interface.
    
    Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
    Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
    Reviewed-by: Josef Bacik <jbacik@fb.com>
    wangxiaoguang authored and fengguang committed Jun 30, 2016
  13. btrfs: dedupe: Introduce function to initialize dedupe info

    Add generic function to initialize dedupe info.
    
    Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
    Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
    Reviewed-by: Josef Bacik <jbacik@fb.com>
    wangxiaoguang authored and fengguang committed Jun 30, 2016
  14. btrfs: dedupe: Introduce dedupe framework and its header

    Introduce the header for btrfs in-band(write time) de-duplication
    framework and needed header.
    
    The new de-duplication framework is going to support 2 different dedupe
    methods and 1 dedupe hash.
    
    Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
    Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
    wangxiaoguang authored and fengguang committed Jun 30, 2016

Commits on Jun 27, 2016

  1. Linux 4.7-rc5

    torvalds committed Jun 27, 2016

Commits on Jun 26, 2016

  1. Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/g…

    …it/jejb/scsi
    
    Pull SCSI fixes from James Bottomley:
     "Two straightforward fixes.
    
      One is a concurrency issue only affecting SAS connected SATA drives,
      but which could hang the storage subsystem if it triggers (because the
      outstanding command count on error never goes back to zero) and the
      other is a NO_TAG fallout from the switch to hostwide tags which
      causes the system to crash on module insertion (we've checked
      carefully and only the 53c700 family of drivers is vulnerable to this
      issue)"
    
    * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
      53c700: fix BUG on untagged commands
      scsi: fix race between simultaneous decrements of ->host_failed
    torvalds committed Jun 26, 2016

Commits on Jun 25, 2016

  1. Merge branch 'for-linus-4.7-part2' of git://git.kernel.org/pub/scm/li…

    …nux/kernel/git/mason/linux-btrfs
    
    Pull btrfs fixes part 2 from Chris Mason:
     "This has one patch from Omar to bring iterate_shared back to btrfs.
    
      We have a tree of work we queue up for directory items and it doesn't
      lend itself well to shared access.  While we're cleaning it up, Omar
      has changed things to use an exclusive lock when there are delayed
      items"
    
    * 'for-linus-4.7-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
      Btrfs: fix ->iterate_shared() by upgrading i_rwsem for delayed nodes
    torvalds committed Jun 25, 2016
  2. Merge branch 'for-linus-4.7' of git://git.kernel.org/pub/scm/linux/ke…

    …rnel/git/mason/linux-btrfs
    
    Pull btrfs fixes from Chris Mason:
     "I have a two part pull this time because one of the patches Dave
      Sterba collected needed to be against v4.7-rc2 or higher (we used
      rc4).  I try to make my for-linus-xx branch testable on top of the
      last major so we can hand fixes to people on the list more easily, so
      I've split this pull in two.
    
      This first part has some fixes and two performance improvements that
      we've been testing for some time.
    
      Josef's two performance fixes are most notable.  The transid tracking
      patch makes a big improvement on pretty much every workload"
    
    * 'for-linus-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
      Btrfs: Force stripesize to the value of sectorsize
      btrfs: fix disk_i_size update bug when fallocate() fails
      Btrfs: fix error handling in map_private_extent_buffer
      Btrfs: fix error return code in btrfs_init_test_fs()
      Btrfs: don't do nocow check unless we have to
      btrfs: fix deadlock in delayed_ref_async_start
      Btrfs: track transid for delayed ref flushing
    torvalds committed Jun 25, 2016
  3. Merge tag 'sound-4.7-rc5' of git://git.kernel.org/pub/scm/linux/kerne…

    …l/git/tiwai/sound
    
    Pull sound fixes from Takashi Iwai:
     "Again pretty calm weeks: we've had only a few trivial / stable
      HD-audio fixes in addition to a possible race fix for snd-dummy driver
      spotted by syzkaller"
    
    * tag 'sound-4.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
      ALSA: dummy: Fix a use-after-free at closing
      ALSA: hda / realtek - add two more Thinkpad IDs (5050,5053) for tpt460 fixup
      ALSA: hda - Fix the headset mic jack detection on Dell machine
      ALSA: hda/tegra: iomem fixups for sparse warnings
      ALSA: hdac_regmap - fix the register access for runtime PM
    torvalds committed Jun 25, 2016
  4. Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/…

    …linux/kernel/git/tip/tip
    
    Pull x86 kprobe fix from Thomas Gleixner:
     "A single fix clearing the TF bit when a fault is single stepped"
    
    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
      kprobes/x86: Clear TF bit in fault on single-stepping
    torvalds committed Jun 25, 2016
  5. Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm…

    …/linux/kernel/git/tip/tip
    
    Pull scheduler fixes from Thomas Gleixner:
     "A couple of scheduler fixes:
    
       - force watchdog reset while processing sysrq-w
    
       - fix a deadlock when enabling trace events in the scheduler
    
       - fixes to the throttled next buddy logic
    
       - fixes for the average accounting (missing serialization and
         underflow handling)
    
       - allow kernel threads for fallback to online but not active cpus"
    
    * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
      sched/core: Allow kthreads to fall back to online && !active cpus
      sched/fair: Do not announce throttled next buddy in dequeue_task_fair()
      sched/fair: Initialize throttle_count for new task-groups lazily
      sched/fair: Fix cfs_rq avg tracking underflow
      kernel/sysrq, watchdog, sched/core: Reset watchdog on all CPUs while processing sysrq-w
      sched/debug: Fix deadlock when enabling sched events
      sched/fair: Fix post_init_entity_util_avg() serialization
    torvalds committed Jun 25, 2016
  6. Btrfs: fix ->iterate_shared() by upgrading i_rwsem for delayed nodes

    Commit fe742fd ("Revert "btrfs: switch to ->iterate_shared()"")
    backed out the conversion to ->iterate_shared() for Btrfs because the
    delayed inode handling in btrfs_real_readdir() is racy. However, we can
    still do readdir in parallel if there are no delayed nodes.
    
    This is a temporary fix which upgrades the shared inode lock to an
    exclusive lock only when we have delayed items until we come up with a
    more complete solution. While we're here, rename the
    btrfs_{get,put}_delayed_items functions to make it very clear that
    they're just for readdir.
    
    Tested with xfstests and by doing a parallel kernel build:
    
    	while make tinyconfig && make -j4 && git clean dqfx; do
    		:
    	done
    
    along with a bunch of parallel finds in another shell:
    
    	while true; do
    		for ((i=0; i<4; i++)); do
    			find . >/dev/null &
    		done
    		wait
    	done
    
    Signed-off-by: Omar Sandoval <osandov@fb.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Signed-off-by: Chris Mason <clm@fb.com>
    osandov authored and masoncl committed Jun 25, 2016
  7. Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/s…

    …cm/linux/kernel/git/tip/tip
    
    Pull locking fix from Thomas Gleixner:
     "A single fix to address a race in the static key logic"
    
    * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
      locking/static_key: Fix concurrent static_key_slow_inc()
    torvalds committed Jun 25, 2016
  8. Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/l…

    …inux/kernel/git/tip/tip
    
    Pull irq fix from Thomas Gleixner:
     "A single fix for the fallout from the conversion of MIPS GIC to irq
      domains"
    
    * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
      irqchip/mips-gic: Fix IRQs in gic_dev_domain
    torvalds committed Jun 25, 2016
  9. Merge tag 'powerpc-4.7-4' of git://git.kernel.org/pub/scm/linux/kerne…

    …l/git/powerpc/linux
    
    Pull powerpc fixes from Michael Ellerman:
     "mm/radix (Aneesh Kumar K.V):
       - Update to tlb functions ric argument
       - Flush page walk cache when freeing page table
       - Update Radix tree size as per ISA 3.0
    
      mm/hash (Aneesh Kumar K.V):
       - Use the correct PPP mask when updating HPTE
       - Don't add memory coherence if cache inhibited is set
    
      eeh (Gavin Shan):
       - Fix invalid cached PE primary bus
    
      bpf/jit (Naveen N. Rao):
       - Disable classic BPF JIT on ppc64le
    
      .. and fix faults caused by radix patching of SLB miss handler
      (Michael Ellerman)"
    
    * tag 'powerpc-4.7-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
      powerpc/bpf/jit: Disable classic BPF JIT on ppc64le
      powerpc: Fix faults caused by radix patching of SLB miss handler
      powerpc/eeh: Fix invalid cached PE primary bus
      powerpc/mm/radix: Update Radix tree size as per ISA 3.0
      powerpc/mm/hash: Don't add memory coherence if cache inhibited is set
      powerpc/mm/hash: Use the correct PPP mask when updating HPTE
      powerpc/mm/radix: Flush page walk cache when freeing page table
      powerpc/mm/radix: Update to tlb functions ric argument
    torvalds committed Jun 25, 2016
  10. Fix build break in fork.c when THREAD_SIZE < PAGE_SIZE

    Commit b235bee ("Clarify naming of thread info/stack allocators")
    breaks the build on some powerpc configs, where THREAD_SIZE < PAGE_SIZE:
    
      kernel/fork.c:235:2: error: implicit declaration of function 'free_thread_stack'
      kernel/fork.c:355:8: error: assignment from incompatible pointer type
        stack = alloc_thread_stack_node(tsk, node);
        ^
    
    Fix it by renaming free_stack() to free_thread_stack(), and updating the
    return type of alloc_thread_stack_node().
    
    Fixes: b235bee ("Clarify naming of thread info/stack allocators")
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    mpe authored and torvalds committed Jun 25, 2016
  11. Merge branch 'akpm' (patches from Andrew)

    Merge misc fixes from Andrew Morton:
     "Two weeks worth of fixes here"
    
    * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (41 commits)
      init/main.c: fix initcall_blacklisted on ia64, ppc64 and parisc64
      autofs: don't get stuck in a loop if vfs_write() returns an error
      mm/page_owner: avoid null pointer dereference
      tools/vm/slabinfo: fix spelling mistake: "Ocurrences" -> "Occurrences"
      fs/nilfs2: fix potential underflow in call to crc32_le
      oom, suspend: fix oom_reaper vs. oom_killer_disable race
      ocfs2: disable BUG assertions in reading blocks
      mm, compaction: abort free scanner if split fails
      mm: prevent KASAN false positives in kmemleak
      mm/hugetlb: clear compound_mapcount when freeing gigantic pages
      mm/swap.c: flush lru pvecs on compound page arrival
      memcg: css_alloc should return an ERR_PTR value on error
      memcg: mem_cgroup_migrate() may be called with irq disabled
      hugetlb: fix nr_pmds accounting with shared page tables
      Revert "mm: disable fault around on emulated access bit architecture"
      Revert "mm: make faultaround produce old ptes"
      mailmap: add Boris Brezillon's email
      mailmap: add Antoine Tenart's email
      mm, sl[au]b: add __GFP_ATOMIC to the GFP reclaim mask
      mm: mempool: kasan: don't poot mempool objects in quarantine
      ...
    torvalds committed Jun 25, 2016
  12. Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/gi…

    …t/dledford/rdma
    
    Pull rdma fixes from Doug Ledford:
     "This is the second batch of queued up rdma patches for this rc cycle.
    
      There isn't anything really major in here.  It's passed 0day,
      linux-next, and local testing across a wide variety of hardware.
      There are still a few known issues to be tracked down, but this should
      amount to the vast majority of the rdma RC fixes.
    
      Round two of 4.7 rc fixes:
    
       - A couple minor fixes to the rdma core
       - Multiple minor fixes to hfi1
       - Multiple minor fixes to mlx4/mlx4
       - A few minor fixes to i40iw"
    
    * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (31 commits)
      IB/srpt: Reduce QP buffer size
      i40iw: Enable level-1 PBL for fast memory registration
      i40iw: Return correct max_fast_reg_page_list_len
      i40iw: Correct status check on i40iw_get_pble
      i40iw: Correct CQ arming
      IB/rdmavt: Correct qp_priv_alloc() return value test
      IB/hfi1: Don't zero out qp->s_ack_queue in rvt_reset_qp
      IB/hfi1: Fix deadlock with txreq allocation slow path
      IB/mlx4: Prevent cross page boundary allocation
      IB/mlx4: Fix memory leak if QP creation failed
      IB/mlx4: Verify port number in flow steering create flow
      IB/mlx4: Fix error flow when sending mads under SRIOV
      IB/mlx4: Fix the SQ size of an RC QP
      IB/mlx5: Fix wrong naming of port_rcv_data counter
      IB/mlx5: Fix post send fence logic
      IB/uverbs: Initialize ib_qp_init_attr with zeros
      IB/core: Fix false search of the IB_SA_WELL_KNOWN_GUID
      IB/core: Fix RoCE v1 multicast join logic issue
      IB/core: Fix no default GIDs when netdevice reregisters
      IB/hfi1: Send a pkey change event on driver pkey update
      ...
    torvalds committed Jun 25, 2016
  13. Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel…

    …/git/jikos/hid
    
    Pull HID fix from Jiri Kosina:
     "hiddev ioctl() validation fix from Scott Bauer"
    
    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
      HID: hiddev: validate num_values for HIDIOCGUSAGES, HIDIOCSUSAGES commands
    torvalds committed Jun 25, 2016
  14. Merge tag 'hwmon-for-linus-v4.7-rc5' of git://git.kernel.org/pub/scm/…

    …linux/kernel/git/groeck/linux-staging
    
    Pull hwmon fix from Guenter Roeck:
     "Improve fan type detection for dell-smm to prevent kernel hang"
    
    * tag 'hwmon-for-linus-v4.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
      hwmon: (dell-smm) Cache fan_type() calls and change fan detection
    torvalds committed Jun 25, 2016
  15. Merge tag 'acpi-4.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel…

    …/git/rafael/linux-pm
    
    Pull ACPI fix from Rafael Wysocki:
     "Stable-candidate fix for a deadlock in ACPICA introduced during the
      4.5 development cycle by a commit attempting to improve the handling
      of AML code that doesn't belong to any namespace objects in a given
      definition block (Lv Zheng)"
    
    * tag 'acpi-4.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
      ACPICA: Namespace: Fix deadlock triggered by MLC support in dynamic table loading
    torvalds committed Jun 25, 2016
  16. Merge tag 'pm-4.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/g…

    …it/rafael/linux-pm
    
    Pull power management fixes from Rafael Wysocki:
     "Fix for a latent cpufreq driver bug uncovered by a recent ACPICA
      change and several fixes for the devfreq framework, including one fix
      for an issue introduced recently.
    
      Specifics:
    
       - Fix a latent initialization issue in the pcc-cpufreq driver
         (incorrect initial value of a structure field) that has been
         uncovered by a recent ACPICA commit (Mike Galbraith).
    
       - Add a missing notification in an update_devfreq() error code path
         forgotten by a recent devfreq commit (Chanwoo Choi).
    
       - Fix devfreq device frequency initialization (Lukasz Luba).
    
       - Fix an incorrect IS_ERR() check in the devfreq framework discovered
         by the Smatch checker (Dan Carpenter).
    
       - Drop two excessive put_device() calls from the devfreq framework
         (MyungJoo Ham, Cai Zhiyong).
    
       - Fix a possible memory leak in the devfreq framework and drop an
         unnecessary kfree() invocation from it (MyungJoo Ham)"
    
    * tag 'pm-4.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
      PM / devfreq: Send the DEVFREQ_POSTCHANGE notification when target() is failed
      cpufreq: pcc-cpufreq: Fix doorbell.access_width
      PM / devfreq: fix initialization of current frequency in last status
      PM / devfreq: exynos-nocp: Remove incorrect IS_ERR() check
      PM / devfreq: remove double put_device
      PM / devfreq: fix double call put_device
      PM / devfreq: fix duplicated kfree on devfreq pointer
      PM / devfreq: devm_kzalloc to have dev pointer more precisely
    torvalds committed Jun 25, 2016
  17. Merge tag 'for-linus-4.7b-rc4-tag' of git://git.kernel.org/pub/scm/li…

    …nux/kernel/git/xen/tip
    
    Pull xen bug fixes from David Vrabel:
    
     - fix x86 PV dom0 crash during early boot on some hardware
    
     - fix two pciback bugs affects certain devices
    
     - fix potential overflow when clearing page tables in x86 PV
    
    * tag 'for-linus-4.7b-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
      xen-pciback: return proper values during BAR sizing
      x86/xen: avoid m2p lookup when setting early page table entries
      xen/pciback: Fix conf_space read/write overlap check.
      x86/xen: fix upper bound of pmd loop in xen_cleanhighmap()
      xen/balloon: Fix declared-but-not-defined warning
    torvalds committed Jun 25, 2016
  18. Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/…

    …git/arm64/linux
    
    Pull arm64 fixes from Will Deacon:
     "Here are a few more arm64 fixes, but things do finally appear to be
      slowing down.  The main fix is avoiding hibernation in a previously
      unanticipated situation where we have CPUs parked in the kernel, but
      it's all good stuff.
    
       - Fix icache/dcache sync for anonymous pages under migration
       - Correct the ASID limit check
       - Fix parallel builds of Image and Image.gz
       - Refuse to hibernate when we have CPUs that we can't offline"
    
    * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
      arm64: hibernate: Don't hibernate on systems with stuck CPUs
      arm64: smp: Add function to determine if cpus are stuck in the kernel
      arm64: mm: remove page_mapping check in __sync_icache_dcache
      arm64: fix boot image dependencies to not generate invalid images
      arm64: update ASID limit
    torvalds committed Jun 25, 2016
  19. init/main.c: fix initcall_blacklisted on ia64, ppc64 and parisc64

    When I replaced kasprintf("%pf") with a direct call to
    sprint_symbol_no_offset I must have broken the initcall blacklisting
    feature on the arches where dereference_function_descriptor() is
    non-trivial.
    
    Fixes: c8cdd2b (init/main.c: simplify initcall_blacklisted())
    Link: http://lkml.kernel.org/r/1466027283-4065-1-git-send-email-linux@rasmusvillemoes.dk
    Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
    Cc: Yang Shi <yang.shi@linaro.org>
    Cc: Prarit Bhargava <prarit@redhat.com>
    Cc: Petr Mladek <pmladek@suse.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Villemoes authored and torvalds committed Jun 25, 2016
Older
You can’t perform that action at this time.