Skip to content
Permalink
jeffm-suse-com…

Commits on Nov 5, 2016

  1. btrfs: remove old tree_root dirent processing in btrfs_real_readdir()

    Commit 3de4586 (Btrfs: Allow subvolumes and snapshots anywhere
    in the directory tree) introduced the current system of placing
    snapshots in the directory tree.  It also introduced the behavior of
    creating the snapshot and then creating the directory entries for it.
    
    We've kept this code around for compatibility reasons, but it turns
    out that no file systems with the old tree_root based snapshots can
    be mounted on newer (>= 2009) kernels anyway.  About a month after the
    above commit, commit 2a7108a (Btrfs: rev the disk format for the
    inode compat and csum selection changes) landed, changing the superblock
    magic number.
    
    As a result, we know that we'll never encounter tree_root-based dirents
    or have to deal with skipping our own snapshot dirents.  Since that
    also means that we're now only iterating over DIR_INDEX items, which only
    contain one directory entry per leaf item, we don't need to loop over
    the leaf item contents anymore either.
    
    Signed-off-by: Jeff Mahoney <jeffm@suse.com>
    jeffmahoney authored and fengguang committed Nov 5, 2016

Commits on Jul 21, 2016

  1. Btrfs: fix delalloc accounting after copy_from_user faults

    Commit 56244ef was almost but not quite enough to fix the
    reservation math after btrfs_copy_from_user returned partial copies.
    
    Some users are still seeing warnings in btrfs_destroy_inode, and with a
    long enough test run I'm able to trigger them as well.
    
    This patch fixes the accounting math again, bringing it much closer to
    the way it was before the sectorsize conversion Chandan did.  The
    problem is accounting for the offset into the page/sector when we do a
    partial copy.  This one just uses the dirty_sectors variable which
    should already be updated properly.
    
    Signed-off-by: Chris Mason <clm@fb.com>
    cc: stable@vger.kernel.org # v4.6+
    masoncl committed Jul 21, 2016

Commits on Jul 20, 2016

  1. Btrfs: avoid deadlocks during reservations in btrfs_truncate_block

    The new enospc code makes it possible to deadlock if we don't use
    FLUSH_LIMIT during reservations inside a transaction.  This enforces
    the correct flush type to avoid both deadlocks and assertions
    
    Signed-off-by: Chris Mason <clm@fb.com>
    Signed-off-by: Josef Bacik <jbacik@fb.com>
    Josef Bacik authored and masoncl committed Jul 20, 2016

Commits on Jul 7, 2016

  1. Btrfs: use FLUSH_LIMIT for relocation in reserve_metadata_bytes

    We used to allow you to set FLUSH_ALL and then just wouldn't do things like
    commit transactions or wait on ordered extents if we noticed you were in a
    transaction.  However now that all the flushing for FLUSH_ALL is asynchronous
    we've lost the ability to tell, and we could end up deadlocking.  So instead use
    FLUSH_LIMIT in reserve_metadata_bytes in relocation and then return -EAGAIN if
    we error out to preserve the previous behavior.  I've also added an ASSERT() to
    catch anybody else who tries to do this.  Thanks,
    
    Signed-off-by: Josef Bacik <jbacik@fb.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Josef Bacik authored and kdave committed Jul 7, 2016
  2. Btrfs: fill relocation block rsv after allocation

    Since we set the reloc control before we've reserved our space for relocation we
    could race with a root being dirtied and not actually have space to do our init
    reloc root.  So once we've allocated it and set it up go ahead and make our
    reservation before setting the relocate control, that way anybody who tries to
    do the reloc root init has space to use.  Thanks,
    
    Signed-off-by: Josef Bacik <jbacik@fb.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Josef Bacik authored and kdave committed Jul 7, 2016
  3. Btrfs: always use trans->block_rsv for orphans

    This is the case all the time anyway except for relocation which could be doing
    a reloc root for a non ref counted root, in which case we'd end up with some
    random block rsv rather than the one we have our reservation in.  If there isn't
    enough space in the block rsv we are trying to steal from we'll BUG() because we
    expect there to be space for the orphan to make its reservation.  Thanks,
    
    Signed-off-by: Josef Bacik <jbacik@fb.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Josef Bacik authored and kdave committed Jul 7, 2016
  4. Btrfs: change how we calculate the global block rsv

    Traditionally we've calculated the global block rsv by guessing how much of the
    metadata used amount was the extent tree, and then taking the data size and
    figuring out how large the csum tree would have to be to hold that much data.
    
    This is imprecise and falls down on MIXED file systems as we can't trust the
    data used amount.  This resulted in failures for xfstests generic/333 because it
    creates lots of clones, which explodes out the extent tree.  Our global reserve
    calculations were woefully inaccurate in this case which meant we got into a
    situation where we did not have enough reserved to do our work.
    
    We know we only use the global block rsv for the extent, csum, and root trees,
    so just get the bytes used for these trees and use that as the basis of our
    global reserve.  Since these are not reference counted trees the bytes_used
    value will be accurate.  This fixed the transaction aborts seen with
    generic/333.  Thanks,
    
    Signed-off-by: Josef Bacik <jbacik@fb.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Josef Bacik authored and kdave committed Jul 7, 2016
  5. Btrfs: use root when checking need_async_flush

    Instead of doing fs_info->fs_root in need_async_flush, which may not be set
    during recovery when mounting, just pass the root itself in, which makes more
    sense as thats what btrfs_calc_reclaim_metadata_size takes.
    
    Signed-off-by: Josef Bacik <jbacik@fb.com>
    Reported-by: David Sterba <dsterba@suse.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Josef Bacik authored and kdave committed Jul 7, 2016
  6. Btrfs: don't bother kicking async if there's nothing to reclaim

    We do this check when we start the async reclaimer thread, might as well check
    before we kick it off to save us some cycles.  Thanks,
    
    Signed-off-by: Josef Bacik <jbacik@fb.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Josef Bacik authored and kdave committed Jul 7, 2016
  7. Btrfs: fix release reserved extents trace points

    We were doing trace_btrfs_release_reserved_extent() in pin_down_extent which
    isn't quite right because we will go through and free that extent later when we
    unpin, so it messes up apps that are accounting for the reservation space.  We
    were also unconditionally doing it in __btrfs_free_reserved_extent(), when we
    only actually free the reservation instead of pinning the extent.  Thanks,
    
    Signed-off-by: Josef Bacik <jbacik@fb.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Josef Bacik authored and kdave committed Jul 7, 2016
  8. Btrfs: add fsid to some tracepoints

    When tracing enospc problems on a box with multiple file systems mounted I need
    to be able to differentiate between the two file systems.  Most of the important
    trace points I'm looking at already have an fsid, but the reserved extent trace
    points do not, so add that to make it possible to figure out which trace point
    belongs to which file system.  Thanks,
    
    Signed-off-by: Josef Bacik <jbacik@fb.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Josef Bacik authored and kdave committed Jul 7, 2016
  9. Btrfs: add tracepoints for flush events

    We want to track when we're triggering flushing from our reservation code and
    what flushing is being done when we start flushing.  Thanks,
    
    Signed-off-by: Josef Bacik <jbacik@fb.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Josef Bacik authored and kdave committed Jul 7, 2016
  10. Btrfs: fix delalloc reservation amount tracepoint

    We can sometimes drop the reservation we had for our inode, so we need to remove
    that amount from to_reserve so that our tracepoint reports a valid amount of
    space.
    
    Signed-off-by: Josef Bacik <jbacik@fb.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Josef Bacik authored and kdave committed Jul 7, 2016
  11. Btrfs: trace pinned extents

    Pinned extents are an important metric to keep track of for enospc.
    
    Signed-off-by: Josef Bacik <jbacik@fb.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Josef Bacik authored and kdave committed Jul 7, 2016
  12. Btrfs: introduce ticketed enospc infrastructure

    Our enospc flushing sucks.  It is born from a time where we were early
    enospc'ing constantly because multiple threads would race in for the same
    reservation and randomly starve other ones out.  So I came up with this solution
    to block any other reservations from happening while one guy tried to flush
    stuff to satisfy his reservation.  This gives us pretty good correctness, but
    completely crap latency.
    
    The solution I've come up with is ticketed reservations.  Basically we try to
    make our reservation, and if we can't we put a ticket on a list in order and
    kick off an async flusher thread.  This async flusher thread does the same old
    flushing we always did, just asynchronously.  As space is freed and added back
    to the space_info it checks and sees if we have any tickets that need
    satisfying, and adds space to the tickets and wakes up anything we've satisfied.
    
    Once the flusher thread stops making progress it wakes up all the current
    tickets and tells them to take a hike.
    
    There is a priority list for things that can't flush, since the async flusher
    could do anything we need to avoid deadlocks.  These guys get priority for
    having their reservation made, and will still do manual flushing themselves in
    case the async flusher isn't running.
    
    This patch gives us significantly better latencies.  Thanks,
    
    Signed-off-by: Josef Bacik <jbacik@fb.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Josef Bacik authored and kdave committed Jul 7, 2016
  13. Btrfs: add tracepoint for adding block groups

    I'm writing a tool to visualize the enospc system inside btrfs, I need this
    tracepoint in order to keep track of the block groups in the system.  Thanks,
    
    Signed-off-by: Josef Bacik <jbacik@fb.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Josef Bacik authored and kdave committed Jul 7, 2016
  14. Btrfs: warn_on for unaccounted spaces

    These were hidden behind enospc_debug, which isn't helpful as they indicate
    actual bugs, unlike the rest of the enospc_debug stuff which is really debug
    information.  Thanks,
    
    Signed-off-by: Josef Bacik <jbacik@fb.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Josef Bacik authored and kdave committed Jul 7, 2016
  15. Btrfs: change delayed reservation fallback behavior

    We reserve space for the inode update when we first reserve space for writing to
    a file.  However there are lots of ways that we can use this reservation and not
    have it for subsequent ordered extents.  Previously we'd fall through and try to
    reserve metadata bytes for this, then we'd just steal the full reservation from
    the delalloc_block_rsv, and if that didn't have enough space we'd steal the full
    reservation from the global reserve.  The problem with this is we can easily
    just return ENOSPC and fallback to updating the inode item directly.  In the
    worst case (assuming 4k nodesize) we'd steal 64kib from the global reserve if we
    fall all the way through, however if we just fallback and update the inode
    directly we'd only steal 4k * BTRFS_PATH_MAX in the worst case which is 32kib.
    
    We would have also just added the extent item for the inode so we likely will
    have already cow'ed down most of the way to the leaf containing the inode item,
    so we are more often than not only need one or two nodesize's worth of
    reservations.  Given the reservation for the extent itself is also a worst case
    we will likely already have space to cover the inode update.
    
    This change will make us behave better in the theoretical worst case, and much
    better in the case that we don't have our reservation and cannot reserve more
    metadata.  Thanks,
    
    Signed-off-by: Josef Bacik <jbacik@fb.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Josef Bacik authored and kdave committed Jul 7, 2016
  16. Btrfs: always reserve metadata for delalloc extents

    There are a few races in the metadata reservation stuff.  First we add the bytes
    to the block_rsv well after we've set the bit on the inode saying that we have
    space for it and after we've reserved the bytes.  So use the normal
    btrfs_block_rsv_add helper for this case.  Secondly we can flush delalloc
    extents when we try to reserve space for our write, which means that we could
    have used up the space for the inode and we wouldn't know because we only check
    before the reservation.  So instead make sure we are always reserving space for
    the inode update, and then if we don't need it release those bytes afterward.
    Thanks,
    
    Signed-off-by: Josef Bacik <jbacik@fb.com>
    Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Josef Bacik authored and kdave committed Jul 7, 2016
  17. Btrfs: fix callers of btrfs_block_rsv_migrate

    So btrfs_block_rsv_migrate just unconditionally calls block_rsv_migrate_bytes.
    Not only this but it unconditionally changes the size of the block_rsv.  This
    isn't a bug strictly speaking, but it makes truncate block rsv's look funny
    because every time we migrate bytes over its size grows, even though we only
    want it to be a specific size.  So collapse this into one function that takes an
    update_size argument and make truncate and evict not update the size for
    consistency sake.  Thanks,
    
    Signed-off-by: Josef Bacik <jbacik@fb.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Josef Bacik authored and kdave committed Jul 7, 2016
  18. Btrfs: add bytes_readonly to the spaceinfo at once

    For some reason we're adding bytes_readonly to the space info after we update
    the space info with the block group info.  This creates a tiny race where we
    could over-reserve space because we haven't yet taken out the bytes_readonly
    bit.  Since we already know this information at the time we call
    update_space_info, just pass it along so it can be updated all at once.  Thanks,
    
    Signed-off-by: Josef Bacik <jbacik@fb.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Josef Bacik authored and kdave committed Jul 7, 2016

Commits on Jul 4, 2016

  1. Linux 4.7-rc6

    torvalds committed Jul 4, 2016

Commits on Jul 3, 2016

  1. Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel…

    …/git/mszeredi/fuse
    
    Pull fuse fix from Miklos Szeredi:
     "This makes sure userspace filesystems are not broken by the parallel
      lookups and readdir feature"
    
    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
      fuse: serialize dirops by default
    torvalds committed Jul 3, 2016
  2. Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/…

    …kernel/git/mszeredi/vfs
    
    Pull overlayfs fixes from Miklos Szeredi:
     "This contains fixes for a dentry leak, a regression in 4.6 noticed by
      Docker users and missing write access checking in truncate"
    
    * 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
      ovl: warn instead of error if d_type is not supported
      ovl: get_write_access() in truncate
      ovl: fix dentry leak for default_permissions
    torvalds committed Jul 3, 2016
  3. ovl: warn instead of error if d_type is not supported

    overlay needs underlying fs to support d_type. Recently I put in a
    patch in to detect this condition and started failing mount if
    underlying fs did not support d_type.
    
    But this breaks existing configurations over kernel upgrade. Those who
    are running docker (partially broken configuration) with xfs not
    supporting d_type, are surprised that after kernel upgrade docker does
    not run anymore.
    
    moby/moby#22937 (comment)
    
    So instead of erroring out, detect broken configuration and warn
    about it. This should allow existing docker setups to continue
    working after kernel upgrade.
    
    Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
    Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
    Fixes: 45aebea ("ovl: Ensure upper filesystem supports d_type")
    Cc: <stable@vger.kernel.org> 4.6
    rhvgoyal authored and Miklos Szeredi committed Jul 3, 2016
  4. Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upst…

    …ream-linus
    
    Pull MIPS fix from Ralf Baechle:
     "Only a single fix for 4.7 pending at this point.  It fixes an issue
      that may lead to corruption of the cache mode bits in the page table"
    
    * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
      MIPS: Fix possible corruption of cache mode by mprotect.
    torvalds committed Jul 3, 2016
  5. Merge tag 'powerpc-4.7-5' of git://git.kernel.org/pub/scm/linux/kerne…

    …l/git/powerpc/linux
    
    Pull powerpc fixes from Michael Ellerman:
    
     - tm: Always reclaim in start_thread() for exec() class syscalls from
       Cyril Bur
    
     - tm: Avoid SLB faults in treclaim/trecheckpoint when RI=0 from Michael
       Neuling
    
     - eeh: Fix wrong argument passed to eeh_rmv_device() from Gavin Shan
    
     - Initialise pci_io_base as early as possible from Darren Stevens
    
    * tag 'powerpc-4.7-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
      powerpc: Initialise pci_io_base as early as possible
      powerpc/tm: Avoid SLB faults in treclaim/trecheckpoint when RI=0
      powerpc/eeh: Fix wrong argument passed to eeh_rmv_device()
      powerpc/tm: Always reclaim in start_thread() for exec() class syscalls
    torvalds committed Jul 3, 2016

Commits on Jul 2, 2016

  1. Merge tag 'drm-fixes-for-v4.7-rc6' of git://people.freedesktop.org/~a…

    …irlied/linux
    
    Pull drm fixes frlm Dave Airlie:
     "Just some AMD and Intel fixes, the AMD ones are further production
      Polaris fixes, and the Intel ones fix some early timeouts, some PCI ID
      changes and a couple of other fixes.
    
      Still a bit Internet challenged here, hopefully end of next week will
      solve it"
    
    * tag 'drm-fixes-for-v4.7-rc6' of git://people.freedesktop.org/~airlied/linux:
      drm/i915: Fix missing unlock on error in i915_ppgtt_info()
      drm/amd/powerplay: workaround for UVD clock issue
      drm/amdgpu: add ACLK_CNTL setting for polaris10
      drm/amd/powerplay: fix issue uvd dpm can't enabled on Polaris11.
      drm/amd/powerplay: Workaround for Memory EDC Error on Polaris10.
      drm/i915: Removing PCI IDs that are no longer listed as Kabylake.
      drm/i915: Add more Kabylake PCI IDs.
      drm/i915: Avoid early timeout during AUX transfers
      drm/i915/hsw: Avoid early timeout during LCPLL disable/restore
      drm/i915/lpt: Avoid early timeout during FDI PHY reset
      drm/i915/bxt: Avoid early timeout during PLL enable
      drm/i915: Refresh cached DP port register value on resume
      drm/amd/powerplay: Update CKS on/ CKS off voltage offset calculation
      drm/amd/powerplay: disable FFC.
      drm/amd/powerplay: add some definition for FFC feature on polaris.
    torvalds committed Jul 2, 2016
  2. Merge tag 'spi-fix-v4.7-rc5' of git://git.kernel.org/pub/scm/linux/ke…

    …rnel/git/broonie/spi
    
    Pull spi fixes from Mark Brown:
     "A few small driver-specific fixes for SPI, all in the normal important
      if you hit them category especially the rockchip driver fix which
      addresses a race which has been exposed more frequently with some
      recent performance improvements"
    
    * tag 'spi-fix-v4.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
      spi: sunxi: fix transfer timeout
      spi: sun4i: fix FIFO limit
      spi: rockchip: Signal unfinished DMA transfers
      spi: spi-ti-qspi: Suspend the queue before removing the device
    torvalds committed Jul 2, 2016
  3. Merge tag 'regulator-fix-v4.7-rc5' of git://git.kernel.org/pub/scm/li…

    …nux/kernel/git/broonie/regulator
    
    Pull regulator fixes from Mark Brown:
     "Two small fixes for the regulator subsystem - one fixing a crash with
      one of the devices supported by the max77620 driver, another fixing
      startup for the anatop regulator when it starts up with the regulator
      in bypass mode"
    
    * tag 'regulator-fix-v4.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
      regulator: max77620: check for valid regulator info
      regulator: anatop: allow regulator to be in bypass mode
    torvalds committed Jul 2, 2016
  4. Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux…

    …/kernel/git/clk/linux
    
    Pull clk fixes from Stephen Boyd:
     "A small fix for the newly added oxnas clk driver and a handful of
      rockchip clk driver fixes for newly added rk3399 support"
    
    * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
      clk: Fix return value check in oxnas_stdclk_probe()
      clk: rockchip: release io resource when failing to init clk on rk3399
      clk: rockchip: fix cpuclk registration error handling
      clk: rockchip: Revert "clk: rockchip: reset init state before mmc card initialization"
      clk: rockchip: fix incorrect parent for rk3399's {c,g}pll_aclk_perihp_src
      clk: rockchip: mark rk3399 GIC clocks as critical
      clk: rockchip: initialize flags of clk_init_data in mmc-phase clock
    torvalds committed Jul 2, 2016
  5. Merge tag 'drm-intel-fixes-2016-06-30' of git://anongit.freedesktop.o…

    …rg/drm-intel into drm-fixes
    
    here's a batch of i915 fixes for 4.7.
    
    * tag 'drm-intel-fixes-2016-06-30' of git://anongit.freedesktop.org/drm-intel:
      drm/i915: Fix missing unlock on error in i915_ppgtt_info()
      drm/i915: Removing PCI IDs that are no longer listed as Kabylake.
      drm/i915: Add more Kabylake PCI IDs.
      drm/i915: Avoid early timeout during AUX transfers
      drm/i915/hsw: Avoid early timeout during LCPLL disable/restore
      drm/i915/lpt: Avoid early timeout during FDI PHY reset
      drm/i915/bxt: Avoid early timeout during PLL enable
      drm/i915: Refresh cached DP port register value on resume
    airlied committed Jul 2, 2016
  6. Merge branch 'drm-fixes-4.7' of git://people.freedesktop.org/~agd5f/l…

    …inux into drm-fixes
    
    Just a few more late fixes for Polaris cards.
    
    * 'drm-fixes-4.7' of git://people.freedesktop.org/~agd5f/linux:
      drm/amd/powerplay: workaround for UVD clock issue
      drm/amdgpu: add ACLK_CNTL setting for polaris10
      drm/amd/powerplay: fix issue uvd dpm can't enabled on Polaris11.
      drm/amd/powerplay: Workaround for Memory EDC Error on Polaris10.
      drm/amd/powerplay: Update CKS on/ CKS off voltage offset calculation
      drm/amd/powerplay: disable FFC.
      drm/amd/powerplay: add some definition for FFC feature on polaris.
    airlied committed Jul 2, 2016

Commits on Jul 1, 2016

  1. MIPS: Fix possible corruption of cache mode by mprotect.

    The following testcase may result in a page table entries with a invalid
    CCA field being generated:
    
    static void *bindstack;
    
    static int sysrqfd;
    
    static void protect_low(int protect)
    {
    	mprotect(bindstack, BINDSTACK_SIZE, protect);
    }
    
    static void sigbus_handler(int signal, siginfo_t * info, void *context)
    {
    	void *addr = info->si_addr;
    
    	write(sysrqfd, "x", 1);
    
    	printf("sigbus, fault address %p (should not happen, but might)\n",
    	       addr);
    	abort();
    }
    
    static void run_bind_test(void)
    {
    	unsigned int *p = bindstack;
    
    	p[0] = 0xf001f001;
    
    	write(sysrqfd, "x", 1);
    
    	/* Set trap on access to p[0] */
    	protect_low(PROT_NONE);
    
    	write(sysrqfd, "x", 1);
    
    	/* Clear trap on access to p[0] */
    	protect_low(PROT_READ | PROT_WRITE | PROT_EXEC);
    
    	write(sysrqfd, "x", 1);
    
    	/* Check the contents of p[0] */
    	if (p[0] != 0xf001f001) {
    		write(sysrqfd, "x", 1);
    
    		/* Reached, but shouldn't be */
    		printf("badness, shouldn't happen but does\n");
    		abort();
    	}
    }
    
    int main(void)
    {
    	struct sigaction sa;
    
    	sysrqfd = open("/proc/sysrq-trigger", O_WRONLY);
    
    	if (sigprocmask(SIG_BLOCK, NULL, &sa.sa_mask)) {
    		perror("sigprocmask");
    		return 0;
    	}
    
    	sa.sa_sigaction = sigbus_handler;
    	sa.sa_flags = SA_SIGINFO | SA_NODEFER | SA_RESTART;
    	if (sigaction(SIGBUS, &sa, NULL)) {
    		perror("sigaction");
    		return 0;
    	}
    
    	bindstack = mmap(NULL,
    			 BINDSTACK_SIZE,
    			 PROT_READ | PROT_WRITE | PROT_EXEC,
    			 MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
    	if (bindstack == MAP_FAILED) {
    		perror("mmap bindstack");
    		return 0;
    	}
    
    	printf("bindstack: %p\n", bindstack);
    
    	run_bind_test();
    
    	printf("done\n");
    
    	return 0;
    }
    
    There are multiple ingredients for this:
    
     1) PAGE_NONE is defined to _CACHE_CACHABLE_NONCOHERENT, which is CCA 3
        on all platforms except SB1 where it's CCA 5.
     2) _page_cachable_default must have bits set which are not set
        _CACHE_CACHABLE_NONCOHERENT.
     3) Either the defective version of pte_modify for XPA or the standard
        version must be in used.  However pte_modify for the 36 bit address
        space support is no affected.
    
    In that case additional bits in the final CCA mode may generate an invalid
    value for the CCA field.  On the R10000 system where this was tracked
    down for example a CCA 7 has been observed, which is Uncached Accelerated.
    
    Fixed by:
    
     1) Using the proper CCA mode for PAGE_NONE just like for all the other
        PAGE_* pte/pmd bits.
     2) Fix the two affected variants of pte_modify.
    
    Further code inspection also shows the same issue to exist in pmd_modify
    which would affect huge page systems.
    
    Issue in pte_modify tracked down by Alastair Bridgewater, PAGE_NONE
    and pmd_modify issue found by me.
    
    The history of this goes back beyond Linus' git history.  Chris Dearman's
    commit 3513369 ("[MIPS] Allow setting of
    the cache attribute at run time.") missed the opportunity to fix this
    but it was originally introduced in lmo commit
    d523832 ("Missing from last commit.")
    and 32cc382 ("New configuration option
    CONFIG_MIPS_UNCACHED.")
    
    Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
    Reported-by: Alastair Bridgewater <alastair.bridgewater@gmail.com>
    ralfbaechle committed Jul 1, 2016
Older
You can’t perform that action at this time.