Skip to content
Commits on Feb 8, 2016
  1. Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

    committed
    Pull KVM fixes from Paolo Bonzini:
     "KVM-ARM fixes, mostly coming from the PMU work"
    
    * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
      arm64: KVM: Fix guest dead loop when register accessor returns false
      arm64: KVM: Fix comments of the CP handler
      arm64: KVM: Fix wrong use of the CPSR MODE mask for 32bit guests
      arm64: KVM: Obey RES0/1 reserved bits when setting CPTR_EL2
      arm64: KVM: Fix AArch64 guest userspace exception injection
  2. Merge tag 'regmap-fix-v4.5-big-endian' of git://git.kernel.org/pub/sc…

    committed
    …m/linux/kernel/git/broonie/regmap
    
    Pull regmap fix from Mark Brown:
     "A single revert back to v4.4 endianness handling.
    
      Commit 29bb45f ("regmap-mmio: Use native endianness for
      read/write") attempted to fix some long standing bugs in the MMIO
      implementation for big endian systems caused by duplicate byte
      swapping in both regmap and readl()/writel().  Sadly the fix makes
      things worse rather than better, so revert it for now"
    
    * tag 'regmap-fix-v4.5-big-endian' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
      regmap: mmio: Revert to v4.4 endianness handling
  3. scatterlist: fix a typo in comment block of sg_miter_stop()

    Masahiro Yamada committed with
    Fix the doubled "started" and tidy up the following sentences.
    
    Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
  4. @bonzini

    Merge tag 'kvm-arm-for-4.5-rc2' of git://git.kernel.org/pub/scm/linux…

    bonzini committed
    …/kernel/git/kvmarm/kvmarm into kvm-master
    
    KVM/ARM fixes for v4.5-rc2
    
    A few random fixes, mostly coming from the PMU work by Shannon:
    
    - fix for injecting faults coming from the guest's userspace
    - cleanup for our CPTR_EL2 accessors (reserved bits)
    - fix for a bug impacting perf (user/kernel discrimination)
    - fix for a 32bit sysreg handling bug
Commits on Feb 7, 2016
  1. Linux 4.5-rc3

    committed
  2. Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel…

    committed
    …/git/arm/arm-soc
    
    Pull ARM SoC fixes from Olof Johansson:
     "The first real batch of fixes for this release cycle, so there are a
      few more than usual.
    
      Most of these are fixes and tweaks to board support (DT bugfixes,
      etc).  I've also picked up a couple of small cleanups that seemed
      innocent enough that there was little reason to wait (const/
      __initconst and Kconfig deps).
    
      Quite a bit of the changes on OMAP were due to fixes to no longer
      write to rodata from assembly when ARM_KERNMEM_PERMS was enabled, but
      there were also other fixes.
    
      Kirkwood had a bunch of gpio fixes for some boards.  OMAP had RTC
      fixes on OMAP5, and Nomadik had changes to MMC parameters in DT.
    
      All in all, mostly the usual mix of various fixes"
    
    * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (46 commits)
      ARM: multi_v7_defconfig: enable DW_WATCHDOG
      ARM: nomadik: fix up SD/MMC DT settings
      ARM64: tegra: Add chosen node for tegra132 norrin
      ARM: realview: use "depends on" instead of "if" after prompt
      ARM: tango: use "depends on" instead of "if" after prompt
      ARM: tango: use const and __initconst for smp_operations
      ARM: realview: use const and __initconst for smp_operations
      bus: uniphier-system-bus: revive tristate prompt
      arm64: dts: Add missing DMA Abort interrupt to Juno
      bus: vexpress-config: Add missing of_node_put
      ARM: dts: am57xx: sbc-am57x: correct Eth PHY settings
      ARM: dts: am57xx: cl-som-am57x: fix CPSW EMAC pinmux
      ARM: dts: am57xx: sbc-am57x: fix UART3 pinmux
      ARM: dts: am57xx: cl-som-am57x: update SPI Flash frequency
      ARM: dts: am57xx: cl-som-am57x: set HOST mode for USB2
      ARM: dts: am57xx: sbc-am57x: fix SB-SOM EEPROM I2C address
      ARM: dts: LogicPD Torpedo: Revert Duplicative Entries
      ARM: dts: am437x: pixcir_tangoc: use correct flags for irq types
      ARM: dts: am4372: fix irq type for arm twd and global timer
      ARM: dts: at91: sama5d4 xplained: fix phy0 IRQ type
      ...
  3. Merge branch 'mailbox-devel' of git://git.linaro.org/landing-teams/wo…

    committed
    …rking/fujitsu/integration
    
    Pull mailbox fixes from Jassi Brar:
    
     - fix getting element from the pcc-channels array by simply indexing
       into it
    
     - prevent building mailbox-test driver for archs that don't have IOMEM
    
    * 'mailbox-devel' of git://git.linaro.org/landing-teams/working/fujitsu/integration:
      mailbox: Fix dependencies for !HAS_IOMEM archs
      mailbox: pcc: fix channel calculation in get_pcc_channel()
  4. Merge tag 'usb-4.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/…

    committed
    …git/gregkh/usb
    
    Pull USB fixes from Greg KH:
     "Here are some USB fixes for 4.5-rc3.
    
      The usual, xhci fixes for reported issues, combined with some small
      gadget driver fixes, and a MAINTAINERS file update.  All have been in
      linux-next with no reported issues"
    
    * tag 'usb-4.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
      xhci: harden xhci_find_next_ext_cap against device removal
      xhci: Fix list corruption in urb dequeue at host removal
      usb: host: xhci-plat: fix NULL pointer in probe for device tree case
      usb: xhci-mtk: fix AHB bus hang up caused by roothubs polling
      usb: xhci-mtk: fix bpkts value of LS/HS periodic eps not behind TT
      usb: xhci: apply XHCI_PME_STUCK_QUIRK to Intel Broxton-M platforms
      usb: xhci: set SSIC port unused only if xhci_suspend succeeds
      usb: xhci: add a quirk bit for ssic port unused
      usb: xhci: handle both SSIC ports in PME stuck quirk
      usb: dwc3: gadget: set the OTG flag in dwc3 gadget driver.
      Revert "xhci: don't finish a TD if we get a short-transfer event mid TD"
      MAINTAINERS: fix my email address
      usb: dwc2: Fix probe problem on bcm2835
      Revert "usb: dwc2: Move reset into dwc2_get_hwparams()"
      usb: musb: ux500: Fix NULL pointer dereference at system PM
      usb: phy: mxs: declare variable with initialized value
      usb: phy: msm: fix error handling in probe.
  5. Merge tag 'staging-4.5-rc3' of git://git.kernel.org/pub/scm/linux/ker…

    committed
    …nel/git/gregkh/staging
    
    Pull staging and IIO driver fixes from Greg KH:
     "Here are some IIO and staging driver fixes for 4.5-rc3.
    
      All of them, except one, are for IIO drivers, and one is for a speakup
      driver fix caused by some earlier patches, to resolve a reported build
      failure"
    
    * tag 'staging-4.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
      Staging: speakup: Fix allyesconfig build on mn10300
      iio: dht11: Use boottime
      iio: ade7753: avoid uninitialized data
      iio: pressure: mpl115: fix temperature offset sign
      iio: imu: Fix dependencies for !HAS_IOMEM archs
      staging: iio: Fix dependencies for !HAS_IOMEM archs
      iio: adc: Fix dependencies for !HAS_IOMEM archs
      iio: inkern: fix a NULL dereference on error
      iio:adc:ti_am335x_adc Fix buffered mode by identifying as software buffer.
      iio: light: acpi-als: Report data as processed
      iio: dac: mcp4725: set iio name property in sysfs
      iio: add HAS_IOMEM dependency to VF610_ADC
      iio: add IIO_TRIGGER dependency to STK8BA50
      iio: proximity: lidar: correct return value
      iio-light: Use a signed return type for ltr501_match_samp_freq()
Commits on Feb 6, 2016
  1. Merge branch 'akpm' (patches from Andrew)

    committed
    Merge fixes from Andrew Morton:
     "22 fixes"
    
    * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (22 commits)
      epoll: restrict EPOLLEXCLUSIVE to POLLIN and POLLOUT
      radix-tree: fix oops after radix_tree_iter_retry
      MAINTAINERS: trim the file triggers for ABI/API
      dax: dirty inode only if required
      thp: make deferred_split_scan() work again
      mm: replace vma_lock_anon_vma with anon_vma_lock_read/write
      ocfs2/dlm: clear refmap bit of recovery lock while doing local recovery cleanup
      um: asm/page.h: remove the pte_high member from struct pte_t
      mm, hugetlb: don't require CMA for runtime gigantic pages
      mm/hugetlb: fix gigantic page initialization/allocation
      mm: downgrade VM_BUG in isolate_lru_page() to warning
      mempolicy: do not try to queue pages from !vma_migratable()
      mm, vmstat: fix wrong WQ sleep when memory reclaim doesn't make any progress
      vmstat: make vmstat_update deferrable
      mm, vmstat: make quiet_vmstat lighter
      mm/Kconfig: correct description of DEFERRED_STRUCT_PAGE_INIT
      memblock: don't mark memblock_phys_mem_size() as __init
      dump_stack: avoid potential deadlocks
      mm: validate_mm browse_rb SMP race condition
      m32r: fix build failure due to SMP and MMU
      ...
  2. Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel…

    committed
    …/git/sage/ceph-client
    
    Pull Ceph fixes from Sage Weil:
     "We have a few wire protocol compatibility fixes, ports of a few recent
      CRUSH mapping changes, and a couple error path fixes"
    
    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
      libceph: MOSDOpReply v7 encoding
      libceph: advertise support for TUNABLES5
      crush: decode and initialize chooseleaf_stable
      crush: add chooseleaf_stable tunable
      crush: ensure take bucket value is valid
      crush: ensure bucket id is valid before indexing buckets array
      ceph: fix snap context leak in error path
      ceph: checking for IS_ERR instead of NULL
  3. Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

    committed
    Pull drm fixes from Dave Airlie:
     "Fixes all over the place:
    
       - amdkfd: two static checker fixes
       - mst: a bunch of static checker and spec/hw interaction fixes
       - amdgpu: fix Iceland hw properly, and some fiji bugs, along with
         some write-combining fixes.
       - exynos: some regression fixes
       - adv7511: fix some EDID reading issues"
    
    * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (38 commits)
      drm/dp/mst: deallocate payload on port destruction
      drm/dp/mst: Reverse order of MST enable and clearing VC payload table.
      drm/dp/mst: move GUID storage from mgr, port to only mst branch
      drm/dp/mst: change MST detection scheme
      drm/dp/mst: Calculate MST PBN with 31.32 fixed point
      drm: Add drm_fixp_from_fraction and drm_fixp2int_ceil
      drm/mst: Add range check for max_payloads during init
      drm/mst: Don't ignore the MST PBN self-test result
      drm: fix missing reference counting decrease
      drm/amdgpu: disable uvd and vce clockgating on Fiji
      drm/amdgpu: remove exp hardware support from iceland
      drm/amdgpu: load MEC ucode manually on iceland
      drm/amdgpu: don't load MEC2 on topaz
      drm/amdgpu: drop topaz support from gmc8 module
      drm/amdgpu: pull topaz gmc bits into gmc_v7
      drm/amdgpu: The VI specific EXE bit should only apply to GMC v8.0 above
      drm/amdgpu: iceland use CI based MC IP
      drm/amdgpu: move gmc7 support out of CIK dependency
      drm/amdgpu/gfx7: enable cp inst/reg error interrupts
      drm/amdgpu/gfx8: enable cp inst/reg error interrupts
      ...
  4. Merge tag 'pm+acpi-4.5-rc3' of git://git.kernel.org/pub/scm/linux/ker…

    committed
    …nel/git/rafael/linux-pm
    
    Pull power management and ACPI fixes from Rafael Wysocki:
     "These are: a fix for a recently introduced false-positive warnings
      about PM domain pointers being changed inappropriately (harmless but
      annoying), an MCH size workaround quirk for one more platform, a
      compiler warning fix (generic power domains framework), an ACPI LPSS
      (Intel SoCs) driver fixup and a cleanup of the ACPI CPPC core code.
    
      Specifics:
    
       - PM core fix to avoid false-positive warnings generated when the
         pm_domain field is cleared for a device that appears to be bound to
         a driver (Rafael Wysocki).
    
       - New MCH size workaround quirk for Intel Haswell-ULT (Josh Boyer).
    
       - Fix for an "unused function" compiler warning in the generic power
         domains framework (Ulf Hansson).
    
       - Fixup for the ACPI driver for Intel SoCs (acpi-lpss) to set the PM
         domain pointer of a device properly in one place that was
         overlooked by a recent PM core update (Andy Shevchenko).
    
       - Removal of a redundant function declaration in the ACPI CPPC core
         code (Timur Tabi)"
    
    * tag 'pm+acpi-4.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
      PM: Avoid false-positive warnings in dev_pm_domain_set()
      PM / Domains: Silence compiler warning for an unused function
      ACPI / CPPC: remove redundant mbox_send_message() declaration
      ACPI / LPSS: set PM domain via helper setter
      PNP: Add Haswell-ULT to Intel MCH size workaround
  5. epoll: restrict EPOLLEXCLUSIVE to POLLIN and POLLOUT

    Jason Baron committed with
    In the current implementation of the EPOLLEXCLUSIVE flag (added for
    4.5-rc1), if epoll waiters create different POLL* sets and register them
    as exclusive against the same target fd, the current implementation will
    stop waking any further waiters once it finds the first idle waiter.
    This means that waiters could miss wakeups in certain cases.
    
    For example, when we wake up a pipe for reading we do:
    wake_up_interruptible_sync_poll(&pipe->wait, POLLIN | POLLRDNORM); So if
    one epoll set or epfd is added to pipe p with POLLIN and a second set
    epfd2 is added to pipe p with POLLRDNORM, only epfd may receive the
    wakeup since the current implementation will stop after it finds any
    intersection of events with a waiter that is blocked in epoll_wait().
    
    We could potentially address this by requiring all epoll waiters that
    are added to p be required to pass the same set of POLL* events.  IE the
    first EPOLL_CTL_ADD that passes EPOLLEXCLUSIVE establishes the set POLL*
    flags to be used by any other epfds that are added as EPOLLEXCLUSIVE.
    However, I think it might be somewhat confusing interface as we would
    have to reference count the number of users for that set, and so
    userspace would have to keep track of that count, or we would need a
    more involved interface.  It also adds some shared state that we'd have
    store somewhere.  I don't think anybody will want to bloat
    __wait_queue_head for this.
    
    I think what we could do instead, is to simply restrict EPOLLEXCLUSIVE
    such that it can only be specified with EPOLLIN and/or EPOLLOUT.  So
    that way if the wakeup includes 'POLLIN' and not 'POLLOUT', we can stop
    once we hit the first idle waiter that specifies the EPOLLIN bit, since
    any remaining waiters that only have 'POLLOUT' set wouldn't need to be
    woken.  Likewise, we can do the same thing if 'POLLOUT' is in the wakeup
    bit set and not 'POLLIN'.  If both 'POLLOUT' and 'POLLIN' are set in the
    wake bit set (there is at least one example of this I saw in fs/pipe.c),
    then we just wake the entire exclusive list.  Having both 'POLLOUT' and
    'POLLIN' both set should not be on any performance critical path, so I
    think that's ok (in fs/pipe.c its in pipe_release()).  We also continue
    to include EPOLLERR and EPOLLHUP by default in any exclusive set.  Thus,
    the user can specify EPOLLERR and/or EPOLLHUP but is not required to do
    so.
    
    Since epoll waiters may be interested in other events as well besides
    EPOLLIN, EPOLLOUT, EPOLLERR and EPOLLHUP, these can still be added by
    doing a 'dup' call on the target fd and adding that as one normally
    would with EPOLL_CTL_ADD.  Since I think that the POLLIN and POLLOUT
    events are what we are interest in balancing, I think that the 'dup'
    thing could perhaps be added to only one of the waiter threads.
    However, I think that EPOLLIN, EPOLLOUT, EPOLLERR and EPOLLHUP should be
    sufficient for the majority of use-cases.
    
    Since EPOLLEXCLUSIVE is intended to be used with a target fd shared
    among multiple epfds, where between 1 and n of the epfds may receive an
    event, it does not satisfy the semantics of EPOLLONESHOT where only 1
    epfd would get an event.  Thus, it is not allowed to be specified in
    conjunction with EPOLLEXCLUSIVE.
    
    EPOLL_CTL_MOD is also not allowed if the fd was previously added as
    EPOLLEXCLUSIVE.  It seems with the limited number of flags to not be as
    interesting, but this could be relaxed at some further point.
    
    Signed-off-by: Jason Baron <jbaron@akamai.com>
    Tested-by: Madars Vitolins <m@silodev.com>
    Cc: Michael Kerrisk <mtk.manpages@gmail.com>
    Cc: Ingo Molnar <mingo@kernel.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Al Viro <viro@ftp.linux.org.uk>
    Cc: Eric Wong <normalperson@yhbt.net>
    Cc: Jonathan Corbet <corbet@lwn.net>
    Cc: Andy Lutomirski <luto@amacapital.net>
    Cc: Hagen Paul Pfeifer <hagen@jauu.net>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
  6. @koct9i

    radix-tree: fix oops after radix_tree_iter_retry

    koct9i committed with
    Helper radix_tree_iter_retry() resets next_index to the current index.
    In following radix_tree_next_slot current chunk size becomes zero.  This
    isn't checked and it tries to dereference null pointer in slot.
    
    Tagged iterator is fine because retry happens only at slot 0 where tag
    bitmask in iter->tags is filled with single bit.
    
    Fixes: 46437f9 ("radix-tree: fix race in gang lookup")
    Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
    Cc: Matthew Wilcox <willy@linux.intel.com>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Ohad Ben-Cohen <ohad@wizery.com>
    Cc: Jeremiah Mahler <jmmahler@gmail.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
  7. @mkerrisk

    MAINTAINERS: trim the file triggers for ABI/API

    mkerrisk committed with
    Commit ea8f8fc ("MAINTAINERS: add linux-api for review of API/ABI
    changes") added file triggers for various paths that likely indicated
    API/ABI changes.  However, catching all changes in Documentation/ABI/
    and include/uapi/ produces a large volume of mail to linux-api, rather
    than only API/ABI changes.  Drop those two entries, but leave
    include/linux/syscalls.h and kernel/sys_ni.c to catch syscall-related
    changes.
    
    [josh@joshtriplett.org: redid changelog]
    Signed-off-by: Michael Kerrisk <mtk.man-pages@gmail.com>
    Acked-by: Shuah khan <shuahkh@osg.samsung.com>
    Cc: Josh Triplett <josh@joshtriplett.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
  8. @dmonakhov

    dax: dirty inode only if required

    dmonakhov committed with
    Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
    Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
  9. thp: make deferred_split_scan() work again

    Kirill A. Shutemov committed with
    We need to iterate over split_queue, not local empty list to get
    anything split from the shrinker.
    
    Fixes: e3ae195 ("thp: limit number of object to scan on deferred_split_scan()")
    Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Cc: Andrea Arcangeli <aarcange@redhat.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
  10. @koct9i

    mm: replace vma_lock_anon_vma with anon_vma_lock_read/write

    koct9i committed with
    Sequence vma_lock_anon_vma() - vma_unlock_anon_vma() isn't safe if
    anon_vma appeared between lock and unlock.  We have to check anon_vma
    first or call anon_vma_prepare() to be sure that it's here.  There are
    only few users of these legacy helpers.  Let's get rid of them.
    
    This patch fixes anon_vma lock imbalance in validate_mm().  Write lock
    isn't required here, read lock is enough.
    
    And reorders expand_downwards/expand_upwards: security_mmap_addr() and
    wrapping-around check don't have to be under anon vma lock.
    
    Link: https://lkml.kernel.org/r/CACT4Y+Y908EjM2z=706dv4rV6dWtxTLK9nFg9_7DhRMLppBo2g@mail.gmail.com
    Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
    Reported-by: Dmitry Vyukov <dvyukov@google.com>
    Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Cc: Andrea Arcangeli <aarcange@redhat.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
  11. ocfs2/dlm: clear refmap bit of recovery lock while doing local recove…

    xuejiufei committed with
    …ry cleanup
    
    When recovery master down, dlm_do_local_recovery_cleanup() only remove
    the $RECOVERY lock owned by dead node, but do not clear the refmap bit.
    Which will make umount thread falling in dead loop migrating $RECOVERY
    to the dead node.
    
    Signed-off-by: xuejiufei <xuejiufei@huawei.com>
    Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
    Cc: Mark Fasheh <mfasheh@suse.de>
    Cc: Joel Becker <jlbec@evilplan.org>
    Cc: Junxiao Bi <junxiao.bi@oracle.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
  12. @nicstange

    um: asm/page.h: remove the pte_high member from struct pte_t

    nicstange committed with
    Commit 16da306 ("um: kill pfn_t") introduced a compile warning for
    defconfig (SUBARCH=i386):
    
      arch/um/kernel/skas/mmu.c:38:206:
          warning: right shift count >= width of type [-Wshift-count-overflow]
    
    Aforementioned patch changes the definition of the phys_to_pfn() macro
    from
    
      ((pfn_t) ((p) >> PAGE_SHIFT))
    
    to
    
      ((p) >> PAGE_SHIFT)
    
    This effectively changes the phys_to_pfn() expansion's type from
    unsigned long long to unsigned long.
    
    Through the callchain init_stub_pte() => mk_pte(), the expansion of
    phys_to_pfn() is (indirectly) fed into the 'phys' argument of the
    pte_set_val(pte, phys, prot) macro, eventually leading to
    
      (pte).pte_high = (phys) >> 32;
    
    This results in the warning from above.
    
    Since UML only deals with 32 bit addresses, the upper 32 bits from
    'phys' used to be always zero anyway.  Also, all page protection flags
    defined by UML don't use any bits beyond bit 9.  Since the contents of a
    PTE are defined within architecture scope only, the ->pte_high member
    can be safely removed.
    
    Remove the ->pte_high member from struct pte_t.
    Rename ->pte_low to ->pte.
    Adapt the pte helper macros in arch/um/include/asm/page.h.
    
    Noteworthy is the pte_copy() macro where a smp_wmb() gets dropped.  This
    write barrier doesn't seem to be paired with any read barrier though and
    thus, was useless anyway.
    
    Fixes: 16da306 ("um: kill pfn_t")
    Signed-off-by: Nicolai Stange <nicstange@gmail.com>
    Cc: Dan Williams <dan.j.williams@intel.com>
    Cc: Richard Weinberger <richard@nod.at>
    Cc: Nicolai Stange <nicstange@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
  13. @tehcaster

    mm, hugetlb: don't require CMA for runtime gigantic pages

    tehcaster committed with
    Commit 944d9fe ("hugetlb: add support for gigantic page allocation
    at runtime") has added the runtime gigantic page allocation via
    alloc_contig_range(), making this support available only when CONFIG_CMA
    is enabled.  Because it doesn't depend on MIGRATE_CMA pageblocks and the
    associated infrastructure, it is possible with few simple adjustments to
    require only CONFIG_MEMORY_ISOLATION instead of full CONFIG_CMA.
    
    After this patch, alloc_contig_range() and related functions are
    available and used for gigantic pages with just CONFIG_MEMORY_ISOLATION
    enabled.  Note CONFIG_CMA selects CONFIG_MEMORY_ISOLATION.  This allows
    supporting runtime gigantic pages without the CMA-specific checks in
    page allocator fastpaths.
    
    Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
    Cc: Luiz Capitulino <lcapitulino@redhat.com>
    Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
    Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
    Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
    Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Cc: Davidlohr Bueso <dave@stgolabs.net>
    Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
    Cc: Mike Kravetz <mike.kravetz@oracle.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
  14. mm/hugetlb: fix gigantic page initialization/allocation

    Mike Kravetz committed with
    Attempting to preallocate 1G gigantic huge pages at boot time with
    "hugepagesz=1G hugepages=1" on the kernel command line will prevent
    booting with the following:
    
      kernel BUG at mm/hugetlb.c:1218!
    
    When mapcount accounting was reworked, the setting of
    compound_mapcount_ptr in prep_compound_gigantic_page was overlooked.  As
    a result, the validation of mapcount in free_huge_page fails.
    
    The "BUG_ON" checks in free_huge_page were also changed to
    "VM_BUG_ON_PAGE" to assist with debugging.
    
    Fixes: 53f9263 ("mm: rework mapcount accounting to enable 4k mapping of THPs")
    Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
    Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
    Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Acked-by: David Rientjes <rientjes@google.com>
    Tested-by: Vlastimil Babka <vbabka@suse.cz>
    Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
    Cc: Jerome Marchand <jmarchan@redhat.com>
    Cc: Michal Hocko <mhocko@suse.cz>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
  15. mm: downgrade VM_BUG in isolate_lru_page() to warning

    Kirill A. Shutemov committed with
    Calling isolate_lru_page() is wrong and shouldn't happen, but it not
    nessesary fatal: the page just will not be isolated if it's not on LRU.
    
    Let's downgrade the VM_BUG_ON_PAGE() to WARN_RATELIMIT().
    
    Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Cc: Dmitry Vyukov <dvyukov@google.com>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Cc: David Rientjes <rientjes@google.com>
    Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
    Acked-by: Michal Hocko <mhocko@suse.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
  16. mempolicy: do not try to queue pages from !vma_migratable()

    Kirill A. Shutemov committed with
    Maybe I miss some point, but I don't see a reason why we try to queue
    pages from non migratable VMAs.
    
    This testcase steps on VM_BUG_ON_PAGE() in isolate_lru_page():
    
        #include <fcntl.h>
        #include <unistd.h>
        #include <stdio.h>
        #include <sys/mman.h>
        #include <numaif.h>
    
        #define SIZE 0x2000
    
        int foo;
    
        int main()
        {
            int fd;
            char *p;
            unsigned long mask = 2;
    
            fd = open("/dev/sg0", O_RDWR);
            p = mmap(NULL, SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
            /* Faultin pages */
            foo = p[0] + p[0x1000];
            mbind(p, SIZE, MPOL_BIND, &mask, 4, MPOL_MF_MOVE | MPOL_MF_STRICT);
            return 0;
        }
    
    The only case when we can queue pages from such VMA is MPOL_MF_STRICT
    plus MPOL_MF_MOVE or MPOL_MF_MOVE_ALL for VMA which has pages on LRU,
    but gfp mask is not sutable for migaration (see mapping_gfp_mask() check
    in vma_migratable()).  That's looks like a bug to me.
    
    Let's filter out non-migratable vma at start of queue_pages_test_walk()
    and go to queue_pages_pte_range() only if MPOL_MF_MOVE or
    MPOL_MF_MOVE_ALL flag is set.
    
    Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Cc: David Rientjes <rientjes@google.com>
    Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
    Cc: Michal Hocko <mhocko@suse.com>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
  17. mm, vmstat: fix wrong WQ sleep when memory reclaim doesn't make any p…

    Tetsuo Handa committed with
    …rogress
    
    Jan Stancek has reported that system occasionally hanging after "oom01"
    testcase from LTP triggers OOM.  Guessing from a result that there is a
    kworker thread doing memory allocation and the values between "Node 0
    Normal free:" and "Node 0 Normal:" differs when hanging, vmstat is not
    up-to-date for some reason.
    
    According to commit 373ccbe ("mm, vmstat: allow WQ concurrency to
    discover memory reclaim doesn't make any progress"), it meant to force
    the kworker thread to take a short sleep, but it by error used
    schedule_timeout(1).  We missed that schedule_timeout() in state
    TASK_RUNNING doesn't do anything.
    
    Fix it by using schedule_timeout_uninterruptible(1) which forces the
    kworker thread to take a short sleep in order to make sure that vmstat
    is up-to-date.
    
    Fixes: 373ccbe ("mm, vmstat: allow WQ concurrency to discover memory reclaim doesn't make any progress")
    Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
    Reported-by: Jan Stancek <jstancek@redhat.com>
    Acked-by: Michal Hocko <mhocko@suse.com>
    Cc: Tejun Heo <tj@kernel.org>
    Cc: Cristopher Lameter <clameter@sgi.com>
    Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
    Cc: Arkadiusz Miskiewicz <arekm@maven.pl>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
  18. vmstat: make vmstat_update deferrable

    Michal Hocko committed with
    Commit 0eb77e9 ("vmstat: make vmstat_updater deferrable again and
    shut down on idle") made vmstat_shepherd deferrable.  vmstat_update
    itself is still useing standard timer which might interrupt idle task.
    This is possible because "mm, vmstat: make quiet_vmstat lighter" removed
    cancel_delayed_work from the quiet_vmstat.
    
    Change vmstat_work to use DEFERRABLE_WORK to prevent from pointless
    wakeups from the idle context.
    
    Acked-by: Christoph Lameter <cl@linux.com>
    Signed-off-by: Michal Hocko <mhocko@suse.com>
    Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
  19. mm, vmstat: make quiet_vmstat lighter

    Michal Hocko committed with
    Mike has reported a considerable overhead of refresh_cpu_vm_stats from
    the idle entry during pipe test:
    
        12.89%  [kernel]       [k] refresh_cpu_vm_stats.isra.12
         4.75%  [kernel]       [k] __schedule
         4.70%  [kernel]       [k] mutex_unlock
         3.14%  [kernel]       [k] __switch_to
    
    This is caused by commit 0eb77e9 ("vmstat: make vmstat_updater
    deferrable again and shut down on idle") which has placed quiet_vmstat
    into cpu_idle_loop.  The main reason here seems to be that the idle
    entry has to get over all zones and perform atomic operations for each
    vmstat entry even though there might be no per cpu diffs.  This is a
    pointless overhead for _each_ idle entry.
    
    Make sure that quiet_vmstat is as light as possible.
    
    First of all it doesn't make any sense to do any local sync if the
    current cpu is already set in oncpu_stat_off because vmstat_update puts
    itself there only if there is nothing to do.
    
    Then we can check need_update which should be a cheap way to check for
    potential per-cpu diffs and only then do refresh_cpu_vm_stats.
    
    The original patch also did cancel_delayed_work which we are not doing
    here.  There are two reasons for that.  Firstly cancel_delayed_work from
    idle context will blow up on RT kernels (reported by Mike):
    
      CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.5.0-rt3 #7
      Hardware name: MEDION MS-7848/MS-7848, BIOS M7848W08.20C 09/23/2013
      Call Trace:
        dump_stack+0x49/0x67
        ___might_sleep+0xf5/0x180
        rt_spin_lock+0x20/0x50
        try_to_grab_pending+0x69/0x240
        cancel_delayed_work+0x26/0xe0
        quiet_vmstat+0x75/0xa0
        cpu_idle_loop+0x38/0x3e0
        cpu_startup_entry+0x13/0x20
        start_secondary+0x114/0x140
    
    And secondly, even on !RT kernels it might add some non trivial overhead
    which is not necessary.  Even if the vmstat worker wakes up and preempts
    idle then it will be most likely a single shot noop because the stats
    were already synced and so it would end up on the oncpu_stat_off anyway.
    We just need to teach both vmstat_shepherd and vmstat_update to stop
    scheduling the worker if there is nothing to do.
    
    [mgalbraith@suse.de: cancel pending work of the cpu_stat_off CPU]
    Signed-off-by: Michal Hocko <mhocko@suse.com>
    Reported-by: Mike Galbraith <umgwanakikbuti@gmail.com>
    Acked-by: Christoph Lameter <cl@linux.com>
    Signed-off-by: Mike Galbraith <mgalbraith@suse.de>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
  20. @tehcaster

    mm/Kconfig: correct description of DEFERRED_STRUCT_PAGE_INIT

    tehcaster committed with
    The description mentions kswapd threads, while the deferred struct page
    initialization is actually done by one-off "pgdatinitX" threads.
    
    Fix the description so that potentially users are not confused about
    pgdatinit threads using CPU after boot instead of kswapd.
    
    Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
    Acked-by: Mel Gorman <mgorman@techsingularity.net>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
  21. @dgibson

    memblock: don't mark memblock_phys_mem_size() as __init

    dgibson committed with
    At the moment memblock_phys_mem_size() is marked as __init, and so is
    discarded after boot.  This is different from most of the memblock
    functions which are marked __init_memblock, and are only discarded after
    boot if memory hotplug is not configured.
    
    To allow for upcoming code which will need memblock_phys_mem_size() in
    the hotplug path, change it from __init to __init_memblock.
    
    Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
  22. dump_stack: avoid potential deadlocks

    Eric Dumazet committed with
    Some servers experienced fatal deadlocks because of a combination of
    bugs, leading to multiple cpus calling dump_stack().
    
    The checksumming bug was fixed in commit 34ae6a1 ("ipv6: update
    skb->csum when CE mark is propagated").
    
    The second problem is a faulty locking in dump_stack()
    
    CPU1 runs in process context and calls dump_stack(), grabs dump_lock.
    
       CPU2 receives a TCP packet under softirq, grabs socket spinlock, and
       call dump_stack() from netdev_rx_csum_fault().
    
       dump_stack() spins on atomic_cmpxchg(&dump_lock, -1, 2), since
       dump_lock is owned by CPU1
    
    While dumping its stack, CPU1 is interrupted by a softirq, and happens
    to process a packet for the TCP socket locked by CPU2.
    
    CPU1 spins forever in spin_lock() : deadlock
    
    Stack trace on CPU1 looked like :
    
        NMI backtrace for cpu 1
        RIP: _raw_spin_lock+0x25/0x30
        ...
        Call Trace:
          <IRQ>
          tcp_v6_rcv+0x243/0x620
          ip6_input_finish+0x11f/0x330
          ip6_input+0x38/0x40
          ip6_rcv_finish+0x3c/0x90
          ipv6_rcv+0x2a9/0x500
          process_backlog+0x461/0xaa0
          net_rx_action+0x147/0x430
          __do_softirq+0x167/0x2d0
          call_softirq+0x1c/0x30
          do_softirq+0x3f/0x80
          irq_exit+0x6e/0xc0
          smp_call_function_single_interrupt+0x35/0x40
          call_function_single_interrupt+0x6a/0x70
          <EOI>
          printk+0x4d/0x4f
          printk_address+0x31/0x33
          print_trace_address+0x33/0x3c
          print_context_stack+0x7f/0x119
          dump_trace+0x26b/0x28e
          show_trace_log_lvl+0x4f/0x5c
          show_stack_log_lvl+0x104/0x113
          show_stack+0x42/0x44
          dump_stack+0x46/0x58
          netdev_rx_csum_fault+0x38/0x3c
          __skb_checksum_complete_head+0x6e/0x80
          __skb_checksum_complete+0x11/0x20
          tcp_rcv_established+0x2bd5/0x2fd0
          tcp_v6_do_rcv+0x13c/0x620
          sk_backlog_rcv+0x15/0x30
          release_sock+0xd2/0x150
          tcp_recvmsg+0x1c1/0xfc0
          inet_recvmsg+0x7d/0x90
          sock_recvmsg+0xaf/0xe0
          ___sys_recvmsg+0x111/0x3b0
          SyS_recvmsg+0x5c/0xb0
          system_call_fastpath+0x16/0x1b
    
    Fixes: b58d977 ("dump_stack: serialize the output from dump_stack()")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Cc: Alex Thorlton <athorlton@sgi.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
  23. mm: validate_mm browse_rb SMP race condition

    Andrea Arcangeli committed with
    The mmap_sem for reading in validate_mm called from expand_stack is not
    enough to prevent the argumented rbtree rb_subtree_gap information to
    change from under us because expand_stack may be running from other
    threads concurrently which will hold the mmap_sem for reading too.
    
    The argumented rbtree is updated with vma_gap_update under the
    page_table_lock so use it in browse_rb() too to avoid false positives.
    
    Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
    Reported-by: Dmitry Vyukov <dvyukov@google.com>
    Tested-by: Dmitry Vyukov <dvyukov@google.com>
    Cc: Konstantin Khlebnikov <koct9i@gmail.com>
    Cc: Oleg Nesterov <oleg@redhat.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
  24. @sudipm-mukherjee

    m32r: fix build failure due to SMP and MMU

    sudipm-mukherjee committed with
    One of the randconfig build failed with the error:
    
      arch/m32r/kernel/smp.c: In function 'smp_flush_tlb_mm':
      arch/m32r/kernel/smp.c:283:20: error: subscripted value is neither array nor pointer nor vector
        mmc = &mm->context[cpu_id];
                          ^
      arch/m32r/kernel/smp.c: In function 'smp_flush_tlb_page':
      arch/m32r/kernel/smp.c:353:20: error: subscripted value is neither array nor pointer nor vector
        mmc = &mm->context[cpu_id];
                          ^
      arch/m32r/kernel/smp.c: In function 'smp_invalidate_interrupt':
      arch/m32r/kernel/smp.c:479:41: error: subscripted value is neither array nor pointer nor vector
        unsigned long *mmc = &flush_mm->context[cpu_id];
    
    It turned out that CONFIG_SMP was defined but CONFIG_MMU was not
    defined.  But arch/m32r/include/asm/mmu.h only defines mm_context_t as
    an array when both CONFIG_SMP and CONFIG_MMU are defined.  And
    arch/m32r/kernel/smp.c is always using context as an array.  So without
    MMU SMP can not work.
    
    Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
  25. block: fix pfn_mkwrite() DAX fault handler

    Ross Zwisler committed with
    Previously the pfn_mkwrite() fault handler for raw block devices called
    bldev_dax_fault() -> __dax_fault() to do a full DAX page fault.
    
    Really what the pfn_mkwrite() fault handler needs to do is call
    dax_pfn_mkwrite() to make sure that the radix tree entry for the given
    PTE is marked as dirty so that a follow-up fsync or msync call will
    flush it durably to media.
    
    Fixes: 5a023cd ("block: enable dax for raw block devices")
    Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
    Cc: Alexander Viro <viro@zeniv.linux.org.uk>
    Cc: Dan Williams <dan.j.williams@intel.com>
    Cc: Dave Chinner <david@fromorbit.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Cc: Matthew Wilcox <willy@linux.intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
  26. @sashalevin

    signals: avoid random wakeups in sigsuspend()

    sashalevin committed with
    A random wakeup can get us out of sigsuspend() without TIF_SIGPENDING
    being set.
    
    Avoid that by making sure we were signaled, like sys_pause() does.
    
    Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
    Acked-by: Oleg Nesterov <oleg@redhat.com>
    Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Cc: Dmitry Vyukov <dvyukov@google.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Something went wrong with that request. Please try again.