Skip to content
Permalink
Christophe-Ler…
Switch branches/tags

Commits on Apr 12, 2021

  1. powerpc: Move copy_from_kernel_nofault_inst()

    When probe_kernel_read_inst() was created, there was no good place to
    put it, so a file called lib/inst.c was dedicated for it.
    
    Since then, probe_kernel_read_inst() has been renamed
    copy_from_kernel_nofault_inst(). And mm/maccess.h didn't exist at that
    time. Today, mm/maccess.h is related to copy_from_kernel_nofault().
    
    Move copy_from_kernel_nofault_inst() into mm/maccess.c
    
    Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
    chleroy authored and intel-lab-lkp committed Apr 12, 2021
  2. powerpc: Rename probe_kernel_read_inst()

    When probe_kernel_read_inst() was created, it was to mimic
    probe_kernel_read() function.
    
    Since then, probe_kernel_read() has been renamed
    copy_from_kernel_nofault().
    
    Rename probe_kernel_read_inst() into copy_from_kernel_nofault_inst().
    
    Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
    chleroy authored and intel-lab-lkp committed Apr 12, 2021
  3. powerpc: Make probe_kernel_read_inst() common to PPC32 and PPC64

    We have two independant versions of probe_kernel_read_inst(), one for
    PPC32 and one for PPC64.
    
    The PPC32 is identical to the first part of the PPC64 version.
    The remaining part of PPC64 version is not relevant for PPC32, but
    not contradictory, so we can easily have a common function with
    the PPC64 part opted out via a IS_ENABLED(CONFIG_PPC64).
    
    The only need is to add a version of ppc_inst_prefix() for PPC32.
    
    Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
    chleroy authored and intel-lab-lkp committed Apr 12, 2021
  4. powerpc: Remove probe_user_read_inst()

    Its name comes from former probe_user_read() function.
    That function is now called copy_from_user_nofault().
    
    probe_user_read_inst() uses copy_from_user_nofault() to read only
    a few bytes. It is suboptimal.
    
    It does the same as get_user_inst() but in addition disables
    page faults.
    
    But on the other hand, it is not used for the time being. So remove it
    for now. If one day it is really needed, we can give it a new name
    more in line with today's naming, and implement it using get_user_inst()
    
    Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
    chleroy authored and intel-lab-lkp committed Apr 12, 2021

Commits on Apr 8, 2021

  1. powerpc/32: Remove powerpc specific definition of 'ptrdiff_t'

    For unknown reason, old commit d27dfd388715 ("Import pre2.0.8")
    changed 'ptrdiff_t' from 'int' to 'long'.
    
    GCC expects it as 'int' really, and this leads to the following
    warning when building KFENCE:
    
      CC      mm/kfence/report.o
    In file included from ./include/linux/printk.h:7,
                     from ./include/linux/kernel.h:16,
                     from mm/kfence/report.c:10:
    mm/kfence/report.c: In function 'kfence_report_error':
    ./include/linux/kern_levels.h:5:18: warning: format '%td' expects argument of type 'ptrdiff_t', but argument 6 has type 'long int' [-Wformat=]
        5 | #define KERN_SOH "\001"  /* ASCII Start Of Header */
          |                  ^~~~~~
    ./include/linux/kern_levels.h:11:18: note: in expansion of macro 'KERN_SOH'
       11 | #define KERN_ERR KERN_SOH "3" /* error conditions */
          |                  ^~~~~~~~
    ./include/linux/printk.h:343:9: note: in expansion of macro 'KERN_ERR'
      343 |  printk(KERN_ERR pr_fmt(fmt), ##__VA_ARGS__)
          |         ^~~~~~~~
    mm/kfence/report.c:213:3: note: in expansion of macro 'pr_err'
      213 |   pr_err("Out-of-bounds %s at 0x%p (%luB %s of kfence-#%td):\n",
          |   ^~~~~~
    
    <asm-generic/uapi/posix-types.h> defines it as 'int', and
    defines 'size_t' and 'ssize_t' exactly as powerpc do, so
    remove the powerpc specific definitions and fallback on
    generic ones.
    
    Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
    Acked-by: Segher Boessenkool <segher@kernel.crashing.org>
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/e43d133bf52fa19e577f64f3a3a38cedc570377d.1617616601.git.christophe.leroy@csgroup.eu
    chleroy authored and mpe committed Apr 8, 2021
  2. powerpc: iommu: fix build when neither PCI or IBMVIO is set

    When neither CONFIG_PCI nor CONFIG_IBMVIO is set/enabled, iommu.c has a
    build error. The fault injection code is not useful in that kernel config,
    so make the FAIL_IOMMU option depend on PCI || IBMVIO.
    
    Prevents this build error (warning escalated to error):
    ../arch/powerpc/kernel/iommu.c:178:30: error: 'fail_iommu_bus_notifier' defined but not used [-Werror=unused-variable]
      178 | static struct notifier_block fail_iommu_bus_notifier = {
    
    Fixes: d6b9a81 ("powerpc: IOMMU fault injection")
    Reported-by: kernel test robot <lkp@intel.com>
    Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
    Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
    Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/20210404192623.10697-1-rdunlap@infradead.org
    rddunlap authored and mpe committed Apr 8, 2021
  3. powerpc/pseries: remove unneeded semicolon

    Eliminate the following coccicheck warning:
    ./arch/powerpc/platforms/pseries/lpar.c:1633:2-3: Unneeded semicolon
    
    Reported-by: Abaci Robot <abaci@linux.alibaba.com>
    Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/1617672785-81372-1-git-send-email-yang.lee@linux.alibaba.com
    Yang Li authored and mpe committed Apr 8, 2021
  4. powerpc/64s: power4 nap fixup in C

    There is no need for this to be in asm, use the new intrrupt entry wrapper.
    
    Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
    Tested-by: Andreas Schwab <schwab@linux-m68k.org>
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/20210406025508.821718-1-npiggin@gmail.com
    npiggin authored and mpe committed Apr 8, 2021
  5. powerpc/perf: Fix PMU constraint check for EBB events

    The power PMU group constraints includes check for EBB events to make
    sure all events in a group must agree on EBB. This will prevent
    scheduling EBB and non-EBB events together. But in the existing check,
    settings for constraint mask and value is interchanged. Patch fixes the
    same.
    
    Before the patch, PMU selftest "cpu_event_pinned_vs_ebb_test" fails with
    below in dmesg logs. This happens because EBB event gets enabled along
    with a non-EBB cpu event.
    
      [35600.453346] cpu_event_pinne[41326]: illegal instruction (4)
      at 10004a18 nip 10004a18 lr 100049f8 code 1 in
      cpu_event_pinned_vs_ebb_test[10000000+10000]
    
    Test results after the patch:
    
      $ ./pmu/ebb/cpu_event_pinned_vs_ebb_test
      test: cpu_event_pinned_vs_ebb
      tags: git_version:v5.12-rc5-93-gf28c3125acd3-dirty
      Binding to cpu 8
      EBB Handler is at 0x100050c8
      read error on event 0x7fffe6bd4040!
      PM_RUN_INST_CMPL: result 9872 running/enabled 37930432
      success: cpu_event_pinned_vs_ebb
    
    This bug was hidden by other logic until commit 1908dc9 (perf:
    Tweak perf_event_attr::exclusive semantics).
    
    Fixes: 4df4899 ("powerpc/perf: Add power8 EBB support")
    Reported-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
    Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
    [mpe: Mention commit 1908dc9]
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/1617725761-1464-1-git-send-email-atrajeev@linux.vnet.ibm.com
    athira-rajeev authored and mpe committed Apr 8, 2021
  6. selftests/powerpc: Suggest memtrace instead of /dev/mem for ci memory

    The suggested alternative for getting cache-inhibited memory with 'mem='
    and /dev/mem is pretty hacky. Also, PAPR guests do not allow system
    memory to be mapped cache-inhibited so despite /dev/mem being available
    this will not work which can cause confusion.  Instead recommend using
    the memtrace buffers. memtrace is only available on powernv so there
    will not be any chance of trying to do this in a guest.
    
    Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/20210225032108.1458352-2-jniethe5@gmail.com
    iamjpn authored and mpe committed Apr 8, 2021
  7. powerpc/powernv/memtrace: Allow mmaping trace buffers

    Let the memory removed from the linear mapping to be used for the trace
    buffers be mmaped. This is a useful way of providing cache-inhibited
    memory for the alignment_handler selftest.
    
    Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
    [mpe: make memtrace_mmap() static as noticed by lkp@intel.com]
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/20210225032108.1458352-1-jniethe5@gmail.com
    iamjpn authored and mpe committed Apr 8, 2021
  8. powerpc/kexec: Don't use .machine ppc64 in trampoline_64.S

    As best as I can tell the ".machine" directive in trampoline_64.S is no
    longer, or never was, necessary.
    
    It was added in commit 0d97631 ("powerpc: Add purgatory for
    kexec_file_load() implementation."), which created the file based on
    the kexec-tools purgatory. It may be/have-been necessary in the
    kexec-tools version, but we have a completely different build system,
    and we already pass the desired CPU flags, eg:
    
      gcc ... -m64 -Wl,-a64 -mabi=elfv2 -Wa,-maltivec -Wa,-mpower4 -Wa,-many
      ... arch/powerpc/purgatory/trampoline_64.S
    
    So drop the ".machine" directive and rely on the assembler flags.
    
    Reported-by: Daniel Axtens <dja@axtens.net>
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>
    Link: https://lore.kernel.org/r/20210315034159.315675-1-mpe@ellerman.id.au
    mpe committed Apr 8, 2021
  9. powerpc/64: Move security code into security.c

    When the original spectre/meltdown mitigations were merged we put them
    in setup_64.c for lack of a better place.
    
    Since then we created security.c for some of the other mitigation
    related code. But it should all be in there.
    
    This sort of code movement can cause trouble for backports, but
    hopefully this code is relatively stable these days (famous last words).
    
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/20210326101201.1973552-1-mpe@ellerman.id.au
    mpe committed Apr 8, 2021
  10. powerpc/mm/64s: Allow STRICT_KERNEL_RWX again

    We have now fixed the known bugs in STRICT_KERNEL_RWX for Book3S
    64-bit Hash and Radix MMUs, see preceding commits, so allow the
    option to be selected again.
    
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/20210331003845.216246-6-mpe@ellerman.id.au
    mpe committed Apr 8, 2021
  11. powerpc/mm/64s/hash: Add real-mode change_memory_range() for hash LPAR

    When we enabled STRICT_KERNEL_RWX we received some reports of boot
    failures when using the Hash MMU and running under phyp. The crashes
    are intermittent, and often exhibit as a completely unresponsive
    system, or possibly an oops.
    
    One example, which was caught in xmon:
    
      [   14.068327][    T1] devtmpfs: mounted
      [   14.069302][    T1] Freeing unused kernel memory: 5568K
      [   14.142060][  T347] BUG: Unable to handle kernel instruction fetch
      [   14.142063][    T1] Run /sbin/init as init process
      [   14.142074][  T347] Faulting instruction address: 0xc000000000004400
      cpu 0x2: Vector: 400 (Instruction Access) at [c00000000c7475e0]
          pc: c000000000004400: exc_virt_0x4400_instruction_access+0x0/0x80
          lr: c0000000001862d4: update_rq_clock+0x44/0x110
          sp: c00000000c747880
         msr: 8000000040001031
        current = 0xc00000000c60d380
        paca    = 0xc00000001ec9de80   irqmask: 0x03   irq_happened: 0x01
          pid   = 347, comm = kworker/2:1
      ...
      enter ? for help
      [c00000000c747880] c0000000001862d4 update_rq_clock+0x44/0x110 (unreliable)
      [c00000000c7478f0] c000000000198794 update_blocked_averages+0xb4/0x6d0
      [c00000000c7479f0] c000000000198e40 update_nohz_stats+0x90/0xd0
      [c00000000c747a20] c0000000001a13b4 _nohz_idle_balance+0x164/0x390
      [c00000000c747b10] c0000000001a1af8 newidle_balance+0x478/0x610
      [c00000000c747be0] c0000000001a1d48 pick_next_task_fair+0x58/0x480
      [c00000000c747c40] c000000000eaab5c __schedule+0x12c/0x950
      [c00000000c747cd0] c000000000eab3e8 schedule+0x68/0x120
      [c00000000c747d00] c00000000016b730 worker_thread+0x130/0x640
      [c00000000c747da0] c000000000174d50 kthread+0x1a0/0x1b0
      [c00000000c747e10] c00000000000e0f0 ret_from_kernel_thread+0x5c/0x6c
    
    This shows that CPU 2, which was idle, woke up and then appears to
    randomly take an instruction fault on a completely valid area of
    kernel text.
    
    The cause turns out to be the call to hash__mark_rodata_ro(), late in
    boot. Due to the way we layout text and rodata, that function actually
    changes the permissions for all of text and rodata to read-only plus
    execute.
    
    To do the permission change we use a hypervisor call, H_PROTECT. On
    phyp that appears to be implemented by briefly removing the mapping of
    the kernel text, before putting it back with the updated permissions.
    If any other CPU is executing during that window, it will see spurious
    faults on the kernel text and/or data, leading to crashes.
    
    To fix it we use stop machine to collect all other CPUs, and then have
    them drop into real mode (MMU off), while we change the mapping. That
    way they are unaffected by the mapping temporarily disappearing.
    
    We don't see this bug on KVM because KVM always use VPM=1, where
    faults are directed to the hypervisor, and the fault will be
    serialised vs the h_protect() by HPTE_V_HVLOCK.
    
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/20210331003845.216246-5-mpe@ellerman.id.au
    mpe committed Apr 8, 2021
  12. powerpc/mm/64s/hash: Factor out change_memory_range()

    Pull the loop calling hpte_updateboltedpp() out of
    hash__change_memory_range() into a helper function. We need it to be a
    separate function for the next patch.
    
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/20210331003845.216246-4-mpe@ellerman.id.au
    mpe committed Apr 8, 2021
  13. powerpc/64s: Use htab_convert_pte_flags() in hash__mark_rodata_ro()

    In hash__mark_rodata_ro() we pass the raw PP_RXXX value to
    hash__change_memory_range(). That has the effect of setting the key to
    zero, because PP_RXXX contains no key value.
    
    Fix it by using htab_convert_pte_flags(), which knows how to convert a
    pgprot into a pp value, including the key.
    
    Fixes: d94b827 ("powerpc/book3s64/kuap: Use Key 3 for kernel mapping with hash translation")
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Reviewed-by: Daniel Axtens <dja@axtens.net>
    Link: https://lore.kernel.org/r/20210331003845.216246-3-mpe@ellerman.id.au
    mpe committed Apr 8, 2021
  14. powerpc/pseries: Add key to flags in pSeries_lpar_hpte_updateboltedpp()

    The flags argument to plpar_pte_protect() (aka. H_PROTECT), includes
    the key in bits 9-13, but currently we always set those bits to zero.
    
    In the past that hasn't been a problem because we always used key 0
    for the kernel, and updateboltedpp() is only used for kernel mappings.
    
    However since commit d94b827 ("powerpc/book3s64/kuap: Use Key 3
    for kernel mapping with hash translation") we are now inadvertently
    changing the key (to zero) when we call plpar_pte_protect().
    
    That hasn't broken anything because updateboltedpp() is only used for
    STRICT_KERNEL_RWX, which is currently disabled on 64s due to other
    bugs.
    
    But we want to fix that, so first we need to pass the key correctly to
    plpar_pte_protect(). We can't pass our newpp value directly in, we
    have to convert it into the form expected by the hcall.
    
    The hcall we're using here is H_PROTECT, which is specified in section
    14.5.4.1.6 of LoPAPR v1.1.
    
    It takes a `flags` parameter, and the description for flags says:
    
     * flags: AVPN, pp0, pp1, pp2, key0-key4, n, and for the CMO
       option: CMO Option flags as defined in Table 189‚
    
    If you then go to the start of the parent section, 14.5.4.1, on page
    405, it says:
    
    Register Linkage (For hcall() tokens 0x04 - 0x18)
     * On Call
       * R3 function call token
       * R4 flags (see Table 178‚ “Page Frame Table Access flags field
         definition‚” on page 401)
    
    Then you have to go to section 14.5.3, and on page 394 there is a list
    of hcalls and their tokens (table 176), and there you can see that
    H_PROTECT == 0x18.
    
    Finally you can look at table 178, on page 401, where it specifies the
    layout of the bits for the key:
    
     Bit     Function
     -----------------
     50-54 | key0-key4
    
    Those are big-endian bit numbers, converting to normal bit numbers you
    get bits 9-13, or 0x3e00.
    
    In the kernel we have:
    
      #define HPTE_R_KEY_HI		ASM_CONST(0x3000000000000000)
      #define HPTE_R_KEY_LO		ASM_CONST(0x0000000000000e00)
    
    So the LO bits of newpp are already in the right place, and the HI
    bits need to be shifted down by 48.
    
    Fixes: d94b827 ("powerpc/book3s64/kuap: Use Key 3 for kernel mapping with hash translation")
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/20210331003845.216246-2-mpe@ellerman.id.au
    mpe committed Apr 8, 2021
  15. powerpc/mm/64s: Add _PAGE_KERNEL_ROX

    In the past we had a fallback definition for _PAGE_KERNEL_ROX, but we
    removed that in commit d82fd29 ("powerpc/mm: Distribute platform
    specific PAGE and PMD flags and definitions") and added definitions
    for each MMU family.
    
    However we missed adding a definition for 64s, which was not really a
    bug because it's currently not used.
    
    But we'd like to use PAGE_KERNEL_ROX in a future patch so add a
    definition now.
    
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/20210331003845.216246-1-mpe@ellerman.id.au
    mpe committed Apr 8, 2021
  16. selftests/powerpc: Test for spurious kernel memory faults on radix

    Previously when mapping kernel memory on radix, no ptesync was
    included which would periodically lead to unhandled spurious faults.
    Mapping kernel memory is used when code patching with Strict RWX
    enabled. As suggested by Chris Riedl, turning ftrace on and off does a
    large amount of code patching so is a convenient way to see this kind
    of fault.
    
    Add a selftest to try and trigger this kind of a spurious fault. It
    tests for 30 seconds which is usually long enough for the issue to
    show up.
    
    Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
    [mpe: Rename it to better reflect what it does, rather than the symptom]
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/20210208032957.1232102-2-jniethe5@gmail.com
    iamjpn authored and mpe committed Apr 8, 2021
  17. powerpc/64s: Fix pte update for kernel memory on radix

    When adding a PTE a ptesync is needed to order the update of the PTE
    with subsequent accesses otherwise a spurious fault may be raised.
    
    radix__set_pte_at() does not do this for performance gains. For
    non-kernel memory this is not an issue as any faults of this kind are
    corrected by the page fault handler. For kernel memory these faults
    are not handled. The current solution is that there is a ptesync in
    flush_cache_vmap() which should be called when mapping from the
    vmalloc region.
    
    However, map_kernel_page() does not call flush_cache_vmap(). This is
    troublesome in particular for code patching with Strict RWX on radix.
    In do_patch_instruction() the page frame that contains the instruction
    to be patched is mapped and then immediately patched. With no ordering
    or synchronization between setting up the PTE and writing to the page
    it is possible for faults.
    
    As the code patching is done using __put_user_asm_goto() the resulting
    fault is obscured - but using a normal store instead it can be seen:
    
      BUG: Unable to handle kernel data access on write at 0xc008000008f24a3c
      Faulting instruction address: 0xc00000000008bd74
      Oops: Kernel access of bad area, sig: 11 [#1]
      LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
      Modules linked in: nop_module(PO+) [last unloaded: nop_module]
      CPU: 4 PID: 757 Comm: sh Tainted: P           O      5.10.0-rc5-01361-ge3c1b78c8440-dirty torvalds#43
      NIP:  c00000000008bd74 LR: c00000000008bd50 CTR: c000000000025810
      REGS: c000000016f634a0 TRAP: 0300   Tainted: P           O       (5.10.0-rc5-01361-ge3c1b78c8440-dirty)
      MSR:  9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 44002884  XER: 00000000
      CFAR: c00000000007c68c DAR: c008000008f24a3c DSISR: 42000000 IRQMASK: 1
    
    This results in the kind of issue reported here:
      https://lore.kernel.org/linuxppc-dev/15AC5B0E-A221-4B8C-9039-FA96B8EF7C88@lca.pw/
    
    Chris Riedl suggested a reliable way to reproduce the issue:
      $ mount -t debugfs none /sys/kernel/debug
      $ (while true; do echo function > /sys/kernel/debug/tracing/current_tracer ; echo nop > /sys/kernel/debug/tracing/current_tracer ; done) &
    
    Turning ftrace on and off does a large amount of code patching which
    in usually less then 5min will crash giving a trace like:
    
       ftrace-powerpc: (____ptrval____): replaced (4b473b11) != old (60000000)
       ------------[ ftrace bug ]------------
       ftrace failed to modify
       [<c000000000bf8e5c>] napi_busy_loop+0xc/0x390
        actual:   11:3b:47:4b
       Setting ftrace call site to call ftrace function
       ftrace record flags: 80000001
        (1)
        expected tramp: c00000000006c96c
       ------------[ cut here ]------------
       WARNING: CPU: 4 PID: 809 at kernel/trace/ftrace.c:2065 ftrace_bug+0x28c/0x2e8
       Modules linked in: nop_module(PO-) [last unloaded: nop_module]
       CPU: 4 PID: 809 Comm: sh Tainted: P           O      5.10.0-rc5-01360-gf878ccaf250a #1
       NIP:  c00000000024f334 LR: c00000000024f330 CTR: c0000000001a5af0
       REGS: c000000004c8b760 TRAP: 0700   Tainted: P           O       (5.10.0-rc5-01360-gf878ccaf250a)
       MSR:  900000000282b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 28008848  XER: 20040000
       CFAR: c0000000001a9c98 IRQMASK: 0
       GPR00: c00000000024f330 c000000004c8b9f0 c000000002770600 0000000000000022
       GPR04: 00000000ffff7fff c000000004c8b6d0 0000000000000027 c0000007fe9bcdd8
       GPR08: 0000000000000023 ffffffffffffffd8 0000000000000027 c000000002613118
       GPR12: 0000000000008000 c0000007fffdca00 0000000000000000 0000000000000000
       GPR16: 0000000023ec37c5 0000000000000000 0000000000000000 0000000000000008
       GPR20: c000000004c8bc90 c0000000027a2d20 c000000004c8bcd0 c000000002612fe8
       GPR24: 0000000000000038 0000000000000030 0000000000000028 0000000000000020
       GPR28: c000000000ff1b68 c000000000bf8e5c c00000000312f700 c000000000fbb9b0
       NIP ftrace_bug+0x28c/0x2e8
       LR  ftrace_bug+0x288/0x2e8
       Call Trace:
         ftrace_bug+0x288/0x2e8 (unreliable)
         ftrace_modify_all_code+0x168/0x210
         arch_ftrace_update_code+0x18/0x30
         ftrace_run_update_code+0x44/0xc0
         ftrace_startup+0xf8/0x1c0
         register_ftrace_function+0x4c/0xc0
         function_trace_init+0x80/0xb0
         tracing_set_tracer+0x2a4/0x4f0
         tracing_set_trace_write+0xd4/0x130
         vfs_write+0xf0/0x330
         ksys_write+0x84/0x140
         system_call_exception+0x14c/0x230
         system_call_common+0xf0/0x27c
    
    To fix this when updating kernel memory PTEs using ptesync.
    
    Fixes: f1cb8f9 ("powerpc/64s/radix: avoid ptesync after set_pte and ptep_set_access_flags")
    Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
    Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
    [mpe: Tidy up change log slightly]
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/20210208032957.1232102-1-jniethe5@gmail.com
    iamjpn authored and mpe committed Apr 8, 2021
  18. powerpc: Spelling/typo fixes

    Various spelling/typo fixes.
    
    Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
    Acked-by: Randy Dunlap <rdunlap@infradead.org>
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    unixbhaskar authored and mpe committed Apr 8, 2021

Commits on Apr 3, 2021

  1. powerpc: Switch to relative jump labels

    Convert powerpc to relative jump labels.
    
    Before the patch, pseries_defconfig vmlinux.o has:
    9074 __jump_table  0003f2a0  0000000000000000  0000000000000000  01321fa8  2**0
    
    With the patch, the same config gets:
    9074 __jump_table  0002a0e0  0000000000000000  0000000000000000  01321fb4  2**0
    
    Size is 258720 without the patch, 172256 with the patch.
    That's a 33% size reduction.
    
    Largely copied from commit c296146 ("arm64/kernel: jump_label:
    Switch to relative references")
    
    Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/828348da7868eda953ce023994404dfc49603b64.1616514473.git.christophe.leroy@csgroup.eu
    chleroy authored and mpe committed Apr 3, 2021
  2. powerpc/bpf: Reallocate BPF registers to volatile registers when poss…

    …ible on PPC32
    
    When the BPF routine doesn't call any function, the non volatile
    registers can be reallocated to volatile registers in order to
    avoid having to save them/restore on the stack.
    
    Before this patch, the test torvalds#359 ADD default X is:
    
       0:	7c 64 1b 78 	mr      r4,r3
       4:	38 60 00 00 	li      r3,0
       8:	94 21 ff b0 	stwu    r1,-80(r1)
       c:	60 00 00 00 	nop
      10:	92 e1 00 2c 	stw     r23,44(r1)
      14:	93 01 00 30 	stw     r24,48(r1)
      18:	93 21 00 34 	stw     r25,52(r1)
      1c:	93 41 00 38 	stw     r26,56(r1)
      20:	39 80 00 00 	li      r12,0
      24:	39 60 00 00 	li      r11,0
      28:	3b 40 00 00 	li      r26,0
      2c:	3b 20 00 00 	li      r25,0
      30:	7c 98 23 78 	mr      r24,r4
      34:	7c 77 1b 78 	mr      r23,r3
      38:	39 80 00 42 	li      r12,66
      3c:	39 60 00 00 	li      r11,0
      40:	7d 8c d2 14 	add     r12,r12,r26
      44:	39 60 00 00 	li      r11,0
      48:	7d 83 63 78 	mr      r3,r12
      4c:	82 e1 00 2c 	lwz     r23,44(r1)
      50:	83 01 00 30 	lwz     r24,48(r1)
      54:	83 21 00 34 	lwz     r25,52(r1)
      58:	83 41 00 38 	lwz     r26,56(r1)
      5c:	38 21 00 50 	addi    r1,r1,80
      60:	4e 80 00 20 	blr
    
    After this patch, the same test has become:
    
       0:	7c 64 1b 78 	mr      r4,r3
       4:	38 60 00 00 	li      r3,0
       8:	94 21 ff b0 	stwu    r1,-80(r1)
       c:	60 00 00 00 	nop
      10:	39 80 00 00 	li      r12,0
      14:	39 60 00 00 	li      r11,0
      18:	39 00 00 00 	li      r8,0
      1c:	38 e0 00 00 	li      r7,0
      20:	7c 86 23 78 	mr      r6,r4
      24:	7c 65 1b 78 	mr      r5,r3
      28:	39 80 00 42 	li      r12,66
      2c:	39 60 00 00 	li      r11,0
      30:	7d 8c 42 14 	add     r12,r12,r8
      34:	39 60 00 00 	li      r11,0
      38:	7d 83 63 78 	mr      r3,r12
      3c:	38 21 00 50 	addi    r1,r1,80
      40:	4e 80 00 20 	blr
    
    Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/b94562d7d2bb21aec89de0c40bb3cd91054b65a2.1616430991.git.christophe.leroy@csgroup.eu
    chleroy authored and mpe committed Apr 3, 2021
  3. powerpc/bpf: Implement extended BPF on PPC32

    Implement Extended Berkeley Packet Filter on Powerpc 32
    
    Test result with test_bpf module:
    
    	test_bpf: Summary: 378 PASSED, 0 FAILED, [354/366 JIT'ed]
    
    Registers mapping:
    
    	[BPF_REG_0] = r11-r12
    	/* function arguments */
    	[BPF_REG_1] = r3-r4
    	[BPF_REG_2] = r5-r6
    	[BPF_REG_3] = r7-r8
    	[BPF_REG_4] = r9-r10
    	[BPF_REG_5] = r21-r22 (Args 9 and 10 come in via the stack)
    	/* non volatile registers */
    	[BPF_REG_6] = r23-r24
    	[BPF_REG_7] = r25-r26
    	[BPF_REG_8] = r27-r28
    	[BPF_REG_9] = r29-r30
    	/* frame pointer aka BPF_REG_10 */
    	[BPF_REG_FP] = r17-r18
    	/* eBPF jit internal registers */
    	[BPF_REG_AX] = r19-r20
    	[TMP_REG] = r31
    
    As PPC32 doesn't have a redzone in the stack, a stack frame must always
    be set in order to host at least the tail count counter.
    
    The stack frame remains for tail calls, it is set by the first callee
    and freed by the last callee.
    
    r0 is used as temporary register as much as possible. It is referenced
    directly in the code in order to avoid misusing it, because some
    instructions interpret it as value 0 instead of register r0
    (ex: addi, addis, stw, lwz, ...)
    
    The following operations are not implemented:
    
    		case BPF_ALU64 | BPF_DIV | BPF_X: /* dst /= src */
    		case BPF_ALU64 | BPF_MOD | BPF_X: /* dst %= src */
    		case BPF_STX | BPF_XADD | BPF_DW: /* *(u64 *)(dst + off) += src */
    
    The following operations are only implemented for power of two constants:
    
    		case BPF_ALU64 | BPF_MOD | BPF_K: /* dst %= imm */
    		case BPF_ALU64 | BPF_DIV | BPF_K: /* dst /= imm */
    
    Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/61d8b149176ddf99e7d5cef0b6dc1598583ca202.1616430991.git.christophe.leroy@csgroup.eu
    chleroy authored and mpe committed Apr 3, 2021
  4. powerpc/asm: Add some opcodes in asm/ppc-opcode.h for PPC32 eBPF

    The following opcodes will be needed for the implementation
    of eBPF for PPC32. Add them in asm/ppc-opcode.h
    
    PPC_RAW_ADDE
    PPC_RAW_ADDZE
    PPC_RAW_ADDME
    PPC_RAW_MFLR
    PPC_RAW_ADDIC
    PPC_RAW_ADDIC_DOT
    PPC_RAW_SUBFC
    PPC_RAW_SUBFE
    PPC_RAW_SUBFIC
    PPC_RAW_SUBFZE
    PPC_RAW_ANDIS
    PPC_RAW_NOR
    
    Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/f7bd573a368edd78006f8a5af508c726e7ce1ed2.1616430991.git.christophe.leroy@csgroup.eu
    chleroy authored and mpe committed Apr 3, 2021
  5. powerpc/bpf: Change values of SEEN_ flags

    Because PPC32 will use more non volatile registers,
    move SEEN_ flags to positions 0-2 which corresponds to special
    registers.
    
    Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/608faa1dc3ecfead649e15392abd07b00313d2ba.1616430991.git.christophe.leroy@csgroup.eu
    chleroy authored and mpe committed Apr 3, 2021
  6. powerpc/bpf: Move common functions into bpf_jit_comp.c

    Move into bpf_jit_comp.c the functions that will remain common to
    PPC64 and PPC32 when we add support of EBPF for PPC32.
    
    Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/2c339d77fb168ef12b213ccddfee3cb6c8ce8ae1.1616430991.git.christophe.leroy@csgroup.eu
    chleroy authored and mpe committed Apr 3, 2021
  7. powerpc/bpf: Move common helpers into bpf_jit.h

    Move functions bpf_flush_icache(), bpf_is_seen_register() and
    bpf_set_seen_register() in order to reuse them in future
    bpf_jit_comp32.c
    
    Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/28e8d5a75e64807d7e9d39a4b52658755e259f8c.1616430991.git.christophe.leroy@csgroup.eu
    chleroy authored and mpe committed Apr 3, 2021
  8. powerpc/bpf: Change register numbering for bpf_set/is_seen_register()

    Instead of using BPF register number as input in functions
    bpf_set_seen_register() and bpf_is_seen_register(), use
    CPU register number directly.
    
    Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/0cd2506f598e7095ea43e62dca1f472de5474a0d.1616430991.git.christophe.leroy@csgroup.eu
    chleroy authored and mpe committed Apr 3, 2021
  9. powerpc/bpf: Remove classical BPF support for PPC32

    At the time being, PPC32 has Classical BPF support.
    
    The test_bpf module exhibits some failure:
    
    	test_bpf: torvalds#298 LD_IND byte frag jited:1 ret 202 != 66 FAIL (1 times)
    	test_bpf: torvalds#299 LD_IND halfword frag jited:1 ret 51958 != 17220 FAIL (1 times)
    	test_bpf: torvalds#301 LD_IND halfword mixed head/frag jited:1 ret 51958 != 1305 FAIL (1 times)
    	test_bpf: torvalds#303 LD_ABS byte frag jited:1 ret 202 != 66 FAIL (1 times)
    	test_bpf: torvalds#304 LD_ABS halfword frag jited:1 ret 51958 != 17220 FAIL (1 times)
    	test_bpf: torvalds#306 LD_ABS halfword mixed head/frag jited:1 ret 51958 != 1305 FAIL (1 times)
    
    	test_bpf: Summary: 371 PASSED, 7 FAILED, [119/366 JIT'ed]
    
    Fixing this is not worth the effort. Instead, remove support for
    classical BPF and prepare for adding Extended BPF support instead.
    
    Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/fbc3e4fcc9c8f6131d6c705212530b2aa50149ee.1616430991.git.christophe.leroy@csgroup.eu
    chleroy authored and mpe committed Apr 3, 2021
  10. powerpc/signal32: Simplify logging in sigreturn()

    Same spirit as commit debf122 ("powerpc/signal32: Simplify logging
    in handle_rt_signal32()"), remove this intermediate 'addr' local var.
    
    Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/638fa99530beb29f82f94370057d110e91272acc.1616151715.git.christophe.leroy@csgroup.eu
    chleroy authored and mpe committed Apr 3, 2021
  11. powerpc/signal32: Convert do_setcontext[_tm]() to user access block

    Add unsafe_get_user_sigset() and transform PPC32 get_sigset_t()
    into an unsafe version unsafe_get_sigset_t().
    
    Then convert do_setcontext() and do_setcontext_tm() to use
    user_read_access_begin/end.
    
    Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/9273ba664db769b8d9c7540ae91395e346e4945e.1616151715.git.christophe.leroy@csgroup.eu
    chleroy authored and mpe committed Apr 3, 2021
  12. powerpc/signal32: Convert restore_[tm]_user_regs() to user access block

    Convert restore_user_regs() and restore_tm_user_regs()
    to use user_access_read_begin/end blocks.
    
    Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/181adf15a6f644efcd1aeafb355f3578ff1b6bc5.1616151715.git.christophe.leroy@csgroup.eu
    chleroy authored and mpe committed Apr 3, 2021
  13. powerpc/signal32: Reorder user reads in restore_tm_user_regs()

    In restore_tm_user_regs(), regroup the reads from 'sr' and the ones
    from 'tm_sr' together in order to allow two block user accesses
    in following patch.
    
    Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/7c518b9a4c8e5ae9a3bfb647bc8b20bf820233af.1616151715.git.christophe.leroy@csgroup.eu
    chleroy authored and mpe committed Apr 3, 2021
Older