Skip to content

Commits

Permalink
Yang-Shi/Intro…
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Commits on Oct 5, 2022

  1. md: dm-crypt: use mempool page bulk allocator

    When using dm-crypt for full disk encryption, dm-crypt would allocate
    an out bio and allocate the same amount of pages as in bio for
    encryption.  It currently allocates one page at a time in a loop.  This
    is not efficient.  So using mempool page bulk allocator instead of
    allocating one page at a time.
    
    The mempool page bulk allocator would improve the IOPS with 1M I/O
    by approxiamately 6%.  The test is done on a VM with 80 vCPU and
    64GB memory with an encrypted ram device (the impact from storage
    hardware could be minimized so that we could benchmark the dm-crypt
    layer more accurately).
    
    Before the patch:
    Jobs: 1 (f=1): [w(1)][100.0%][r=0KiB/s,w=402MiB/s][r=0,w=402 IOPS][eta 00m:00s]
    crypt: (groupid=0, jobs=1): err= 0: pid=233950: Thu Sep 15 16:23:10 2022
      write: IOPS=402, BW=403MiB/s (423MB/s)(23.6GiB/60002msec)
        slat (usec): min=2425, max=3819, avg=2480.84, stdev=34.00
        clat (usec): min=7, max=165751, avg=156398.72, stdev=4691.03
         lat (msec): min=2, max=168, avg=158.88, stdev= 4.69
        clat percentiles (msec):
         |  1.00th=[  157],  5.00th=[  157], 10.00th=[  157], 20.00th=[  157],
         | 30.00th=[  157], 40.00th=[  157], 50.00th=[  157], 60.00th=[  157],
         | 70.00th=[  157], 80.00th=[  157], 90.00th=[  157], 95.00th=[  157],
         | 99.00th=[  159], 99.50th=[  159], 99.90th=[  165], 99.95th=[  165],
         | 99.99th=[  167]
       bw (  KiB/s): min=405504, max=413696, per=99.71%, avg=411845.53, stdev=1155.04, samples=120
       iops        : min=  396, max=  404, avg=402.17, stdev= 1.15, samples=120
      lat (usec)   : 10=0.01%
      lat (msec)   : 4=0.01%, 10=0.01%, 20=0.02%, 50=0.05%, 100=0.08%
      lat (msec)   : 250=100.09%
      cpu          : usr=3.74%, sys=95.66%, ctx=27, majf=0, minf=4
      IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=103.1%
         submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
         issued rwts: total=0,24138,0,0 short=0,0,0,0 dropped=0,0,0,0
         latency   : target=0, window=0, percentile=100.00%, depth=64
    
    Run status group 0 (all jobs):
      WRITE: bw=403MiB/s (423MB/s), 403MiB/s-403MiB/s (423MB/s-423MB/s), io=23.6GiB (25.4GB), run=60002-60002msec
    
    After the patch:
    Jobs: 1 (f=1): [w(1)][100.0%][r=0KiB/s,w=430MiB/s][r=0,w=430 IOPS][eta 00m:00s]
    crypt: (groupid=0, jobs=1): err= 0: pid=288730: Thu Sep 15 16:25:39 2022
      write: IOPS=430, BW=431MiB/s (452MB/s)(25.3GiB/60002msec)
        slat (usec): min=2253, max=3213, avg=2319.49, stdev=34.29
        clat (usec): min=6, max=149337, avg=146257.68, stdev=4239.52
         lat (msec): min=2, max=151, avg=148.58, stdev= 4.24
        clat percentiles (msec):
         |  1.00th=[  146],  5.00th=[  146], 10.00th=[  146], 20.00th=[  146],
         | 30.00th=[  146], 40.00th=[  146], 50.00th=[  146], 60.00th=[  146],
         | 70.00th=[  146], 80.00th=[  146], 90.00th=[  148], 95.00th=[  148],
         | 99.00th=[  148], 99.50th=[  148], 99.90th=[  150], 99.95th=[  150],
         | 99.99th=[  150]
       bw (  KiB/s): min=438272, max=442368, per=99.73%, avg=440463.57, stdev=1305.60, samples=120
       iops        : min=  428, max=  432, avg=430.12, stdev= 1.28, samples=120
      lat (usec)   : 10=0.01%
      lat (msec)   : 4=0.01%, 10=0.01%, 20=0.02%, 50=0.05%, 100=0.09%
      lat (msec)   : 250=100.07%
      cpu          : usr=3.78%, sys=95.37%, ctx=12778, majf=0, minf=4
      IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=103.1%
         submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
         issued rwts: total=0,25814,0,0 short=0,0,0,0 dropped=0,0,0,0
         latency   : target=0, window=0, percentile=100.00%, depth=64
    
    Run status group 0 (all jobs):
      WRITE: bw=431MiB/s (452MB/s), 431MiB/s-431MiB/s (452MB/s-452MB/s), io=25.3GiB (27.1GB), run=60002-60002msec
    
    The function tracing also shows the time consumed by page allocations is
    reduced significantly.  The test allocated 1M (256 pages) bio in the same
    environment.
    
    Before the patch:
    It took approximately 600us by excluding the bio_add_page() calls.
    2720.630754 |   56)  xfs_io-38859  |   2.571 us    |    mempool_alloc();
    2720.630757 |   56)  xfs_io-38859  |   0.937 us    |    bio_add_page();
     2720.630758 |   56)  xfs_io-38859  |   1.772 us    |    mempool_alloc();
     2720.630760 |   56)  xfs_io-38859  |   0.852 us    |    bio_add_page();
    ….
    2720.631559 |   56)  xfs_io-38859  |   2.058 us    |    mempool_alloc();
     2720.631561 |   56)  xfs_io-38859  |   0.717 us    |    bio_add_page();
     2720.631562 |   56)  xfs_io-38859  |   2.014 us    |    mempool_alloc();
     2720.631564 |   56)  xfs_io-38859  |   0.620 us    |    bio_add_page();
    
    After the patch:
    It took approxiamately 30us.
    11564.266385 |   22) xfs_io-136183  | + 30.551 us   |    __alloc_pages_bulk();
    
    Page allocations overhead is around 6% (600us/9853us) in dm-crypt layer shown by
    function trace.  The data also matches the IOPS data shown by fio.
    
    And the benchmark with 4K size I/O doesn't show measurable regression.
    
    Signed-off-by: Yang Shi <shy828301@gmail.com>
    yang-shi authored and intel-lab-lkp committed Oct 5, 2022
    Copy the full SHA
    2e43952 View commit details
    Browse the repository at this point in the history
  2. md: dm-crypt: move crypt_free_buffer_pages ahead

    With moving crypt_free_buffer_pages() before crypt_alloc_buffer(), we
    don't need an extra declaration anymore.
    
    Signed-off-by: Yang Shi <shy828301@gmail.com>
    yang-shi authored and intel-lab-lkp committed Oct 5, 2022
    Copy the full SHA
    cca58c1 View commit details
    Browse the repository at this point in the history
  3. mm: mempool: introduce page bulk allocator

    Since v5.13 the page bulk allocator was introduced to allocate order-0
    pages in bulk.  There are a few mempool allocator callers which does
    order-0 page allocation in a loop, for example, dm-crypt, f2fs compress,
    etc.  A mempool page bulk allocator seems useful.  So introduce the
    mempool page bulk allocator.
    
    It introduces the below APIs:
      - mempool_init_pages_bulk()
      - mempool_create_pages_bulk()
    They initialize the mempool for page bulk allocator.  The pool is filled
    by alloc_page() in a loop.
    
      - mempool_alloc_pages_bulk_list()
      - mempool_alloc_pages_bulk_array()
    They do bulk allocation from mempool.
    They do the below conceptually:
      1. Call bulk page allocator
      2. If the allocation is fulfilled then return otherwise try to
         allocate the remaining pages from the mempool
      3. If it is fulfilled then return otherwise retry from #1 with sleepable
         gfp
      4. If it is still failed, sleep for a while to wait for the mempool is
         refilled, then retry from #1
    The populated pages will stay on the list or array until the callers
    consume them or free them.
    Since mempool allocator is guaranteed to success in the sleepable context,
    so the two APIs return true for success or false for fail.  It is the
    caller's responsibility to handle failure case (partial allocation), just
    like the page bulk allocator.
    
    The mempool typically is an object agnostic allocator, but bulk allocation
    is only supported by pages, so the mempool bulk allocator is for page
    allocation only as well.
    
    Signed-off-by: Yang Shi <shy828301@gmail.com>
    yang-shi authored and intel-lab-lkp committed Oct 5, 2022
    Copy the full SHA
    439333b View commit details
    Browse the repository at this point in the history
  4. mm: mempool: extract common initialization code

    Extract the common initialization code to __mempool_init() and
    __mempool_create().  This will make adding mempool bulk init
    code easier.
    
    Signed-off-by: Yang Shi <shy828301@gmail.com>
    yang-shi authored and intel-lab-lkp committed Oct 5, 2022
    Copy the full SHA
    d431ec3 View commit details
    Browse the repository at this point in the history

Commits on Oct 4, 2022

  1. dm clone: Fix typo in block_device format specifier

    Use %pg for printing the block device name, instead of %pd.
    
    Fixes: 385411f ("dm: stop using bdevname")
    Cc: stable@vger.kernel.org # v5.18+
    Signed-off-by: Nikos Tsironis <ntsironis@arrikto.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    ntsiron authored and Mike Snitzer committed Oct 4, 2022
    Copy the full SHA
    a871fb2 View commit details
    Browse the repository at this point in the history
  2. dm: remove unnecessary assignment statement in alloc_dev()

    Fixes: 74fe6ba ("dm: convert to blk_alloc_disk/blk_cleanup_disk")
    Signed-off-by: Genjian Zhang <zhanggenjian@kylinos.cn>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    Genjian Zhang authored and Mike Snitzer committed Oct 4, 2022
    Copy the full SHA
    460fde1 View commit details
    Browse the repository at this point in the history
  3. dm verity: Add documentation for try_verify_in_tasklet option

    Add documentation that was missing from commit 5721d4e ("dm
    verity: Add optional "try_verify_in_tasklet" feature").
    
    Signed-off-by: Milan Broz <gmazyland@gmail.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    mbroz authored and Mike Snitzer committed Oct 4, 2022
    Copy the full SHA
    7e05e0d View commit details
    Browse the repository at this point in the history
  4. dm: support allocating error strings to enhance errors returned to us…

    …erspace
    
    Previously, ti->error and ti_error strings were pointing to statically
    allocated memory (the .rodata section) and it was not possible to add
    parameters to the error strings.
    
    This commit makes possible to allocate error strings dynamically using
    the "kasprintf" function, so we can add arbitrary parameters to the
    strings.
    
    We need to free the error string only if is allocated with kasprintf.
    So, we introduce a function "dm_free_error" that tests if the string
    points to the module area (and doesn't free the string if it does),
    then it calls kfree_const. kfree_const detects if the string points to
    the kernel .rodata section and frees the string if it is not in
    .rodata.
    
    Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    Mikulas Patocka authored and Mike Snitzer committed Oct 4, 2022
    Copy the full SHA
    a336ab1 View commit details
    Browse the repository at this point in the history

Commits on Sep 7, 2022

  1. mm: export is_vmalloc_or_module_addr

    Export is_vmalloc_or_module_addr - device mapper needs it to determine
    if the error string is statically or dynamically allocated.
    
    Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    Mikulas Patocka authored and Mike Snitzer committed Sep 7, 2022
    Copy the full SHA
    fb4a181 View commit details
    Browse the repository at this point in the history
  2. dm cache: delete the redundant word 'each' in comment

    Signed-off-by: Shaomin Deng <dengshaomin@cdjrlc.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    Shaomin Deng authored and Mike Snitzer committed Sep 7, 2022
    Copy the full SHA
    47e6357 View commit details
    Browse the repository at this point in the history
  3. dm raid: fix typo in analyse_superblocks code comment

    Reported-by: k2ci <kernel-bot@kylinos.cn>
    Signed-off-by: Jiangshan Yi <yijiangshan@kylinos.cn>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    Jiangshan Yi authored and Mike Snitzer committed Sep 7, 2022
    Copy the full SHA
    f924fd1 View commit details
    Browse the repository at this point in the history

Commits on Aug 31, 2022

  1. dm verity: enable WQ_HIGHPRI on verify_wq

    WQ_HIGHPRI increases throughput and decreases disk latency when using
    dm-verity. This is important in Android for camera startup speed.
    
    The following tests were run by doing 60 seconds of random reads using
    a dm-verity device backed by two ramdisks.
    
    Without WQ_HIGHPRI
    lat (usec): min=13, max=3947, avg=69.53, stdev=50.55
    READ: bw=51.1MiB/s (53.6MB/s), 51.1MiB/s-51.1MiB/s (53.6MB/s-53.6MB/s)
    
    With WQ_HIGHPRI:
    lat (usec): min=13, max=7854, avg=31.15, stdev=30.42
    READ: bw=116MiB/s (121MB/s), 116MiB/s-116MiB/s (121MB/s-121MB/s)
    
    Further testing was done by measuring how long it takes to open a
    camera on an Android device.
    
    Without WQ_HIGHPRI
    Total verity work queue wait times (ms):
    880.960, 789.517, 898.852
    
    With WQ_HIGHPRI:
    Total verity work queue wait times (ms):
    528.824, 439.191, 433.300
    
    The average time to open the camera is reduced by 350ms (or 40-50%).
    
    Signed-off-by: Nathan Huckleberry <nhuck@google.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    nhukc authored and Mike Snitzer committed Aug 31, 2022
    Copy the full SHA
    c8e2f5c View commit details
    Browse the repository at this point in the history
  2. dm raid: delete the redundant word 'that' in comment

    Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    prepcy authored and Mike Snitzer committed Aug 31, 2022
    Copy the full SHA
    8093333 View commit details
    Browse the repository at this point in the history
  3. dm ioctl: add an option to return an error string to userspace

    Introduce a new flag DM_RETURN_ERROR_FLAG. This flag should only be
    set on table load ioctl. When this flag is present and table load
    fails, the error string is returned in the "name" field of the ioctl
    and the error code in the "error" field. The flag DM_RETURN_ERROR_FLAG
    will always be cleared and the table load ioctl always returns 0.
    
    The reason for always returning 0 is there are some error paths where
    an error string cannot be set (e.g. -EFAULT). So if userspace sets
    DM_RETURN_ERROR_FLAG, and the table load ioctl returns an error,
    userspace should not parse the dm ioctl structure.
    
    Requested-by: Milan Broz <gmazyland@gmail.com>
    Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    Mikulas Patocka authored and Mike Snitzer committed Aug 31, 2022
    Copy the full SHA
    0abd2cd View commit details
    Browse the repository at this point in the history
  4. dm: change from DMWARN to DMERR or DMCRIT for fatal errors

    Change DMWARN to DMERR in cases when there is an unrecoverable error.
    Change DMWARN to DMCRIT when handling of a case is unimplemented.
    
    Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    Mikulas Patocka authored and Mike Snitzer committed Aug 31, 2022
    Copy the full SHA
    e59e8d2 View commit details
    Browse the repository at this point in the history

Commits on Aug 22, 2022

  1. Linux 6.0-rc2

    torvalds committed Aug 22, 2022
    Copy the full SHA
    1c23f9e View commit details
    Browse the repository at this point in the history

Commits on Aug 21, 2022

  1. Merge tag 'irq-urgent-2022-08-21' of git://git.kernel.org/pub/scm/lin…

    …ux/kernel/git/tip/tip
    
    Pull irq fixes from Ingo Molnar:
     "Misc irqchip fixes: LoongArch driver fixes and a Hyper-V IOMMU fix"
    
    * tag 'irq-urgent-2022-08-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
      irqchip/loongson-liointc: Fix an error handling path in liointc_init()
      irqchip/loongarch: Fix irq_domain_alloc_fwnode() abuse
      irqchip/loongson-pch-pic: Move find_pch_pic() into CONFIG_ACPI
      irqchip/loongson-eiointc: Fix a build warning
      irqchip/loongson-eiointc: Fix irq affinity setting
      iommu/hyper-v: Use helper instead of directly accessing affinity
    torvalds committed Aug 21, 2022
    Copy the full SHA
    4daa6a8 View commit details
    Browse the repository at this point in the history
  2. Merge tag 'perf-urgent-2022-08-21' of git://git.kernel.org/pub/scm/li…

    …nux/kernel/git/tip/tip
    
    Pull x86 kprobes fix from Ingo Molnar:
     "Fix a kprobes bug in JNG/JNLE emulation when a kprobe is installed at
      such instructions, possibly resulting in incorrect execution (the
      wrong branch taken)"
    
    * tag 'perf-urgent-2022-08-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
      x86/kprobes: Fix JNG/JNLE emulation
    torvalds committed Aug 21, 2022
    Copy the full SHA
    4f61f84 View commit details
    Browse the repository at this point in the history
  3. Merge tag 'trace-v6.0-rc1-2' of git://git.kernel.org/pub/scm/linux/ke…

    …rnel/git/rostedt/linux-trace
    
    Pull tracing fixes from Steven Rostedt:
     "Various fixes for tracing:
    
       - Fix a return value of traceprobe_parse_event_name()
    
       - Fix NULL pointer dereference from failed ftrace enabling
    
       - Fix NULL pointer dereference when asking for registers from eprobes
    
       - Make eprobes consistent with kprobes/uprobes, filters and
         histograms"
    
    * tag 'trace-v6.0-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
      tracing: Have filter accept "common_cpu" to be consistent
      tracing/probes: Have kprobes and uprobes use $COMM too
      tracing/eprobes: Have event probes be consistent with kprobes and uprobes
      tracing/eprobes: Fix reading of string fields
      tracing/eprobes: Do not hardcode $comm as a string
      tracing/eprobes: Do not allow eprobes to use $stack, or % for regs
      ftrace: Fix NULL pointer dereference in is_ftrace_trampoline when ftrace is dead
      tracing/perf: Fix double put of trace event when init fails
      tracing: React to error return from traceprobe_parse_event_name()
    torvalds committed Aug 21, 2022
    Copy the full SHA
    7fb312d View commit details
    Browse the repository at this point in the history
  4. tracing: Have filter accept "common_cpu" to be consistent

    Make filtering consistent with histograms. As "cpu" can be a field of an
    event, allow for "common_cpu" to keep it from being confused with the
    "cpu" field of the event.
    
    Link: https://lkml.kernel.org/r/20220820134401.513062765@goodmis.org
    Link: https://lore.kernel.org/all/20220820220920.e42fa32b70505b1904f0a0ad@kernel.org/
    
    Cc: stable@vger.kernel.org
    Cc: Ingo Molnar <mingo@kernel.org>
    Cc: Andrew Morton <akpm@linux-foundation.org>
    Cc: Tzvetomir Stoyanov <tz.stoyanov@gmail.com>
    Cc: Tom Zanussi <zanussi@kernel.org>
    Fixes: 1e3bac7 ("tracing/histogram: Rename "cpu" to "common_cpu"")
    Suggested-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
    Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
    Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
    rostedt committed Aug 21, 2022
    Copy the full SHA
    b238057 View commit details
    Browse the repository at this point in the history
  5. tracing/probes: Have kprobes and uprobes use $COMM too

    Both $comm and $COMM can be used to get current->comm in eprobes and the
    filtering and histogram logic. Make kprobes and uprobes consistent in this
    regard and allow both $comm and $COMM as well. Currently kprobes and
    uprobes only handle $comm, which is inconsistent with the other utilities,
    and can be confusing to users.
    
    Link: https://lkml.kernel.org/r/20220820134401.317014913@goodmis.org
    Link: https://lore.kernel.org/all/20220820220442.776e1ddaf8836e82edb34d01@kernel.org/
    
    Cc: stable@vger.kernel.org
    Cc: Ingo Molnar <mingo@kernel.org>
    Cc: Andrew Morton <akpm@linux-foundation.org>
    Cc: Tzvetomir Stoyanov <tz.stoyanov@gmail.com>
    Cc: Tom Zanussi <zanussi@kernel.org>
    Fixes: 5330592 ("tracing: probeevent: Introduce new argument fetching code")
    Suggested-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
    Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
    Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
    rostedt committed Aug 21, 2022
    Copy the full SHA
    ab83844 View commit details
    Browse the repository at this point in the history
  6. tracing/eprobes: Have event probes be consistent with kprobes and upr…

    …obes
    
    Currently, if a symbol "@" is attempted to be used with an event probe
    (eprobes), it will cause a NULL pointer dereference crash.
    
    Both kprobes and uprobes can reference data other than the main registers.
    Such as immediate address, symbols and the current task name. Have eprobes
    do the same thing.
    
    For "comm", if "comm" is used and the event being attached to does not
    have the "comm" field, then make it the "$comm" that kprobes has. This is
    consistent to the way histograms and filters work.
    
    Link: https://lkml.kernel.org/r/20220820134401.136924220@goodmis.org
    
    Cc: stable@vger.kernel.org
    Cc: Ingo Molnar <mingo@kernel.org>
    Cc: Andrew Morton <akpm@linux-foundation.org>
    Cc: Masami Hiramatsu <mhiramat@kernel.org>
    Cc: Tzvetomir Stoyanov <tz.stoyanov@gmail.com>
    Cc: Tom Zanussi <zanussi@kernel.org>
    Fixes: 7491e2c ("tracing: Add a probe that attaches to trace events")
    Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
    rostedt committed Aug 21, 2022
    Copy the full SHA
    6a832ec View commit details
    Browse the repository at this point in the history
  7. tracing/eprobes: Fix reading of string fields

    Currently when an event probe (eprobe) hooks to a string field, it does
    not display it as a string, but instead as a number. This makes the field
    rather useless. Handle the different kinds of strings, dynamic, static,
    relational/dynamic etc.
    
    Now when a string field is used, the ":string" type can be used to display
    it:
    
      echo "e:sw sched/sched_switch comm=$next_comm:string" > dynamic_events
    
    Link: https://lkml.kernel.org/r/20220820134400.959640191@goodmis.org
    
    Cc: stable@vger.kernel.org
    Cc: Ingo Molnar <mingo@kernel.org>
    Cc: Andrew Morton <akpm@linux-foundation.org>
    Cc: Tzvetomir Stoyanov <tz.stoyanov@gmail.com>
    Cc: Tom Zanussi <zanussi@kernel.org>
    Fixes: 7491e2c ("tracing: Add a probe that attaches to trace events")
    Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
    Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
    rostedt committed Aug 21, 2022
    Copy the full SHA
    f04dec9 View commit details
    Browse the repository at this point in the history
  8. tracing/eprobes: Do not hardcode $comm as a string

    The variable $comm is hard coded as a string, which is true for both
    kprobes and uprobes, but for event probes (eprobes) it is a field name. In
    most cases the "comm" field would be a string, but there's no guarantee of
    that fact.
    
    Do not assume that comm is a string. Not to mention, it currently forces
    comm fields to fault, as string processing for event probes is currently
    broken.
    
    Link: https://lkml.kernel.org/r/20220820134400.756152112@goodmis.org
    
    Cc: stable@vger.kernel.org
    Cc: Ingo Molnar <mingo@kernel.org>
    Cc: Andrew Morton <akpm@linux-foundation.org>
    Cc: Masami Hiramatsu <mhiramat@kernel.org>
    Cc: Tzvetomir Stoyanov <tz.stoyanov@gmail.com>
    Cc: Tom Zanussi <zanussi@kernel.org>
    Fixes: 7491e2c ("tracing: Add a probe that attaches to trace events")
    Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
    rostedt committed Aug 21, 2022
    Copy the full SHA
    02333de View commit details
    Browse the repository at this point in the history
  9. tracing/eprobes: Do not allow eprobes to use $stack, or % for regs

    While playing with event probes (eprobes), I tried to see what would
    happen if I attempted to retrieve the instruction pointer (%rip) knowing
    that event probes do not use pt_regs. The result was:
    
     BUG: kernel NULL pointer dereference, address: 0000000000000024
     #PF: supervisor read access in kernel mode
     #PF: error_code(0x0000) - not-present page
     PGD 0 P4D 0
     Oops: 0000 [#1] PREEMPT SMP PTI
     CPU: 1 PID: 1847 Comm: trace-cmd Not tainted 5.19.0-rc5-test+ torvalds#309
     Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01
    v03.03 07/14/2016
     RIP: 0010:get_event_field.isra.0+0x0/0x50
     Code: ff 48 c7 c7 c0 8f 74 a1 e8 3d 8b f5 ff e8 88 09 f6 ff 4c 89 e7 e8
    50 6a 13 00 48 89 ef 5b 5d 41 5c 41 5d e9 42 6a 13 00 66 90 <48> 63 47 24
    8b 57 2c 48 01 c6 8b 47 28 83 f8 02 74 0e 83 f8 04 74
     RSP: 0018:ffff916c394bbaf0 EFLAGS: 00010086
     RAX: ffff916c854041d8 RBX: ffff916c8d9fbf50 RCX: ffff916c255d2000
     RDX: 0000000000000000 RSI: ffff916c255d2008 RDI: 0000000000000000
     RBP: 0000000000000000 R08: ffff916c3a2a0c08 R09: ffff916c394bbda8
     R10: 0000000000000000 R11: 0000000000000000 R12: ffff916c854041d8
     R13: ffff916c854041b0 R14: 0000000000000000 R15: 0000000000000000
     FS:  0000000000000000(0000) GS:ffff916c9ea40000(0000)
    knlGS:0000000000000000
     CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
     CR2: 0000000000000024 CR3: 000000011b60a002 CR4: 00000000001706e0
     Call Trace:
      <TASK>
      get_eprobe_size+0xb4/0x640
      ? __mod_node_page_state+0x72/0xc0
      __eprobe_trace_func+0x59/0x1a0
      ? __mod_lruvec_page_state+0xaa/0x1b0
      ? page_remove_file_rmap+0x14/0x230
      ? page_remove_rmap+0xda/0x170
      event_triggers_call+0x52/0xe0
      trace_event_buffer_commit+0x18f/0x240
      trace_event_raw_event_sched_wakeup_template+0x7a/0xb0
      try_to_wake_up+0x260/0x4c0
      __wake_up_common+0x80/0x180
      __wake_up_common_lock+0x7c/0xc0
      do_notify_parent+0x1c9/0x2a0
      exit_notify+0x1a9/0x220
      do_exit+0x2ba/0x450
      do_group_exit+0x2d/0x90
      __x64_sys_exit_group+0x14/0x20
      do_syscall_64+0x3b/0x90
      entry_SYSCALL_64_after_hwframe+0x46/0xb0
    
    Obviously this is not the desired result.
    
    Move the testing for TPARG_FL_TPOINT which is only used for event probes
    to the top of the "$" variable check, as all the other variables are not
    used for event probes. Also add a check in the register parsing "%" to
    fail if an event probe is used.
    
    Link: https://lkml.kernel.org/r/20220820134400.564426983@goodmis.org
    
    Cc: stable@vger.kernel.org
    Cc: Ingo Molnar <mingo@kernel.org>
    Cc: Andrew Morton <akpm@linux-foundation.org>
    Cc: Tzvetomir Stoyanov <tz.stoyanov@gmail.com>
    Cc: Tom Zanussi <zanussi@kernel.org>
    Fixes: 7491e2c ("tracing: Add a probe that attaches to trace events")
    Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
    Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
    rostedt committed Aug 21, 2022
    Copy the full SHA
    2673c60 View commit details
    Browse the repository at this point in the history
  10. ftrace: Fix NULL pointer dereference in is_ftrace_trampoline when ftr…

    …ace is dead
    
    ftrace_startup does not remove ops from ftrace_ops_list when
    ftrace_startup_enable fails:
    
    register_ftrace_function
      ftrace_startup
        __register_ftrace_function
          ...
          add_ftrace_ops(&ftrace_ops_list, ops)
          ...
        ...
        ftrace_startup_enable // if ftrace failed to modify, ftrace_disabled is set to 1
        ...
      return 0 // ops is in the ftrace_ops_list.
    
    When ftrace_disabled = 1, unregister_ftrace_function simply returns without doing anything:
    unregister_ftrace_function
      ftrace_shutdown
        if (unlikely(ftrace_disabled))
                return -ENODEV;  // return here, __unregister_ftrace_function is not executed,
                                 // as a result, ops is still in the ftrace_ops_list
        __unregister_ftrace_function
        ...
    
    If ops is dynamically allocated, it will be free later, in this case,
    is_ftrace_trampoline accesses NULL pointer:
    
    is_ftrace_trampoline
      ftrace_ops_trampoline
        do_for_each_ftrace_op(op, ftrace_ops_list) // OOPS! op may be NULL!
    
    Syzkaller reports as follows:
    [ 1203.506103] BUG: kernel NULL pointer dereference, address: 000000000000010b
    [ 1203.508039] #PF: supervisor read access in kernel mode
    [ 1203.508798] #PF: error_code(0x0000) - not-present page
    [ 1203.509558] PGD 800000011660b067 P4D 800000011660b067 PUD 130fb8067 PMD 0
    [ 1203.510560] Oops: 0000 [#1] SMP KASAN PTI
    [ 1203.511189] CPU: 6 PID: 29532 Comm: syz-executor.2 Tainted: G    B   W         5.10.0 torvalds#8
    [ 1203.512324] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
    [ 1203.513895] RIP: 0010:is_ftrace_trampoline+0x26/0xb0
    [ 1203.514644] Code: ff eb d3 90 41 55 41 54 49 89 fc 55 53 e8 f2 00 fd ff 48 8b 1d 3b 35 5d 03 e8 e6 00 fd ff 48 8d bb 90 00 00 00 e8 2a 81 26 00 <48> 8b ab 90 00 00 00 48 85 ed 74 1d e8 c9 00 fd ff 48 8d bb 98 00
    [ 1203.518838] RSP: 0018:ffffc900012cf960 EFLAGS: 00010246
    [ 1203.520092] RAX: 0000000000000000 RBX: 000000000000007b RCX: ffffffff8a331866
    [ 1203.521469] RDX: 0000000000000000 RSI: 0000000000000008 RDI: 000000000000010b
    [ 1203.522583] RBP: 0000000000000000 R08: 0000000000000000 R09: ffffffff8df18b07
    [ 1203.523550] R10: fffffbfff1be3160 R11: 0000000000000001 R12: 0000000000478399
    [ 1203.524596] R13: 0000000000000000 R14: ffff888145088000 R15: 0000000000000008
    [ 1203.525634] FS:  00007f429f5f4700(0000) GS:ffff8881daf00000(0000) knlGS:0000000000000000
    [ 1203.526801] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 1203.527626] CR2: 000000000000010b CR3: 0000000170e1e001 CR4: 00000000003706e0
    [ 1203.528611] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [ 1203.529605] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    
    Therefore, when ftrace_startup_enable fails, we need to rollback registration
    process and remove ops from ftrace_ops_list.
    
    Link: https://lkml.kernel.org/r/20220818032659.56209-1-yangjihong1@huawei.com
    
    Suggested-by: Steven Rostedt <rostedt@goodmis.org>
    Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
    Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
    Yang Jihong authored and rostedt committed Aug 21, 2022
    Copy the full SHA
    c3b0f72 View commit details
    Browse the repository at this point in the history
  11. tracing/perf: Fix double put of trace event when init fails

    If in perf_trace_event_init(), the perf_trace_event_open() fails, then it
    will call perf_trace_event_unreg() which will not only unregister the perf
    trace event, but will also call the put() function of the tp_event.
    
    The problem here is that the trace_event_try_get_ref() is called by the
    caller of perf_trace_event_init() and if perf_trace_event_init() returns a
    failure, it will then call trace_event_put(). But since the
    perf_trace_event_unreg() already called the trace_event_put() function, it
    triggers a WARN_ON().
    
     WARNING: CPU: 1 PID: 30309 at kernel/trace/trace_dynevent.c:46 trace_event_dyn_put_ref+0x15/0x20
    
    If perf_trace_event_reg() does not call the trace_event_try_get_ref() then
    the perf_trace_event_unreg() should not be calling trace_event_put(). This
    breaks symmetry and causes bugs like these.
    
    Pull out the trace_event_put() from perf_trace_event_unreg() and call it
    in the locations that perf_trace_event_unreg() is called. This not only
    fixes this bug, but also brings back the proper symmetry of the reg/unreg
    vs get/put logic.
    
    Link: https://lore.kernel.org/all/cover.1660347763.git.kjlx@templeofstupid.com/
    Link: https://lkml.kernel.org/r/20220816192817.43d5e17f@gandalf.local.home
    
    Cc: stable@vger.kernel.org
    Fixes: 1d18538 ("tracing: Have dynamic events have a ref counter")
    Reported-by: Krister Johansen <kjlx@templeofstupid.com>
    Reviewed-by: Krister Johansen <kjlx@templeofstupid.com>
    Tested-by: Krister Johansen <kjlx@templeofstupid.com>
    Acked-by: Jiri Olsa <jolsa@kernel.org>
    Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
    rostedt committed Aug 21, 2022
    Copy the full SHA
    7249921 View commit details
    Browse the repository at this point in the history
  12. tracing: React to error return from traceprobe_parse_event_name()

    The function traceprobe_parse_event_name() may set the first two function
    arguments to a non-null value and still return -EINVAL to indicate an
    unsuccessful completion of the function. Hence, it is not sufficient to
    just check the result of the two function arguments for being not null,
    but the return value also needs to be checked.
    
    Commit 95c104c ("tracing: Auto generate event name when creating a
    group of events") changed the error-return-value checking of the second
    traceprobe_parse_event_name() invocation in __trace_eprobe_create() and
    removed checking the return value to jump to the error handling case.
    
    Reinstate using the return value in the error-return-value checking.
    
    Link: https://lkml.kernel.org/r/20220811071734.20700-1-lukas.bulwahn@gmail.com
    
    Fixes: 95c104c ("tracing: Auto generate event name when creating a group of events")
    Acked-by: Linyu Yuan <quic_linyyuan@quicinc.com>
    Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
    Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
    bulwahn authored and rostedt committed Aug 21, 2022
    Copy the full SHA
    d8a6431 View commit details
    Browse the repository at this point in the history
  13. Merge tag 'i2c-for-6.0-rc2' of git://git.kernel.org/pub/scm/linux/ker…

    …nel/git/wsa/linux
    
    Pull i2c fixes from Wolfram Sang:
     "A revert to fix a regression introduced this merge window and a fix
      for proper error handling in the remove path of the iMX driver"
    
    * tag 'i2c-for-6.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
      i2c: imx: Make sure to unregister adapter on remove()
      Revert "i2c: scmi: Replace open coded device_get_match_data()"
    torvalds committed Aug 21, 2022
    Copy the full SHA
    e3f259d View commit details
    Browse the repository at this point in the history
  14. Merge tag '6.0-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/…

    …cifs-2.6
    
    Pull cifs client fixes from Steve French:
    
     - memory leak fix
    
     - two small cleanups
    
     - trivial strlcpy removal
    
     - update missing entry for cifs headers in MAINTAINERS file
    
    * tag '6.0-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
      cifs: move from strlcpy with unused retval to strscpy
      cifs: Fix memory leak on the deferred close
      cifs: remove useless parameter 'is_fsctl' from SMB2_ioctl()
      cifs: remove unused server parameter from calc_smb_size()
      cifs: missing directory in MAINTAINERS file
    torvalds committed Aug 21, 2022
    Copy the full SHA
    367bcbc View commit details
    Browse the repository at this point in the history
  15. asm goto: eradicate CC_HAS_ASM_GOTO

    GCC has supported asm goto since 4.5, and Clang has since version 9.0.0.
    The minimum supported versions of these tools for the build according to
    Documentation/process/changes.rst are 5.1 and 11.0.0 respectively.
    
    Remove the feature detection script, Kconfig option, and clean up some
    fallback code that is no longer supported.
    
    The removed script was also testing for a GCC specific bug that was
    fixed in the 4.7 release.
    
    Also remove workarounds for bpftrace using clang older than 9.0.0, since
    other BPF backend fixes are required at this point.
    
    Link: https://lore.kernel.org/lkml/CAK7LNATSr=BXKfkdW8f-H5VT_w=xBpT2ZQcZ7rm6JfkdE+QnmA@mail.gmail.com/
    Link: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48637
    Acked-by: Borislav Petkov <bp@suse.de>
    Suggested-by: Masahiro Yamada <masahiroy@kernel.org>
    Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
    Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
    Reviewed-by: Ingo Molnar <mingo@kernel.org>
    Reviewed-by: Nathan Chancellor <nathan@kernel.org>
    Reviewed-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    nickdesaulniers authored and torvalds committed Aug 21, 2022
    Copy the full SHA
    a0a12c3 View commit details
    Browse the repository at this point in the history
  16. i2c: imx: Make sure to unregister adapter on remove()

    If for whatever reasons pm_runtime_resume_and_get() fails and .remove() is
    exited early, the i2c adapter stays around and the irq still calls its
    handler, while the driver data and the register mapping go away. So if
    later the i2c adapter is accessed or the irq triggers this results in
    havoc accessing freed memory and unmapped registers.
    
    So unregister the software resources even if resume failed, and only skip
    the hardware access in that case.
    
    Fixes: 588eb93 ("i2c: imx: add runtime pm support to improve the performance")
    Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
    Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
    Signed-off-by: Wolfram Sang <wsa@kernel.org>
    ukleinek authored and Wolfram Sang committed Aug 21, 2022
    Copy the full SHA
    d98bdd3 View commit details
    Browse the repository at this point in the history
  17. Revert "i2c: scmi: Replace open coded device_get_match_data()"

    This reverts commit 9ae551d. We got a
    regression report, so ensure this machine boots again. We will come back
    with a better version hopefully.
    
    Reported-by: Josef Johansson <josef@oderland.se>
    Link: https://lore.kernel.org/r/4d2d5b04-0b6c-1cb1-a63f-dc06dfe1b5da@oderland.se
    Signed-off-by: Wolfram Sang <wsa@kernel.org>
    Wolfram Sang committed Aug 21, 2022
    Copy the full SHA
    3df71d7 View commit details
    Browse the repository at this point in the history

Commits on Aug 20, 2022

  1. Merge tag 'kbuild-fixes-v6.0' of git://git.kernel.org/pub/scm/linux/k…

    …ernel/git/masahiroy/linux-kbuild
    
    Pull Kbuild fixes from Masahiro Yamada:
    
     - Fix module versioning broken on some architectures
    
     - Make dummy-tools enable CONFIG_PPC_LONG_DOUBLE_128
    
     - Remove -Wformat-zero-length, which has no warning instance
    
     - Fix the order between drivers and libs in modules.order
    
     - Fix false-positive warnings in clang-analyzer
    
    * tag 'kbuild-fixes-v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
      scripts/clang-tools: Remove DeprecatedOrUnsafeBufferHandling check
      kbuild: fix the modules order between drivers and libs
      scripts/Makefile.extrawarn: Do not disable clang's -Wformat-zero-length
      kbuild: dummy-tools: pretend we understand __LONG_DOUBLE_128__
      modpost: fix module versioning when a symbol lacks valid CRC
    torvalds committed Aug 20, 2022
    Copy the full SHA
    15b3f48 View commit details
    Browse the repository at this point in the history
  2. Merge tag 'perf-tools-fixes-for-v6.0-2022-08-19' of git://git.kernel.…

    …org/pub/scm/linux/kernel/git/acme/linux
    
    Pull perf tools fixes from Arnaldo Carvalho de Melo:
    
     - Fix alignment for cpu map masks in event encoding.
    
     - Support reading PERF_FORMAT_LOST, perf tool counterpart for a feature
       that was added in this merge window.
    
     - Sync perf tools copies of kernel headers: socket, msr-index, fscrypt,
       cpufeatures, i915_drm, kvm, vhost, perf_event.
    
    * tag 'perf-tools-fixes-for-v6.0-2022-08-19' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
      perf tools: Support reading PERF_FORMAT_LOST
      libperf: Add a test case for read formats
      libperf: Handle read format in perf_evsel__read()
      tools headers UAPI: Sync linux/perf_event.h with the kernel sources
      tools headers UAPI: Sync x86's asm/kvm.h with the kernel sources
      tools headers UAPI: Sync KVM's vmx.h header with the kernel sources
      tools include UAPI: Sync linux/vhost.h with the kernel sources
      tools headers kvm s390: Sync headers with the kernel sources
      tools headers UAPI: Sync linux/kvm.h with the kernel sources
      tools headers UAPI: Sync drm/i915_drm.h with the kernel sources
      tools headers cpufeatures: Sync with the kernel sources
      tools headers UAPI: Sync linux/fscrypt.h with the kernel sources
      tools arch x86: Sync the msr-index.h copy with the kernel sources
      perf beauty: Update copy of linux/socket.h with the kernel sources
      perf cpumap: Fix alignment for masks in event encoding
      perf cpumap: Compute mask size in constant time
      perf cpumap: Synthetic events and const/static
      perf cpumap: Const map for max()
    torvalds committed Aug 20, 2022
    Copy the full SHA
    16b3d85 View commit details
    Browse the repository at this point in the history
Older