Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reeeeee #10

Closed
wants to merge 10,000 commits into from
Closed

reeeeee #10

wants to merge 10,000 commits into from

Conversation

Kiciuk
Copy link
Owner

@Kiciuk Kiciuk commented Feb 6, 2021

No description provided.

Likun Gao and others added 30 commits January 14, 2021 14:01
Add function to set power brake sequence.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reverts commit f01afd1.

Cc: Wayne Lin <Wayne.Lin@amd.com>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Harry Wentland <Harry.Wentland@amd.com>
Cc: Roman Li <Roman.Li@amd.com>
Cc: Bindu R <Bindu.R@amd.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
…ger"

This reverts commit 6ae09fa.

Cc: Wayne Lin <Wayne.Lin@amd.com>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Harry Wentland <Harry.Wentland@amd.com>
Cc: Roman Li <Roman.Li@amd.com>
Cc: Bindu R <Bindu.R@amd.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reverts commit c920888.

Cc: Wayne Lin <Wayne.Lin@amd.com>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Harry Wentland <Harry.Wentland@amd.com>
Cc: Roman Li <Roman.Li@amd.com>
Cc: Bindu R <Bindu.R@amd.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
Find out when we try to disable CRC calculation, crc generation is still
enabled. Main reason is that dc_stream_configure_crc() will never get
called when the source is AMDGPU_DM_PIPE_CRC_SOURCE_NONE.

[How]
Add checking condition that when source is
AMDGPU_DM_PIPE_CRC_SOURCE_NONE, we should also call
dc_stream_configure_crc() to disable crc calculation.

Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
…/git/hid/hid

Pull HID fixes from Jiri Kosina:

 - memory leak fix for Wacom driver (Ping Cheng)

 - various trivial small fixes, cleanups and device ID additions

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
  HID: logitech-hidpp: Add product ID for MX Ergo in Bluetooth mode
  HID: Ignore battery for Elan touchscreen on ASUS UX550
  HID: logitech-dj: add the G602 receiver
  HID: wiimote: remove h from printk format specifier
  HID: uclogic: remove h from printk format specifier
  HID: sony: select CONFIG_CRC32
  HID: sfh: fix address space confusion
  HID: multitouch: Enable multi-input for Synaptics pointstick/touchpad device
  HID: wacom: Fix memory leakage caused by kfifo_alloc
tcp_disconnect() expects the caller acquires the sock lock,
but mptcp_disconnect() is not doing that. Add the missing
required lock.

Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Fixes: 76e2a55 ("mptcp: better msk-level shutdown.")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/f818e82b58a556feeb71dcccc8bf1c87aafc6175.1610638176.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When setting port traddr to INADDR_ANY, the listening cm_id->device
is NULL. The associate IB device is known only when a connect request
event arrives, so checking T10-PI device capability should be done
at this stage.

Fixes: b09160c ("nvmet-rdma: add metadata/T10-PI support")
Signed-off-by: Israel Rukshin <israelr@nvidia.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
We shouldn't call smp_processor_id() in a preemptible
context, but this is advisory at best, so instead
call __smp_processor_id().

Fixes: db5ad6b ("nvme-tcp: try to send request in queue_rq context")
Reported-by: Or Gerlitz <gerlitz.or@gmail.com>
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
When a bio merges, we can get a request that spans multiple
bios, and the overall request payload size is the sum of
all bios. When we calculate how much we need to send
from the existing bio (and bvec), we did not take into
account the iov_iter byte count cap.

Since multipage bvecs support, bvecs can split in the middle
which means that when we account for the last bvec send we
should also take the iov_iter byte count cap as it might be
lower than the last bvec size.

Reported-by: Hao Wang <pkuwangh@gmail.com>
Fixes: 3f2304f ("nvme-tcp: add NVMe over TCP host driver")
Tested-by: Hao Wang <pkuwangh@gmail.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Discovery controllers usually don't support smart log page command.
So when we connect to the discovery controller we see this warning:
nvme nvme0: Failed to read smart log (error 24577)
nvme nvme0: new ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery", addr 192.168.123.1:8009
nvme nvme0: Removing ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery"

Introduce a new helper to understand if the controller is a discovery
controller and use this helper to skip nvme_init_hwmon (also use it in
other places that we check if the controller is a discovery controller).

Fixes: 400b6a7 ("nvme: Add hardware monitoring support")
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
…/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "We have a few fixes for long standing issues, in particular Eric's fix
  to not underestimate the skb sizes, and my fix for brokenness of
  register_netdevice() error path. They may uncover other bugs so we
  will keep an eye on them. Also included are Willem's fixes for
  kmap(_atomic).

  Looking at the "current release" fixes, it seems we are about one rc
  behind a normal cycle. We've previously seen an uptick of "people had
  run their test suites" / "humans actually tried to use new features"
  fixes between rc2 and rc3.

  Summary:

  Current release - regressions:

   - fix feature enforcement to allow NETIF_F_HW_TLS_TX if IP_CSUM &&
     IPV6_CSUM

   - dcb: accept RTM_GETDCB messages carrying set-like DCB commands if
     user is admin for backward-compatibility

   - selftests/tls: fix selftests build after adding ChaCha20-Poly1305

  Current release - always broken:

   - ppp: fix refcount underflow on channel unbridge

   - bnxt_en: clear DEFRAG flag in firmware message when retry flashing

   - smc: fix out of bound access in the new netlink interface

  Previous releases - regressions:

   - fix use-after-free with UDP GRO by frags

   - mptcp: better msk-level shutdown

   - rndis_host: set proper input size for OID_GEN_PHYSICAL_MEDIUM
     request

   - i40e: xsk: fix potential NULL pointer dereferencing

  Previous releases - always broken:

   - skb frag: kmap_atomic fixes

   - avoid 32 x truesize under-estimation for tiny skbs

   - fix issues around register_netdevice() failures

   - udp: prevent reuseport_select_sock from reading uninitialized socks

   - dsa: unbind all switches from tree when DSA master unbinds

   - dsa: clear devlink port type before unregistering slave netdevs

   - can: isotp: isotp_getname(): fix kernel information leak

   - mlxsw: core: Thermal control fixes

   - ipv6: validate GSO SKB against MTU before finish IPv6 processing

   - stmmac: use __napi_schedule() for PREEMPT_RT

   - net: mvpp2: remove Pause and Asym_Pause support

  Misc:

   - remove from MAINTAINERS folks who had been inactive for >5yrs"

* tag 'net-5.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (58 commits)
  mptcp: fix locking in mptcp_disconnect()
  net: Allow NETIF_F_HW_TLS_TX if IP_CSUM && IPV6_CSUM
  MAINTAINERS: dccp: move Gerrit Renker to CREDITS
  MAINTAINERS: ipvs: move Wensong Zhang to CREDITS
  MAINTAINERS: tls: move Aviad to CREDITS
  MAINTAINERS: ena: remove Zorik Machulsky from reviewers
  MAINTAINERS: vrf: move Shrijeet to CREDITS
  MAINTAINERS: net: move Alexey Kuznetsov to CREDITS
  MAINTAINERS: altx: move Jay Cliburn to CREDITS
  net: avoid 32 x truesize under-estimation for tiny skbs
  nt: usb: USB_RTL8153_ECM should not default to y
  net: stmmac: fix taprio configuration when base_time is in the past
  net: stmmac: fix taprio schedule configuration
  net: tip: fix a couple kernel-doc markups
  net: sit: unregister_netdevice on newlink's error path
  net: stmmac: Fixed mtu channged by cache aligned
  cxgb4/chtls: Fix tid stuck due to wrong update of qid
  i40e: fix potential NULL pointer dereferencing
  net: stmmac: use __napi_schedule() for PREEMPT_RT
  can: mcp251xfd: mcp251xfd_handle_rxif_one(): fix wrong NULL pointer check
  ...
…b/scm/linux/kernel/git/shuah/linux-kselftest

Pull kselftest fixes from Shuah Khan:
 "One single fix to skip BPF selftests by default.

  BPF selftests have a hard dependency on cutting edge versions of tools
  in the BPF ecosystem including LLVM.

  Skipping BPF allows by default will make it easier for users
  interested in running kselftest as a whole. Users can include BPF in
  Kselftest build by via SKIP_TARGETS variable"

* tag 'linux-kselftest-fixes-5.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests: Skip BPF seftests by default
… block-5.11

Pull NVMe fixes from Christoph:

"nvme fixes for 5.11:

 - don't initialize hwmon for discover controllers (Sagi Grimberg)
 - fix iov_iter handling in nvme-tcp (Sagi Grimberg)
 - fix a preempt warning in nvme-tcp (Sagi Grimberg)
 - fix a possible NULL pointer dereference in nvme (Israel Rukshin)"

* tag 'nvme-5.11-2021-01-14' of git://git.infradead.org/nvme:
  nvme: don't intialize hwmon for discovery controllers
  nvme-tcp: fix possible data corruption with bio merges
  nvme-tcp: Fix warning with CONFIG_DEBUG_PREEMPT
  nvmet-rdma: Fix NULL deref when setting pi_enable and traddr INADDR_ANY
…g/drm/drm-misc into drm-fixes

Short summary of fixes pull:

 * dma-buf: Fix a memory leak in CMAV heap
 * drm: Fix format check for legacy pageflips
 * ttm: Pass correct address to dma_mapping_error(); Use mutex in pool
   shrinker

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/X/2iXO4ofFSZ39/v@linux-uq9g
This issue has generally been covered up by the presence of additional
expansion ROMs after the ones we're interested in, with header fetches
of subsequent images loading enough of the ROM to hide the issue.

Noticed on GA102, which lacks a type 0x70 image compared to TU102,.

[  906.364197] nouveau 0000:09:00.0: bios: 00000000: type 00, 65024 bytes
[  906.381205] nouveau 0000:09:00.0: bios: 0000fe00: type 03, 91648 bytes
[  906.405213] nouveau 0000:09:00.0: bios: 00026400: type e0, 22016 bytes
[  906.410984] nouveau 0000:09:00.0: bios: 0002ba00: type e0, 366080 bytes

vs

[   22.961901] nouveau 0000:09:00.0: bios: 00000000: type 00, 60416 bytes
[   22.984174] nouveau 0000:09:00.0: bios: 0000ec00: type 03, 71168 bytes
[   23.010446] nouveau 0000:09:00.0: bios: 00020200: type e0, 48128 bytes
[   23.028220] nouveau 0000:09:00.0: bios: 0002be00: type e0, 140800 bytes
[   23.080196] nouveau 0000:09:00.0: bios: 0004e400: type 70, 7168 bytes

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Whatever it is that we were doing before doesn't work on Ampere.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
No functional changes here yet.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
RM does this around transactions, and it seemed to help while debugging
AUXCH issues on GA102.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Noticed while debugging GA102.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
GA100 hidden behind a module option, as it's not been as well verified
since initial bring-up and may need additional changes.

There's no display anyway, so this can wait for a bit.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
VRAM offset 0 is a valid address, triggered on GA102.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Appears to be compatible with GP100 code.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Forcing PRAMIN-shadowing off for GA100, as it requires display, and we don't
know if/where the fuse register for detecting its presence is.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
VPLL regs changed a bit.  There's more stuff to do around these, but it's
less invasive to stick those changes into disp for now.

None of that belongs here anymore anyhow - fix that someday.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Fortunately, all the interrupts we need to bring up basic display support
are contained in a single leaf register, allowing this basic (but hackish)
implementation.

There's a bunch more invasive patches to come implementing all this in a
better/more complete way, but trying to get a minimal series out first.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Appears to be compatible with GM200 code.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Appears to be compatible with NV50 code.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
No VPR scrub.  GA102 and GA104 have a new VRAM size detection method.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
@Kiciuk Kiciuk closed this Feb 6, 2021
@Kiciuk Kiciuk deleted the master_rebase branch February 6, 2021 11:35
Kiciuk pushed a commit that referenced this pull request May 15, 2021
I got several memory leak reports from Asan with a simple command.  It
was because VDSO is not released due to the refcount.  Like in
__dsos_addnew_id(), it should put the refcount after adding to the list.

  $ perf record true
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.030 MB perf.data (10 samples) ]

  =================================================================
  ==692599==ERROR: LeakSanitizer: detected memory leaks

  Direct leak of 439 byte(s) in 1 object(s) allocated from:
    #0 0x7fea52341037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x559bce4aa8ee in dso__new_id util/dso.c:1256
    #2 0x559bce59245a in __machine__addnew_vdso util/vdso.c:132
    #3 0x559bce59245a in machine__findnew_vdso util/vdso.c:347
    #4 0x559bce50826c in map__new util/map.c:175
    #5 0x559bce503c92 in machine__process_mmap2_event util/machine.c:1787
    #6 0x559bce512f6b in machines__deliver_event util/session.c:1481
    #7 0x559bce515107 in perf_session__deliver_event util/session.c:1551
    #8 0x559bce51d4d2 in do_flush util/ordered-events.c:244
    #9 0x559bce51d4d2 in __ordered_events__flush util/ordered-events.c:323
    #10 0x559bce519bea in __perf_session__process_events util/session.c:2268
    #11 0x559bce519bea in perf_session__process_events util/session.c:2297
    msm8953-mainline#12 0x559bce2e7a52 in process_buildids /home/namhyung/project/linux/tools/perf/builtin-record.c:1017
    msm8953-mainline#13 0x559bce2e7a52 in record__finish_output /home/namhyung/project/linux/tools/perf/builtin-record.c:1234
    msm8953-mainline#14 0x559bce2ed4f6 in __cmd_record /home/namhyung/project/linux/tools/perf/builtin-record.c:2026
    msm8953-mainline#15 0x559bce2ed4f6 in cmd_record /home/namhyung/project/linux/tools/perf/builtin-record.c:2858
    msm8953-mainline#16 0x559bce422db4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:313
    msm8953-mainline#17 0x559bce2acac8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:365
    msm8953-mainline#18 0x559bce2acac8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:409
    msm8953-mainline#19 0x559bce2acac8 in main /home/namhyung/project/linux/tools/perf/perf.c:539
    msm8953-mainline#20 0x7fea51e76d09 in __libc_start_main ../csu/libc-start.c:308

  Indirect leak of 32 byte(s) in 1 object(s) allocated from:
    #0 0x7fea52341037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x559bce520907 in nsinfo__copy util/namespaces.c:169
    #2 0x559bce50821b in map__new util/map.c:168
    #3 0x559bce503c92 in machine__process_mmap2_event util/machine.c:1787
    #4 0x559bce512f6b in machines__deliver_event util/session.c:1481
    #5 0x559bce515107 in perf_session__deliver_event util/session.c:1551
    #6 0x559bce51d4d2 in do_flush util/ordered-events.c:244
    #7 0x559bce51d4d2 in __ordered_events__flush util/ordered-events.c:323
    #8 0x559bce519bea in __perf_session__process_events util/session.c:2268
    #9 0x559bce519bea in perf_session__process_events util/session.c:2297
    #10 0x559bce2e7a52 in process_buildids /home/namhyung/project/linux/tools/perf/builtin-record.c:1017
    #11 0x559bce2e7a52 in record__finish_output /home/namhyung/project/linux/tools/perf/builtin-record.c:1234
    msm8953-mainline#12 0x559bce2ed4f6 in __cmd_record /home/namhyung/project/linux/tools/perf/builtin-record.c:2026
    msm8953-mainline#13 0x559bce2ed4f6 in cmd_record /home/namhyung/project/linux/tools/perf/builtin-record.c:2858
    msm8953-mainline#14 0x559bce422db4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:313
    msm8953-mainline#15 0x559bce2acac8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:365
    msm8953-mainline#16 0x559bce2acac8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:409
    msm8953-mainline#17 0x559bce2acac8 in main /home/namhyung/project/linux/tools/perf/perf.c:539
    msm8953-mainline#18 0x7fea51e76d09 in __libc_start_main ../csu/libc-start.c:308

  SUMMARY: AddressSanitizer: 471 byte(s) leaked in 2 allocation(s).

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210315045641.700430-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Kiciuk pushed a commit that referenced this pull request May 15, 2021
The following deadlock is detected:

  truncate -> setattr path is waiting for pending direct IO to be done (inode->i_dio_count become zero) with inode->i_rwsem held (down_write).

  PID: 14827  TASK: ffff881686a9af80  CPU: 20  COMMAND: "ora_p005_hrltd9"
   #0  __schedule at ffffffff818667cc
   #1  schedule at ffffffff81866de6
   #2  inode_dio_wait at ffffffff812a2d04
   #3  ocfs2_setattr at ffffffffc05f322e [ocfs2]
   #4  notify_change at ffffffff812a5a09
   #5  do_truncate at ffffffff812808f5
   #6  do_sys_ftruncate.constprop.18 at ffffffff81280cf2
   #7  sys_ftruncate at ffffffff81280d8e
   #8  do_syscall_64 at ffffffff81003949
   #9  entry_SYSCALL_64_after_hwframe at ffffffff81a001ad

dio completion path is going to complete one direct IO (decrement
inode->i_dio_count), but before that it hung at locking inode->i_rwsem:

   #0  __schedule+700 at ffffffff818667cc
   #1  schedule+54 at ffffffff81866de6
   #2  rwsem_down_write_failed+536 at ffffffff8186aa28
   #3  call_rwsem_down_write_failed+23 at ffffffff8185a1b7
   #4  down_write+45 at ffffffff81869c9d
   #5  ocfs2_dio_end_io_write+180 at ffffffffc05d5444 [ocfs2]
   #6  ocfs2_dio_end_io+85 at ffffffffc05d5a85 [ocfs2]
   #7  dio_complete+140 at ffffffff812c873c
   #8  dio_aio_complete_work+25 at ffffffff812c89f9
   #9  process_one_work+361 at ffffffff810b1889
  #10  worker_thread+77 at ffffffff810b233d
  #11  kthread+261 at ffffffff810b7fd5
  msm8953-mainline#12  ret_from_fork+62 at ffffffff81a0035e

Thus above forms ABBA deadlock.  The same deadlock was mentioned in
upstream commit 28f5a8a ("ocfs2: should wait dio before inode lock
in ocfs2_setattr()").  It seems that that commit only removed the
cluster lock (the victim of above dead lock) from the ABBA deadlock
party.

End-user visible effects: Process hang in truncate -> ocfs2_setattr path
and other processes hang at ocfs2_dio_end_io_write path.

This is to fix the deadlock itself.  It removes inode_lock() call from
dio completion path to remove the deadlock and add ip_alloc_sem lock in
setattr path to synchronize the inode modifications.

[wen.gang.wang@oracle.com: remove the "had_alloc_lock" as suggested]
  Link: https://lkml.kernel.org/r/20210402171344.1605-1-wen.gang.wang@oracle.com

Link: https://lkml.kernel.org/r/20210331203654.3911-1-wen.gang.wang@oracle.com
Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kiciuk pushed a commit that referenced this pull request May 22, 2021
commit 90bd070 upstream.

The following deadlock is detected:

  truncate -> setattr path is waiting for pending direct IO to be done (inode->i_dio_count become zero) with inode->i_rwsem held (down_write).

  PID: 14827  TASK: ffff881686a9af80  CPU: 20  COMMAND: "ora_p005_hrltd9"
   #0  __schedule at ffffffff818667cc
   #1  schedule at ffffffff81866de6
   #2  inode_dio_wait at ffffffff812a2d04
   #3  ocfs2_setattr at ffffffffc05f322e [ocfs2]
   #4  notify_change at ffffffff812a5a09
   #5  do_truncate at ffffffff812808f5
   #6  do_sys_ftruncate.constprop.18 at ffffffff81280cf2
   #7  sys_ftruncate at ffffffff81280d8e
   #8  do_syscall_64 at ffffffff81003949
   #9  entry_SYSCALL_64_after_hwframe at ffffffff81a001ad

dio completion path is going to complete one direct IO (decrement
inode->i_dio_count), but before that it hung at locking inode->i_rwsem:

   #0  __schedule+700 at ffffffff818667cc
   #1  schedule+54 at ffffffff81866de6
   #2  rwsem_down_write_failed+536 at ffffffff8186aa28
   #3  call_rwsem_down_write_failed+23 at ffffffff8185a1b7
   #4  down_write+45 at ffffffff81869c9d
   #5  ocfs2_dio_end_io_write+180 at ffffffffc05d5444 [ocfs2]
   #6  ocfs2_dio_end_io+85 at ffffffffc05d5a85 [ocfs2]
   #7  dio_complete+140 at ffffffff812c873c
   #8  dio_aio_complete_work+25 at ffffffff812c89f9
   #9  process_one_work+361 at ffffffff810b1889
  #10  worker_thread+77 at ffffffff810b233d
  #11  kthread+261 at ffffffff810b7fd5
  msm8953-mainline#12  ret_from_fork+62 at ffffffff81a0035e

Thus above forms ABBA deadlock.  The same deadlock was mentioned in
upstream commit 28f5a8a ("ocfs2: should wait dio before inode lock
in ocfs2_setattr()").  It seems that that commit only removed the
cluster lock (the victim of above dead lock) from the ABBA deadlock
party.

End-user visible effects: Process hang in truncate -> ocfs2_setattr path
and other processes hang at ocfs2_dio_end_io_write path.

This is to fix the deadlock itself.  It removes inode_lock() call from
dio completion path to remove the deadlock and add ip_alloc_sem lock in
setattr path to synchronize the inode modifications.

[wen.gang.wang@oracle.com: remove the "had_alloc_lock" as suggested]
  Link: https://lkml.kernel.org/r/20210402171344.1605-1-wen.gang.wang@oracle.com

Link: https://lkml.kernel.org/r/20210331203654.3911-1-wen.gang.wang@oracle.com
Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Kiciuk pushed a commit that referenced this pull request Oct 9, 2021
It's later supposed to be either a correct address or NULL. Without the
initialization, it may contain an undefined value which results in the
following segmentation fault:

  # perf top --sort comm -g --ignore-callees=do_idle

terminates with:

  #0  0x00007ffff56b7685 in __strlen_avx2 () from /lib64/libc.so.6
  #1  0x00007ffff55e3802 in strdup () from /lib64/libc.so.6
  #2  0x00005555558cb139 in hist_entry__init (callchain_size=<optimized out>, sample_self=true, template=0x7fffde7fb110, he=0x7fffd801c250) at util/hist.c:489
  #3  hist_entry__new (template=template@entry=0x7fffde7fb110, sample_self=sample_self@entry=true) at util/hist.c:564
  #4  0x00005555558cb4ba in hists__findnew_entry (hists=hists@entry=0x5555561d9e38, entry=entry@entry=0x7fffde7fb110, al=al@entry=0x7fffde7fb420,
      sample_self=sample_self@entry=true) at util/hist.c:657
  #5  0x00005555558cba1b in __hists__add_entry (hists=hists@entry=0x5555561d9e38, al=0x7fffde7fb420, sym_parent=<optimized out>, bi=bi@entry=0x0, mi=mi@entry=0x0,
      sample=sample@entry=0x7fffde7fb4b0, sample_self=true, ops=0x0, block_info=0x0) at util/hist.c:288
  #6  0x00005555558cbb70 in hists__add_entry (sample_self=true, sample=0x7fffde7fb4b0, mi=0x0, bi=0x0, sym_parent=<optimized out>, al=<optimized out>, hists=0x5555561d9e38)
      at util/hist.c:1056
  #7  iter_add_single_cumulative_entry (iter=0x7fffde7fb460, al=<optimized out>) at util/hist.c:1056
  #8  0x00005555558cc8a4 in hist_entry_iter__add (iter=iter@entry=0x7fffde7fb460, al=al@entry=0x7fffde7fb420, max_stack_depth=<optimized out>, arg=arg@entry=0x7fffffff7db0)
      at util/hist.c:1231
  #9  0x00005555557cdc9a in perf_event__process_sample (machine=<optimized out>, sample=0x7fffde7fb4b0, evsel=<optimized out>, event=<optimized out>, tool=0x7fffffff7db0)
      at builtin-top.c:842
  #10 deliver_event (qe=<optimized out>, qevent=<optimized out>) at builtin-top.c:1202
  #11 0x00005555558a9318 in do_flush (show_progress=false, oe=0x7fffffff80e0) at util/ordered-events.c:244
  msm8953-mainline#12 __ordered_events__flush (oe=oe@entry=0x7fffffff80e0, how=how@entry=OE_FLUSH__TOP, timestamp=timestamp@entry=0) at util/ordered-events.c:323
  msm8953-mainline#13 0x00005555558a9789 in __ordered_events__flush (timestamp=<optimized out>, how=<optimized out>, oe=<optimized out>) at util/ordered-events.c:339
  msm8953-mainline#14 ordered_events__flush (how=OE_FLUSH__TOP, oe=0x7fffffff80e0) at util/ordered-events.c:341
  msm8953-mainline#15 ordered_events__flush (oe=oe@entry=0x7fffffff80e0, how=how@entry=OE_FLUSH__TOP) at util/ordered-events.c:339
  msm8953-mainline#16 0x00005555557cd631 in process_thread (arg=0x7fffffff7db0) at builtin-top.c:1114
  msm8953-mainline#17 0x00007ffff7bb817a in start_thread () from /lib64/libpthread.so.0
  msm8953-mainline#18 0x00007ffff5656dc3 in clone () from /lib64/libc.so.6

If you look at the frame #2, the code is:

488	 if (he->srcline) {
489          he->srcline = strdup(he->srcline);
490          if (he->srcline == NULL)
491              goto err_rawdata;
492	 }

If he->srcline is not NULL (it is not NULL if it is uninitialized rubbish),
it gets strdupped and strdupping a rubbish random string causes the problem.

Also, if you look at the commit 1fb7d06, it adds the srcline property
into the struct, but not initializing it everywhere needed.

Committer notes:

Now I see, when using --ignore-callees=do_idle we end up here at line
2189 in add_callchain_ip():

2181         if (al.sym != NULL) {
2182                 if (perf_hpp_list.parent && !*parent &&
2183                     symbol__match_regex(al.sym, &parent_regex))
2184                         *parent = al.sym;
2185                 else if (have_ignore_callees && root_al &&
2186                   symbol__match_regex(al.sym, &ignore_callees_regex)) {
2187                         /* Treat this symbol as the root,
2188                            forgetting its callees. */
2189                         *root_al = al;
2190                         callchain_cursor_reset(cursor);
2191                 }
2192         }

And the al that doesn't have the ->srcline field initialized will be
copied to the root_al, so then, back to:

1211 int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al,
1212                          int max_stack_depth, void *arg)
1213 {
1214         int err, err2;
1215         struct map *alm = NULL;
1216
1217         if (al)
1218                 alm = map__get(al->map);
1219
1220         err = sample__resolve_callchain(iter->sample, &callchain_cursor, &iter->parent,
1221                                         iter->evsel, al, max_stack_depth);
1222         if (err) {
1223                 map__put(alm);
1224                 return err;
1225         }
1226
1227         err = iter->ops->prepare_entry(iter, al);
1228         if (err)
1229                 goto out;
1230
1231         err = iter->ops->add_single_entry(iter, al);
1232         if (err)
1233                 goto out;
1234

That al at line 1221 is what hist_entry_iter__add() (called from
sample__resolve_callchain()) saw as 'root_al', and then:

        iter->ops->add_single_entry(iter, al);

will go on with al->srcline with a bogus value, I'll add the above
sequence to the cset and apply, thanks!

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
CC: Milian Wolff <milian.wolff@kdab.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Fixes: 1fb7d06 ("perf report Use srcline from callchain for hist entries")
Link: https //lore.kernel.org/r/20210719145332.29747-1-mpetlan@redhat.com
Reported-by: Juri Lelli <jlelli@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Kiciuk pushed a commit that referenced this pull request Oct 9, 2021
FD uses xyarray__entry that may return NULL if an index is out of
bounds. If NULL is returned then a segv happens as FD unconditionally
dereferences the pointer. This was happening in a case of with perf
iostat as shown below. The fix is to make FD an "int*" rather than an
int and handle the NULL case as either invalid input or a closed fd.

  $ sudo gdb --args perf stat --iostat  list
  ...
  Breakpoint 1, perf_evsel__alloc_fd (evsel=0x5555560951a0, ncpus=1, nthreads=1) at evsel.c:50
  50      {
  (gdb) bt
   #0  perf_evsel__alloc_fd (evsel=0x5555560951a0, ncpus=1, nthreads=1) at evsel.c:50
   #1  0x000055555585c188 in evsel__open_cpu (evsel=0x5555560951a0, cpus=0x555556093410,
      threads=0x555556086fb0, start_cpu=0, end_cpu=1) at util/evsel.c:1792
   #2  0x000055555585cfb2 in evsel__open (evsel=0x5555560951a0, cpus=0x0, threads=0x555556086fb0)
      at util/evsel.c:2045
   #3  0x000055555585d0db in evsel__open_per_thread (evsel=0x5555560951a0, threads=0x555556086fb0)
      at util/evsel.c:2065
   #4  0x00005555558ece64 in create_perf_stat_counter (evsel=0x5555560951a0,
      config=0x555555c34700 <stat_config>, target=0x555555c2f1c0 <target>, cpu=0) at util/stat.c:590
   #5  0x000055555578e927 in __run_perf_stat (argc=1, argv=0x7fffffffe4a0, run_idx=0)
      at builtin-stat.c:833
   #6  0x000055555578f3c6 in run_perf_stat (argc=1, argv=0x7fffffffe4a0, run_idx=0)
      at builtin-stat.c:1048
   #7  0x0000555555792ee5 in cmd_stat (argc=1, argv=0x7fffffffe4a0) at builtin-stat.c:2534
   #8  0x0000555555835ed3 in run_builtin (p=0x555555c3f540 <commands+288>, argc=3,
      argv=0x7fffffffe4a0) at perf.c:313
   #9  0x0000555555836154 in handle_internal_command (argc=3, argv=0x7fffffffe4a0) at perf.c:365
   #10 0x000055555583629f in run_argv (argcp=0x7fffffffe2ec, argv=0x7fffffffe2e0) at perf.c:409
   #11 0x0000555555836692 in main (argc=3, argv=0x7fffffffe4a0) at perf.c:539
  ...
  (gdb) c
  Continuing.
  Error:
  The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (uncore_iio_0/event=0x83,umask=0x04,ch_mask=0xF,fc_mask=0x07/).
  /bin/dmesg | grep -i perf may provide additional information.

  Program received signal SIGSEGV, Segmentation fault.
  0x00005555559b03ea in perf_evsel__close_fd_cpu (evsel=0x5555560951a0, cpu=1) at evsel.c:166
  166                     if (FD(evsel, cpu, thread) >= 0)

v3. fixes a bug in perf_evsel__run_ioctl where the sense of a branch was
    backward.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210918054440.2350466-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Kiciuk pushed a commit that referenced this pull request Jul 24, 2023
Noticed with:

  make EXTRA_CFLAGS="-fsanitize=address" BUILD_BPF_SKEL=1 CORESIGHT=1 O=/tmp/build/perf-tools-next -C tools/perf install-bin

Direct leak of 45 byte(s) in 1 object(s) allocated from:
    #0 0x7f213f87243b in strdup (/lib64/libasan.so.8+0x7243b)
    #1 0x63d15f in evsel__set_filter util/evsel.c:1371
    #2 0x63d15f in evsel__append_filter util/evsel.c:1387
    #3 0x63d15f in evsel__append_tp_filter util/evsel.c:1400
    #4 0x62cd52 in evlist__append_tp_filter util/evlist.c:1145
    #5 0x62cd52 in evlist__append_tp_filter_pids util/evlist.c:1196
    #6 0x541e49 in trace__set_filter_loop_pids /home/acme/git/perf-tools/tools/perf/builtin-trace.c:3646
    #7 0x541e49 in trace__set_filter_pids /home/acme/git/perf-tools/tools/perf/builtin-trace.c:3670
    #8 0x541e49 in trace__run /home/acme/git/perf-tools/tools/perf/builtin-trace.c:3970
    #9 0x541e49 in cmd_trace /home/acme/git/perf-tools/tools/perf/builtin-trace.c:5141
    #10 0x5ef1a2 in run_builtin /home/acme/git/perf-tools/tools/perf/perf.c:323
    #11 0x4196da in handle_internal_command /home/acme/git/perf-tools/tools/perf/perf.c:377
    msm8953-mainline#12 0x4196da in run_argv /home/acme/git/perf-tools/tools/perf/perf.c:421
    msm8953-mainline#13 0x4196da in main /home/acme/git/perf-tools/tools/perf/perf.c:537
    msm8953-mainline#14 0x7f213e84a50f in __libc_start_call_main (/lib64/libc.so.6+0x2750f)

Free it on evsel__exit().

Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/20230719202951.534582-2-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Kiciuk pushed a commit that referenced this pull request Jul 24, 2023
To plug these leaks detected with:

  $ make EXTRA_CFLAGS="-fsanitize=address" BUILD_BPF_SKEL=1 CORESIGHT=1 O=/tmp/build/perf-tools-next -C tools/perf install-bin

  =================================================================
  ==473890==ERROR: LeakSanitizer: detected memory leaks

  Direct leak of 112 byte(s) in 1 object(s) allocated from:
    #0 0x7fdf19aba097 in calloc (/lib64/libasan.so.8+0xba097)
    #1 0x987836 in zalloc (/home/acme/bin/perf+0x987836)
    #2 0x5367ae in thread_trace__new /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:1289
    #3 0x5367ae in thread__trace /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:1307
    #4 0x5367ae in trace__sys_exit /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:2468
    #5 0x52bf34 in trace__handle_event /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3177
    #6 0x52bf34 in __trace__deliver_event /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3685
    #7 0x542927 in trace__deliver_event /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3712
    #8 0x542927 in trace__run /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:4055
    #9 0x542927 in cmd_trace /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:5141
    #10 0x5ef1a2 in run_builtin /home/acme/git/perf-tools-next/tools/perf/perf.c:323
    #11 0x4196da in handle_internal_command /home/acme/git/perf-tools-next/tools/perf/perf.c:377
    msm8953-mainline#12 0x4196da in run_argv /home/acme/git/perf-tools-next/tools/perf/perf.c:421
    msm8953-mainline#13 0x4196da in main /home/acme/git/perf-tools-next/tools/perf/perf.c:537
    msm8953-mainline#14 0x7fdf18a4a50f in __libc_start_call_main (/lib64/libc.so.6+0x2750f)

  Direct leak of 2048 byte(s) in 1 object(s) allocated from:
    #0 0x7f788fcba6af in __interceptor_malloc (/lib64/libasan.so.8+0xba6af)
    #1 0x5337c0 in trace__sys_enter /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:2342
    #2 0x52bfb4 in trace__handle_event /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3191
    #3 0x52bfb4 in __trace__deliver_event /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3699
    #4 0x542883 in trace__deliver_event /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3726
    #5 0x542883 in trace__run /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:4069
    #6 0x542883 in cmd_trace /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:5155
    #7 0x5ef232 in run_builtin /home/acme/git/perf-tools-next/tools/perf/perf.c:323
    #8 0x4196da in handle_internal_command /home/acme/git/perf-tools-next/tools/perf/perf.c:377
    #9 0x4196da in run_argv /home/acme/git/perf-tools-next/tools/perf/perf.c:421
    #10 0x4196da in main /home/acme/git/perf-tools-next/tools/perf/perf.c:537
    #11 0x7f788ec4a50f in __libc_start_call_main (/lib64/libc.so.6+0x2750f)

  Indirect leak of 48 byte(s) in 1 object(s) allocated from:
    #0 0x7fdf19aba6af in __interceptor_malloc (/lib64/libasan.so.8+0xba6af)
    #1 0x77b335 in intlist__new util/intlist.c:116
    #2 0x5367fd in thread_trace__new /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:1293
    #3 0x5367fd in thread__trace /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:1307
    #4 0x5367fd in trace__sys_exit /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:2468
    #5 0x52bf34 in trace__handle_event /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3177
    #6 0x52bf34 in __trace__deliver_event /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3685
    #7 0x542927 in trace__deliver_event /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3712
    #8 0x542927 in trace__run /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:4055
    #9 0x542927 in cmd_trace /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:5141
    #10 0x5ef1a2 in run_builtin /home/acme/git/perf-tools-next/tools/perf/perf.c:323
    #11 0x4196da in handle_internal_command /home/acme/git/perf-tools-next/tools/perf/perf.c:377
    msm8953-mainline#12 0x4196da in run_argv /home/acme/git/perf-tools-next/tools/perf/perf.c:421
    msm8953-mainline#13 0x4196da in main /home/acme/git/perf-tools-next/tools/perf/perf.c:537
    msm8953-mainline#14 0x7fdf18a4a50f in __libc_start_call_main (/lib64/libc.so.6+0x2750f)

Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/20230719202951.534582-4-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Kiciuk pushed a commit that referenced this pull request Jul 24, 2023
In 3cb4d5e ("perf trace: Free syscall tp fields in
evsel->priv") it only was freeing if strcmp(evsel->tp_format->system,
"syscalls") returned zero, while the corresponding initialization of
evsel->priv was being performed if it was _not_ zero, i.e. if the tp
system wasn't 'syscalls'.

Just stop looking for that and free it if evsel->priv was set, which
should be equivalent.

Also use the pre-existing evsel_trace__delete() function.

This resolves these leaks, detected with:

  $ make EXTRA_CFLAGS="-fsanitize=address" BUILD_BPF_SKEL=1 CORESIGHT=1 O=/tmp/build/perf-tools-next -C tools/perf install-bin

  =================================================================
  ==481565==ERROR: LeakSanitizer: detected memory leaks

  Direct leak of 40 byte(s) in 1 object(s) allocated from:
      #0 0x7f7343cba097 in calloc (/lib64/libasan.so.8+0xba097)
      #1 0x987966 in zalloc (/home/acme/bin/perf+0x987966)
      #2 0x52f9b9 in evsel_trace__new /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:307
      #3 0x52f9b9 in evsel__syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:333
      #4 0x52f9b9 in evsel__init_raw_syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:458
      #5 0x52f9b9 in perf_evsel__raw_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:480
      #6 0x540e8b in trace__add_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3212
      #7 0x540e8b in trace__run /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3891
      #8 0x540e8b in cmd_trace /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:5156
      #9 0x5ef262 in run_builtin /home/acme/git/perf-tools-next/tools/perf/perf.c:323
      #10 0x4196da in handle_internal_command /home/acme/git/perf-tools-next/tools/perf/perf.c:377
      #11 0x4196da in run_argv /home/acme/git/perf-tools-next/tools/perf/perf.c:421
      msm8953-mainline#12 0x4196da in main /home/acme/git/perf-tools-next/tools/perf/perf.c:537
      msm8953-mainline#13 0x7f7342c4a50f in __libc_start_call_main (/lib64/libc.so.6+0x2750f)

  Direct leak of 40 byte(s) in 1 object(s) allocated from:
      #0 0x7f7343cba097 in calloc (/lib64/libasan.so.8+0xba097)
      #1 0x987966 in zalloc (/home/acme/bin/perf+0x987966)
      #2 0x52f9b9 in evsel_trace__new /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:307
      #3 0x52f9b9 in evsel__syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:333
      #4 0x52f9b9 in evsel__init_raw_syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:458
      #5 0x52f9b9 in perf_evsel__raw_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:480
      #6 0x540dd1 in trace__add_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3205
      #7 0x540dd1 in trace__run /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3891
      #8 0x540dd1 in cmd_trace /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:5156
      #9 0x5ef262 in run_builtin /home/acme/git/perf-tools-next/tools/perf/perf.c:323
      #10 0x4196da in handle_internal_command /home/acme/git/perf-tools-next/tools/perf/perf.c:377
      #11 0x4196da in run_argv /home/acme/git/perf-tools-next/tools/perf/perf.c:421
      msm8953-mainline#12 0x4196da in main /home/acme/git/perf-tools-next/tools/perf/perf.c:537
      msm8953-mainline#13 0x7f7342c4a50f in __libc_start_call_main (/lib64/libc.so.6+0x2750f)

  SUMMARY: AddressSanitizer: 80 byte(s) leaked in 2 allocation(s).
  [root@quaco ~]#

With this we plug all leaks with "perf trace sleep 1".

Fixes: 3cb4d5e ("perf trace: Free syscall tp fields in evsel->priv")
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Link: https://lore.kernel.org/lkml/20230719202951.534582-5-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Kiciuk pushed a commit that referenced this pull request Jul 24, 2023
Petr Machata says:

====================
mlxsw: Permit enslavement to netdevices with uppers

The mlxsw driver currently makes the assumption that the user applies
configuration in a bottom-up manner. Thus netdevices need to be added to
the bridge before IP addresses are configured on that bridge or SVI added
on top of it. Enslaving a netdevice to another netdevice that already has
uppers is in fact forbidden by mlxsw for this reason. Despite this safety,
it is rather easy to get into situations where the offloaded configuration
is just plain wrong.

As an example, take a front panel port, configure an IP address: it gets a
RIF. Now enslave the port to the bridge, and the RIF is gone. Remove the
port from the bridge again, but the RIF never comes back. There is a number
of similar situations, where changing the configuration there and back
utterly breaks the offload.

Similarly, detaching a front panel port from a configured topology means
unoffloading of this whole topology -- VLAN uppers, next hops, etc.
Attaching the port back is then not permitted at all. If it were, it would
not result in a working configuration, because much of mlxsw is written to
react to changes in immediate configuration. There is nothing that would go
visit netdevices in the attached-to topology and offload existing routes
and VLAN memberships, for example.

In this patchset, introduce a number of replays to be invoked so that this
sort of post-hoc offload is supported. Then remove the vetoes that
disallowed enslavement of front panel ports to other netdevices with
uppers.

The patchset progresses as follows:

- In patch #1, fix an issue in the bridge driver. To my knowledge, the
  issue could not have resulted in a buggy behavior previously, and thus is
  packaged with this patchset instead of being sent separately to net.

- In patch #2, add a new helper to the switchdev code.

- In patch #3, drop mlxsw selftests that will not be relevant after this
  patchset anymore.

- Patches #4, #5, #6, #7 and #8 prepare the codebase for smoother
  introduction of the rest of the code.

- Patches #9, #10, #11, msm8953-mainline#12, msm8953-mainline#13 and msm8953-mainline#14 replay various aspects of upper
  configuration when a front panel port is introduced into a topology.
  Individual patches take care of bridge and LAG RIF memberships, switchdev
  replay, nexthop and neighbors replay, and MACVLAN offload.

- Patches msm8953-mainline#15 and msm8953-mainline#16 introduce RIFs for newly-relevant netdevices when a
  front panel port is enslaved (in which case all uppers are newly
  relevant), or, respectively, deslaved (in which case the newly-relevant
  netdevice is the one being deslaved).

- Up until this point, the introduced scaffolding was not really used,
  because mlxsw still forbids enslavement of mlxsw netdevices to uppers
  with uppers. In patch msm8953-mainline#17, this condition is finally relaxed.

A sizable selftest suite is available to test all this new code. That will
be sent in a separate patchset.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
Kiciuk pushed a commit that referenced this pull request Jul 28, 2023
Merge series from Richard Fitzgerald <rf@opensource.cirrus.com>:

This set of patches adds support for using the CS35L56 boosted smart
amplifier on HDA systems. In these systems the CS35L56 audio is
routed through a HDA-to-I2S bridge codec.

This doesn't include the changes to the Realtek driver to actually hook
up the CS35L56 driver, because we don't yet have the QUIRK IDs to
associate it with. But we want to publish the driver now so that it is
available for bringing up hardware with the CS35L56.

The first 9 patches are moving code out of the ASoC driver and into the
shared library so that it can be shared with the HDA driver.

Patch #10 fixes missing #includes in the HDA headers so that the CS35L56
driver doesn't have to #include headers that it doesn't use.
Kiciuk pushed a commit that referenced this pull request Nov 4, 2023
Ido Schimmel says:

====================
Add MDB get support

This patchset adds MDB get support, allowing user space to request a
single MDB entry to be retrieved instead of dumping the entire MDB.
Support is added in both the bridge and VXLAN drivers.

Patches #1-#6 are small preparations in both drivers.

Patches #7-#8 add the required uAPI attributes for the new functionality
and the MDB get net device operation (NDO), respectively.

Patches #9-#10 implement the MDB get NDO in both drivers.

Patch #11 registers a handler for RTM_GETMDB messages in rtnetlink core.
The handler derives the net device from the ifindex specified in the
ancillary header and invokes its MDB get NDO.

Patches msm8953-mainline#12-msm8953-mainline#13 add selftests by converting tests that use MDB dump with
grep to the new MDB get functionality.

iproute2 changes can be found here [1].

v2:
* Patch #7: Add a comment to describe attributes structure.
* Patch #9: Add a comment above spin_lock_bh().

[1] https://github.com/idosch/iproute2/tree/submit/mdb_get_v1
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
Kiciuk pushed a commit that referenced this pull request Dec 3, 2023
Petr Machata says:

====================
mlxsw: Support CFF flood mode

The registers to configure to initialize a flood table differ between the
controlled and CFF flood modes. In therefore needs to be an op. Add it,
hook up the current init to the existing families, and invoke the op.

PGT is an in-HW table that maps addresses to sets of ports. Then when some
HW process needs a set of ports as an argument, instead of embedding the
actual set in the dynamic configuration, what gets configured is the
address referencing the set. The HW then works with the appropriate PGT
entry.

Among other allocations, the PGT currently contains two large blocks for
bridge flooding: one for 802.1q and one for 802.1d. Within each of these
blocks are three tables, for unknown-unicast, multicast and broadcast
flooding:

      . . . |    802.1q    |    802.1d    | . . .
            | UC | MC | BC | UC | MC | BC |
             \______ _____/ \_____ ______/
                    v             v
                   FID flood vectors

Thus each FID (which corresponds to an 802.1d bridge or one VLAN in an
802.1q bridge) uses three flood vectors spread across a fairly large region
of PGT.

This way of organizing the flood table (called "controlled") is not very
flexible. E.g. to decrease a bridge scale and store more IP MC vectors, one
would need to completely rewrite the bridge PGT blocks, or resort to hacks
such as storing individual MC flood vectors into unused part of the bridge
table.

In order to address these shortcomings, Spectrum-2 and above support what
is called CFF flood mode, for Compressed FID Flooding. In CFF flood mode,
each FID has a little table of its own, with three entries adjacent to each
other, one for unknown-UC, one for MC, one for BC. This allows for a much
more fine-grained approach to PGT management, where bits of it are
allocated on demand.

      . . . | FID | FID | FID | FID | FID | . . .
            |U|M|B|U|M|B|U|M|B|U|M|B|U|M|B|
             \_____________ _____________/
                           v
                   FID flood vectors

Besides the FID table organization, the CFF flood mode also impacts Router
Subport (RSP) table. This table contains flood vectors for rFIDs, which are
FIDs that reference front panel ports or LAGs. The RSP table contains two
entries per front panel port and LAG, one for unknown-UC traffic, and one
for everything else. Currently, the FW allocates and manages the table in
its own part of PGT. rFIDs are marked with flood_rsp bit and managed
specially. In CFF mode, rFIDs are managed as all other FIDs. The driver
therefore has to allocate and maintain the flood vectors. Like with bridge
FIDs, this is more work, but increases flexibility of the system.

The FW currently supports both the controlled and CFF flood modes. To shed
complexity, in the future it should only support CFF flood mode. Hence this
patchset, which adds CFF flood mode support to mlxsw.

Since mlxsw needs to maintain both the controlled mode as well as CFF mode
support, we will keep the layout as compatible as possible. The bridge
tables will stay in the same overall shape, just their inner organization
will change from flood mode -> FID to FID -> flood mode. Likewise will RSP
be kept as a contiguous block of PGT memory, as was the case when the FW
maintained it.

- The way FIDs get configured under the CFF flood mode differs from the
  currently used controlled mode. The simple approach of having several
  globally visible arrays for spectrum.c to statically choose from no
  longer works.

  Patch #1 thus privatizes all FID initialization and finalization logic,
  and exposes it as ops instead.

- Patch #2 renames the ops that are specific to the controlled mode, to
  make room in the namespace for the CFF variants.

  Patch #3 extracts a helper to compute flood table base out of
  mlxsw_sp_fid_flood_table_mid().

- The op fid_setup configured fid_offset, i.e. the number of this FID
  within its family. For rFIDs in CFF mode, to determine this number, the
  driver will need to do fallible queries.

  Thus in patch #4, make the FID setup operation fallible as well.

- Flood mode initialization routine differs between the controlled and CFF
  flood modes. The controlled mode needs to configure flood table layout,
  which the CFF mode does not need to do.

  In patch #5, move mlxsw_sp_fid_flood_table_init() up so that the
  following patch can make use of it.

  In patch #6, add an op to be invoked per table (if defined).

- The current way of determining PGT allocation size depends on the number
  of FIDs and number of flood tables. RFIDs however have PGT footprint
  depending not on number of FIDs, but on number of ports and LAGs, because
  which ports an rFID should flood to does not depend on the FID itself,
  but on the port or LAG that it references.

  Therefore in patch #7, add FID family ops for determining PGT allocation
  size.

- As elaborated above, layout of PGT will differ between controlled and CFF
  flood modes. In CFF mode, it will further differ between rFIDs and other
  FIDs (as described at previous patch). The way to pack the SFMR register
  to configure a FID will likewise differ from controlled to CFF.

  Thus in patches #8 and #9 add FID family ops to determine PGT base
  address for a FID and to pack SFMR.

- Patches #10 and #11 add more bits for RSP support. In patch #10, add a
  new traffic type enumerator, for non-UC traffic. This is a combination of
  BC and MC traffic, but the way that mlxsw maps these mnemonic names to
  actual traffic type configurations requires that we have a new name to
  describe this class of traffic.

  Patch #11 then adds hooks necessary for RSP table maintenance. As ports
  come and go, and join and leave LAGs, it is necessary to update flood
  vectors that the rFIDs use. These new hooks will make that possible.

- Patches msm8953-mainline#12, msm8953-mainline#13 and msm8953-mainline#14 introduce flood profiles. These have been
  implicit so far, but the way that CFF flood mode works with profile IDs
  requires that we make them explicit.

  Thus in patch msm8953-mainline#12, introduce flood profile objects as a set of flood
  tables that FID families then refer to. The FID code currently only
  uses a single flood profile.

  In patch msm8953-mainline#13, add a flood profile ID to flood profile objects.

  In patch msm8953-mainline#14, when in CFF mode, configure SFFP according to the existing
  flood profiles (or the one that exists as of that point).

- Patches msm8953-mainline#15 and msm8953-mainline#16 add code to implement, respectively, bridge FIDs and
  RSP FIDs in CFF mode.

- In patch msm8953-mainline#17, toggle flood_mode_prefer_cff on Spectrum-2 and above, which
  makes the newly-added code live.
====================

Link: https://lore.kernel.org/r/cover.1701183891.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet