forked from torvalds/linux
-
Notifications
You must be signed in to change notification settings - Fork 155
Nanopim2 dts #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Nanopim2 dts #3
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Update dwc2_hsotg structure to add descriptor chains for EP 0: two DMA descriptors for Setup phase, per one for IN/OUT data and status phases. Add their allocation function dwc2_gadget_alloc_ctrl_desc_chains() and its call during gadget probe. Change-Id: I4d5cf7d36947c1d13090b644e1fd527378069e05 Signed-off-by: Vahram Aharonyan <vahrama@synopsys.com> Signed-off-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Rename MAX_DMA_DESC_SIZE to HOST_DMA_NBYTES_LIMIT as it stores value of host DMA descriptor transfer bytes' limit. Values are different in case of gadget DMA descriptors. Change-Id: I97ad1541a4959cfa31d4f849f20bf56e9203fbc9 Signed-off-by: Vahram Aharonyan <vahrama@synopsys.com> Signed-off-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Add dwc2_gadget_get_chain_limit() function and its call in transfer start routine to correctly estimate one go on transfer size if descriptor DMA mode is selected. Change-Id: Ic20e678df741611d0fe398f48ad1428d2639108a Signed-off-by: Vahram Aharonyan <vahrama@synopsys.com> Signed-off-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Add DMA descriptor members to the dwc2_hsotg_ep structure. Change-Id: I0a71229cbabcfa432b03df60856039d93eef2077 Signed-off-by: Vahram Aharonyan <vahrama@synopsys.com> Signed-off-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
This property was mistakenly added, then removed, so don't read it in. Change-Id: I0f6d69cf525239e64c005a71ea0ed4e931335f4d Signed-off-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Add dwc2_gadget_set_ep0_desc_chain() function to switch between EP0 DDMA chains depend on the stage of control transfer. Include EP0 DDMA chain selection during ep_queue called from dwc2_hsotg_enqueue_setup() for setup stage. Selecting and filling DDMA chain for status phase as well - add calls of dwc2_gadget_set_ep0_desc_chain() and dwc2_gadget_config_nonisoc_xfer_ddma() functions. Change-Id: I44b0a142732716ad8ed5219cabfc53372ba5288d Signed-off-by: Vahram Aharonyan <vahrama@synopsys.com> Signed-off-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Update transfer starting dwc2_hsotg_start_req() routine with call of function dwc2_gadget_config_nonisoc_xfer_ddma() to fill descriptor chain. Add call of dwc2_gadget_get_xfersize_ddma() in dwc2_hsotg_handle_outdone() and dwc2_hsotg_complete_in() interrupt handlers for DDMA mode to get information on transferred data from descriptors instead of DXEPTSIZ. Change-Id: Iacb778b90e9d01c7d7cbae196f0b2bd59d7c160b Signed-off-by: Vahram Aharonyan <vahrama@synopsys.com> Signed-off-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
The StsPhseRcvd interrupt should not be enabled in slave mode. Also move the StsPhsRcvd interrupt checking in the endpoint interrupt handler to the correct order according to the databook. The interrupt itself will be implemented in a later commit. Change-Id: I312c63c9c8b873d5721bb965cd0f327a5721a4f8 Signed-off-by: Vahram Aharonyan <vahrama@synopsys.com> Signed-off-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
In DDMA mode of operation IN status phase of control write transfer should start after getting StsPhseRcvd interrupt. This interrupt is issued by HW once host starts to send IN tokens after data stage. Change-Id: Ic65f960e477b07bebbc8d4b6836f3ea5883ae46b Signed-off-by: Vahram Aharonyan <vahrama@synopsys.com> Signed-off-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Add DCFG register field macro for descriptor DMA mode and update core initialization routine to set that bit accordingly. Change-Id: Ie487371935cbb660a6fc965ce07ce41718cdd8ac Signed-off-by: Vahram Aharonyan <vahrama@synopsys.com> Signed-off-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Preparing for isochronous transfers support adding in DDMA mode. In DDMA isochronous transfers are handled differently compared to Slave and BDMA modes. This is caused by fact that isoc requests contain data for one frame/microframe. HW descriptor should contain data of one frame/microframe as well. Hence each DMA descriptor in the chain will correspond to one usb request. Decided to divide endpoints descriptor chain to two halves - while one will be processed by HW, other one will be under SW control. First part will be passed to HW once ISOC traffic needs to be started. In parallel to HW's work SW will keep creating new entries in the other half of chain if new requests arrive in ep_queue routine. This will allow passing of already pre-prepared descriptors to HW immediately after endpoint gets disabled. The endpoint should be disabled once HW closes descriptor with "L" bit set. Afterwards SW will switch to use first part of chain if more requests are arriving. Add two members to the dwc2_hsotg_ep structure to be used in isochronous transfers' handling in DDMA mode: -isoc_chain_num - indicates which half of EP descriptor chain can be used by SW to add new queued requests while HW is processing other half. -next_desc - index which points to next not yet programmed descriptor in the half of descriptor chain which is under SW control. Also add initialization of these fields in function dwc2_hsotg_ep_enable(). Change-Id: Iedd1e695a3bc009f9efdc310a2fa7500832631b8 Signed-off-by: Vahram Aharonyan <vahrama@synopsys.com> Signed-off-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
In DDMA mode incompISOOUT should be masked, similar as Bulk Out - XferCompl and BNA should be handled. incompISOIN is not valid in DDMA and is not getting asserted. Change-Id: Id136ea2f2f62041473bfee0f187620fbebf2f910 Signed-off-by: Vahram Aharonyan <vahrama@synopsys.com> Signed-off-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
For DDMA mode in case of isochronous transfers completion performed differently than other transfer types. This is because each usb request was mapped to one descriptor in the chain and SW gets xfercomplete interrupt on all descriptors. The endpoint remains enabled until HW processes last descriptor with "L" bit set or BNA interrupt gets asserted for IN and OUT endpoints correspondingly. Add function dwc2_gadget_complete_isoc_request_ddma() - completes one isochronous request taken from endpoint's queue. Add function dwc2_gadget_start_next_isoc_ddma() - tries to restart isochronous endpoint if requests are pending. Check for EPENA. If the endpoint was disabled, try to restart it after programming descriptor chain prepared by SW earlier, switch SW to fill the other half of chain. Add function dwc2_gadget_fill_isoc_desc() - initializes DMA descriptor for isochronous transfer based on the received request data and endpoint characteristics. Added function dwc2_gadget_start_isoc_ddma() - prepare DMA chain for isochronous transfer in DDMA, programs corresponding DMA address to DEPDMA, enables the endpoint. This function is called once SW decides to start isochronous IN or OUT transfer depend on reception of NAK or OUTTknEPdis interrupts indicating first isochronous token arrival from host. Change-Id: I01559de55de43fdbfc4156dd92f616d914c71e8d Signed-off-by: Vahram Aharonyan <vahrama@synopsys.com> Signed-off-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Enable the BNA (Buffer Not Available) interrupt in descriptor DMA mode. Change-Id: I8a5c438af9746dad064420d40a794835072a2ca8 Signed-off-by: Vahram Aharonyan <vahrama@synopsys.com> Signed-off-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
In DDMA mode if programmed ISOC OUT transfer length is not DWORD aligned, after closing descriptor HW leaves value of 4 - (ureq->length % 4) in the RX bytes. This is caused because DMA works using 4B chunks. Example: if length = 9 and all 9 bytes were received from the bus, after xfercomplete rx_bytes value is 3. Hence add this value to the amount of transferred bytes. Change-Id: I4d814168ff1f5c96905a11316c097691f1105965 Signed-off-by: Vahram Aharonyan <vahrama@synopsys.com> Signed-off-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Tests with various hosts show that depend on time difference between host sending SETUP packet and IN/OUT token SW could get Xfercomplete interrupt without SetUp interrupt. On the other hand, SW should parse received SETUP packet only after ensuring that Host has moved to either Data or Status stage of control transfer. For this purpose added checking in the dwc2_hsotg_epint() function to not handle xfercomplete and postpone SETUP packet analysis till SW's getting of setup phase done interrupt. Change-Id: I68c59727179915ed31a8e89ba715523cc4671448 Signed-off-by: Vahram Aharonyan <vahrama@synopsys.com> Signed-off-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Correct dwc2_hsotg_ep_stop_xfr() function to follow dwc2 programming guide for setting NAK on specific endpoint, disabling it and flushing corresponding FIFO. Current code does not take into account whether core acts in shared or dedicated FIFO mode, current endpoint is periodic or not. It does not clear EPDISBLD interrupt after programming of DXEPCTL_EPDIS, does not flush shared TX FIFO and tries to clear global out NAK in wrong manner instead of setting DCTL_CGOUTNAK. Change-Id: I1de9b083749ed201ca1eca2146ffbb985bc3f32b Signed-off-by: Vahram Aharonyan <vahrama@synopsys.com> Signed-off-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Check if endpoint is enabled during dwc2_hsotg_ep_disable() function processing and call dwc2_hsotg_ep_stop_xfr() to disable it and flush associated FIFO. Move dwc2_hsotg_ep_stop_xfr() and dwc2_hsotg_wait_bit_set() functions upper before dwc2_hsotg_ep_enable and dwc2_hsotg_ep_disable function definitions. Change-Id: I40f96f32c449b1f1b22f1b62283322f4ebfcfacb Signed-off-by: Vahram Aharonyan <vahrama@synopsys.com> Signed-off-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Do modifications in dwc2_hsotg_core_init_disconnected() function to enable USB 1.1 Full-Speed Serial Transceiver Select in GUSBCFG register if corresponding speed and PHY types are chosen. Adjust device speed selection in DCFG register as well. Change-Id: I708bdeb0bcce2b1d90efdd4740e975704219a1d6 Signed-off-by: Vahram Aharonyan <vahrama@synopsys.com> Signed-off-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Add new device IDs for IOT gadget. Done changes in probe to configure core accordingly depending on device ID value. Change-Id: If6824bd263f598e1ed69ef2fa989358f3bf20a40 Signed-off-by: Vardan Mikayelyan <mvardan@synopsys.com> Signed-off-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
I had seen some odd behavior with HiKey's usb-gadget interface that I finally seemed to have chased down. Basically every other time I plugged in the OTG port, the gadget interface would properly initialize. The other times, I'd get a big WARN_ON in dwc2_hsotg_init_fifo() about the fifo_map not being clear. Ends up if we don't disconnect the gadget state, the fifo-map doesn't get cleared properly, which causes WARN_ON messages and also results in the device not properly being setup as a gadget every other time the OTG port is connected. So this patch adds a call to dwc2_hsotg_disconnect() in the reset path so the state is properly cleared. With it, the gadget interface initializes properly on every plug in. Change-Id: I97cdfa0af680780077f79da290f8d5f8c587e868 Cc: Wei Xu <xuwei5@hisilicon.com> Cc: Guodong Xu <guodong.xu@linaro.org> Cc: Amit Pundir <amit.pundir@linaro.org> Cc: Rob Herring <robh+dt@kernel.org> Cc: John Youn <johnyoun@synopsys.com> Cc: Douglas Anderson <dianders@chromium.org> Cc: Chen Yu <chenyu56@huawei.com> Cc: Felipe Balbi <felipe.balbi@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: linux-usb@vger.kernel.org Acked-by: John Youn <johnyoun@synopsys.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
As of commit bb47523 ("reset: make optional functions really optional"), the reset framework API calls use NULL pointers to describe optional, non-present reset controls. This allows to return errors from devm_reset_control_get_optional and to call reset_control_(de)assert unconditionally. Change-Id: Idba38a1dae920d830a47c6f65af7ea61b929d541 Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Acked-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
When device is enumerated in LS we should program ep0_mps accordingly. USB2 spec says that in LS mode, control ep mps must be 8. Change-Id: Iac9919252ac4c988830685b238d19bfca7ee1ae2 Signed-off-by: Vardan Mikayelyan <mvardan@synopsys.com> Signed-off-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Added new core param for low speed, which can be used only when SNPSID is equal to DWC2_CORE_FS_IOT. When LS mode is enabled, we are restricting ep types and providing to upper layer only INTR and CTRL endpoints. Change-Id: I18b6359ddc7c45dc1edf1089a207a09e9e0bfc20 Signed-off-by: Vardan Mikayelyan <mvardan@synopsys.com> Signed-off-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
This patch moves the the host_dma initialization before dwc2_set_param_dma_desc_enable() and dwc2_set_param_dma_desc_fs_enable(). The reason being that both function need it. Fixes: 1205489cee75bf39 ("usb: dwc2: Get host DMA device properties") Change-Id: Ia7c3392503ecaa1fe645f5f1c351c69407ac5408 Acked-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
This patch adds support for the "amcc,usb-otg" device which is found in the PowerPC Canyonlands' dts. The device definition was added by: commit c89b345 ("powerpc/44x: Add USB DWC DTS entry to Canyonlands board") but without any driver support as the dwc2 driver wasn't available at that time. Note: The system can't use the generic "snps,dwc2" compatible because of the special ahbcfg configuration. The default GAHBCFG_HBSTLEN_INCR4 of snps,dwc2 can cause a system hang when the USB and SATA is used concurrently. Change-Id: I6100f7d78c87dc43ec82ab02fbd8b6157dd40fd3 Signed-off-by: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Stop sending complete split requests in case of ISOC IN split transfers after getting data with PID0. Otherwise we will get a NYET for each additional IN token. Change-Id: I1e74aedfe40f3e8b3090cacd2eed77e07df00d7c Signed-off-by: Sevak Arakelyan <sevaka@synopsys.com> Signed-off-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Check the request length in dwc2_hsotg_start_req() function. If length == 0, do not write DMA address to control register. Change-Id: I77fc95dc7f6c9c3458e0cde620599f2f7150f9c7 Signed-off-by: Razmik Karapetyan <razmik@synopsys.com> Signed-off-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
While sending zlp for DWC2_EP0_STATUS_IN EP direction was changed to IN. Change it back to complete OUT transfer request. This affects only on DMA mode, because DMA buffer map/unmap function is direction sensitive. Change-Id: I704c1f06a3443dc2974e9daf589a9ad8eed30e10 Signed-off-by: Razmik Karapetyan <razmik@synopsys.com> Signed-off-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Remove request length checking from dwc2_hsotg_unmap_dma() and dwc2_hsotg_map_dma(). That checking is done in functions called from those functions, usb_gadget_unmap_request_by_dev() and usb_gadget_map_request_by_dev() respectively, so it's unnecessary. Change-Id: Id3f6dcdef7927a576e9769453ddf8aa092c75be8 Signed-off-by: Razmik Karapetyan <razmik@synopsys.com> Signed-off-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: hmz007 <hmz007@gmail.com>
Signed-off-by: hmz007 <hmz007@gmail.com>
Signed-off-by: hmz007 <hmz007@gmail.com>
This patch reverts commit ce5eddb.
Signed-off-by: hmz007 <hmz007@gmail.com>
Signed-off-by: hmz007 <hmz007@gmail.com>
Signed-off-by: hmz007 <hmz007@gmail.com>
Signed-off-by: hmz007 <hmz007@gmail.com>
Signed-off-by: hmz007 <hmz007@gmail.com>
commit 7c3eaad upstream. Suggested-by: Noralf Trønnes <noralf@tronnes.org> Acked-by: Noralf Trønnes <noralf@tronnes.org> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: hmz007 <hmz007@gmail.com>
Signed-off-by: hmz007 <hmz007@gmail.com>
Signed-off-by: hmz007 <hmz007@gmail.com>
Signed-off-by: perfectnewer <weixipeng@outlook.com>
pikatenor
pushed a commit
to pikatenor/linux
that referenced
this pull request
Apr 11, 2023
[ Upstream commit e40b801 ] There is a certain chance to trigger the following panic: PID: 5900 TASK: ffff88c1c8af4100 CPU: 1 COMMAND: "kworker/1:48" #0 [ffff9456c1cc79a0] machine_kexec at ffffffff870665b7 friendlyarm#1 [ffff9456c1cc79f0] __crash_kexec at ffffffff871b4c7a friendlyarm#2 [ffff9456c1cc7ab0] crash_kexec at ffffffff871b5b60 friendlyarm#3 [ffff9456c1cc7ac0] oops_end at ffffffff87026ce7 friendlyarm#4 [ffff9456c1cc7ae0] page_fault_oops at ffffffff87075715 friendlyarm#5 [ffff9456c1cc7b58] exc_page_fault at ffffffff87ad0654 friendlyarm#6 [ffff9456c1cc7b80] asm_exc_page_fault at ffffffff87c00b62 [exception RIP: ib_alloc_mr+19] RIP: ffffffffc0c9cce3 RSP: ffff9456c1cc7c38 RFLAGS: 00010202 RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000004 RDX: 0000000000000010 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff88c1ea281d00 R8: 000000020a34ffff R9: ffff88c1350bbb20 R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 R13: 0000000000000010 R14: ffff88c1ab040a50 R15: ffff88c1ea281d00 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 friendlyarm#7 [ffff9456c1cc7c60] smc_ib_get_memory_region at ffffffffc0aff6df [smc] friendlyarm#8 [ffff9456c1cc7c88] smcr_buf_map_link at ffffffffc0b0278c [smc] friendlyarm#9 [ffff9456c1cc7ce0] __smc_buf_create at ffffffffc0b03586 [smc] The reason here is that when the server tries to create a second link, smc_llc_srv_add_link() has no protection and may add a new link to link group. This breaks the security environment protected by llc_conf_mutex. Fixes: 2d2209f ("net/smc: first part of add link processing as SMC server") Signed-off-by: D. Wythe <alibuda@linux.alibaba.com> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
pikatenor
pushed a commit
to pikatenor/linux
that referenced
this pull request
Apr 11, 2023
[ Upstream commit 3a70e0d ] When doing timestamping in lan966x and having PROVE_LOCKING enabled the following warning is shown. ======================================================== WARNING: possible irq lock inversion dependency detected 6.2.0-rc7-01749-gc54e1f7f7e36 #2786 Tainted: G N -------------------------------------------------------- swapper/0/0 just changed the state of lock: c2609f5 (_xmit_ETHER#2){+.-.}-{2:2}, at: sch_direct_xmit+0x16c/0x2e8 but this lock took another, SOFTIRQ-unsafe lock in the past: (&lan966x->ptp_ts_id_lock){+.+.}-{2:2} and interrupts could create inverse lock ordering between them. other info that might help us debug this: Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&lan966x->ptp_ts_id_lock); local_irq_disable(); lock(_xmit_ETHER#2); lock(&lan966x->ptp_ts_id_lock); <Interrupt> lock(_xmit_ETHER#2); *** DEADLOCK *** 5 locks held by swapper/0/0: #0: c1001e18 ((&ndev->rs_timer)){+.-.}-{0:0}, at: call_timer_fn+0x0/0x33c friendlyarm#1: c105e7c4 (rcu_read_lock){....}-{1:2}, at: ndisc_send_skb+0x134/0x81c friendlyarm#2: c105e7d8 (rcu_read_lock_bh){....}-{1:2}, at: ip6_finish_output2+0x17c/0xc64 friendlyarm#3: c105e7d8 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x4c/0x1224 friendlyarm#4: c3056174 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: __dev_queue_xmit+0x354/0x1224 the shortest dependencies between 2nd lock and 1st lock: -> (&lan966x->ptp_ts_id_lock){+.+.}-{2:2} { HARDIRQ-ON-W at: lock_acquire.part.0+0xb0/0x248 _raw_spin_lock+0x38/0x48 lan966x_ptp_irq_handler+0x164/0x2a8 irq_thread_fn+0x1c/0x78 irq_thread+0x130/0x278 kthread+0xec/0x110 ret_from_fork+0x14/0x28 SOFTIRQ-ON-W at: lock_acquire.part.0+0xb0/0x248 _raw_spin_lock+0x38/0x48 lan966x_ptp_irq_handler+0x164/0x2a8 irq_thread_fn+0x1c/0x78 irq_thread+0x130/0x278 kthread+0xec/0x110 ret_from_fork+0x14/0x28 INITIAL USE at: lock_acquire.part.0+0xb0/0x248 _raw_spin_lock_irqsave+0x4c/0x68 lan966x_ptp_txtstamp_request+0x128/0x1cc lan966x_port_xmit+0x224/0x43c dev_hard_start_xmit+0xa8/0x2f0 sch_direct_xmit+0x108/0x2e8 __dev_queue_xmit+0x41c/0x1224 packet_sendmsg+0xdb4/0x134c __sys_sendto+0xd0/0x154 sys_send+0x18/0x20 ret_fast_syscall+0x0/0x1c } ... key at: [<c174ba0c>] __key.2+0x0/0x8 ... acquired at: _raw_spin_lock_irqsave+0x4c/0x68 lan966x_ptp_txtstamp_request+0x128/0x1cc lan966x_port_xmit+0x224/0x43c dev_hard_start_xmit+0xa8/0x2f0 sch_direct_xmit+0x108/0x2e8 __dev_queue_xmit+0x41c/0x1224 packet_sendmsg+0xdb4/0x134c __sys_sendto+0xd0/0x154 sys_send+0x18/0x20 ret_fast_syscall+0x0/0x1c -> (_xmit_ETHER#2){+.-.}-{2:2} { HARDIRQ-ON-W at: lock_acquire.part.0+0xb0/0x248 _raw_spin_lock+0x38/0x48 netif_freeze_queues+0x38/0x68 dev_deactivate_many+0xac/0x388 dev_deactivate+0x38/0x6c linkwatch_do_dev+0x70/0x8c __linkwatch_run_queue+0xd4/0x1e8 linkwatch_event+0x24/0x34 process_one_work+0x284/0x744 worker_thread+0x28/0x4bc kthread+0xec/0x110 ret_from_fork+0x14/0x28 IN-SOFTIRQ-W at: lock_acquire.part.0+0xb0/0x248 _raw_spin_lock+0x38/0x48 sch_direct_xmit+0x16c/0x2e8 __dev_queue_xmit+0x41c/0x1224 ip6_finish_output2+0x5f4/0xc64 ndisc_send_skb+0x4cc/0x81c addrconf_rs_timer+0xb0/0x2f8 call_timer_fn+0xb4/0x33c expire_timers+0xb4/0x10c run_timer_softirq+0xf8/0x2a8 __do_softirq+0xd4/0x5fc __irq_exit_rcu+0x138/0x17c irq_exit+0x8/0x28 __irq_svc+0x90/0xbc arch_cpu_idle+0x30/0x3c default_idle_call+0x44/0xac do_idle+0xc8/0x138 cpu_startup_entry+0x18/0x1c rest_init+0xcc/0x168 arch_post_acpi_subsys_init+0x0/0x8 INITIAL USE at: lock_acquire.part.0+0xb0/0x248 _raw_spin_lock+0x38/0x48 netif_freeze_queues+0x38/0x68 dev_deactivate_many+0xac/0x388 dev_deactivate+0x38/0x6c linkwatch_do_dev+0x70/0x8c __linkwatch_run_queue+0xd4/0x1e8 linkwatch_event+0x24/0x34 process_one_work+0x284/0x744 worker_thread+0x28/0x4bc kthread+0xec/0x110 ret_from_fork+0x14/0x28 } ... key at: [<c175974c>] netdev_xmit_lock_key+0x8/0x1c8 ... acquired at: __lock_acquire+0x978/0x2978 lock_acquire.part.0+0xb0/0x248 _raw_spin_lock+0x38/0x48 sch_direct_xmit+0x16c/0x2e8 __dev_queue_xmit+0x41c/0x1224 ip6_finish_output2+0x5f4/0xc64 ndisc_send_skb+0x4cc/0x81c addrconf_rs_timer+0xb0/0x2f8 call_timer_fn+0xb4/0x33c expire_timers+0xb4/0x10c run_timer_softirq+0xf8/0x2a8 __do_softirq+0xd4/0x5fc __irq_exit_rcu+0x138/0x17c irq_exit+0x8/0x28 __irq_svc+0x90/0xbc arch_cpu_idle+0x30/0x3c default_idle_call+0x44/0xac do_idle+0xc8/0x138 cpu_startup_entry+0x18/0x1c rest_init+0xcc/0x168 arch_post_acpi_subsys_init+0x0/0x8 stack backtrace: CPU: 0 PID: 0 Comm: swapper/0 Tainted: G N 6.2.0-rc7-01749-gc54e1f7f7e36 #2786 Hardware name: Generic DT based system unwind_backtrace from show_stack+0x10/0x14 show_stack from dump_stack_lvl+0x58/0x70 dump_stack_lvl from mark_lock.part.0+0x59c/0x93c mark_lock.part.0 from __lock_acquire+0x978/0x2978 __lock_acquire from lock_acquire.part.0+0xb0/0x248 lock_acquire.part.0 from _raw_spin_lock+0x38/0x48 _raw_spin_lock from sch_direct_xmit+0x16c/0x2e8 sch_direct_xmit from __dev_queue_xmit+0x41c/0x1224 __dev_queue_xmit from ip6_finish_output2+0x5f4/0xc64 ip6_finish_output2 from ndisc_send_skb+0x4cc/0x81c ndisc_send_skb from addrconf_rs_timer+0xb0/0x2f8 addrconf_rs_timer from call_timer_fn+0xb4/0x33c call_timer_fn from expire_timers+0xb4/0x10c expire_timers from run_timer_softirq+0xf8/0x2a8 run_timer_softirq from __do_softirq+0xd4/0x5fc __do_softirq from __irq_exit_rcu+0x138/0x17c __irq_exit_rcu from irq_exit+0x8/0x28 irq_exit from __irq_svc+0x90/0xbc Exception stack(0xc1001f20 to 0xc1001f68) 1f20: ffffffff ffffffff 00000001 c011f840 c100e000 c100e000 c1009314 c1009370 1f40: c10f0c1a c0d5e564 c0f5da8c 00000000 00000000 c1001f70 c010f0bc c010f0c0 1f60: 600f0013 ffffffff __irq_svc from arch_cpu_idle+0x30/0x3c arch_cpu_idle from default_idle_call+0x44/0xac default_idle_call from do_idle+0xc8/0x138 do_idle from cpu_startup_entry+0x18/0x1c cpu_startup_entry from rest_init+0xcc/0x168 rest_init from arch_post_acpi_subsys_init+0x0/0x8 Fix this by using spin_lock_irqsave/spin_lock_irqrestore also inside lan966x_ptp_irq_handler. Fixes: e85a96e ("net: lan966x: Add support for ptp interrupts") Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Link: https://lore.kernel.org/r/20230217210917.2649365-1-horatiu.vultur@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
pikatenor
pushed a commit
to pikatenor/linux
that referenced
this pull request
Apr 11, 2023
[ Upstream commit 479d4f0 ] If vc4_hdmi_reset_link() returns -EDEADLK, it means that a deadlock happened in the locking context. This situation should be addressed by dropping all currently held locks and block until the contended lock becomes available. Currently, vc4 is not dealing with the deadlock properly, producing the following output when PROVE_LOCKING is enabled: [ 825.612809] ------------[ cut here ]------------ [ 825.612852] WARNING: CPU: 1 PID: 116 at drivers/gpu/drm/drm_modeset_lock.c:276 drm_modeset_drop_locks+0x60/0x68 [drm] [ 825.613458] Modules linked in: 8021q mrp garp stp llc raspberrypi_cpufreq brcmfmac brcmutil crct10dif_ce hci_uart cfg80211 btqca btbcm bluetooth vc4 raspberrypi_hwmon snd_soc_hdmi_codec cec clk_raspberrypi ecdh_generic drm_display_helper ecc rfkill drm_dma_helper drm_kms_helper pwm_bcm2835 bcm2835_thermal bcm2835_rng rng_core i2c_bcm2835 drm fuse ip_tables x_tables ipv6 [ 825.613735] CPU: 1 PID: 116 Comm: kworker/1:2 Tainted: G W 6.1.0-rc6-01399-g941aae326315 friendlyarm#3 [ 825.613759] Hardware name: Raspberry Pi 3 Model B Rev 1.2 (DT) [ 825.613777] Workqueue: events output_poll_execute [drm_kms_helper] [ 825.614038] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 825.614063] pc : drm_modeset_drop_locks+0x60/0x68 [drm] [ 825.614603] lr : drm_helper_probe_detect+0x120/0x1b4 [drm_kms_helper] [ 825.614829] sp : ffff800008313bf0 [ 825.614844] x29: ffff800008313bf0 x28: ffffcd7778b8b000 x27: 0000000000000000 [ 825.614883] x26: 0000000000000001 x25: 0000000000000001 x24: ffff677cc35c2758 [ 825.614920] x23: ffffcd7707d01430 x22: ffffcd7707c3edc7 x21: 0000000000000001 [ 825.614958] x20: 0000000000000000 x19: ffff800008313c10 x18: 000000000000b6d3 [ 825.614995] x17: ffffcd777835e214 x16: ffffcd7777cef870 x15: fffff81000000000 [ 825.615033] x14: 0000000000000000 x13: 0000000000000099 x12: 0000000000000002 [ 825.615070] x11: 72917988020af800 x10: 72917988020af800 x9 : 72917988020af800 [ 825.615108] x8 : ffff677cc665e0a8 x7 : d00a8c180000110c x6 : ffffcd77774c0054 [ 825.615145] x5 : 0000000000000000 x4 : 0000000000000001 x3 : 0000000000000000 [ 825.615181] x2 : ffff677cc55e1880 x1 : ffffcd7777cef8ec x0 : ffff800008313c10 [ 825.615219] Call trace: [ 825.615232] drm_modeset_drop_locks+0x60/0x68 [drm] [ 825.615773] drm_helper_probe_detect+0x120/0x1b4 [drm_kms_helper] [ 825.616003] output_poll_execute+0xe4/0x224 [drm_kms_helper] [ 825.616233] process_one_work+0x2b4/0x618 [ 825.616264] worker_thread+0x24c/0x464 [ 825.616288] kthread+0xec/0x110 [ 825.616310] ret_from_fork+0x10/0x20 [ 825.616335] irq event stamp: 7634 [ 825.616349] hardirqs last enabled at (7633): [<ffffcd777831ee90>] _raw_spin_unlock_irq+0x3c/0x78 [ 825.616384] hardirqs last disabled at (7634): [<ffffcd7778315a78>] __schedule+0x134/0x9f0 [ 825.616411] softirqs last enabled at (7630): [<ffffcd7707aacea0>] local_bh_enable+0x4/0x30 [ipv6] [ 825.617019] softirqs last disabled at (7618): [<ffffcd7707aace70>] local_bh_disable+0x4/0x30 [ipv6] [ 825.617586] ---[ end trace 0000000000000000 ]--- Therefore, deal with the deadlock as suggested by [1], using the function drm_modeset_backoff(). [1] https://docs.kernel.org/gpu/drm-kms.html?highlight=kms#kms-locking Fixes: 6bed2ea ("drm/vc4: hdmi: Reset link on hotplug") Reported-by: Stefan Wahren <stefan.wahren@i2se.com> Signed-off-by: Maíra Canal <mcanal@igalia.com> Tested-by: Stefan Wahren <stefan.wahren@i2se.com> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20221229194638.178712-1-mcanal@igalia.com Signed-off-by: Sasha Levin <sashal@kernel.org>
pikatenor
pushed a commit
to pikatenor/linux
that referenced
this pull request
Apr 11, 2023
[ Upstream commit 91621be ] When --overwrite and --max-size options of perf record are used together, a segmentation fault occurs. The following is an example: # perf record -e sched:sched* --overwrite --max-size 1K -a -- sleep 1 [ perf record: Woken up 1 times to write data ] perf: Segmentation fault Obtained 12 stack frames. ./perf/perf(+0x197673) [0x55f99710b673] /lib/x86_64-linux-gnu/libc.so.6(+0x3ef0f) [0x7fa45f3cff0f] ./perf/perf(+0x8eb40) [0x55f997002b40] ./perf/perf(+0x1f6882) [0x55f99716a882] ./perf/perf(+0x794c2) [0x55f996fed4c2] ./perf/perf(+0x7b7c7) [0x55f996fef7c7] ./perf/perf(+0x9074b) [0x55f99700474b] ./perf/perf(+0x12e23c) [0x55f9970a223c] ./perf/perf(+0x12e54a) [0x55f9970a254a] ./perf/perf(+0x7db60) [0x55f996ff1b60] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe6) [0x7fa45f3b2c86] ./perf/perf(+0x7dfe9) [0x55f996ff1fe9] Segmentation fault (core dumped) backtrace of the core file is as follows: (gdb) bt #0 record__bytes_written (rec=0x55f99755a200 <record>) at builtin-record.c:234 friendlyarm#1 record__output_max_size_exceeded (rec=0x55f99755a200 <record>) at builtin-record.c:242 friendlyarm#2 record__write (map=0x0, size=12816, bf=0x55f9978da2e0, rec=0x55f99755a200 <record>) at builtin-record.c:263 friendlyarm#3 process_synthesized_event (tool=tool@entry=0x55f99755a200 <record>, event=event@entry=0x55f9978da2e0, sample=sample@entry=0x0, machine=machine@entry=0x55f997893658) at builtin-record.c:618 friendlyarm#4 0x000055f99716a883 in __perf_event__synthesize_id_index (tool=tool@entry=0x55f99755a200 <record>, process=process@entry=0x55f997002aa0 <process_synthesized_event>, evlist=0x55f9978928b0, machine=machine@entry=0x55f997893658, from=from@entry=0) at util/synthetic-events.c:1895 friendlyarm#5 0x000055f99716a91f in perf_event__synthesize_id_index (tool=tool@entry=0x55f99755a200 <record>, process=process@entry=0x55f997002aa0 <process_synthesized_event>, evlist=<optimized out>, machine=machine@entry=0x55f997893658) at util/synthetic-events.c:1905 friendlyarm#6 0x000055f996fed4c3 in record__synthesize (tail=tail@entry=true, rec=0x55f99755a200 <record>) at builtin-record.c:1997 friendlyarm#7 0x000055f996fef7c8 in __cmd_record (argc=argc@entry=2, argv=argv@entry=0x7ffc67551260, rec=0x55f99755a200 <record>) at builtin-record.c:2802 friendlyarm#8 0x000055f99700474c in cmd_record (argc=<optimized out>, argv=0x7ffc67551260) at builtin-record.c:4258 friendlyarm#9 0x000055f9970a223d in run_builtin (p=0x55f997564d88 <commands+264>, argc=10, argv=0x7ffc67551260) at perf.c:330 torvalds#10 0x000055f9970a254b in handle_internal_command (argc=10, argv=0x7ffc67551260) at perf.c:384 torvalds#11 0x000055f996ff1b61 in run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:428 torvalds#12 main (argc=<optimized out>, argv=0x7ffc67551260) at perf.c:562 The reason is that record__bytes_written accesses the freed memory rec->thread_data, The process is as follows: __cmd_record -> record__free_thread_data -> zfree(&rec->thread_data) // free rec->thread_data -> record__synthesize -> perf_event__synthesize_id_index -> process_synthesized_event -> record__write -> record__bytes_written // access rec->thread_data We add a member variable "thread_bytes_written" in the struct "record" to save the data size written by the threads. Fixes: 6d57581 ("perf record: Add support for limit perf output file size") Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jiwei Sun <jiwei.sun@windriver.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/CAM9d7ci_TRrqBQVQNW8=GwakUr7SsZpYxaaty-S4bxF8zJWyqw@mail.gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
pikatenor
pushed a commit
to pikatenor/linux
that referenced
this pull request
Apr 11, 2023
commit 60eed1e upstream. code path: ocfs2_ioctl_move_extents ocfs2_move_extents ocfs2_defrag_extent __ocfs2_move_extent + ocfs2_journal_access_di + ocfs2_split_extent //sub-paths call jbd2_journal_restart + ocfs2_journal_dirty //crash by jbs2 ASSERT crash stacks: PID: 11297 TASK: ffff974a676dcd00 CPU: 67 COMMAND: "defragfs.ocfs2" #0 [ffffb25d8dad3900] machine_kexec at ffffffff8386fe01 friendlyarm#1 [ffffb25d8dad3958] __crash_kexec at ffffffff8395959d friendlyarm#2 [ffffb25d8dad3a20] crash_kexec at ffffffff8395a45d friendlyarm#3 [ffffb25d8dad3a38] oops_end at ffffffff83836d3f friendlyarm#4 [ffffb25d8dad3a58] do_trap at ffffffff83833205 friendlyarm#5 [ffffb25d8dad3aa0] do_invalid_op at ffffffff83833aa6 friendlyarm#6 [ffffb25d8dad3ac0] invalid_op at ffffffff84200d18 [exception RIP: jbd2_journal_dirty_metadata+0x2ba] RIP: ffffffffc09ca54a RSP: ffffb25d8dad3b70 RFLAGS: 00010207 RAX: 0000000000000000 RBX: ffff9706eedc5248 RCX: 0000000000000000 RDX: 0000000000000001 RSI: ffff97337029ea28 RDI: ffff9706eedc5250 RBP: ffff9703c3520200 R8: 000000000f46b0b2 R9: 0000000000000000 R10: 0000000000000001 R11: 00000001000000fe R12: ffff97337029ea28 R13: 0000000000000000 R14: ffff9703de59bf60 R15: ffff9706eedc5250 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 friendlyarm#7 [ffffb25d8dad3ba8] ocfs2_journal_dirty at ffffffffc137fb95 [ocfs2] friendlyarm#8 [ffffb25d8dad3be8] __ocfs2_move_extent at ffffffffc139a950 [ocfs2] friendlyarm#9 [ffffb25d8dad3c80] ocfs2_defrag_extent at ffffffffc139b2d2 [ocfs2] Analysis This bug has the same root cause of 'commit 7f27ec9 ("ocfs2: call ocfs2_journal_access_di() before ocfs2_journal_dirty() in ocfs2_write_end_nolock()")'. For this bug, jbd2_journal_restart() is called by ocfs2_split_extent() during defragmenting. How to fix For ocfs2_split_extent() can handle journal operations totally by itself. Caller doesn't need to call journal access/dirty pair, and caller only needs to call journal start/stop pair. The fix method is to remove journal access/dirty from __ocfs2_move_extent(). The discussion for this patch: https://oss.oracle.com/pipermail/ocfs2-devel/2023-February/000647.html Link: https://lkml.kernel.org/r/20230217003717.32469-1-heming.zhao@suse.com Signed-off-by: Heming Zhao <heming.zhao@suse.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
pikatenor
pushed a commit
to pikatenor/linux
that referenced
this pull request
Apr 11, 2023
commit 951df98 upstream. UBSAN complains about invalid value for bool: [ 101.165172] [drm] Initialized gud 1.0.0 20200422 for 2-3.2:1.0 on minor 1 [ 101.213360] gud 2-3.2:1.0: [drm] fb1: guddrmfb frame buffer device [ 101.213426] usbcore: registered new interface driver gud [ 101.989431] ================================================================================ [ 101.989441] UBSAN: invalid-load in linux/include/linux/iosys-map.h:253:9 [ 101.989447] load of value 121 is not a valid value for type '_Bool' [ 101.989451] CPU: 1 PID: 455 Comm: kworker/1:6 Not tainted 5.18.0-rc5-gud-5.18-rc5 friendlyarm#3 [ 101.989456] Hardware name: Hewlett-Packard HP EliteBook 820 G1/1991, BIOS L71 Ver. 01.44 04/12/2018 [ 101.989459] Workqueue: events_long gud_flush_work [gud] [ 101.989471] Call Trace: [ 101.989474] <TASK> [ 101.989479] dump_stack_lvl+0x49/0x5f [ 101.989488] dump_stack+0x10/0x12 [ 101.989493] ubsan_epilogue+0x9/0x3b [ 101.989498] __ubsan_handle_load_invalid_value.cold+0x44/0x49 [ 101.989504] dma_buf_vmap.cold+0x38/0x3d [ 101.989511] ? find_busiest_group+0x48/0x300 [ 101.989520] drm_gem_shmem_vmap+0x76/0x1b0 [drm_shmem_helper] [ 101.989528] drm_gem_shmem_object_vmap+0x9/0xb [drm_shmem_helper] [ 101.989535] drm_gem_vmap+0x26/0x60 [drm] [ 101.989594] drm_gem_fb_vmap+0x47/0x150 [drm_kms_helper] [ 101.989630] gud_prep_flush+0xc1/0x710 [gud] [ 101.989639] ? _raw_spin_lock+0x17/0x40 [ 101.989648] gud_flush_work+0x1e0/0x430 [gud] [ 101.989653] ? __switch_to+0x11d/0x470 [ 101.989664] process_one_work+0x21f/0x3f0 [ 101.989673] worker_thread+0x200/0x3e0 [ 101.989679] ? rescuer_thread+0x390/0x390 [ 101.989684] kthread+0xfd/0x130 [ 101.989690] ? kthread_complete_and_exit+0x20/0x20 [ 101.989696] ret_from_fork+0x22/0x30 [ 101.989706] </TASK> [ 101.989708] ================================================================================ The source of this warning is in iosys_map_clear() called from dma_buf_vmap(). It conditionally sets values based on map->is_iomem. The iosys_map variables are allocated uninitialized on the stack leading to ->is_iomem having all kinds of values and not only 0/1. Fix this by zeroing the iosys_map variables. Fixes: 40e1a70 ("drm: Add GUD USB Display driver") Cc: <stable@vger.kernel.org> # v5.18+ Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Noralf Trønnes <noralf@tronnes.org> Link: https://patchwork.freedesktop.org/patch/msgid/20221122-gud-shadow-plane-v2-1-435037990a83@tronnes.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
pikatenor
pushed a commit
to pikatenor/linux
that referenced
this pull request
Apr 11, 2023
[ Upstream commit 3dca1f8 ] Don't hold sdw_dev_lock while calling the peripheral driver probe() and remove() callbacks. Holding sdw_dev_lock around the probe() and remove() calls causes a theoretical mutex inversion which lockdep will assert on. During probe() the sdw_dev_lock mutex is taken first and then ASoC/ALSA locks are taken by the probe() implementation. During normal operation ASoC can take its locks and then trigger a runtime resume of the component. The SoundWire resume will then take sdw_dev_lock. This is the reverse order compared to probe(). It's not necessary to hold sdw_dev_lock when calling the probe() and remove(), it is only used to prevent the bus core calling the driver callbacks if there isn't a driver or the driver is removing. All calls to the driver callbacks are guarded by the 'probed' flag. So if sdw_dev_lock is held while setting and clearing the 'probed' flag this is sufficient to guarantee the safety of callback functions. Removing the mutex from around the call to probe() means that it is now possible for a bus event (PING response) to be handled in parallel with the probe(). But sdw_bus_probe() already has handling for this by calling the device update_status() after the probe() has completed. Example lockdep assert: [ 46.098514] ====================================================== [ 46.104736] WARNING: possible circular locking dependency detected [ 46.110961] 6.1.0-rc4-jamerson friendlyarm#1 Tainted: G E [ 46.116842] ------------------------------------------------------ [ 46.123063] mpg123/1130 is trying to acquire lock: [ 46.127883] ffff8b445031fb80 (&slave->sdw_dev_lock){+.+.}-{3:3}, at: sdw_update_slave_status+0x26/0x70 [ 46.137225] but task is already holding lock: [ 46.143074] ffffffffc1455310 (&card->pcm_mutex){+.+.}-{3:3}, at: dpcm_fe_dai_open+0x49/0x830 [ 46.151536] which lock already depends on the new lock.[ 46.159732] the existing dependency chain (in reverse order) is: [ 46.167231] -> friendlyarm#4 (&card->pcm_mutex){+.+.}-{3:3}: [ 46.173428] __mutex_lock+0x94/0x920 [ 46.177542] snd_soc_dpcm_runtime_update+0x2e/0x100 [ 46.182958] snd_soc_dapm_put_enum_double+0x1c2/0x200 [ 46.188548] snd_ctl_elem_write+0x10c/0x1d0 [ 46.193268] snd_ctl_ioctl+0x126/0x850 [ 46.197556] __x64_sys_ioctl+0x87/0xc0 [ 46.201845] do_syscall_64+0x38/0x90 [ 46.205959] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 46.211553] -> friendlyarm#3 (&card->controls_rwsem){++++}-{3:3}: [ 46.218188] down_write+0x2b/0xd0 [ 46.222038] snd_ctl_add_replace+0x39/0xb0 [ 46.226672] snd_soc_add_controls+0x53/0x80 [ 46.231393] soc_probe_component+0x1e4/0x2a0 [ 46.236202] snd_soc_bind_card+0x51a/0xc80 [ 46.240836] devm_snd_soc_register_card+0x43/0x90 [ 46.246079] mc_probe+0x982/0xfe0 [snd_soc_sof_sdw] [ 46.251500] platform_probe+0x3c/0xa0 [ 46.255700] really_probe+0xde/0x390 [ 46.259814] __driver_probe_device+0x78/0x180 [ 46.264710] driver_probe_device+0x1e/0x90 [ 46.269347] __driver_attach+0x9f/0x1f0 [ 46.273721] bus_for_each_dev+0x78/0xc0 [ 46.278098] bus_add_driver+0x1ac/0x200 [ 46.282473] driver_register+0x8f/0xf0 [ 46.286759] do_one_initcall+0x58/0x310 [ 46.291136] do_init_module+0x4c/0x1f0 [ 46.295422] __do_sys_finit_module+0xb4/0x130 [ 46.300321] do_syscall_64+0x38/0x90 [ 46.304434] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 46.310027] -> friendlyarm#2 (&card->mutex){+.+.}-{3:3}: [ 46.315883] __mutex_lock+0x94/0x920 [ 46.320000] snd_soc_bind_card+0x3e/0xc80 [ 46.324551] devm_snd_soc_register_card+0x43/0x90 [ 46.329798] mc_probe+0x982/0xfe0 [snd_soc_sof_sdw] [ 46.335219] platform_probe+0x3c/0xa0 [ 46.339420] really_probe+0xde/0x390 [ 46.343532] __driver_probe_device+0x78/0x180 [ 46.348430] driver_probe_device+0x1e/0x90 [ 46.353065] __driver_attach+0x9f/0x1f0 [ 46.357437] bus_for_each_dev+0x78/0xc0 [ 46.361812] bus_add_driver+0x1ac/0x200 [ 46.366716] driver_register+0x8f/0xf0 [ 46.371528] do_one_initcall+0x58/0x310 [ 46.376424] do_init_module+0x4c/0x1f0 [ 46.381239] __do_sys_finit_module+0xb4/0x130 [ 46.386665] do_syscall_64+0x38/0x90 [ 46.391299] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 46.397416] -> friendlyarm#1 (client_mutex){+.+.}-{3:3}: [ 46.404307] __mutex_lock+0x94/0x920 [ 46.408941] snd_soc_add_component+0x24/0x2c0 [ 46.414345] devm_snd_soc_register_component+0x54/0xa0 [ 46.420522] cs35l56_common_probe+0x280/0x370 [snd_soc_cs35l56] [ 46.427487] cs35l56_sdw_probe+0xf4/0x170 [snd_soc_cs35l56_sdw] [ 46.434442] sdw_drv_probe+0x80/0x1a0 [ 46.439136] really_probe+0xde/0x390 [ 46.443738] __driver_probe_device+0x78/0x180 [ 46.449120] driver_probe_device+0x1e/0x90 [ 46.454247] __driver_attach+0x9f/0x1f0 [ 46.459106] bus_for_each_dev+0x78/0xc0 [ 46.463971] bus_add_driver+0x1ac/0x200 [ 46.468825] driver_register+0x8f/0xf0 [ 46.473592] do_one_initcall+0x58/0x310 [ 46.478441] do_init_module+0x4c/0x1f0 [ 46.483202] __do_sys_finit_module+0xb4/0x130 [ 46.488572] do_syscall_64+0x38/0x90 [ 46.493158] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 46.499229] -> #0 (&slave->sdw_dev_lock){+.+.}-{3:3}: [ 46.506737] __lock_acquire+0x1121/0x1df0 [ 46.511765] lock_acquire+0xd5/0x300 [ 46.516360] __mutex_lock+0x94/0x920 [ 46.520949] sdw_update_slave_status+0x26/0x70 [ 46.526409] sdw_clear_slave_status+0xd8/0xe0 [ 46.531783] intel_resume_runtime+0x139/0x2a0 [ 46.537155] __rpm_callback+0x41/0x120 [ 46.541919] rpm_callback+0x5d/0x70 [ 46.546422] rpm_resume+0x531/0x7e0 [ 46.550920] __pm_runtime_resume+0x4a/0x80 [ 46.556024] snd_soc_pcm_component_pm_runtime_get+0x2f/0xc0 [ 46.562611] __soc_pcm_open+0x62/0x520 [ 46.567375] dpcm_be_dai_startup+0x116/0x210 [ 46.572661] dpcm_fe_dai_open+0xf7/0x830 [ 46.577597] snd_pcm_open_substream+0x54a/0x8b0 [ 46.583145] snd_pcm_open.part.0+0xdc/0x200 [ 46.588341] snd_pcm_playback_open+0x51/0x80 [ 46.593625] chrdev_open+0xc0/0x250 [ 46.598129] do_dentry_open+0x15f/0x430 [ 46.602981] path_openat+0x75e/0xa80 [ 46.607575] do_filp_open+0xb2/0x160 [ 46.612162] do_sys_openat2+0x9a/0x160 [ 46.616922] __x64_sys_openat+0x53/0xa0 [ 46.621767] do_syscall_64+0x38/0x90 [ 46.626352] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 46.632414] other info that might help us debug this:[ 46.641862] Chain exists of: &slave->sdw_dev_lock --> &card->controls_rwsem --> &card->pcm_mutex[ 46.655145] Possible unsafe locking scenario:[ 46.662048] CPU0 CPU1 [ 46.667080] ---- ---- [ 46.672108] lock(&card->pcm_mutex); [ 46.676267] lock(&card->controls_rwsem); [ 46.683382] lock(&card->pcm_mutex); [ 46.690063] lock(&slave->sdw_dev_lock); [ 46.694574] *** DEADLOCK ***[ 46.701942] 2 locks held by mpg123/1130: [ 46.706356] #0: ffff8b4457b22b90 (&pcm->open_mutex){+.+.}-{3:3}, at: snd_pcm_open.part.0+0xc9/0x200 [ 46.715999] friendlyarm#1: ffffffffc1455310 (&card->pcm_mutex){+.+.}-{3:3}, at: dpcm_fe_dai_open+0x49/0x830 [ 46.725390] stack backtrace: [ 46.730752] CPU: 0 PID: 1130 Comm: mpg123 Tainted: G E 6.1.0-rc4-jamerson friendlyarm#1 [ 46.739703] Hardware name: AAEON UP-WHL01/UP-WHL01, BIOS UPW1AM19 11/10/2020 [ 46.747270] Call Trace: [ 46.750239] <TASK> [ 46.752857] dump_stack_lvl+0x56/0x73 [ 46.757045] check_noncircular+0x102/0x120 [ 46.761664] __lock_acquire+0x1121/0x1df0 [ 46.766197] lock_acquire+0xd5/0x300 [ 46.770292] ? sdw_update_slave_status+0x26/0x70 [ 46.775432] ? lock_is_held_type+0xe2/0x140 [ 46.780143] __mutex_lock+0x94/0x920 [ 46.784241] ? sdw_update_slave_status+0x26/0x70 [ 46.789387] ? find_held_lock+0x2b/0x80 [ 46.793750] ? sdw_update_slave_status+0x26/0x70 [ 46.798894] ? lock_release+0x147/0x2f0 [ 46.803262] ? lockdep_init_map_type+0x47/0x250 [ 46.808315] ? sdw_update_slave_status+0x26/0x70 [ 46.813456] sdw_update_slave_status+0x26/0x70 [ 46.818422] sdw_clear_slave_status+0xd8/0xe0 [ 46.823302] ? pm_generic_runtime_suspend+0x30/0x30 [ 46.828706] intel_resume_runtime+0x139/0x2a0 [ 46.833583] ? _raw_spin_unlock_irq+0x24/0x50 [ 46.838462] ? pm_generic_runtime_suspend+0x30/0x30 [ 46.843866] __rpm_callback+0x41/0x120 [ 46.848142] ? pm_generic_runtime_suspend+0x30/0x30 [ 46.853550] rpm_callback+0x5d/0x70 [ 46.857568] rpm_resume+0x531/0x7e0 [ 46.861578] ? _raw_spin_lock_irqsave+0x62/0x70 [ 46.866634] __pm_runtime_resume+0x4a/0x80 [ 46.871258] snd_soc_pcm_component_pm_runtime_get+0x2f/0xc0 [ 46.877358] __soc_pcm_open+0x62/0x520 [ 46.881634] ? dpcm_add_paths.isra.0+0x35d/0x4c0 [ 46.886784] dpcm_be_dai_startup+0x116/0x210 [ 46.891592] dpcm_fe_dai_open+0xf7/0x830 [ 46.896046] ? debug_mutex_init+0x33/0x50 [ 46.900591] snd_pcm_open_substream+0x54a/0x8b0 [ 46.905658] snd_pcm_open.part.0+0xdc/0x200 [ 46.910376] ? wake_up_q+0x90/0x90 [ 46.914312] snd_pcm_playback_open+0x51/0x80 [ 46.919118] chrdev_open+0xc0/0x250 [ 46.923147] ? cdev_device_add+0x90/0x90 [ 46.927608] do_dentry_open+0x15f/0x430 [ 46.931976] path_openat+0x75e/0xa80 [ 46.936086] do_filp_open+0xb2/0x160 [ 46.940194] ? lock_release+0x147/0x2f0 [ 46.944563] ? _raw_spin_unlock+0x29/0x50 [ 46.949101] do_sys_openat2+0x9a/0x160 [ 46.953377] __x64_sys_openat+0x53/0xa0 [ 46.957733] do_syscall_64+0x38/0x90 [ 46.961829] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 46.967402] RIP: 0033:0x7fa6397ccd3b [ 46.971506] Code: 25 00 00 41 00 3d 00 00 41 00 74 4b 64 8b 04 25 18 00 00 00 85 c0 75 67 44 89 e2 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 91 00 00 00 48 8b 4c 24 28 64 48 33 0c 25 [ 46.991413] RSP: 002b:00007fff838e8990 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 [ 46.999580] RAX: ffffffffffffffda RBX: 0000000000080802 RCX: 00007fa6397ccd3b [ 47.007311] RDX: 0000000000080802 RSI: 00007fff838e8b50 RDI: 00000000ffffff9c [ 47.015047] RBP: 00007fff838e8b50 R08: 0000000000000000 R09: 0000000000000011 [ 47.022787] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000080802 [ 47.030539] R13: 0000000000000004 R14: 0000000000000000 R15: 00007fff838e8b50 [ 47.038289] </TASK> Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Link: https://lore.kernel.org/r/20230123172520.339367-1-rf@opensource.cirrus.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
pikatenor
pushed a commit
to pikatenor/linux
that referenced
this pull request
Apr 11, 2023
[ Upstream commit 13085e1 ] The following LOCKDEP was detected: Workqueue: events smc_lgr_free_work [smc] WARNING: possible circular locking dependency detected 6.1.0-20221027.rc2.git8.56bc5b569087.300.fc36.s390x+debug friendlyarm#1 Not tainted ------------------------------------------------------ kworker/3:0/176251 is trying to acquire lock: 00000000f1467148 ((wq_completion)smc_tx_wq-00000000#2){+.+.}-{0:0}, at: __flush_workqueue+0x7a/0x4f0 but task is already holding lock: 0000037fffe97dc8 ((work_completion)(&(&lgr->free_work)->work)){+.+.}-{0:0}, at: process_one_work+0x232/0x730 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> friendlyarm#4 ((work_completion)(&(&lgr->free_work)->work)){+.+.}-{0:0}: __lock_acquire+0x58e/0xbd8 lock_acquire.part.0+0xe2/0x248 lock_acquire+0xac/0x1c8 __flush_work+0x76/0xf0 __cancel_work_timer+0x170/0x220 __smc_lgr_terminate.part.0+0x34/0x1c0 [smc] smc_connect_rdma+0x15e/0x418 [smc] __smc_connect+0x234/0x480 [smc] smc_connect+0x1d6/0x230 [smc] __sys_connect+0x90/0xc0 __do_sys_socketcall+0x186/0x370 __do_syscall+0x1da/0x208 system_call+0x82/0xb0 -> friendlyarm#3 (smc_client_lgr_pending){+.+.}-{3:3}: __lock_acquire+0x58e/0xbd8 lock_acquire.part.0+0xe2/0x248 lock_acquire+0xac/0x1c8 __mutex_lock+0x96/0x8e8 mutex_lock_nested+0x32/0x40 smc_connect_rdma+0xa4/0x418 [smc] __smc_connect+0x234/0x480 [smc] smc_connect+0x1d6/0x230 [smc] __sys_connect+0x90/0xc0 __do_sys_socketcall+0x186/0x370 __do_syscall+0x1da/0x208 system_call+0x82/0xb0 -> friendlyarm#2 (sk_lock-AF_SMC){+.+.}-{0:0}: __lock_acquire+0x58e/0xbd8 lock_acquire.part.0+0xe2/0x248 lock_acquire+0xac/0x1c8 lock_sock_nested+0x46/0xa8 smc_tx_work+0x34/0x50 [smc] process_one_work+0x30c/0x730 worker_thread+0x62/0x420 kthread+0x138/0x150 __ret_from_fork+0x3c/0x58 ret_from_fork+0xa/0x40 -> friendlyarm#1 ((work_completion)(&(&smc->conn.tx_work)->work)){+.+.}-{0:0}: __lock_acquire+0x58e/0xbd8 lock_acquire.part.0+0xe2/0x248 lock_acquire+0xac/0x1c8 process_one_work+0x2bc/0x730 worker_thread+0x62/0x420 kthread+0x138/0x150 __ret_from_fork+0x3c/0x58 ret_from_fork+0xa/0x40 -> #0 ((wq_completion)smc_tx_wq-00000000#2){+.+.}-{0:0}: check_prev_add+0xd8/0xe88 validate_chain+0x70c/0xb20 __lock_acquire+0x58e/0xbd8 lock_acquire.part.0+0xe2/0x248 lock_acquire+0xac/0x1c8 __flush_workqueue+0xaa/0x4f0 drain_workqueue+0xaa/0x158 destroy_workqueue+0x44/0x2d8 smc_lgr_free+0x9e/0xf8 [smc] process_one_work+0x30c/0x730 worker_thread+0x62/0x420 kthread+0x138/0x150 __ret_from_fork+0x3c/0x58 ret_from_fork+0xa/0x40 other info that might help us debug this: Chain exists of: (wq_completion)smc_tx_wq-00000000#2 --> smc_client_lgr_pending --> (work_completion)(&(&lgr->free_work)->work) Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock((work_completion)(&(&lgr->free_work)->work)); lock(smc_client_lgr_pending); lock((work_completion) (&(&lgr->free_work)->work)); lock((wq_completion)smc_tx_wq-00000000#2); *** DEADLOCK *** 2 locks held by kworker/3:0/176251: #0: 0000000080183548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x232/0x730 friendlyarm#1: 0000037fffe97dc8 ((work_completion) (&(&lgr->free_work)->work)){+.+.}-{0:0}, at: process_one_work+0x232/0x730 stack backtrace: CPU: 3 PID: 176251 Comm: kworker/3:0 Not tainted Hardware name: IBM 8561 T01 701 (z/VM 7.2.0) Call Trace: [<000000002983c3e4>] dump_stack_lvl+0xac/0x100 [<0000000028b477ae>] check_noncircular+0x13e/0x160 [<0000000028b48808>] check_prev_add+0xd8/0xe88 [<0000000028b49cc4>] validate_chain+0x70c/0xb20 [<0000000028b4bd26>] __lock_acquire+0x58e/0xbd8 [<0000000028b4cf6a>] lock_acquire.part.0+0xe2/0x248 [<0000000028b4d17c>] lock_acquire+0xac/0x1c8 [<0000000028addaaa>] __flush_workqueue+0xaa/0x4f0 [<0000000028addf9a>] drain_workqueue+0xaa/0x158 [<0000000028ae303c>] destroy_workqueue+0x44/0x2d8 [<000003ff8029af26>] smc_lgr_free+0x9e/0xf8 [smc] [<0000000028adf3d4>] process_one_work+0x30c/0x730 [<0000000028adf85a>] worker_thread+0x62/0x420 [<0000000028aeac50>] kthread+0x138/0x150 [<0000000028a63914>] __ret_from_fork+0x3c/0x58 [<00000000298503da>] ret_from_fork+0xa/0x40 INFO: lockdep is turned off. =================================================================== This deadlock occurs because cancel_delayed_work_sync() waits for the work(&lgr->free_work) to finish, while the &lgr->free_work waits for the work(lgr->tx_wq), which needs the sk_lock-AF_SMC, that is already used under the mutex_lock. The solution is to use cancel_delayed_work() instead, which kills off a pending work. Fixes: a52bcc9 ("net/smc: improve termination processing") Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com> Reviewed-by: Jan Karcher <jaka@linux.ibm.com> Reviewed-by: Karsten Graul <kgraul@linux.ibm.com> Reviewed-by: Tony Lu <tonylu@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
pikatenor
pushed a commit
to pikatenor/linux
that referenced
this pull request
Apr 11, 2023
commit e021343 upstream. Histogram values can not be strings, stacktraces, graphs, symbols, syscalls, or grouped in buckets or log. Give an error if a value is set to do so. Note, the histogram code was not prepared to handle these modifiers for histograms and caused a bug. Mark Rutland reported: # echo 'p:copy_to_user __arch_copy_to_user n=$arg2' >> /sys/kernel/tracing/kprobe_events # echo 'hist:keys=n:vals=hitcount.buckets=8:sort=hitcount' > /sys/kernel/tracing/events/kprobes/copy_to_user/trigger # cat /sys/kernel/tracing/events/kprobes/copy_to_user/hist [ 143.694628] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 [ 143.695190] Mem abort info: [ 143.695362] ESR = 0x0000000096000004 [ 143.695604] EC = 0x25: DABT (current EL), IL = 32 bits [ 143.695889] SET = 0, FnV = 0 [ 143.696077] EA = 0, S1PTW = 0 [ 143.696302] FSC = 0x04: level 0 translation fault [ 143.702381] Data abort info: [ 143.702614] ISV = 0, ISS = 0x00000004 [ 143.702832] CM = 0, WnR = 0 [ 143.703087] user pgtable: 4k pages, 48-bit VAs, pgdp=00000000448f9000 [ 143.703407] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000 [ 143.704137] Internal error: Oops: 0000000096000004 [friendlyarm#1] PREEMPT SMP [ 143.704714] Modules linked in: [ 143.705273] CPU: 0 PID: 133 Comm: cat Not tainted 6.2.0-00003-g6fc512c10a7c friendlyarm#3 [ 143.706138] Hardware name: linux,dummy-virt (DT) [ 143.706723] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 143.707120] pc : hist_field_name.part.0+0x14/0x140 [ 143.707504] lr : hist_field_name.part.0+0x104/0x140 [ 143.707774] sp : ffff800008333a30 [ 143.707952] x29: ffff800008333a30 x28: 0000000000000001 x27: 0000000000400cc0 [ 143.708429] x26: ffffd7a653b20260 x25: 0000000000000000 x24: ffff10d303ee5800 [ 143.708776] x23: ffffd7a6539b27b0 x22: ffff10d303fb8c00 x21: 0000000000000001 [ 143.709127] x20: ffff10d303ec2000 x19: 0000000000000000 x18: 0000000000000000 [ 143.709478] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 [ 143.709824] x14: 0000000000000000 x13: 203a6f666e692072 x12: 6567676972742023 [ 143.710179] x11: 0a230a6d6172676f x10: 000000000000002c x9 : ffffd7a6521e018c [ 143.710584] x8 : 000000000000002c x7 : 7f7f7f7f7f7f7f7f x6 : 000000000000002c [ 143.710915] x5 : ffff10d303b0103e x4 : ffffd7a653b20261 x3 : 000000000000003d [ 143.711239] x2 : 0000000000020001 x1 : 0000000000000001 x0 : 0000000000000000 [ 143.711746] Call trace: [ 143.712115] hist_field_name.part.0+0x14/0x140 [ 143.712642] hist_field_name.part.0+0x104/0x140 [ 143.712925] hist_field_print+0x28/0x140 [ 143.713125] event_hist_trigger_print+0x174/0x4d0 [ 143.713348] hist_show+0xf8/0x980 [ 143.713521] seq_read_iter+0x1bc/0x4b0 [ 143.713711] seq_read+0x8c/0xc4 [ 143.713876] vfs_read+0xc8/0x2a4 [ 143.714043] ksys_read+0x70/0xfc [ 143.714218] __arm64_sys_read+0x24/0x30 [ 143.714400] invoke_syscall+0x50/0x120 [ 143.714587] el0_svc_common.constprop.0+0x4c/0x100 [ 143.714807] do_el0_svc+0x44/0xd0 [ 143.714970] el0_svc+0x2c/0x84 [ 143.715134] el0t_64_sync_handler+0xbc/0x140 [ 143.715334] el0t_64_sync+0x190/0x194 [ 143.715742] Code: a9bd7bfd 910003fd a90153f3 aa0003f3 (f9400000) [ 143.716510] ---[ end trace 0000000000000000 ]--- Segmentation fault Link: https://lkml.kernel.org/r/20230302020810.559462599@goodmis.org Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Fixes: c6afad4 ("tracing: Add hist trigger 'sym' and 'sym-offset' modifiers") Reported-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
pikatenor
pushed a commit
to pikatenor/linux
that referenced
this pull request
Apr 11, 2023
[ Upstream commit 2b4cc3d ] The check introduced in the commit a5fd394 ("igc: Lift TAPRIO schedule restriction") can detect a false positive error in some corner case. For instance, tc qdisc replace ... taprio num_tc 4 ... sched-entry S 0x01 100000 # slot#1 sched-entry S 0x03 100000 # slot#2 sched-entry S 0x04 100000 # slot#3 sched-entry S 0x08 200000 # slot#4 flags 0x02 # hardware offload Here the queue#0 (the first queue) is on at the slot#1 and friendlyarm#2, and off at the slot#3 and friendlyarm#4. Under the current logic, when the slot#4 is examined, validate_schedule() returns *false* since the enablement count for the queue#0 is two and it is already off at the previous slot (i.e. friendlyarm#3). But this definition is truely correct. Let's fix the logic to enforce a strict validation for consecutively-opened slots. Fixes: a5fd394 ("igc: Lift TAPRIO schedule restriction") Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
pikatenor
pushed a commit
to pikatenor/linux
that referenced
this pull request
Apr 11, 2023
[ Upstream commit 4e264be ] When a system with E810 with existing VFs gets rebooted the following hang may be observed. Pid 1 is hung in iavf_remove(), part of a network driver: PID: 1 TASK: ffff965400e5a340 CPU: 24 COMMAND: "systemd-shutdow" #0 [ffffaad04005fa50] __schedule at ffffffff8b3239cb friendlyarm#1 [ffffaad04005fae8] schedule at ffffffff8b323e2d friendlyarm#2 [ffffaad04005fb00] schedule_hrtimeout_range_clock at ffffffff8b32cebc friendlyarm#3 [ffffaad04005fb80] usleep_range_state at ffffffff8b32c930 friendlyarm#4 [ffffaad04005fbb0] iavf_remove at ffffffffc12b9b4c [iavf] friendlyarm#5 [ffffaad04005fbf0] pci_device_remove at ffffffff8add7513 friendlyarm#6 [ffffaad04005fc10] device_release_driver_internal at ffffffff8af08baa friendlyarm#7 [ffffaad04005fc40] pci_stop_bus_device at ffffffff8adcc5fc friendlyarm#8 [ffffaad04005fc60] pci_stop_and_remove_bus_device at ffffffff8adcc81e friendlyarm#9 [ffffaad04005fc70] pci_iov_remove_virtfn at ffffffff8adf9429 torvalds#10 [ffffaad04005fca8] sriov_disable at ffffffff8adf98e4 torvalds#11 [ffffaad04005fcc8] ice_free_vfs at ffffffffc04bb2c8 [ice] torvalds#12 [ffffaad04005fd10] ice_remove at ffffffffc04778fe [ice] torvalds#13 [ffffaad04005fd38] ice_shutdown at ffffffffc0477946 [ice] torvalds#14 [ffffaad04005fd50] pci_device_shutdown at ffffffff8add58f1 torvalds#15 [ffffaad04005fd70] device_shutdown at ffffffff8af05386 torvalds#16 [ffffaad04005fd98] kernel_restart at ffffffff8a92a870 torvalds#17 [ffffaad04005fda8] __do_sys_reboot at ffffffff8a92abd6 torvalds#18 [ffffaad04005fee0] do_syscall_64 at ffffffff8b317159 torvalds#19 [ffffaad04005ff08] __context_tracking_enter at ffffffff8b31b6fc torvalds#20 [ffffaad04005ff18] syscall_exit_to_user_mode at ffffffff8b31b50d torvalds#21 [ffffaad04005ff28] do_syscall_64 at ffffffff8b317169 torvalds#22 [ffffaad04005ff50] entry_SYSCALL_64_after_hwframe at ffffffff8b40009b RIP: 00007f1baa5c13d7 RSP: 00007fffbcc55a98 RFLAGS: 00000202 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1baa5c13d7 RDX: 0000000001234567 RSI: 0000000028121969 RDI: 00000000fee1dead RBP: 00007fffbcc55ca0 R8: 0000000000000000 R9: 00007fffbcc54e90 R10: 00007fffbcc55050 R11: 0000000000000202 R12: 0000000000000005 R13: 0000000000000000 R14: 00007fffbcc55af0 R15: 0000000000000000 ORIG_RAX: 00000000000000a9 CS: 0033 SS: 002b During reboot all drivers PM shutdown callbacks are invoked. In iavf_shutdown() the adapter state is changed to __IAVF_REMOVE. In ice_shutdown() the call chain above is executed, which at some point calls iavf_remove(). However iavf_remove() expects the VF to be in one of the states __IAVF_RUNNING, __IAVF_DOWN or __IAVF_INIT_FAILED. If that's not the case it sleeps forever. So if iavf_shutdown() gets invoked before iavf_remove() the system will hang indefinitely because the adapter is already in state __IAVF_REMOVE. Fix this by returning from iavf_remove() if the state is __IAVF_REMOVE, as we already went through iavf_shutdown(). Fixes: 9745780 ("iavf: Add waiting so the port is initialized in remove") Fixes: a841733 ("iavf: Fix race condition between iavf_shutdown and iavf_remove") Reported-by: Marius Cornea <mcornea@redhat.com> Signed-off-by: Stefan Assmann <sassmann@kpanic.de> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
pikatenor
pushed a commit
to pikatenor/linux
that referenced
this pull request
Apr 11, 2023
[ Upstream commit bce5640 ] To loop a variable-length array, hci_init_stage_sync(stage) considers that stage[i] is valid as long as stage[i-1].func is valid. Thus, the last element of stage[].func should be intentionally invalid as hci_init0[], le_init2[], and others did. However, amp_init1[] and amp_init2[] have no invalid element, letting hci_init_stage_sync() keep accessing amp_init1[] over its valid range. This patch fixes this by adding {} in the last of amp_init1[] and amp_init2[]. ================================================================== BUG: KASAN: global-out-of-bounds in hci_dev_open_sync ( /v6.2-bzimage/net/bluetooth/hci_sync.c:3154 /v6.2-bzimage/net/bluetooth/hci_sync.c:3343 /v6.2-bzimage/net/bluetooth/hci_sync.c:4418 /v6.2-bzimage/net/bluetooth/hci_sync.c:4609 /v6.2-bzimage/net/bluetooth/hci_sync.c:4689) Read of size 8 at addr ffffffffaed1ab70 by task kworker/u5:0/1032 CPU: 0 PID: 1032 Comm: kworker/u5:0 Not tainted 6.2.0 friendlyarm#3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04 Workqueue: hci1 hci_power_on Call Trace: <TASK> dump_stack_lvl (/v6.2-bzimage/lib/dump_stack.c:107 (discriminator 1)) print_report (/v6.2-bzimage/mm/kasan/report.c:307 /v6.2-bzimage/mm/kasan/report.c:417) ? hci_dev_open_sync (/v6.2-bzimage/net/bluetooth/hci_sync.c:3154 /v6.2-bzimage/net/bluetooth/hci_sync.c:3343 /v6.2-bzimage/net/bluetooth/hci_sync.c:4418 /v6.2-bzimage/net/bluetooth/hci_sync.c:4609 /v6.2-bzimage/net/bluetooth/hci_sync.c:4689) kasan_report (/v6.2-bzimage/mm/kasan/report.c:184 /v6.2-bzimage/mm/kasan/report.c:519) ? hci_dev_open_sync (/v6.2-bzimage/net/bluetooth/hci_sync.c:3154 /v6.2-bzimage/net/bluetooth/hci_sync.c:3343 /v6.2-bzimage/net/bluetooth/hci_sync.c:4418 /v6.2-bzimage/net/bluetooth/hci_sync.c:4609 /v6.2-bzimage/net/bluetooth/hci_sync.c:4689) hci_dev_open_sync (/v6.2-bzimage/net/bluetooth/hci_sync.c:3154 /v6.2-bzimage/net/bluetooth/hci_sync.c:3343 /v6.2-bzimage/net/bluetooth/hci_sync.c:4418 /v6.2-bzimage/net/bluetooth/hci_sync.c:4609 /v6.2-bzimage/net/bluetooth/hci_sync.c:4689) ? __pfx_hci_dev_open_sync (/v6.2-bzimage/net/bluetooth/hci_sync.c:4635) ? mutex_lock (/v6.2-bzimage/./arch/x86/include/asm/atomic64_64.h:190 /v6.2-bzimage/./include/linux/atomic/atomic-long.h:443 /v6.2-bzimage/./include/linux/atomic/atomic-instrumented.h:1781 /v6.2-bzimage/kernel/locking/mutex.c:171 /v6.2-bzimage/kernel/locking/mutex.c:285) ? __pfx_mutex_lock (/v6.2-bzimage/kernel/locking/mutex.c:282) hci_power_on (/v6.2-bzimage/net/bluetooth/hci_core.c:485 /v6.2-bzimage/net/bluetooth/hci_core.c:984) ? __pfx_hci_power_on (/v6.2-bzimage/net/bluetooth/hci_core.c:969) ? read_word_at_a_time (/v6.2-bzimage/./include/asm-generic/rwonce.h:85) ? strscpy (/v6.2-bzimage/./arch/x86/include/asm/word-at-a-time.h:62 /v6.2-bzimage/lib/string.c:161) process_one_work (/v6.2-bzimage/kernel/workqueue.c:2294) worker_thread (/v6.2-bzimage/./include/linux/list.h:292 /v6.2-bzimage/kernel/workqueue.c:2437) ? __pfx_worker_thread (/v6.2-bzimage/kernel/workqueue.c:2379) kthread (/v6.2-bzimage/kernel/kthread.c:376) ? __pfx_kthread (/v6.2-bzimage/kernel/kthread.c:331) ret_from_fork (/v6.2-bzimage/arch/x86/entry/entry_64.S:314) </TASK> The buggy address belongs to the variable: amp_init1+0x30/0x60 The buggy address belongs to the physical page: page:000000003a157ec6 refcount:1 mapcount:0 mapping:0000000000000000 ia flags: 0x200000000001000(reserved|node=0|zone=2) raw: 0200000000001000 ffffea0005054688 ffffea0005054688 000000000000000 raw: 0000000000000000 0000000000000000 00000001ffffffff 000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffffffffaed1aa00: f9 f9 f9 f9 00 00 00 00 f9 f9 f9 f9 00 00 00 00 ffffffffaed1aa80: 00 00 00 00 f9 f9 f9 f9 00 00 00 00 00 00 00 00 >ffffffffaed1ab00: 00 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 00 00 f9 f9 ^ ffffffffaed1ab80: f9 f9 f9 f9 00 00 00 00 f9 f9 f9 f9 00 00 00 f9 ffffffffaed1ac00: f9 f9 f9 f9 00 06 f9 f9 f9 f9 f9 f9 00 00 02 f9 This bug is found by FuzzBT, a modified version of Syzkaller. Other contributors for this bug are Ruoyu Wu and Peng Hui. Fixes: d0b1370 ("Bluetooth: hci_sync: Rework init stages") Signed-off-by: Sungwoo Kim <iam@sung-woo.kim> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
pikatenor
pushed a commit
to pikatenor/linux
that referenced
this pull request
Apr 11, 2023
[ Upstream commit fa2068d ] The naming of space_info->active_total_bytes is misleading. It counts not only active block groups but also full ones which are previously active but now inactive. That confusion results in a bug not counting the full BGs into active_total_bytes on mount time. For a background, there are three kinds of block groups in terms of activation. 1. Block groups never activated 2. Block groups currently active 3. Block groups previously active and currently inactive (due to fully written or zone finish) What we really wanted to exclude from "total_bytes" is the total size of BGs friendlyarm#1. They seem empty and allocatable but since they are not activated, we cannot rely on them to do the space reservation. And, since BGs friendlyarm#1 never get activated, they should have no "used", "reserved" and "pinned" bytes. OTOH, BGs friendlyarm#3 can be counted in the "total", since they are already full we cannot allocate from them anyway. For them, "total_bytes == used + reserved + pinned + zone_unusable" should hold. Tracking friendlyarm#2 and friendlyarm#3 as "active_total_bytes" (current implementation) is confusing. And, tracking friendlyarm#1 and subtract that properly from "total_bytes" every time you need space reservation is cumbersome. Instead, we can count the whole region of a newly allocated block group as zone_unusable. Then, once that block group is activated, release [0 .. zone_capacity] from the zone_unusable counters. With this, we can eliminate the confusing ->active_total_bytes and the code will be common among regular and the zoned mode. Also, no additional counter is needed with this approach. Fixes: 6a921de ("btrfs: zoned: introduce space_info->active_total_bytes") CC: stable@vger.kernel.org # 6.1+ Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com> Stable-dep-of: e15acc2 ("btrfs: zoned: drop space_info->active_total_bytes") Signed-off-by: Sasha Levin <sashal@kernel.org>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.