Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

missing .dts file for firefly rk3288 reload board #1

Closed
bschiett opened this issue Oct 10, 2016 · 18 comments
Closed

missing .dts file for firefly rk3288 reload board #1

bschiett opened this issue Oct 10, 2016 · 18 comments

Comments

@bschiett
Copy link

I compiled the kernel using branch release-20160920, hoping to use it with my
firefly rk3288 reload board (see http://en.t-firefly.com/en/firenow/firefly_rk3288/).

I configured the kernel with rockchip_linux_defconfig and I'm using rk3288-firefly.dts. I managed to boot with the kernel but it seems that the hardware on my firefly reload board doesn't work. I can see graphics via HDMI and see the linux desktop but when I plug in a mouse it doesn't work, probably because the .dts file for my reload board has to be different from the original .dts for the firefly board.

Also, I get the below kernel messages on the serial console endlessly looping, I suppose this has to do with the .dts file not being correct for my board. See below for the kernel messages on the serial console which are looping non stop.

I know that firefly has a kernel config and dts file for the reload board in their kernel repository. I thought I could copy their .dts and kernel config to use it with the above branch of rockchip's kernel but I'm not sure how to do it. Is it a matter of putting their kernel config in ./arch/arm/configs/ and .dts file in ./arch/arm/boot/dts/ and i can automatically use it or is there something else I should do? I would appreciate it if the kernel config/dts file for the reload board are added to rockchip's kernel repo.

[ 690.278519] dhd_txglom_enable: enable 0
[ 690.282359] dhd_conf_set_txglom_params: swtxglom=0, txglom_ext=0
[ 690.288406] dhd_conf_set_txglom_params: txglom_bucket_size=0
[ 690.294061] dhd_conf_set_txglom_params: txglomsize=0,
deferred_tx_len=0, bus_txglom=0
[ 690.301906] dhd_conf_set_txglom_params: tx_in_rx=1, tx_max_offset=0
[ 690.308190] dhd_bus_devreset: WLAN OFF DONE
[ 690.313696] wifi_platform_set_power = 0
[ 690.317576] ======== PULL WL_REG_ON LOW! ========
[ 690.322278] [WLAN_RFKILL]: rockchip_wifi_power: 0
[ 690.327626] [WLAN_RFKILL]: wifi shut off power.
[ 690.332155] wl_android_wifi_off out
[ 690.467023] dhd_stop: Exit
[ 690.480667] dhd_set_mac_addr_handler: interface info not available/down
[ 690.590926] dhd_open: Enter eebb7800
[ 690.594506]
[ 690.594506] Dongle Host Driver, version 1.363.59.144.1 (r)
[ 690.601499] wl_android_wifi_on in 1
[ 690.605414] wl_android_wifi_on in 2: g_wifi_on=0
[ 690.610055] wifi_platform_set_power = 1
[ 690.613885] ======== PULL WL_REG_ON HIGH! ========
[ 690.618684] [WLAN_RFKILL]: rockchip_wifi_power: 1
[ 690.623381] [WLAN_RFKILL]: wifi turn on power. -1
[ 690.837146] sdio_reset_comm():
[ 690.843390] dwmmc_rockchip ff0d0000.dwmmc: could not set regulator OCR (-22)
[ 690.850729] dwmmc_rockchip ff0d0000.dwmmc: failed to enable vmmc regulator
[ 690.887384] mmc_host mmc1: Bus speed (slot 0) = 400000Hz (slot req
400000Hz, actual 400000HZ div = 0)
[ 690.910541] mmc_host mmc1: Bus speed (slot 0) = 200000Hz (slot req
200000Hz, actual 200000HZ div = 0)
[ 690.958322] mmc1: queuing unknown CIS tuple 0x80 (2 bytes)
[ 690.966781] mmc1: queuing unknown CIS tuple 0x80 (3 bytes)
[ 690.975271] mmc1: queuing unknown CIS tuple 0x80 (3 bytes)
[ 690.986141] mmc1: queuing unknown CIS tuple 0x80 (7 bytes)
[ 691.098590] mmc_host mmc1: Bus speed (slot 0) = 50000000Hz (slot
req 50000000Hz, actual 50000000HZ div = 0)
[ 692.167059] dwmmc_rockchip ff0d0000.dwmmc: Successfully tuned phase to 117
[ 692.174157] sdioh_start: set sd_f2_blocksize 128
[ 692.183550]
[ 692.183550]
[ 692.183550] dhd_bus_devreset: == WLAN ON ==
[ 692.192021] F1 signature read @0x18000000=0x16224335
[ 692.207003] F1 signature OK, socitype:0x1 chip:0x4339 rev:0x1 pkg:0x2
[ 692.214942] DHD: dongle ram size is set to 786432(orig 786432) at 0x180000
[ 692.224124] dhd_os_open_image: /system/etc/firmware/config.txt (204
bytes) open success
[ 692.233653] dhd_conf_read_config: dhd_doflow = 1
[ 692.238549] dhd_conf_read_config: srl = 15
[ 692.242687] dhd_conf_read_config: lrl = 15
[ 692.247548] dhd_conf_set_fw_name_by_chip:
firmware_path=/system/etc/firmware/fw_bcm4339a0_ag.bin
[ 692.256352] dhdsdio_download_firmware: set use_rxchain 0
[ 692.261742] dhdsdio_download_firmware: set txglomsize 40
[ 692.267139] sdioh_set_mode: set txglom_mode to copy
[ 692.272035] Final fw_path=/system/etc/firmware/fw_bcm4339a0_ag.bin
[ 692.278286] Final nv_path=/system/etc/firmware/nvram_AP6335.txt
[ 692.284222] Final conf_path=/system/etc/firmware/config.txt
[ 692.297615] dhd_os_open_image:
/system/etc/firmware/fw_bcm4339a0_ag.bin (517576 bytes) open success
[ 692.360876] dhd_os_open_image:
/system/etc/firmware/nvram_AP6335.txt (2375 bytes) open success
[ 692.369518] NVRAM version: AP6335_NVRAM_V1.2_01092014
[ 692.375557] dhdsdio_write_vars: Download, Upload and compare of
NVRAM succeeded.
[ 692.561339] dhd_bus_init: enable 0x06, ready 0x06 (waited 0us)
[ 692.567307] bcmsdh_oob_intr_register: Enter
[ 692.571483] bcmsdh_oob_intr_register: HW_OOB enabled
[ 692.576438] bcmsdh_oob_intr_register OOB irq=223 flags=4
[ 692.582201] bcmsdh_oob_intr_register: enable_irq_wake
[ 692.589367] dhd_conf_set_fw_int_cmd: set WLC_SET_BAND 142 0
[ 692.604561] Firmware up: op_mode=0x0005, MAC=94:a1:a2:4b:4b:46
[ 692.610399] dhd_conf_set_country: set country ALL, revision 0
[ 692.616134] dhd_conf_set_fw_string_struct_cmd: set country
[ 692.622243] CONFIG-ERROR) dhd_conf_set_fw_string_struct_cmd:
country setting failed -2
[ 692.630959] Country code: US (US/0)
[ 692.634635] dhd_conf_set_fw_string_cmd: set roam_off 1
[ 692.641241] dhd_conf_set_fw_int_cmd: set WLC_SET_SRL 32 15
[ 692.646994] dhd_conf_set_fw_int_cmd: set WLC_SET_LRL 34 15
[ 692.652914] dhd_conf_set_fw_string_cmd: set txbf 1
[ 692.666123] dhd_preinit_ioctls event_msgs_ext not support or
version mismatch -23
[ 692.677594] Firmware version = wl0: Jul 25 2013 00:04:02 version
6.37.32.5 (r414538)
[ 692.685325] Driver: 1.363.59.144.1 (r)
[ 692.685325] Firmware: wl0: Jul 25 2013 00:04:02 version 6.37.32.5 (r414538)
[ 692.697207] dhd_txglom_enable: enable 1
[ 692.701037] dhd_conf_set_txglom_params: swtxglom=0, txglom_ext=0
[ 692.707049] dhd_conf_set_txglom_params: txglom_bucket_size=0
[ 692.712700] dhd_conf_set_txglom_params: txglomsize=40,
deferred_tx_len=40, bus_txglom=0
[ 692.720699] dhd_conf_set_txglom_params: tx_in_rx=1, tx_max_offset=0
[ 692.726969] dhd_conf_set_disable_proptx: set disable_proptx 0
[ 692.733889] dhd_wlfc_hostreorder_init(): successful bdcv2 tlv signaling, 64
[ 692.741297] wl_android_wifi_on: Success
[ 692.745595] wl_create_event_handler(): thread:wl_event_handler:6b4 started
[ 692.745601] tsk Enter, tsk = 0xedf415c8
[ 692.825591] dhd_open: Exit ret=0
[ 692.832518] dhd_stop: Enter eebb7800
[ 692.838009] wl_event_handler: was terminated
[ 692.842282] wl_destroy_event_handler(): thread:wl_event_handler:6b4
terminated OK
[ 692.849966] CFGP2P-ERROR) wl_cfgp2p_disable_discovery : do
nothing, not initialized
[ 692.857800] CFGP2P-ERROR) wl_cfgp2p_deinit_priv : In
[ 692.863390] CFG80211-ERROR) wl_cfg80211_event : Event handler is not created
[ 692.870864] wl_android_wifi_off in 1
[ 692.874561] wl_android_wifi_off in 2: g_wifi_on=1, on_failure=1
[ 692.881115] dhdsdio_isr : bus is down. we have nothing to do
[ 692.886795] bcmsdh_oob_intr_unregister: Enter
[ 692.891203] ------------[ cut here ]------------
[ 692.895833] WARNING: CPU: 0 PID: 423 at drivers/clk/clk.c:680
clk_core_disable+0x34/0x16c()
[ 692.904173] Modules linked in:
[ 692.907245] CPU: 0 PID: 423 Comm: NetworkManager Tainted: G
W 4.4.16 #2
[ 692.914979] Hardware name: Rockchip (Device Tree)
[ 692.919701] from
[ 692.927449] from
[ 692.934676] from
[ 692.942765] from
[ 692.951546] from
[ 692.960070] from
[ 692.967901] from
[ 692.976774] from

[ 692.985731] from
[ 692.993300] from
[ 693.000352] from
[ 693.008789] from

[ 693.018007] from
[ 693.026707] from
[ 693.035578] from
[ 693.043496] from
[ 693.051068] from
[ 693.058899] from
[ 693.066988] from
[ 693.075423] from
[ 693.083338] from
[ 693.090907] from
[ 693.099084] from
[ 693.107347] from
[ 693.115263] from
[ 693.123353] from
[ 693.131616] from
[ 693.139445] from
[ 693.147360] from
[ 693.155188] from
[ 693.162758] from
[ 693.170494] ---[ end trace 94868236ff675595 ]---

@bschiett
Copy link
Author

I can also add that in the mainline there is no kernel config for rockchip but there is a .dts file for the reload board, see

https://github.com/torvalds/linux/tree/master/arch/arm/configs
https://github.com/torvalds/linux/blob/master/arch/arm/boot/dts/rk3288-firefly-reload.dts

@hizukiayaka
Copy link
Contributor

hizukiayaka commented Oct 11, 2016

Yes, it is what I submitted to mainline. For the firefly rk3288 reload, I would suggest you to use the mainline kernel.
You may not use the firefly release dts for firefly reload, the hardware connection is a little different.

@bschiett
Copy link
Author

@hizukiayaka thanks, yesterday I checked out mainline stable tag 4.8.1 (latest) and then noticed your rk3288-firefly-reload.dts is not in it yet, so I went to download it manually together with its core board .dtsi file, and I put them in the dts folder and modified the Makefile to compile your .dts file. For the linux defconfig I used the defconfig from the firefly kernel tree (original v3 kernel, not mainline) since was not sure what the correct defconfig was I should use. I can now boot the kernel but I don't have any graphics output via HDMI. I assume this is because you haven't added HDMI out yet? I haven't manually installed any modules in the rootfs yet after compiling the kernel and haven't manually added any drivers yet. Should it be a matter of adding HDMI suport to your .dts file for the reload board to get basic graphics output working with mainline, or is there more to be done?

@hizukiayaka
Copy link
Contributor

Those dts files for firefly reload have been merged into linux next. Anyway, that dts is designed for mainline not rockchip develop branch, you need a extend property "supports-sd;" in sdmmc node to enable the support for SDMMC.
You may meeet such problem with this branch. Besides, I think the HDMI would work in mainline, but I have not confirmed it yet, you may have a try.

@bschiett
Copy link
Author

Besides, I think the HDMI would work in mainline, but I have not confirmed it yet, you may have a try.

Thanks, I tried enabling it by setting the status of hdmi and i2c5 node to okay in your reload .dts file using the 4.8.1 stable branch on mainline but that was not enough? Is there anything else I should check?

@bschiett
Copy link
Author

In the rockchip kernel repo, "supports-sd;" is already in the sdmmc node,
in the file rk3288-firefly.dtsi. It's just that as soon as I enable the
lvds node and try to boot from SD, the kernel can't find the SD card
anymore so it can't find the root FS. Of course, this is for the classic
firefly board and not my reload board, maybe on the reload board the pins
used for lvds are the ones for the sd card so the moment I turn on the lvds
I disable the sd automatically?

In the mainline, I had no problem booting using your reload .dts but there
I haven't switched on lvds yet because some driver code seems to be missing
and I'm not sure what I need to import.

On Wed, Oct 12, 2016 at 3:04 AM, Randy Li notifications@github.com wrote:

Those dts files for firefly reload have been merged into linux next.
Anyway, that dts is designed for mainline not rockchip develop branch, you
need a extend property "supports-sd;" in sdmmc node to enable the support
for SDMMC.
You may meeet such problem with this branch. Besides, I think the HDMI
would work in mainline, but I have not confirmed it yet, you may have a try.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#1 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAJq5p-Lkl8e6jycxFzfjJA6DQm4-m-Bks5qzDIjgaJpZM4KSzik
.

@wzyy2
Copy link
Contributor

wzyy2 commented Oct 12, 2016

hi
try this patch
https://patchwork.kernel.org/patch/9348571/

@wzyy2
Copy link
Contributor

wzyy2 commented Oct 12, 2016

And the endlessly looping is caused by networkmanager bug.
I have fix it in this commit
https://github.com/rockchip-linux/rk-rootfs-build/commit/9d54eeca9470143cea79615927f5c7c9edae0cae

@bschiett
Copy link
Author

I tried this patch but nothing changed, for some reason now that i've
enabled hdmi and lvds in my .dts file the sd card is no longer found by the
kernel ... I now have:

[ 2.638645] mmc_host mmc1: Bus speed (slot 0) = 400000Hz (slot req
400000Hz, actual 400000HZ div = 0)
[ 2.658531] dwmmc_rockchip ff0d0000.dwmmc: 1 slots initialized
[ 2.665639] dwmmc_rockchip ff0f0000.dwmmc: IDMAC supports 32-bit address
mode.
[ 2.674916] dwmmc_rockchip ff0f0000.dwmmc: Using internal DMA controller.
[ 2.681716] dwmmc_rockchip ff0f0000.dwmmc: Version ID is 270a
[ 2.687495] dwmmc_rockchip ff0f0000.dwmmc: DW MMC controller at irq
31,32 bit host data width,256 deep fifo
[ 2.708513] mmc_host mmc2: Bus speed (slot 0) = 400000Hz (slot req
400000Hz, actual 400000HZ div = 0)
[ 2.729527] dwmmc_rockchip ff0f0000.dwmmc: 1 slots initialized
[ 2.737451] sdhci-pltfm: SDHCI platform and OF driver helper

previously it was this:

[ 2.198200] mmc_host mmc1: Bus speed (slot 0) = 400000Hz (slot req
400000Hz, actual 400000HZ div = 0)
[ 2.218083] dwmmc_rockchip ff0d0000.dwmmc: 1 slots initialized
[ 2.218333] dwmmc_rockchip ff0f0000.dwmmc: IDMAC supports 32-bit address
mode.
[ 2.218342] dwmmc_rockchip ff0f0000.dwmmc: Using internal DMA controller.
[ 2.218348] dwmmc_rockchip ff0f0000.dwmmc: Version ID is 270a
[ 2.218367] dwmmc_rockchip ff0f0000.dwmmc: DW MMC controller at irq
31,32 bit host data width,256 deep fifo
[ 2.248080] mmc_host mmc2: Bus speed (slot 0) = 400000Hz (slot req
400000Hz, actual 400000HZ div = 0)
[ 2.259091] mmc_host mmc0: Bus speed (slot 0) = 50000000Hz (slot req
50000000Hz, actual 50000000HZ div = 0)
[ 2.259126] mmc0: new high speed SDHC card at address aaaa
[ 2.259347] mmcblk0: mmc0:aaaa SL08G 7.40 GiB
[ 2.264320] mmcblk0: p1 p2 p3 p4 p5 p6 p7
[ 2.268097] dwmmc_rockchip ff0f0000.dwmmc: 1 slots initialized
[ 2.268198] sdhci-pltfm: SDHCI platform and OF driver helper

On Wed, Oct 12, 2016 at 2:05 PM, 陈豪 | Jacob Chen notifications@github.com
wrote:

hi
try this patch
https://patchwork.kernel.org/patch/9348571/


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#1 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAJq5uBFcEdpfwcm7eQzPwZ7PyjW4kRbks5qzM0RgaJpZM4KSzik
.

@wzyy2
Copy link
Contributor

wzyy2 commented Oct 13, 2016

LVDS shouldn't affect sd-card.
Could you show me your LVDS dts?

@wzyy2
Copy link
Contributor

wzyy2 commented Oct 14, 2016

The DTS you send seems no problem.
Try add rootwait to bootcmd.

@bschiett
Copy link
Author

I think the problem with the SD card not being found might have to do with
the regulators for vcc_sd and vccio_sd, I read that vccio_sd has to be able
to switch between 1.8v and 3.3v depending on the type of SD card (high
speed)? Maybe the .dts is missing some definitions to make this possible?

On Fri, Oct 14, 2016 at 2:50 AM, 陈豪 | Jacob Chen notifications@github.com
wrote:

The DTS you send seems no problem.
Try add rootwait to bootcmd.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#1 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAJq5q_qY3si2B3Knq9fTJ7CRAoX9lU7ks5qztHGgaJpZM4KSzik
.

@wzyy2
Copy link
Contributor

wzyy2 commented Oct 15, 2016

Firefly don't support uhs, they have a hardware bug in it.
You should remove any sd-uhs-* in sdcard dts node and vccio_sd will not switch between 1.8v and 3.3v.

@bschiett
Copy link
Author

Thanks, I will try it.

On Sat, Oct 15, 2016 at 4:07 PM, 陈豪 | Jacob Chen notifications@github.com
wrote:

Firefly don't support uhs, thet have a hardware bug in it.
You should remove sd-uhs-* in sdcard dts node and vccio_sd will not switch
between 1.8v and 3.3v.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#1 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAJq5io8KJgfXaYaHCJfw6jS1WeziRQ5ks5q0N4bgaJpZM4KSzik
.

@bschiett
Copy link
Author

I restored the sdmmc settings and I still have the same problem - kernel
doesn't find sd card and thus can't mount root fs ... for some reason when
i turn on lvds stops the sd from working, I don't understand ... I'll try
again with Randy's original reload .dts without my changes to see if it can
boot from sd (it did when I first tried it)

On Sat, Oct 15, 2016 at 4:07 PM, 陈豪 | Jacob Chen notifications@github.com
wrote:

Firefly don't support uhs, thet have a hardware bug in it.
You should remove sd-uhs-* in sdcard dts node and vccio_sd will not switch
between 1.8v and 3.3v.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#1 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAJq5io8KJgfXaYaHCJfw6jS1WeziRQ5ks5q0N4bgaJpZM4KSzik
.

@wzyy2
Copy link
Contributor

wzyy2 commented Oct 15, 2016

LVDS and SD-Card work well in our other boards..
I think there might have hardware bugs....=_=

@wzyy2
Copy link
Contributor

wzyy2 commented Oct 15, 2016

I think you can ask firefly folks.

@bschiett
Copy link
Author

You are probably correct, it seems that pulling current from vcc_io on the firefly board is tricky since the sd card power comes from that supply... so for example enabling lcd_en immediately to connect vcc_lcd to the soc without any kind of startup delay is not a great idea... same with enabling the gpu, for some reason if not setting a ramp and it gets enabled then the sd card dissppears. So the ramp property should be set in the vdd_gpu for the reload board just like for the original ff board dts.
Thanks for helping, I appreciate it.

On Oct 15, 2016, at 17:32, 陈豪 | Jacob Chen notifications@github.com wrote:

LVDS and SD-Card work well in our other boards..
I think there might have hardware bugs....=_=


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.

@wzyy2 wzyy2 closed this as completed Oct 17, 2016
wzyy2 pushed a commit that referenced this issue Nov 3, 2016
The xhci hcd use the port polling timer (rh_timer) to poll the
roothub for port events. But we can't allow the USB core to poll
the port events during shutdown because the xhci controller may
be disable and can't be accessed in shutdown. If we access xhci
port registers with port polling timer after xhci shutdown, it
may cause kernel crash or unexpected behavior. So we need to stop
the port polling timer while shutdown xhci.

I can easily reproduce the issue on rk3399 platform, plug in a
Type-C hub and an USB ethernet dongle, then do reboot test,
result in a crash with the following backtrace.

Unhandled fault: synchronous external abort (0x96000010) at 0xffffff80002f0430
Internal error: : 96000010 [#1] PREEMPT SMP
task: ffffffc001092cb0 ti: ffffffc00107c000 task.ti: ffffffc00107c000
PC is at xhci_hub_status_data+0xec/0x1e4
LR is at xhci_hub_status_data+0xb0/0x1e4
[<ffffffc00067a210>] xhci_hub_status_data+0xec/0x1e4
[<ffffffc00063ebf8>] usb_hcd_poll_rh_status+0x54/0x148
[<ffffffc00063ed0c>] rh_timer_func+0x20/0x2c
[<ffffffc0002845cc>] call_timer_fn+0xa4/0x1c8
[<ffffffc000284964>] run_timer_softirq+0x248/0x2cc
[<ffffffc000200928>] __do_softirq+0x178/0x338
[<ffffffc000224b08>] irq_exit+0x78/0xc0
[<ffffffc00027089c>] __handle_domain_irq+0x9c/0xbc
[<ffffffc0002006f4>] gic_handle_irq+0xcc/0x188

BUG=chrome-os-partner:59111
TEST=Plug in a Type-C hub, then do reboot test, check if
kernel crash during shutdown.

Change-Id: I3ca3d12d101241cd78138ea5d995708a2893d1a0
Signed-off-by: William wu <wulf@rock-chips.com>
Reviewed-on: https://chromium-review.googlesource.com/401121
Commit-Ready: Guenter Roeck <groeck@chromium.org>
Tested-by: Guenter Roeck <groeck@chromium.org>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
wzyy2 pushed a commit that referenced this issue Nov 8, 2016
commit 368301f upstream.

With this command sequence:

  modprobe plip
  modprobe pps_parport
  rmmod pps_parport

the partport_pps modules causes this crash:

  BUG: unable to handle kernel NULL pointer dereference at (null)
  IP: parport_detach+0x1d/0x60 [pps_parport]
  Oops: 0000 [#1] SMP
  ...
  Call Trace:
    parport_unregister_driver+0x65/0xc0 [parport]
    SyS_delete_module+0x187/0x210

The sequence that builds up to this is:

 1) plip is loaded and takes the parport device for exclusive use:

    plip0: Parallel port at 0x378, using IRQ 7.

 2) pps_parport then fails to grab the device:

    pps_parport: parallel port PPS client
    parport0: cannot grant exclusive access for device pps_parport
    pps_parport: couldn't register with parport0

 3) rmmod of pps_parport is then killed because it tries to access
    pardev->name, but pardev (taken from port->cad) is NULL.

So add a check for NULL in the test there too.

Link: http://lkml.kernel.org/r/20160714115245.12651-1-jslaby@suse.cz
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Acked-by: Rodolfo Giometti <giometti@enneenne.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
wzyy2 pushed a commit that referenced this issue Nov 8, 2016
[ Upstream commit d3e6952 ]

I ran into this:

    kasan: CONFIG_KASAN_INLINE enabled
    kasan: GPF could be caused by NULL-ptr deref or user memory access
    general protection fault: 0000 [#1] PREEMPT SMP KASAN
    CPU: 2 PID: 2012 Comm: trinity-c3 Not tainted 4.7.0-rc7+ #19
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
    task: ffff8800b745f2c0 ti: ffff880111740000 task.ti: ffff880111740000
    RIP: 0010:[<ffffffff82bbf066>]  [<ffffffff82bbf066>] irttp_connect_request+0x36/0x710
    RSP: 0018:ffff880111747bb8  EFLAGS: 00010286
    RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000069dd8358
    RDX: 0000000000000009 RSI: 0000000000000027 RDI: 0000000000000048
    RBP: ffff880111747c00 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000069dd8358 R11: 1ffffffff0759723 R12: 0000000000000000
    R13: ffff88011a7e4780 R14: 0000000000000027 R15: 0000000000000000
    FS:  00007fc738404700(0000) GS:ffff88011af00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007fc737fdfb10 CR3: 0000000118087000 CR4: 00000000000006e0
    Stack:
     0000000000000200 ffff880111747bd8 ffffffff810ee611 ffff880119f1f220
     ffff880119f1f4f8 ffff880119f1f4f0 ffff88011a7e4780 ffff880119f1f232
     ffff880119f1f220 ffff880111747d58 ffffffff82bca542 0000000000000000
    Call Trace:
     [<ffffffff82bca542>] irda_connect+0x562/0x1190
     [<ffffffff825ae582>] SYSC_connect+0x202/0x2a0
     [<ffffffff825b4489>] SyS_connect+0x9/0x10
     [<ffffffff8100334c>] do_syscall_64+0x19c/0x410
     [<ffffffff83295ca5>] entry_SYSCALL64_slow_path+0x25/0x25
    Code: 41 89 ca 48 89 e5 41 57 41 56 41 55 41 54 41 89 d7 53 48 89 fb 48 83 c7 48 48 89 fa 41 89 f6 48 c1 ea 03 48 83 ec 20 4c 8b 65 10 <0f> b6 04 02 84 c0 74 08 84 c0 0f 8e 4c 04 00 00 80 7b 48 00 74
    RIP  [<ffffffff82bbf066>] irttp_connect_request+0x36/0x710
     RSP <ffff880111747bb8>
    ---[ end trace 4cda2588bc055b30 ]---

The problem is that irda_open_tsap() can fail and leave self->tsap = NULL,
and then irttp_connect_request() almost immediately dereferences it.

Cc: stable@vger.kernel.org
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
wzyy2 pushed a commit that referenced this issue Nov 8, 2016
commit ed1f0ee upstream.

Add device ID 0x0a04 for Haswell-ULT to the list of devices with MCH
problems.

From a Lenovo ThinkPad T440S:
[    0.188604] pnp: PnP ACPI init
[    0.189044] system 00:00: [mem 0x00000000-0x0009ffff] could not be reserved
[    0.189048] system 00:00: [mem 0x000c0000-0x000c3fff] could not be reserved
[    0.189050] system 00:00: [mem 0x000c4000-0x000c7fff] could not be reserved
[    0.189052] system 00:00: [mem 0x000c8000-0x000cbfff] could not be reserved
[    0.189054] system 00:00: [mem 0x000cc000-0x000cffff] could not be reserved
[    0.189056] system 00:00: [mem 0x000d0000-0x000d3fff] has been reserved
[    0.189058] system 00:00: [mem 0x000d4000-0x000d7fff] has been reserved
[    0.189060] system 00:00: [mem 0x000d8000-0x000dbfff] has been reserved
[    0.189061] system 00:00: [mem 0x000dc000-0x000dffff] has been reserved
[    0.189063] system 00:00: [mem 0x000e0000-0x000e3fff] could not be reserved
[    0.189065] system 00:00: [mem 0x000e4000-0x000e7fff] could not be reserved
[    0.189067] system 00:00: [mem 0x000e8000-0x000ebfff] could not be reserved
[    0.189069] system 00:00: [mem 0x000ec000-0x000effff] could not be reserved
[    0.189071] system 00:00: [mem 0x000f0000-0x000fffff] could not be reserved
[    0.189073] system 00:00: [mem 0x00100000-0xdf9fffff] could not be reserved
[    0.189075] system 00:00: [mem 0xfec00000-0xfed3ffff] could not be reserved
[    0.189078] system 00:00: [mem 0xfed4c000-0xffffffff] could not be reserved
[    0.189082] system 00:00: Plug and Play ACPI device, IDs PNP0c01 (active)
[    0.189216] system 00:01: [io  0x1800-0x189f] could not be reserved
[    0.189220] system 00:01: [io  0x0800-0x087f] has been reserved
[    0.189222] system 00:01: [io  0x0880-0x08ff] has been reserved
[    0.189224] system 00:01: [io  0x0900-0x097f] has been reserved
[    0.189226] system 00:01: [io  0x0980-0x09ff] has been reserved
[    0.189229] system 00:01: [io  0x0a00-0x0a7f] has been reserved
[    0.189231] system 00:01: [io  0x0a80-0x0aff] has been reserved
[    0.189233] system 00:01: [io  0x0b00-0x0b7f] has been reserved
[    0.189235] system 00:01: [io  0x0b80-0x0bff] has been reserved
[    0.189238] system 00:01: [io  0x15e0-0x15ef] has been reserved
[    0.189240] system 00:01: [io  0x1600-0x167f] has been reserved
[    0.189242] system 00:01: [io  0x1640-0x165f] has been reserved
[    0.189246] system 00:01: [mem 0xf8000000-0xfbffffff] could not be reserved
[    0.189249] system 00:01: [mem 0x00000000-0x00000fff] could not be reserved
[    0.189251] system 00:01: [mem 0xfed1c000-0xfed1ffff] has been reserved
[    0.189254] system 00:01: [mem 0xfed10000-0xfed13fff] has been reserved
[    0.189256] system 00:01: [mem 0xfed18000-0xfed18fff] has been reserved
[    0.189258] system 00:01: [mem 0xfed19000-0xfed19fff] has been reserved
[    0.189261] system 00:01: [mem 0xfed45000-0xfed4bfff] has been reserved
[    0.189264] system 00:01: Plug and Play ACPI device, IDs PNP0c02 (active)
[....]
[    0.583653] resource sanity check: requesting [mem 0xfed10000-0xfed15fff], which spans more than pnp 00:01 [mem 0xfed10000-0xfed13fff]
[    0.583654] ------------[ cut here ]------------
[    0.583660] WARNING: CPU: 0 PID: 1 at arch/x86/mm/ioremap.c:198 __ioremap_caller+0x2c5/0x380()
[    0.583661] Info: mapping multiple BARs. Your kernel is fine.
[    0.583662] Modules linked in:

[    0.583666] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.3.3-303.fc23.x86_64 #1
[    0.583668] Hardware name: LENOVO 20AR001GXS/20AR001GXS, BIOS GJET86WW (2.36 ) 12/04/2015
[    0.583670]  0000000000000000 0000000014cf7e59 ffff880214a1baf8 ffffffff813a625f
[    0.583673]  ffff880214a1bb40 ffff880214a1bb30 ffffffff810a07c2 00000000fed10000
[    0.583675]  ffffc90000cb8000 0000000000006000 0000000000000000 ffff8800d6381040
[    0.583678] Call Trace:
[    0.583683]  [<ffffffff813a625f>] dump_stack+0x44/0x55
[    0.583686]  [<ffffffff810a07c2>] warn_slowpath_common+0x82/0xc0
[    0.583688]  [<ffffffff810a085c>] warn_slowpath_fmt+0x5c/0x80
[    0.583692]  [<ffffffff810a6fba>] ? iomem_map_sanity_check+0xba/0xd0
[    0.583695]  [<ffffffff81065835>] __ioremap_caller+0x2c5/0x380
[    0.583698]  [<ffffffff81065907>] ioremap_nocache+0x17/0x20
[    0.583701]  [<ffffffff8103a119>] snb_uncore_imc_init_box+0x79/0xb0
[    0.583705]  [<ffffffff81038900>] uncore_pci_probe+0xd0/0x1b0
[    0.583707]  [<ffffffff813efda5>] local_pci_probe+0x45/0xa0
[    0.583710]  [<ffffffff813f118d>] pci_device_probe+0xfd/0x140
[    0.583713]  [<ffffffff814d9b52>] driver_probe_device+0x222/0x480
[    0.583715]  [<ffffffff814d9e34>] __driver_attach+0x84/0x90
[    0.583717]  [<ffffffff814d9db0>] ? driver_probe_device+0x480/0x480
[    0.583720]  [<ffffffff814d762c>] bus_for_each_dev+0x6c/0xc0
[    0.583722]  [<ffffffff814d930e>] driver_attach+0x1e/0x20
[    0.583724]  [<ffffffff814d8e4b>] bus_add_driver+0x1eb/0x280
[    0.583727]  [<ffffffff81d6af1a>] ? uncore_cpu_setup+0x12/0x12
[    0.583729]  [<ffffffff814da680>] driver_register+0x60/0xe0
[    0.583733]  [<ffffffff813ef78c>] __pci_register_driver+0x4c/0x50
[    0.583736]  [<ffffffff81d6affc>] intel_uncore_init+0xe2/0x2e6
[    0.583738]  [<ffffffff81d6af1a>] ? uncore_cpu_setup+0x12/0x12
[    0.583741]  [<ffffffff81002123>] do_one_initcall+0xb3/0x200
[    0.583745]  [<ffffffff810be500>] ? parse_args+0x1a0/0x4a0
[    0.583749]  [<ffffffff81d5c1c8>] kernel_init_freeable+0x189/0x223
[    0.583752]  [<ffffffff81775c40>] ? rest_init+0x80/0x80
[    0.583754]  [<ffffffff81775c4e>] kernel_init+0xe/0xe0
[    0.583758]  [<ffffffff81781adf>] ret_from_fork+0x3f/0x70
[    0.583760]  [<ffffffff81775c40>] ? rest_init+0x80/0x80
[    0.583765] ---[ end trace 077c426a39e018aa ]---

00:00.0 Host bridge [0600]: Intel Corporation Haswell-ULT DRAM Controller [8086:0a04] (rev 0b)
	Subsystem: Lenovo Device [17aa:220c]
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-
	Latency: 0
	Capabilities: <access denied>
	Kernel driver in use: hsw_uncore

Link: https://bugzilla.redhat.com/show_bug.cgi?id=1300955
Tested-by: <robo@tcp.sk>
Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Laura Abbott <labbott@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
wzyy2 pushed a commit that referenced this issue Nov 8, 2016
commit a77060f upstream.

Add device ID 0x1604 for Broadwell to commit cb171f7 ("PNP:
Work around BIOS defects in Intel MCH area reporting").

>From a Lenovo ThinkPad T550:

  system 00:01: [io  0x1800-0x189f] could not be reserved
  system 00:01: [io  0x0800-0x087f] has been reserved
  system 00:01: [io  0x0880-0x08ff] has been reserved
  system 00:01: [io  0x0900-0x097f] has been reserved
  system 00:01: [io  0x0980-0x09ff] has been reserved
  system 00:01: [io  0x0a00-0x0a7f] has been reserved
  system 00:01: [io  0x0a80-0x0aff] has been reserved
  system 00:01: [io  0x0b00-0x0b7f] has been reserved
  system 00:01: [io  0x0b80-0x0bff] has been reserved
  system 00:01: [io  0x15e0-0x15ef] has been reserved
  system 00:01: [io  0x1600-0x167f] has been reserved
  system 00:01: [io  0x1640-0x165f] has been reserved
  system 00:01: [mem 0xf8000000-0xfbffffff] could not be reserved
  system 00:01: [mem 0xfed1c000-0xfed1ffff] has been reserved
  system 00:01: [mem 0xfed10000-0xfed13fff] has been reserved
  system 00:01: [mem 0xfed18000-0xfed18fff] has been reserved
  system 00:01: [mem 0xfed19000-0xfed19fff] has been reserved
  system 00:01: [mem 0xfed45000-0xfed4bfff] has been reserved
  system 00:01: Plug and Play ACPI device, IDs PNP0c02 (active)
  [...]
  resource sanity check: requesting [mem 0xfed10000-0xfed15fff], which spans more than pnp 00:01 [mem 0xfed10000-0xfed13fff]
  ------------[ cut here ]------------
  WARNING: CPU: 2 PID: 1 at /build/linux-CrHvZ_/linux-4.2.6/arch/x86/mm/ioremap.c:198 __ioremap_caller+0x2ee/0x360()
  Info: mapping multiple BARs. Your kernel is fine.
  Modules linked in:
  CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.2.0-1-amd64 #1 Debian 4.2.6-1
  Hardware name: LENOVO 20CKCTO1WW/20CKCTO1WW, BIOS N11ET34W (1.10 ) 08/20/2015
   0000000000000000 ffffffff817e6868 ffffffff8154e2f6 ffff8802241efbf8
   ffffffff8106e5b1 ffffc90000e98000 0000000000006000 ffffc90000e98000
   0000000000006000 0000000000000000 ffffffff8106e62a ffffffff817e68c8
  Call Trace:
   [<ffffffff8154e2f6>] ? dump_stack+0x40/0x50
   [<ffffffff8106e5b1>] ? warn_slowpath_common+0x81/0xb0
   [<ffffffff8106e62a>] ? warn_slowpath_fmt+0x4a/0x50
   [<ffffffff810742a3>] ? iomem_map_sanity_check+0xb3/0xc0
   [<ffffffff8105dade>] ? __ioremap_caller+0x2ee/0x360
   [<ffffffff81036ae6>] ? snb_uncore_imc_init_box+0x66/0x90
   [<ffffffff810351a8>] ? uncore_pci_probe+0xc8/0x1a0
   [<ffffffff81302d7f>] ? local_pci_probe+0x3f/0xa0
   [<ffffffff81303ea4>] ? pci_device_probe+0xc4/0x110
   [<ffffffff813d9b1e>] ? driver_probe_device+0x1ee/0x450
   [<ffffffff813d9dfb>] ? __driver_attach+0x7b/0x80
   [<ffffffff813d9d80>] ? driver_probe_device+0x450/0x450
   [<ffffffff813d796a>] ? bus_for_each_dev+0x5a/0x90
   [<ffffffff813d9091>] ? bus_add_driver+0x1f1/0x290
   [<ffffffff81b37fa8>] ? uncore_cpu_setup+0xc/0xc
   [<ffffffff813da73f>] ? driver_register+0x5f/0xe0
   [<ffffffff81b38074>] ? intel_uncore_init+0xcc/0x2b0
   [<ffffffff81b37fa8>] ? uncore_cpu_setup+0xc/0xc
   [<ffffffff8100213e>] ? do_one_initcall+0xce/0x200
   [<ffffffff8108a100>] ? parse_args+0x140/0x4e0
   [<ffffffff81b2b0cb>] ? kernel_init_freeable+0x162/0x1e8
   [<ffffffff815443f0>] ? rest_init+0x80/0x80
   [<ffffffff815443fe>] ? kernel_init+0xe/0xf0
   [<ffffffff81553e5f>] ? ret_from_fork+0x3f/0x70
   [<ffffffff815443f0>] ? rest_init+0x80/0x80
  ---[ end trace 472e7959536abf12 ]---

  00:00.0 Host bridge: Intel Corporation Broadwell-U Host Bridge -OPI (rev 09)
          Subsystem: Lenovo Device 2223
          Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
          Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-
          Latency: 0
          Capabilities: [e0] Vendor Specific Information: Len=0c <?>
          Kernel driver in use: bdw_uncore
  00: 86 80 04 16 06 00 90 20 09 00 00 06 00 00 00 00
  10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  20: 00 00 00 00 00 00 00 00 00 00 00 00 aa 17 23 22
  30: 00 00 00 00 e0 00 00 00 00 00 00 00 00 00 00 00

Signed-off-by: Christophe Le Roy <christophe.fish@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Laura Abbott <labbott@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
wzyy2 pushed a commit that referenced this issue Nov 8, 2016
commit edfe63e upstream.

A Xorg failure on qemu32 was reported as a regression [1] caused by
commit 9cd25aa ("x86/mm/pat: Emulate PAT when it is disabled").

This patch fixes the Xorg crash.

Negative effects of this regression were the following two failures [2]
in Xorg on QEMU with QEMU CPU model "qemu32" (-cpu qemu32), which were
triggered by the fact that its virtual CPU does not support MTRRs.

 #1. copy_process() failed in the check in reserve_pfn_range()

    copy_process
     copy_mm
      dup_mm
       dup_mmap
        copy_page_range
         track_pfn_copy
          reserve_pfn_range

 A WC map request was tracked as WC in memtype, which set a PTE as
 UC (pgprot) per __cachemode2pte_tbl[].  This led to this error in
 reserve_pfn_range() called from track_pfn_copy(), which obtained
 a pgprot from a PTE.  It converts pgprot to page_cache_mode, which
 does not necessarily result in the original page_cache_mode since
 __cachemode2pte_tbl[] redirects multiple types to UC.

 #2. error path in copy_process() then hit WARN_ON_ONCE in
     untrack_pfn().

     x86/PAT: Xorg:509 map pfn expected mapping type uncached-
     minus for [mem 0xfd000000-0xfdffffff], got write-combining
      Call Trace:
     dump_stack
     warn_slowpath_common
     ? untrack_pfn
     ? untrack_pfn
     warn_slowpath_null
     untrack_pfn
     ? __kunmap_atomic
     unmap_single_vma
     ? pagevec_move_tail_fn
     unmap_vmas
     exit_mmap
     mmput
     copy_process.part.47
     _do_fork
     SyS_clone
     do_syscall_32_irqs_on
     entry_INT80_32

These negative effects are caused by two separate bugs, but they
can be addressed in separate patches.  Fixing the pat_init() issue
described below addresses the root cause, and avoids Xorg to hit
these cases.

When the CPU does not support MTRRs, MTRR does not call pat_init(),
which leaves PAT enabled without initializing PAT.  This pat_init()
issue is a long-standing issue, but manifested as issue #1 (and then
hit issue #2) with the above-mentioned commit because the memtype
now tracks cache attribute with 'page_cache_mode'.

This pat_init() issue existed before the commit, but we used pgprot
in memtype.  Hence, we did not have issue #1 before.  But WC request
resulted in WT in effect because WC pgrot is actually WT when PAT
is not initialized.  This is not how it was designed to work.  When
PAT is set to disable properly, WC is converted to UC.  The use of
WT can result in a system crash if the target range does not support
WT.  Fortunately, nobody ran into such issue before.

To fix this pat_init() issue, PAT code has been enhanced to provide
pat_disable() interface.  Call this interface when MTRRs are disabled.
By setting PAT to disable properly, PAT bypasses the memtype check,
and avoids issue #1.

  [1]: https://lkml.org/lkml/2016/3/3/828
  [2]: https://lkml.org/lkml/2016/3/4/775

Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: elliott@hpe.com
Cc: konrad.wilk@oracle.com
Cc: paul.gortmaker@windriver.com
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1458769323-24491-5-git-send-email-toshi.kani@hpe.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
wzyy2 pushed a commit that referenced this issue Nov 8, 2016
1. "dm: optimize use SRCU and RCU" removes the use of dm_table_put.
2. "dm: remove request-based logic from make_request_fn wrapper" necessitates
    calling dm_setup_md_queue or else the request_queue's make_request_fn
    pointer ends being unset.

[    7.711600] Internal error: Oops - bad mode: 0 [#1] PREEMPT SMP
[    7.717519] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G        W       4.1.15-02273-gb057d16-dirty #33
[    7.726559] Hardware name: HiKey Development Board (DT)
[    7.731779] task: ffffffc005f8acc0 ti: ffffffc005f8c000 task.ti: ffffffc005f8c000
[    7.739257] PC is at 0x0
[    7.741787] LR is at generic_make_request+0x8c/0x108
....
[    9.082931] Call trace:
[    9.085372] [<          (null)>]           (null)
[    9.090074] [<ffffffc0003f4ac0>] submit_bio+0x98/0x1e0
[    9.095212] [<ffffffc0001e2618>] _submit_bh+0x120/0x1f0
[    9.096165] cfg80211: Calling CRDA to update world regulatory domain
[    9.106781] [<ffffffc0001e5450>] __bread_gfp+0x94/0x114
[    9.112004] [<ffffffc00024a748>] ext4_fill_super+0x18c/0x2d64
[    9.117750] [<ffffffc0001b275c>] mount_bdev+0x194/0x1c0
[    9.122973] [<ffffffc0002450dc>] ext4_mount+0x14/0x1c
[    9.128021] [<ffffffc0001b29a0>] mount_fs+0x3c/0x194
[    9.132985] [<ffffffc0001d059c>] vfs_kern_mount+0x4c/0x134
[    9.138467] [<ffffffc0001d2168>] do_mount+0x204/0xbbc
[    9.143514] [<ffffffc0001d2ec4>] SyS_mount+0x94/0xe8
[    9.148479] [<ffffffc000c54074>] mount_block_root+0x120/0x24c
[    9.154222] [<ffffffc000c543e8>] mount_root+0x110/0x12c
[    9.159443] [<ffffffc000c54574>] prepare_namespace+0x170/0x1b8
[    9.165273] [<ffffffc000c53d98>] kernel_init_freeable+0x23c/0x260
[    9.171365] [<ffffffc0009b1748>] kernel_init+0x10/0x118
[    9.176589] Code: bad PC value
[    9.179807] ---[ end trace 75e1bc52ba364d13 ]---

Bug: 27175947

Signed-off-by: Badhri Jagan Sridharan <Badhri@google.com>
Change-Id: I952d86fd1475f0825f9be1386e3497b36127abd0
wzyy2 pushed a commit that referenced this issue Nov 8, 2016
There may be a race condition if f_fs calls unregister_gadget_item in
ffs_closed() when unregister_gadget is called by UDC store at the same time.
this leads to a kernel NULL pointer dereference:

[  310.644928] Unable to handle kernel NULL pointer dereference at virtual address 00000004
[  310.645053] init: Service 'adbd' is being killed...
[  310.658938] pgd = c9528000
[  310.662515] [00000004] *pgd=19451831, *pte=00000000, *ppte=00000000
[  310.669702] Internal error: Oops: 817 [#1] PREEMPT SMP ARM
[  310.675211] Modules linked in:
[  310.678294] CPU: 0 PID: 1537 Comm: ->transport Not tainted 4.1.15-03725-g793404c #2
[  310.685958] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[  310.692493] task: c8e24200 ti: c945e000 task.ti: c945e000
[  310.697911] PC is at usb_gadget_unregister_driver+0xb4/0xd0
[  310.703502] LR is at __mutex_lock_slowpath+0x10c/0x16c
[  310.708648] pc : [<c075efc0>]    lr : [<c0bfb0bc>]    psr: 600f0113
<snip..>
[  311.565585] [<c075efc0>] (usb_gadget_unregister_driver) from [<c075e2b8>] (unregister_gadget_item+0x1c/0x34)
[  311.575426] [<c075e2b8>] (unregister_gadget_item) from [<c076fcc8>] (ffs_closed+0x8c/0x9c)
[  311.583702] [<c076fcc8>] (ffs_closed) from [<c07736b8>] (ffs_data_reset+0xc/0xa0)
[  311.591194] [<c07736b8>] (ffs_data_reset) from [<c07738ac>] (ffs_data_closed+0x90/0xd0)
[  311.599208] [<c07738ac>] (ffs_data_closed) from [<c07738f8>] (ffs_ep0_release+0xc/0x14)
[  311.607224] [<c07738f8>] (ffs_ep0_release) from [<c023e030>] (__fput+0x80/0x1d0)
[  311.614635] [<c023e030>] (__fput) from [<c014e688>] (task_work_run+0xb0/0xe8)
[  311.621788] [<c014e688>] (task_work_run) from [<c010afdc>] (do_work_pending+0x7c/0xa4)
[  311.629718] [<c010afdc>] (do_work_pending) from [<c010770c>] (work_pending+0xc/0x20)

for functions using functionFS, i.e. android adbd will close /dev/usb-ffs/adb/ep0
when usb IO thread fails, but switch adb from on to off also triggers write
"none" > UDC. These 2 operations both call unregister_gadget, which will lead
to the panic above.

add a mutex before calling unregister_gadget for api used in f_fs.

Signed-off-by: Winter Wang <wente.wang@nxp.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
wzyy2 pushed a commit that referenced this issue Nov 8, 2016
commit bd975d1 upstream.

The secmech hmac(md5) structures are present in the TCP_Server_Info
struct and can be shared among multiple CIFS sessions.  However, the
server mutex is not currently held when these structures are allocated
and used, which can lead to a kernel crashes, as in the scenario below:

mount.cifs(8) #1				mount.cifs(8) #2

Is secmech.sdeschmaccmd5 allocated?
// false

						Is secmech.sdeschmaccmd5 allocated?
						// false

secmech.hmacmd = crypto_alloc_shash..
secmech.sdeschmaccmd5 = kzalloc..
sdeschmaccmd5->shash.tfm = &secmec.hmacmd;

						secmech.sdeschmaccmd5 = kzalloc
						// sdeschmaccmd5->shash.tfm
						// not yet assigned

crypto_shash_update()
 deref NULL sdeschmaccmd5->shash.tfm

 Unable to handle kernel paging request at virtual address 00000030
 epc   : 8027ba34 crypto_shash_update+0x38/0x158
 ra    : 8020f2e8 setup_ntlmv2_rsp+0x4bc/0xa84
 Call Trace:
  crypto_shash_update+0x38/0x158
  setup_ntlmv2_rsp+0x4bc/0xa84
  build_ntlmssp_auth_blob+0xbc/0x34c
  sess_auth_rawntlmssp_authenticate+0xac/0x248
  CIFS_SessSetup+0xf0/0x178
  cifs_setup_session+0x4c/0x84
  cifs_get_smb_ses+0x2c8/0x314
  cifs_mount+0x38c/0x76c
  cifs_do_mount+0x98/0x440
  mount_fs+0x20/0xc0
  vfs_kern_mount+0x58/0x138
  do_mount+0x1e8/0xccc
  SyS_mount+0x88/0xd4
  syscall_common+0x30/0x54

Fix this by locking the srv_mutex around the code which uses these
hmac(md5) structures.  All the other secmech algos already have similar
locking.

Fixes: 95dc8dd ("Limit allocation of crypto mechanisms to dialect which requires")
Signed-off-by: Rabin Vincent <rabinv@axis.com>
Acked-by: Sachin Prabhu <sprabhu@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
wzyy2 pushed a commit that referenced this issue Nov 8, 2016
…mgmt()

commit 05a0587 upstream.

The lpfc_sli4_scmd_to_wqidx_distr() function expects the scsi_cmnd
'lpfc_cmd->pCmd' not to be null, and point to the midlayer command.

That's not true in the .eh_(device|target|bus)_reset_handler path,
because lpfc_send_taskmgmt() sends commands not from the midlayer, so
does not set 'lpfc_cmd->pCmd'.

That is true in the .queuecommand path because lpfc_queuecommand()
stores the scsi_cmnd from midlayer in lpfc_cmd->pCmd; and lpfc_cmd is
stored by lpfc_scsi_prep_cmnd() in piocbq->context1 -- which is passed
to lpfc_sli4_scmd_to_wqidx_distr() as lpfc_cmd parameter.

This problem can be hit on SCSI EH, and immediately with sg_reset.
These 2 test-cases demonstrate the problem/fix with next-20160601.

Test-case 1) sg_reset

    # strace sg_reset --device /dev/sdm
    <...>
    open("/dev/sdm", O_RDWR|O_NONBLOCK)     = 3
    ioctl(3, SG_SCSI_RESET, 0x3fffde6d0994 <unfinished ...>
    +++ killed by SIGSEGV +++
    Segmentation fault

    # dmesg
    Unable to handle kernel paging request for data at address 0x00000000
    Faulting instruction address: 0xd00000001c88442c
    Oops: Kernel access of bad area, sig: 11 [#1]
    <...>
    CPU: 104 PID: 16333 Comm: sg_reset Tainted: G        W       4.7.0-rc1-next-20160601-00004-g95b89dc #6
    <...>
    NIP [d00000001c88442c] lpfc_sli4_scmd_to_wqidx_distr+0xc/0xd0 [lpfc]
    LR [d00000001c826fe8] lpfc_sli_calc_ring.part.27+0x98/0xd0 [lpfc]
    Call Trace:
    [c000003c9ec876f0] [c000003c9ec87770] 0xc000003c9ec87770 (unreliable)
    [c000003c9ec87720] [d00000001c82e004] lpfc_sli_issue_iocb+0xd4/0x260 [lpfc]
    [c000003c9ec87780] [d00000001c831a3c] lpfc_sli_issue_iocb_wait+0x15c/0x5b0 [lpfc]
    [c000003c9ec87880] [d00000001c87f27c] lpfc_send_taskmgmt+0x24c/0x650 [lpfc]
    [c000003c9ec87950] [d00000001c87fd7c] lpfc_device_reset_handler+0x10c/0x200 [lpfc]
    [c000003c9ec87a10] [c000000000610694] scsi_try_bus_device_reset+0x44/0xc0
    [c000003c9ec87a40] [c0000000006113e8] scsi_ioctl_reset+0x198/0x2c0
    [c000003c9ec87bf0] [c00000000060fe5c] scsi_ioctl+0x13c/0x4b0
    [c000003c9ec87c80] [c0000000006629b0] sd_ioctl+0xf0/0x120
    [c000003c9ec87cd0] [c00000000046e4f8] blkdev_ioctl+0x248/0xb70
    [c000003c9ec87d30] [c0000000002a1f60] block_ioctl+0x70/0x90
    [c000003c9ec87d50] [c00000000026d334] do_vfs_ioctl+0xc4/0x890
    [c000003c9ec87de0] [c00000000026db60] SyS_ioctl+0x60/0xc0
    [c000003c9ec87e30] [c000000000009120] system_call+0x38/0x108
    Instruction dump:
    <...>

    With fix:

    # strace sg_reset --device /dev/sdm
    <...>
    open("/dev/sdm", O_RDWR|O_NONBLOCK)     = 3
    ioctl(3, SG_SCSI_RESET, 0x3fffe103c554) = 0
    close(3)                                = 0
    exit_group(0)                           = ?
    +++ exited with 0 +++

    # dmesg
    [  424.658649] lpfc 0006:01:00.4: 4:(0):0713 SCSI layer issued Device Reset (1, 0) return x2002

Test-case 2) SCSI EH

    Using this debug patch to wire an SCSI EH trigger, for lpfc_scsi_cmd_iocb_cmpl():
    -       cmd->scsi_done(cmd);
    +       if ((phba->pport ? phba->pport->cfg_log_verbose : phba->cfg_log_verbose) == 0x32100000)
    +               printk(KERN_ALERT "lpfc: skip scsi_done()\n");
    +       else
    +               cmd->scsi_done(cmd);

    # echo 0x32100000 > /sys/class/scsi_host/host11/lpfc_log_verbose

    # dd if=/dev/sdm of=/dev/null iflag=direct &
    <...>

    After a while:

    # dmesg
    lpfc 0006:01:00.4: 4:(0):3053 lpfc_log_verbose changed from 0 (x0) to 839909376 (x32100000)
    lpfc: skip scsi_done()
    <...>
    Unable to handle kernel paging request for data at address 0x00000000
    Faulting instruction address: 0xd0000000199e448c
    Oops: Kernel access of bad area, sig: 11 [#1]
    <...>
    CPU: 96 PID: 28556 Comm: scsi_eh_11 Tainted: G        W       4.7.0-rc1-next-20160601-00004-g95b89dc #6
    <...>
    NIP [d0000000199e448c] lpfc_sli4_scmd_to_wqidx_distr+0xc/0xd0 [lpfc]
    LR [d000000019986fe8] lpfc_sli_calc_ring.part.27+0x98/0xd0 [lpfc]
    Call Trace:
    [c000000ff0d0b890] [c000000ff0d0b900] 0xc000000ff0d0b900 (unreliable)
    [c000000ff0d0b8c0] [d00000001998e004] lpfc_sli_issue_iocb+0xd4/0x260 [lpfc]
    [c000000ff0d0b920] [d000000019991a3c] lpfc_sli_issue_iocb_wait+0x15c/0x5b0 [lpfc]
    [c000000ff0d0ba20] [d0000000199df27c] lpfc_send_taskmgmt+0x24c/0x650 [lpfc]
    [c000000ff0d0baf0] [d0000000199dfd7c] lpfc_device_reset_handler+0x10c/0x200 [lpfc]
    [c000000ff0d0bbb0] [c000000000610694] scsi_try_bus_device_reset+0x44/0xc0
    [c000000ff0d0bbe0] [c0000000006126cc] scsi_eh_ready_devs+0x49c/0x9c0
    [c000000ff0d0bcb0] [c000000000614160] scsi_error_handler+0x580/0x680
    [c000000ff0d0bd80] [c0000000000ae848] kthread+0x108/0x130
    [c000000ff0d0be30] [c0000000000094a8] ret_from_kernel_thread+0x5c/0xb4
    Instruction dump:
    <...>

    With fix:

    # dmesg
    lpfc 0006:01:00.4: 4:(0):3053 lpfc_log_verbose changed from 0 (x0) to 839909376 (x32100000)
    lpfc: skip scsi_done()
    <...>
    lpfc 0006:01:00.4: 4:(0):0713 SCSI layer issued Device Reset (0, 0) return x2002
    <...>
    lpfc 0006:01:00.4: 4:(0):0723 SCSI layer issued Target Reset (1, 0) return x2002
    <...>
    lpfc 0006:01:00.4: 4:(0):0714 SCSI layer issued Bus Reset Data: x2002
    <...>
    lpfc 0006:01:00.4: 4:(0):3172 SCSI layer issued Host Reset Data:
    <...>

Fixes: 8b0dff1 ("lpfc: Add support for using block multi-queue")
Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
wzyy2 pushed a commit that referenced this issue Nov 8, 2016
commit 88716a9 upstream.

After a device is disconnected, xhci_stop_device() will be invoked
in xhci_bus_suspend().
Also the "disconnect" IRQ will have ISR to invoke
xhci_free_virt_device() in this sequence.
xhci_irq -> xhci_handle_event -> handle_cmd_completion ->
xhci_handle_cmd_disable_slot -> xhci_free_virt_device

If xhci->devs[slot_id] has been assigned to NULL in
xhci_free_virt_device(), then virt_dev->eps[i].ring in
xhci_stop_device() may point to an invlid address to cause kernel
panic.

virt_dev = xhci->devs[slot_id];
:
if (virt_dev->eps[i].ring && virt_dev->eps[i].ring->dequeue)

[] Unable to handle kernel paging request at virtual address 00001a68
[] pgd=ffffffc001430000
[] [00001a68] *pgd=000000013c807003, *pud=000000013c807003,
*pmd=000000013c808003, *pte=0000000000000000
[] Internal error: Oops: 96000006 [#1] PREEMPT SMP
[] CPU: 0 PID: 39 Comm: kworker/0:1 Tainted: G     U
[] Workqueue: pm pm_runtime_work
[] task: ffffffc0bc0e0bc0 ti: ffffffc0bc0ec000 task.ti:
ffffffc0bc0ec000
[] PC is at xhci_stop_device.constprop.11+0xb4/0x1a4

This issue is found when running with realtek ethernet device
(0bda:8153).

Signed-off-by: Jim Lin <jilin@nvidia.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
hejiawencc referenced this issue in LubanCat/kernel Dec 14, 2022
|BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:768
|in_atomic(): 1, irqs_disabled(): 0, pid: 26, name: rcuos/2
|2 locks held by rcuos/2/26:
| #0:  (rcu_callback){.+.+..}, at: [<ffffffff810b1a12>] rcu_nocb_kthread+0x1e2/0x380
| #1:  (rcu_read_lock_sched){.+.+..}, at: [<ffffffff812acd26>] percpu_ref_kill_rcu+0xa6/0x1c0
|Preemption disabled at:[<ffffffff810b1a93>] rcu_nocb_kthread+0x263/0x380
|Call Trace:
| [<ffffffff81582e9e>] dump_stack+0x4e/0x9c
| [<ffffffff81077aeb>] __might_sleep+0xfb/0x170
| [<ffffffff81589304>] rt_spin_lock+0x24/0x70
| [<ffffffff811c5790>] free_ioctx_users+0x30/0x130
| [<ffffffff812ace34>] percpu_ref_kill_rcu+0x1b4/0x1c0
| [<ffffffff810b1a93>] rcu_nocb_kthread+0x263/0x380
| [<ffffffff8106e046>] kthread+0xd6/0xf0
| [<ffffffff81591eec>] ret_from_fork+0x7c/0xb0

replace this preempt_disable() friendly swork.

Reported-By: Mike Galbraith <umgwanakikbuti@gmail.com>
Suggested-by: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
hejiawencc referenced this issue in LubanCat/kernel Dec 14, 2022
With RT_FULL we get the below wreckage:

[  126.060484] =======================================================
[  126.060486] [ INFO: possible circular locking dependency detected ]
[  126.060489] 3.0.1-rt10+ rockchip-linux#30
[  126.060490] -------------------------------------------------------
[  126.060492] irq/24-eth0/1235 is trying to acquire lock:
[  126.060495]  (&(lock)->wait_lock#2){+.+...}, at: [<ffffffff81501c81>] rt_mutex_slowunlock+0x16/0x55
[  126.060503]
[  126.060504] but task is already holding lock:
[  126.060506]  (&p->pi_lock){-...-.}, at: [<ffffffff81074fdc>] try_to_wake_up+0x35/0x429
[  126.060511]
[  126.060511] which lock already depends on the new lock.
[  126.060513]
[  126.060514]
[  126.060514] the existing dependency chain (in reverse order) is:
[  126.060516]
[  126.060516] -> #1 (&p->pi_lock){-...-.}:
[  126.060519]        [<ffffffff810afe9e>] lock_acquire+0x145/0x18a
[  126.060524]        [<ffffffff8150291e>] _raw_spin_lock_irqsave+0x4b/0x85
[  126.060527]        [<ffffffff810b5aa4>] task_blocks_on_rt_mutex+0x36/0x20f
[  126.060531]        [<ffffffff815019bb>] rt_mutex_slowlock+0xd1/0x15a
[  126.060534]        [<ffffffff81501ae3>] rt_mutex_lock+0x2d/0x2f
[  126.060537]        [<ffffffff810d9020>] rcu_boost+0xad/0xde
[  126.060541]        [<ffffffff810d90ce>] rcu_boost_kthread+0x7d/0x9b
[  126.060544]        [<ffffffff8109a760>] kthread+0x99/0xa1
[  126.060547]        [<ffffffff81509b14>] kernel_thread_helper+0x4/0x10
[  126.060551]
[  126.060552] -> #0 (&(lock)->wait_lock#2){+.+...}:
[  126.060555]        [<ffffffff810af1b8>] __lock_acquire+0x1157/0x1816
[  126.060558]        [<ffffffff810afe9e>] lock_acquire+0x145/0x18a
[  126.060561]        [<ffffffff8150279e>] _raw_spin_lock+0x40/0x73
[  126.060564]        [<ffffffff81501c81>] rt_mutex_slowunlock+0x16/0x55
[  126.060566]        [<ffffffff81501ce7>] rt_mutex_unlock+0x27/0x29
[  126.060569]        [<ffffffff810d9f86>] rcu_read_unlock_special+0x17e/0x1c4
[  126.060573]        [<ffffffff810da014>] __rcu_read_unlock+0x48/0x89
[  126.060576]        [<ffffffff8106847a>] select_task_rq_rt+0xc7/0xd5
[  126.060580]        [<ffffffff8107511c>] try_to_wake_up+0x175/0x429
[  126.060583]        [<ffffffff81075425>] wake_up_process+0x15/0x17
[  126.060585]        [<ffffffff81080a51>] wakeup_softirqd+0x24/0x26
[  126.060590]        [<ffffffff81081df9>] irq_exit+0x49/0x55
[  126.060593]        [<ffffffff8150a3bd>] smp_apic_timer_interrupt+0x8a/0x98
[  126.060597]        [<ffffffff81509793>] apic_timer_interrupt+0x13/0x20
[  126.060600]        [<ffffffff810d5952>] irq_forced_thread_fn+0x1b/0x44
[  126.060603]        [<ffffffff810d582c>] irq_thread+0xde/0x1af
[  126.060606]        [<ffffffff8109a760>] kthread+0x99/0xa1
[  126.060608]        [<ffffffff81509b14>] kernel_thread_helper+0x4/0x10
[  126.060611]
[  126.060612] other info that might help us debug this:
[  126.060614]
[  126.060615]  Possible unsafe locking scenario:
[  126.060616]
[  126.060617]        CPU0                    CPU1
[  126.060619]        ----                    ----
[  126.060620]   lock(&p->pi_lock);
[  126.060623]                                lock(&(lock)->wait_lock);
[  126.060625]                                lock(&p->pi_lock);
[  126.060627]   lock(&(lock)->wait_lock);
[  126.060629]
[  126.060629]  *** DEADLOCK ***
[  126.060630]
[  126.060632] 1 lock held by irq/24-eth0/1235:
[  126.060633]  #0:  (&p->pi_lock){-...-.}, at: [<ffffffff81074fdc>] try_to_wake_up+0x35/0x429
[  126.060638]
[  126.060638] stack backtrace:
[  126.060641] Pid: 1235, comm: irq/24-eth0 Not tainted 3.0.1-rt10+ rockchip-linux#30
[  126.060643] Call Trace:
[  126.060644]  <IRQ>  [<ffffffff810acbde>] print_circular_bug+0x289/0x29a
[  126.060651]  [<ffffffff810af1b8>] __lock_acquire+0x1157/0x1816
[  126.060655]  [<ffffffff810ab3aa>] ? trace_hardirqs_off_caller+0x1f/0x99
[  126.060658]  [<ffffffff81501c81>] ? rt_mutex_slowunlock+0x16/0x55
[  126.060661]  [<ffffffff810afe9e>] lock_acquire+0x145/0x18a
[  126.060664]  [<ffffffff81501c81>] ? rt_mutex_slowunlock+0x16/0x55
[  126.060668]  [<ffffffff8150279e>] _raw_spin_lock+0x40/0x73
[  126.060671]  [<ffffffff81501c81>] ? rt_mutex_slowunlock+0x16/0x55
[  126.060674]  [<ffffffff810d9655>] ? rcu_report_qs_rsp+0x87/0x8c
[  126.060677]  [<ffffffff81501c81>] rt_mutex_slowunlock+0x16/0x55
[  126.060680]  [<ffffffff810d9ea3>] ? rcu_read_unlock_special+0x9b/0x1c4
[  126.060683]  [<ffffffff81501ce7>] rt_mutex_unlock+0x27/0x29
[  126.060687]  [<ffffffff810d9f86>] rcu_read_unlock_special+0x17e/0x1c4
[  126.060690]  [<ffffffff810da014>] __rcu_read_unlock+0x48/0x89
[  126.060693]  [<ffffffff8106847a>] select_task_rq_rt+0xc7/0xd5
[  126.060696]  [<ffffffff810683da>] ? select_task_rq_rt+0x27/0xd5
[  126.060701]  [<ffffffff810a852a>] ? clockevents_program_event+0x8e/0x90
[  126.060704]  [<ffffffff8107511c>] try_to_wake_up+0x175/0x429
[  126.060708]  [<ffffffff810a95dc>] ? tick_program_event+0x1f/0x21
[  126.060711]  [<ffffffff81075425>] wake_up_process+0x15/0x17
[  126.060715]  [<ffffffff81080a51>] wakeup_softirqd+0x24/0x26
[  126.060718]  [<ffffffff81081df9>] irq_exit+0x49/0x55
[  126.060721]  [<ffffffff8150a3bd>] smp_apic_timer_interrupt+0x8a/0x98
[  126.060724]  [<ffffffff81509793>] apic_timer_interrupt+0x13/0x20
[  126.060726]  <EOI>  [<ffffffff81072855>] ? migrate_disable+0x75/0x12d
[  126.060733]  [<ffffffff81080a61>] ? local_bh_disable+0xe/0x1f
[  126.060736]  [<ffffffff81080a70>] ? local_bh_disable+0x1d/0x1f
[  126.060739]  [<ffffffff810d5952>] irq_forced_thread_fn+0x1b/0x44
[  126.060742]  [<ffffffff81502ac0>] ? _raw_spin_unlock_irq+0x3b/0x59
[  126.060745]  [<ffffffff810d582c>] irq_thread+0xde/0x1af
[  126.060748]  [<ffffffff810d5937>] ? irq_thread_fn+0x3a/0x3a
[  126.060751]  [<ffffffff810d574e>] ? irq_finalize_oneshot+0xd1/0xd1
[  126.060754]  [<ffffffff810d574e>] ? irq_finalize_oneshot+0xd1/0xd1
[  126.060757]  [<ffffffff8109a760>] kthread+0x99/0xa1
[  126.060761]  [<ffffffff81509b14>] kernel_thread_helper+0x4/0x10
[  126.060764]  [<ffffffff81069ed7>] ? finish_task_switch+0x87/0x10a
[  126.060768]  [<ffffffff81502ec4>] ? retint_restore_args+0xe/0xe
[  126.060771]  [<ffffffff8109a6c7>] ? __init_kthread_worker+0x8c/0x8c
[  126.060774]  [<ffffffff81509b10>] ? gs_change+0xb/0xb

Because irq_exit() does:

void irq_exit(void)
{
	account_system_vtime(current);
	trace_hardirq_exit();
	sub_preempt_count(IRQ_EXIT_OFFSET);
	if (!in_interrupt() && local_softirq_pending())
		invoke_softirq();

	...
}

Which triggers a wakeup, which uses RCU, now if the interrupted task has
t->rcu_read_unlock_special set, the rcu usage from the wakeup will end
up in rcu_read_unlock_special(). rcu_read_unlock_special() will test
for in_irq(), which will fail as we just decremented preempt_count
with IRQ_EXIT_OFFSET, and in_sering_softirq(), which for
PREEMPT_RT_FULL reads:

int in_serving_softirq(void)
{
	int res;

	preempt_disable();
	res = __get_cpu_var(local_softirq_runner) == current;
	preempt_enable();
	return res;
}

Which will thus also fail, resulting in the above wreckage.

The 'somewhat' ugly solution is to open-code the preempt_count() test
in rcu_read_unlock_special().

Also, we're not at all sure how ->rcu_read_unlock_special gets set
here... so this is very likely a bandaid and more thought is required.

Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
hejiawencc referenced this issue in LubanCat/kernel Dec 14, 2022
…ntext

| BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:914
| in_atomic(): 1, irqs_disabled(): 0, pid: 255, name: kworker/u257:6
| 5 locks held by kworker/u257:6/255:
|  #0:  ("events_unbound"){.+.+.+}, at: [<ffffffff8108edf1>] process_one_work+0x171/0x5e0
|  #1:  ((&entry->work)){+.+.+.}, at: [<ffffffff8108edf1>] process_one_work+0x171/0x5e0
|  #2:  (&shost->scan_mutex){+.+.+.}, at: [<ffffffffa000faa3>] __scsi_add_device+0xa3/0x130 [scsi_mod]
|  rockchip-linux#3:  (&set->tag_list_lock){+.+...}, at: [<ffffffff812f09fa>] blk_mq_init_queue+0x96a/0xa50
|  rockchip-linux#4:  (rcu_read_lock_sched){......}, at: [<ffffffff8132887d>] percpu_ref_kill_and_confirm+0x1d/0x120
| Preemption disabled at:[<ffffffff812eff76>] blk_mq_freeze_queue_start+0x56/0x70
|
| CPU: 2 PID: 255 Comm: kworker/u257:6 Not tainted 3.18.7-rt0+ #1
| Workqueue: events_unbound async_run_entry_fn
|  0000000000000003 ffff8800bc29f998 ffffffff815b3a12 0000000000000000
|  0000000000000000 ffff8800bc29f9b8 ffffffff8109aa16 ffff8800bc29fa28
|  ffff8800bc5d1bc8 ffff8800bc29f9e8 ffffffff815b8dd4 ffff880000000000
| Call Trace:
|  [<ffffffff815b3a12>] dump_stack+0x4f/0x7c
|  [<ffffffff8109aa16>] __might_sleep+0x116/0x190
|  [<ffffffff815b8dd4>] rt_spin_lock+0x24/0x60
|  [<ffffffff810b6089>] __wake_up+0x29/0x60
|  [<ffffffff812ee06e>] blk_mq_usage_counter_release+0x1e/0x20
|  [<ffffffff81328966>] percpu_ref_kill_and_confirm+0x106/0x120
|  [<ffffffff812eff76>] blk_mq_freeze_queue_start+0x56/0x70
|  [<ffffffff812f0000>] blk_mq_update_tag_set_depth+0x40/0xd0
|  [<ffffffff812f0a1c>] blk_mq_init_queue+0x98c/0xa50
|  [<ffffffffa000dcf0>] scsi_mq_alloc_queue+0x20/0x60 [scsi_mod]
|  [<ffffffffa000ea35>] scsi_alloc_sdev+0x2f5/0x370 [scsi_mod]
|  [<ffffffffa000f494>] scsi_probe_and_add_lun+0x9e4/0xdd0 [scsi_mod]
|  [<ffffffffa000fb26>] __scsi_add_device+0x126/0x130 [scsi_mod]
|  [<ffffffffa013033f>] ata_scsi_scan_host+0xaf/0x200 [libata]
|  [<ffffffffa012b5b6>] async_port_probe+0x46/0x60 [libata]
|  [<ffffffff810978fb>] async_run_entry_fn+0x3b/0xf0
|  [<ffffffff8108ee81>] process_one_work+0x201/0x5e0

percpu_ref_kill_and_confirm() invokes blk_mq_usage_counter_release() in
a rcu-sched region. swait based wake queue can't be used due to
wake_up_all() usage and disabled interrupts in !RT configs (as reported
by Corey Minyard).
The wq_has_sleeper() check has been suggested by Peter Zijlstra.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
hejiawencc referenced this issue in LubanCat/kernel Dec 14, 2022
…local lock

| BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:974
| in_atomic(): 1, irqs_disabled(): 0, pid: 1401, name: cryptomgr_test
| Preemption disabled at:
| [<ffff00000849941c>] scomp_acomp_comp_decomp+0x34/0x1a0
| CPU: 21 PID: 1401 Comm: cryptomgr_test Tainted: G        W        4.16.18-rt9-rt #1
| Hardware name: www.cavium.com crb-1s/crb-1s, BIOS 0.3 Apr 25 2017
| Call trace:
|  dump_backtrace+0x0/0x1c8
|  show_stack+0x24/0x30
|  dump_stack+0xac/0xe8
|  ___might_sleep+0x124/0x188
|  rt_spin_lock+0x40/0x88
|  zip_load_instr+0x44/0x170 [thunderx_zip]
|  zip_deflate+0x184/0x378 [thunderx_zip]
|  zip_compress+0xb0/0x130 [thunderx_zip]
|  zip_scomp_compress+0x48/0x60 [thunderx_zip]
|  scomp_acomp_comp_decomp+0xd8/0x1a0
|  scomp_acomp_compress+0x24/0x30
|  test_acomp+0x15c/0x558
|  alg_test_comp+0xc0/0x128
|  alg_test.part.6+0x120/0x2c0
|  alg_test+0x6c/0xa0
|  cryptomgr_test+0x50/0x58
|  kthread+0x134/0x138
|  ret_from_fork+0x10/0x18

Mainline disables preemption to serialize percpu scratch buffer access,
causing the splat above.  Serialize with a local lock for RT instead.

Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
hejiawencc referenced this issue in LubanCat/kernel Dec 14, 2022
To avoid:
|BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:914
|in_atomic(): 1, irqs_disabled(): 0, pid: 92, name: rcuc/11
|2 locks held by rcuc/11/92:
| #0:  (rcu_callback){......}, at: [<ffffffff810e037e>] rcu_cpu_kthread+0x3de/0x940
| #1:  (rcu_read_lock_sched){......}, at: [<ffffffff81328390>] percpu_ref_call_confirm_rcu+0x0/0xd0
|Preemption disabled at:[<ffffffff813284e2>] percpu_ref_switch_to_atomic_rcu+0x82/0xc0
|CPU: 11 PID: 92 Comm: rcuc/11 Not tainted 3.18.7-rt0+ #1
| ffff8802398cdf80 ffff880235f0bc28 ffffffff815b3a12 0000000000000000
| 0000000000000000 ffff880235f0bc48 ffffffff8109aa16 0000000000000000
| ffff8802398cdf80 ffff880235f0bc78 ffffffff815b8dd4 000000000000df80
|Call Trace:
| [<ffffffff815b3a12>] dump_stack+0x4f/0x7c
| [<ffffffff8109aa16>] __might_sleep+0x116/0x190
| [<ffffffff815b8dd4>] rt_spin_lock+0x24/0x60
| [<ffffffff8108d2cd>] queue_work_on+0x6d/0x1d0
| [<ffffffff8110c881>] css_release+0x81/0x90
| [<ffffffff8132844e>] percpu_ref_call_confirm_rcu+0xbe/0xd0
| [<ffffffff813284e2>] percpu_ref_switch_to_atomic_rcu+0x82/0xc0
| [<ffffffff810e03e5>] rcu_cpu_kthread+0x445/0x940
| [<ffffffff81098a2d>] smpboot_thread_fn+0x18d/0x2d0
| [<ffffffff810948d8>] kthread+0xe8/0x100
| [<ffffffff815b9c3c>] ret_from_fork+0x7c/0xb0

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
hejiawencc referenced this issue in LubanCat/kernel Dec 14, 2022
[ Upstream commit 4e4cf4be79635e67144632d9135286381acbc95a ]

Zswap causes "BUG: scheduling while atomic" by blocking on a rt_spin_lock() with
preemption disabled. The preemption is disabled by get_cpu_var() in
zswap_frontswap_store() to protect the access of the zswap_dstmem percpu variable.

Use get_locked_var() to protect the percpu zswap_dstmem variable, making the
code preemptive.

As get_cpu_ptr() also disables preemption, replace it by this_cpu_ptr() and
remove the counterpart put_cpu_ptr().

Steps to Reproduce:

    1. # grubby --args "zswap.enabled=1" --update-kernel DEFAULT
    2. # reboot
    3. Calculate the amount o memory to be used by the test:
       ---> grep MemAvailable /proc/meminfo
       ---> Add 25% ~ 50% to that value
    4. # stress --vm 1 --vm-bytes ${MemAvailable+25%} --timeout 240s

Usually, in less than 5 minutes the backtrace listed below appears, followed
by a kernel panic:

| BUG: scheduling while atomic: kswapd1/181/0x00000002
|
| Preemption disabled at:
| [<ffffffff8b2a6cda>] zswap_frontswap_store+0x21a/0x6e1
|
| Kernel panic - not syncing: scheduling while atomic
| CPU: 14 PID: 181 Comm: kswapd1 Kdump: loaded Not tainted 5.0.14-rt9 #1
| Hardware name: AMD Pence/Pence, BIOS WPN2321X_Weekly_12_03_21 03/19/2012
| Call Trace:
|  panic+0x106/0x2a7
|  __schedule_bug.cold+0x3f/0x51
|  __schedule+0x5cb/0x6f0
|  schedule+0x43/0xd0
|  rt_spin_lock_slowlock_locked+0x114/0x2b0
|  rt_spin_lock_slowlock+0x51/0x80
|  zbud_alloc+0x1da/0x2d0
|  zswap_frontswap_store+0x31a/0x6e1
|  __frontswap_store+0xab/0x130
|  swap_writepage+0x39/0x70
|  pageout.isra.0+0xe3/0x320
|  shrink_page_list+0xa8e/0xd10
|  shrink_inactive_list+0x251/0x840
|  shrink_node_memcg+0x213/0x770
|  shrink_node+0xd9/0x450
|  balance_pgdat+0x2d5/0x510
|  kswapd+0x218/0x470
|  kthread+0xfb/0x130
|  ret_from_fork+0x27/0x50

Cc: stable-rt@vger.kernel.org
Reported-by: Ping Fang <pifang@redhat.com>
Signed-off-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Reviewed-by: Daniel Bristot de Oliveira <bristot@redhat.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
hejiawencc referenced this issue in LubanCat/kernel Dec 14, 2022
[ Upstream commit 1a142116f6435ef070ecebb66d2d599507c10601 ]

|BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:768
|in_atomic(): 1, irqs_disabled(): 0, pid: 26, name: rcuos/2
|2 locks held by rcuos/2/26:
| #0:  (rcu_callback){.+.+..}, at: [<ffffffff810b1a12>] rcu_nocb_kthread+0x1e2/0x380
| #1:  (rcu_read_lock_sched){.+.+..}, at: [<ffffffff812acd26>] percpu_ref_kill_rcu+0xa6/0x1c0
|Preemption disabled at:[<ffffffff810b1a93>] rcu_nocb_kthread+0x263/0x380
|Call Trace:
| [<ffffffff81582e9e>] dump_stack+0x4e/0x9c
| [<ffffffff81077aeb>] __might_sleep+0xfb/0x170
| [<ffffffff81589304>] rt_spin_lock+0x24/0x70
| [<ffffffff811c5790>] free_ioctx_users+0x30/0x130
| [<ffffffff812ace34>] percpu_ref_kill_rcu+0x1b4/0x1c0
| [<ffffffff810b1a93>] rcu_nocb_kthread+0x263/0x380
| [<ffffffff8106e046>] kthread+0xd6/0xf0
| [<ffffffff81591eec>] ret_from_fork+0x7c/0xb0

replace this preempt_disable() friendly swork.

Reported-By: Mike Galbraith <umgwanakikbuti@gmail.com>
Suggested-by: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
hejiawencc referenced this issue in LubanCat/kernel Dec 14, 2022
…ntext

[ Upstream commit 61c928ecf4fe200bda9b49a0813b5ba0f43995b5 ]

| BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:914
| in_atomic(): 1, irqs_disabled(): 0, pid: 255, name: kworker/u257:6
| 5 locks held by kworker/u257:6/255:
|  #0:  ("events_unbound"){.+.+.+}, at: [<ffffffff8108edf1>] process_one_work+0x171/0x5e0
|  #1:  ((&entry->work)){+.+.+.}, at: [<ffffffff8108edf1>] process_one_work+0x171/0x5e0
|  #2:  (&shost->scan_mutex){+.+.+.}, at: [<ffffffffa000faa3>] __scsi_add_device+0xa3/0x130 [scsi_mod]
|  rockchip-linux#3:  (&set->tag_list_lock){+.+...}, at: [<ffffffff812f09fa>] blk_mq_init_queue+0x96a/0xa50
|  rockchip-linux#4:  (rcu_read_lock_sched){......}, at: [<ffffffff8132887d>] percpu_ref_kill_and_confirm+0x1d/0x120
| Preemption disabled at:[<ffffffff812eff76>] blk_mq_freeze_queue_start+0x56/0x70
|
| CPU: 2 PID: 255 Comm: kworker/u257:6 Not tainted 3.18.7-rt0+ #1
| Workqueue: events_unbound async_run_entry_fn
|  0000000000000003 ffff8800bc29f998 ffffffff815b3a12 0000000000000000
|  0000000000000000 ffff8800bc29f9b8 ffffffff8109aa16 ffff8800bc29fa28
|  ffff8800bc5d1bc8 ffff8800bc29f9e8 ffffffff815b8dd4 ffff880000000000
| Call Trace:
|  [<ffffffff815b3a12>] dump_stack+0x4f/0x7c
|  [<ffffffff8109aa16>] __might_sleep+0x116/0x190
|  [<ffffffff815b8dd4>] rt_spin_lock+0x24/0x60
|  [<ffffffff810b6089>] __wake_up+0x29/0x60
|  [<ffffffff812ee06e>] blk_mq_usage_counter_release+0x1e/0x20
|  [<ffffffff81328966>] percpu_ref_kill_and_confirm+0x106/0x120
|  [<ffffffff812eff76>] blk_mq_freeze_queue_start+0x56/0x70
|  [<ffffffff812f0000>] blk_mq_update_tag_set_depth+0x40/0xd0
|  [<ffffffff812f0a1c>] blk_mq_init_queue+0x98c/0xa50
|  [<ffffffffa000dcf0>] scsi_mq_alloc_queue+0x20/0x60 [scsi_mod]
|  [<ffffffffa000ea35>] scsi_alloc_sdev+0x2f5/0x370 [scsi_mod]
|  [<ffffffffa000f494>] scsi_probe_and_add_lun+0x9e4/0xdd0 [scsi_mod]
|  [<ffffffffa000fb26>] __scsi_add_device+0x126/0x130 [scsi_mod]
|  [<ffffffffa013033f>] ata_scsi_scan_host+0xaf/0x200 [libata]
|  [<ffffffffa012b5b6>] async_port_probe+0x46/0x60 [libata]
|  [<ffffffff810978fb>] async_run_entry_fn+0x3b/0xf0
|  [<ffffffff8108ee81>] process_one_work+0x201/0x5e0

percpu_ref_kill_and_confirm() invokes blk_mq_usage_counter_release() in
a rcu-sched region. swait based wake queue can't be used due to
wake_up_all() usage and disabled interrupts in !RT configs (as reported
by Corey Minyard).
The wq_has_sleeper() check has been suggested by Peter Zijlstra.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
FanX-Tek pushed a commit to FanX-Tek/kernel that referenced this issue Jan 3, 2023
Add "mem=0x40000000" to COMMAND LINE, kernel will crash:

[   18.379633][   T35] Unable to handle kernel paging request at virtual address ffffffff01d7c000
[   18.380412][   T35] Mem abort info:
[   18.380726][   T35]   ESR = 0x96000006
[   18.381060][   T35]   EC = 0x25: DABT (current EL), IL = 32 bits
[   18.381587][   T35]   SET = 0, FnV = 0
[   18.381920][   T35]   EA = 0, S1PTW = 0
[   18.382264][   T35] Data abort info:
[   18.382585][   T35]   ISV = 0, ISS = 0x00000006
[   18.383029][   T35]   CM = 0, WnR = 0
[   18.383360][   T35] swapper pgtable: 4k pages, 39-bit VAs, pgdp=0000000001c47000
[   18.384014][   T35] [ffffffff01d7c000] pgd=0000000040f2f003, p4d=0000000040f2f003, pud=0000000040f2f003, pmd=0000000000000000
[   18.385299][   T35] Internal error: Oops: 96000006 [rockchip-linux#1] PREEMPT SMP
[   18.385862][   T35] Modules linked in:
[   18.386205][   T35] CPU: 2 PID: 35 Comm: kworker/2:1 Not tainted 5.10.110 rockchip-linux#324
[   18.386843][   T35] Hardware name: Rockchip RK3568 EVB1 DDR4 V10 Board (DT)
[   18.387470][   T35] Workqueue: events rockchip_drm_fb_destroy_work
[   18.388026][   T35] pstate: 80c00009 (Nzcv daif +PAN +UAO -TCO BTYPE=--)
[   18.388626][   T35] pc : rockchip_free_loader_memory+0x100/0x1a0
[   18.389156][   T35] lr : rockchip_free_loader_memory+0xc8/0x1a0
[   18.389683][   T35] sp : ffffffc0125abd00
[   18.390039][   T35] x29: ffffffc0125abd00 x28: 0000000000001000
[   18.390570][   T35] x27: ffffff807df00000 x26: 0000008000000000
[   18.391099][   T35] x25: fffffffeffe00000 x24: 0000000000000001
[   18.391628][   T35] x23: ffffff80043d6300 x22: ffffff807df00000
[   18.392157][   T35] x21: ffffff807dfb7000 x20: ffffff80043a8480
[   18.392686][   T35] x19: ffffffff01d7c000 x18: 0000000000000000
[   18.393215][   T35] x17: 0000000000000000 x16: 0000000000000000
[   18.393744][   T35] x15: 00000072e03fbb78 x14: 0000000000000000
[   18.394273][   T35] x13: 000000000000003d x12: 0000000000000000
[   18.394802][   T35] x11: 00000000fffffff6 x10: ffffffc0122f8000
[   18.395330][   T35] x9 : 0000000000010000 x8 : 0000000000000008
[   18.395859][   T35] x7 : 0000000000818000 x6 : ffffff801ad909f0
[   18.396387][   T35] x5 : 0000000000000039 x4 : ffffffc0122d01b8
[   18.396917][   T35] x3 : 0000000000000000 x2 : 0000000000000000
[   18.397445][   T35] x1 : ffffff80032622d8 x0 : 0000000000001000
[   18.397975][   T35] Call trace:
[   18.398255][   T35]  rockchip_free_loader_memory+0x100/0x1a0
[   18.398762][   T35]  __rockchip_drm_fb_destroy+0xa4/0xc8
[   18.399233][   T35]  rockchip_drm_fb_destroy_work+0x14/0x20
[   18.399728][   T35]  process_one_work+0x1b0/0x490
[   18.400142][   T35]  worker_thread+0x4c/0x3f8
[   18.400534][   T35]  kthread+0x140/0x160
[   18.400882][   T35]  ret_from_fork+0x10/0x30

Logo buffer allocated from Uboot which may removed in kernel for any
purpose such as "debug". So, before freeing to the buddy system, we
must check if the buffer are valid.

Change-Id: I9a27c4552a3c7f450b034d2a6c6c3a321bb2fc54
Signed-off-by: Simon Xue <xxm@rock-chips.com>
FanX-Tek pushed a commit to FanX-Tek/kernel that referenced this issue Jan 3, 2023
Unable to handle kernel access to user memory outside uaccess routines at virtual address 0000000000000830
Mem abort info:
  ESR = 0x96000005
  Exception class = DABT (current EL), IL = 32 bits
  SET = 0, FnV = 0
  EA = 0, S1PTW = 0
Data abort info:
  ISV = 0, ISS = 0x00000005
  CM = 0, WnR = 0
user pgtable: 4k pages, 39-bit VAs, pgdp = 00000000aaff8145
[0000000000000830] pgd=000000007a5c9003, pud=000000007a5c9003, pmd=0000000000000000
Internal error: Oops: 96000005 [rockchip-linux#1] PREEMPT SMP
Modules linked in:
Process init (pid: 1, stack limit = 0x00000000722656e6)
CPU: 3 PID: 1 Comm: init Not tainted 4.19.232 rockchip-linux#72
Hardware name: Rockchip RK3528 EVB1 DDR4 V10 Board (DT)
pstate: 60400005 (nZCv daif +PAN -UAO)
pc : __ll_sc___cmpxchg_case_acq_8+0x4/0x20
lr : mutex_lock+0x18/0x40
sp : ffffff800802bc60
x29: ffffff800802bc60 x28: ffffffc07c8e0000
x27: 0000000000000000 x26: 0000000000000000
x25: ffffff8009510d78 x24: ffffff8009cde040
x23: ffffffc07c9e3890 x22: ffffff8009d89000
x21: ffffff8009b73b30 x20: 0000000000000000
x19: 0000000000000830 x18: ffffffffffffffff
x17: 0000000000000000 x16: 0000000000000002
x15: ffffff80095ed278 x14: ffffff8089ce4196
x13: ffffff8009ce41b0 x12: ffffff8009ce45d0
x11: ffffff8009adf3b8 x10: 0000000005f5e0ff
x9 : 00000000ffffffd0 x8 : 6873203a73756c70
x7 : 0000000000000058 x6 : 0000000000000001
x5 : 0000000000000005 x4 : 0000000000000004
x3 : 0000000000000830 x2 : ffffffc07c8e0000
x1 : 0000000000000000 x0 : 0000000000000830
Call trace:
 __ll_sc___cmpxchg_case_acq_8+0x4/0x20
 rockchip_tve_shutdown+0x1c/0x38
 platform_drv_shutdown+0x20/0x30
 device_shutdown+0x118/0x200
 kernel_restart_prepare+0x34/0x40
 kernel_restart+0x14/0x78
 __se_sys_reboot+0x1cc/0x208
 __arm64_sys_reboot+0x18/0x20
 el0_svc_common.constprop.0+0x64/0x178
 el0_svc_handler+0x28/0x78
 el0_svc+0x8/0xc

Signed-off-by: Tao Huang <huangtao@rock-chips.com>
Change-Id: Ic1e85b963548e5e66ef78fae5af8dcc12e928fa2
hejiawencc referenced this issue in LubanCat/kernel Sep 5, 2023
Unable to handle kernel NULL pointer dereference at virtual address 00000080
pgd = 5be93016
[00000080] *pgd=00000000
Internal error: Oops: 5 [#1] PREEMPT SMP ARM
Modules linked in:
CPU: 3 PID: 58 Comm: kworker/3:1 Not tainted 4.19.111 rockchip-linux#9
Hardware name: Generic DT based system
PC is at snd_soc_add_dai_controls+0x24/0x40
LR is at   (null)
pc : [<b0692590>]    lr : [<00000000>]    psr: 20000053
sp : ee117d58  ip : 00000000  fp : ddb2d540
r10: ddbd1c40  r9 : 00000000  r8 : ddb35b80
r7 : ddb35940  r6 : ddb65080  r5 : 00000002  r4 : eeb39410
r3 : 00000001  r2 : b0d464e4  r1 : eeb39410  r0 : ddb65080
Flags: nzCv  IRQs on  FIQs off  Mode SVC_32  ISA ARM  Segment user

Change-Id: I0571e1a0554f11af62fab3572fcb11f299626be6
Signed-off-by: Sugar Zhang <sugar.zhang@rock-chips.com>
hejiawencc referenced this issue in LubanCat/kernel Sep 5, 2023
The virtual address of allocated __GFP_HIGHMEM pages is not mapped
in large memeory aarch32 system, Using kmap to build it, or it will
failed like:

    Unable to handle kernel NULL pointer dereference at virtual address 00000000
    pgd = 50ec00c8
    [00000000] *pgd=00000000
    Internal error: Oops: 817 [#1] PREEMPT SMP ARM
    Modules linked in:
    CPU: 3 PID: 67 Comm: rkflash Not tainted 4.19.111 rockchip-linux#67
    Hardware name: Generic DT based system
    PC is at memcpy+0x50/0x330
    LR is at 0x61640a68

Change-Id: I0cde9012d29e49d9ba751cb019ccfa784c01b7c7
Signed-off-by: Jon Lin <jon.lin@rock-chips.com>
hejiawencc referenced this issue in LubanCat/kernel Sep 5, 2023
Unable to handle kernel access to user memory outside uaccess routines at virtual address 0000000000000830
Mem abort info:
  ESR = 0x96000005
  Exception class = DABT (current EL), IL = 32 bits
  SET = 0, FnV = 0
  EA = 0, S1PTW = 0
Data abort info:
  ISV = 0, ISS = 0x00000005
  CM = 0, WnR = 0
user pgtable: 4k pages, 39-bit VAs, pgdp = 00000000aaff8145
[0000000000000830] pgd=000000007a5c9003, pud=000000007a5c9003, pmd=0000000000000000
Internal error: Oops: 96000005 [#1] PREEMPT SMP
Modules linked in:
Process init (pid: 1, stack limit = 0x00000000722656e6)
CPU: 3 PID: 1 Comm: init Not tainted 4.19.232 rockchip-linux#72
Hardware name: Rockchip RK3528 EVB1 DDR4 V10 Board (DT)
pstate: 60400005 (nZCv daif +PAN -UAO)
pc : __ll_sc___cmpxchg_case_acq_8+0x4/0x20
lr : mutex_lock+0x18/0x40
sp : ffffff800802bc60
x29: ffffff800802bc60 x28: ffffffc07c8e0000
x27: 0000000000000000 x26: 0000000000000000
x25: ffffff8009510d78 x24: ffffff8009cde040
x23: ffffffc07c9e3890 x22: ffffff8009d89000
x21: ffffff8009b73b30 x20: 0000000000000000
x19: 0000000000000830 x18: ffffffffffffffff
x17: 0000000000000000 x16: 0000000000000002
x15: ffffff80095ed278 x14: ffffff8089ce4196
x13: ffffff8009ce41b0 x12: ffffff8009ce45d0
x11: ffffff8009adf3b8 x10: 0000000005f5e0ff
x9 : 00000000ffffffd0 x8 : 6873203a73756c70
x7 : 0000000000000058 x6 : 0000000000000001
x5 : 0000000000000005 x4 : 0000000000000004
x3 : 0000000000000830 x2 : ffffffc07c8e0000
x1 : 0000000000000000 x0 : 0000000000000830
Call trace:
 __ll_sc___cmpxchg_case_acq_8+0x4/0x20
 rockchip_tve_shutdown+0x1c/0x38
 platform_drv_shutdown+0x20/0x30
 device_shutdown+0x118/0x200
 kernel_restart_prepare+0x34/0x40
 kernel_restart+0x14/0x78
 __se_sys_reboot+0x1cc/0x208
 __arm64_sys_reboot+0x18/0x20
 el0_svc_common.constprop.0+0x64/0x178
 el0_svc_handler+0x28/0x78
 el0_svc+0x8/0xc

Signed-off-by: Tao Huang <huangtao@rock-chips.com>
Change-Id: Ic1e85b963548e5e66ef78fae5af8dcc12e928fa2
hejiawencc referenced this issue in LubanCat/kernel Sep 5, 2023
[    0.279514] Unable to handle kernel read from unreadable memory at
virtual address 0000000000000030
[    0.279659] device-mapper: uevent: version 1.0.3
[    0.296703] Mem abort info:
[    0.296704]   ESR = 0x96000005
[    0.296707]   Exception class = DABT (current EL), IL = 32 bits
[    0.296708]   SET = 0, FnV = 0
[    0.296710]   EA = 0, S1PTW = 0
[    0.296711] Data abort info:
[    0.296712]   ISV = 0, ISS = 0x00000005
[    0.296713]   CM = 0, WnR = 0
[    0.296715] [0000000000000030] user address but active_mm is swapper
[    0.296719] Internal error: Oops: 96000005 [#1] PREEMPT SMP
...
[    0.320746] Call trace:
[    0.320973]  drm_mode_object_get+0x0/0x50
[    0.321329]  try_to_bring_up_master+0x144/0x1b0
[    0.321737]  component_master_add_with_match+0xd4/0x130
[    0.322200]  rockchip_drm_platform_probe+0x178/0x318
[    0.322641]  platform_drv_probe+0x50/0xa8

Signed-off-by: Sandy Huang <hjc@rock-chips.com>
Change-Id: I958df008f9e2c4f427f9cd56b6f02c832637b7c2
hejiawencc referenced this issue in LubanCat/kernel Sep 5, 2023
Unable to handle kernel read from unreadable memory at virtual address 0000000000000000
Mem abort info:
  ESR = 0x96000005
  Exception class = DABT (current EL), IL = 32 bits
  SET = 0, FnV = 0
  EA = 0, S1PTW = 0
Data abort info:
  ISV = 0, ISS = 0x00000005
  CM = 0, WnR = 0
user pgtable: 4k pages, 39-bit VAs, pgdp =00000000c9324dfa
[0000000000000000] pgd=0000000000000000, pud=0000000000000000
Internal error: Oops: 96000005 [#1] PREEMPT SMP
Modules linked in: rk_vcodec bcmdhd [last unloaded: rk_vcodec]
Process queue_work0 (pid: 3128, stack limit = 0x00000000044997c1)
CPU: 2 PID: 3128 Comm: queue_work0 Not tainted 4.19.232 #439
Hardware name: Rockchip RK3528 EVB1 DDR4 V10 Board (DT)
pstate: a0400085 (NzCv daIf +PAN -UAO)
pc : __wake_up_common+0x58/0x170
lr : __wake_up_common_lock+0x90/0xe0
sp : ffffff801101bc50
x29: ffffff801101bc50 x28: 0000000000000000
x27: 0000000000000000 x26: 0000000000000003
x25: ffffffc043c62d70 x24: 0000000000000001
x23: ffffffc043c63908 x22: ffffff801101bcf0
x21: 0000000000000003 x20: ffffffffffffffe8
x19: 0000000000000000 x18: 0000000000000030
x17: 0000000000000000 x16: 0000000000000000
x15: ffffffffffffffff x14: 766365725f727369
x13: 5f6b6e696c5f6365 x12: 00000000000001d8
x11: 0000000000000000 x10: 0000000000000000
x9 : 0000000000000000 x8 : 0000000000000000
x7 : ffffffc07f75f340 x6 : 0000000000000000
x5 : ffffff801101bcf0 x4 : 0000000000000000
x3 : 0000000000000000 x2 : 0000000000000001
x1 : 0000000000000003 x0 : 0000000000000000
Call trace:
 __wake_up_common+0x58/0x170
 __wake_up_common_lock+0x90/0xe0
 __wake_up+0x14/0x20
 rkvdec2_link_worker+0xda4/0x12dc [rk_vcodec]
 kthread_worker_fn+0xbc/0x180
 kthread+0x12c/0x160
 ret_from_fork+0x10/0x20

Signed-off-by: Yandong Lin <yandong.lin@rock-chips.com>
Change-Id: I5129a1a52ffcf71775410c446ece4b5e85b812cc
hejiawencc referenced this issue in LubanCat/kernel Sep 5, 2023
The otg wakelock should be destroyed when the device probe failed
or removed, else may cause the following kernel errors.

  list_add corruption. next->prev should be prev (ffffffc01209d3c8), but
  was 0000000000000000. (next=ffffff800350faf8).
  ------------[ cut here ]------------
  kernel BUG at lib/list_debug.c:25!
  Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
  Modules linked in:
  CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.10.110 rockchip-linux#83
  Hardware name: Rockchip RK3562 EVB1 LP4X V10 Board (DT)
  pstate: 60400085 (nZCv daIf +PAN -UAO -TCO BTYPE=--)
  pc : __list_add_valid+0x6c/0x88
  lr : __list_add_valid+0x6c/0x88
  [...]
  Call trace:
   __list_add_valid+0x7c/0x98
   wakeup_source_register+0x120/0x160
   wakeup_source_register+0x120/0x160
   device_init_wakeup+0x60/0xf4
   [...]

So add devm action to fix it.

Signed-off-by: Frank Wang <frank.wang@rock-chips.com>
Change-Id: If54a299a694414ad759002e4f6c4187448ccdb15
hejiawencc referenced this issue in LubanCat/kernel Sep 5, 2023
…rate

If EP-IN is disabled (p_chmask = 0), the agdev->in_ep
isn't initialized, it shouldn't access the agdev->in_ep
when set/get sample rate, otherwise it will cause panic
with the following log :

Unable to handle kernel NULL pointer dereference at virtual address 00000021
pgd = 18db8ecb
[00000021] *pgd=00000000
Internal error: Oops: 17 [#1] PREEMPT SMP ARM
Modules linked in:
CPU: 0 PID: 503 Comm: irq/60-dwc3 Not tainted 4.19.111 rockchip-linux#166
Hardware name: Generic DT based system
PC is at uac_cs_attr_sample_rate+0x54/0x9c
LR is at 0xbb
...
[<b0652280>] (uac_cs_attr_sample_rate) from [<b060ffec>] (dwc3_gadget_giveback+0x34/0x40)
[<b060ffec>] (dwc3_gadget_giveback) from [<b06143f0>] (dwc3_ep0_interrupt+0x2d0/0x7f4)
[<b06143f0>] (dwc3_ep0_interrupt) from [<b0611b60>] (dwc3_thread_interrupt+0x860/0xfa8)
[<b0611b60>] (dwc3_thread_interrupt) from [<b016dfd8>] (irq_thread_fn+0x1c/0x7c)
[<b016dfd8>] (irq_thread_fn) from [<b016e2dc>] (irq_thread+0x124/0x200)
[<b016e2dc>] (irq_thread) from [<b014290c>] (kthread+0x140/0x170)
[<b014290c>] (kthread) from [<b01010d8>] (ret_from_fork+0x14/0x3c)

Signed-off-by: William Wu <william.wu@rock-chips.com>
Change-Id: Ib23128a5b3f9027e711b352a4befcc8be1a97721
hejiawencc referenced this issue in LubanCat/kernel Sep 5, 2023
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000080
Internal error: Oops: 96000005 [#1] PREEMPT SMP
Modules linked in: bcmdhd dhd_static_buf r8168
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.10.110 #470
Hardware name: Rockchip RK3588 EVB1 LP4 V10 Board (DT)
pstate: 20400009 (nzCv daif +PAN -UAO -TCO BTYPE=--)
pc : dmaengine_pcm_dma_complete+0x44/0xdc
lr : pl330_tasklet+0xd4/0x2d0
sp : ffffffc01249bdf0
x29: ffffffc01249bdf0 x28: 0000000000000000
x27: ffffff8102c8c608 x26: 0000000000000002
x25: ffffff8102c8c608 x24: ffffff8102d4c600
x23: ffffff81018868c0 x22: ffffff8102c8c618
x21: 0000000000000000 x20: ffffffc010f238ec
x19: ffffff8102c8c560 x18: ffffffc012491048
x17: 0000000000000000 x16: 0000000000000000
x15: 0000000000000000 x14: 000000000007fcc4
x13: 0000000000000004 x12: 0000000a7e22d2ff
x11: 0000000000004007 x10: 0000000000000000
x9 : 0000000000003000 x8 : ffffff8030b7d480
x7 : 000000b2b5593519 x6 : 00000000003033ff
x5 : 0000000000000000 x4 : 0000000000bb82b6
x3 : ffffff8102c90b28 x2 : 0000000000000001
x1 : 0000000000000000 x0 : ffffff8102d4c600
Call trace:
 dmaengine_pcm_dma_complete+0x44/0xdc
 pl330_tasklet+0xd4/0x2d0
 tasklet_action_common+0x11c/0x414
 tasklet_action+0x28/0x38
 _stext+0x108/0x408
 __irq_exit_rcu+0xc0/0xc4
 irq_exit+0x14/0x28
 __handle_domain_irq+0x84/0xd0
 gic_handle_irq+0x78/0x154
 el1_irq+0xe4/0x1c0
 cpuidle_enter_state+0x200/0x3b8
 cpuidle_enter+0x3c/0x58
 cpuidle_idle_call+0x158/0x238
 do_idle+0xac/0xfc
 cpu_startup_entry+0x28/0x2c
 rest_init+0xd8/0xec
 arch_call_rest_init+0x14/0x24
 start_kernel+0x3d8/0x500

Signed-off-by: Sugar Zhang <sugar.zhang@rock-chips.com>
Change-Id: Ib3d3ac842d3cd249c8cc6f516178031400dbc307
Joshua-Riek pushed a commit to Joshua-Riek/rockchip-kernel that referenced this issue Oct 21, 2023
The scmi node is required for px30s/rk3326s, but it bring in error
for px30/rk3326, so do not probe arm_scmi for px30/rk3326.

error log:
[   33.476288] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [swapper/0:1]
[   33.476942] Modules linked in:
[   33.477246]
[   33.477406] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.4.189 rockchip-linux#1
[   33.477953] Hardware name: Rockchip rk3326 evb lpddr3 v10 board for linux (DT)
[   33.478604] task: ffffffc0089e0000 task.stack: ffffffc0089e8000
[   33.479153] PC is at shmem_tx_prepare+0x24/0xb0
[   33.479569] LR is at smc_send_message+0x40/0xac
[   33.479981] pc : [<ffffff8008595df8>] lr : [<ffffff80085963a0>] pstate: 60000045
[   33.480639] sp : ffffffc0089eb9d0
[   33.480949] x29: ffffffc0089eb9d0 x28: ffffffc01ff7d800
[   33.481456] x27: 00000000024080c0 x26: ffffff800899679e
[   33.481964] x25: ffffff8008593cd8 x24: ffffffc01e0cc810
[   33.482471] x23: ffffffc01e24e268 x22: ffffffc01e239e28
[   33.482974] x21: ffffffc00896f028 x20: ffffff8008ee2000
[   33.483476] x19: ffffffc00896f028 x18: ffffffc01e239dca
[   33.483980] x17: 0000000000007cb0 x16: 0000000000001000
[   33.484488] x15: 0000000000000000 x14: 0000000000000000
[   33.484998] x13: 000000000000000a x12: 0000000000000000
[   33.485508] x11: 0000000000000008 x10: 0101010101010101
[   33.486018] x9 : ffffffffffffffff x8 : ffffffc01e239f00
[   33.486526] x7 : 0000000000000000 x6 : 0000000000000000
[   33.487035] x5 : 0000000000000000 x4 : 0000000000000001
[   33.487541] x3 : ffffffc00896f028 x2 : 0000000000000000
[   33.488047] x1 : 0000000000000000 x0 : ffffff8008ee2004
[   33.488556]

Change-Id: I0b804cda660625f27583d525c9a3ed364929d207
Signed-off-by: Liang Chen <cl@rock-chips.com>
Joshua-Riek pushed a commit to Joshua-Riek/rockchip-kernel that referenced this issue Oct 21, 2023
The otg wakelock should be destroyed when the device probe failed
or removed, else may cause the following kernel errors.

  list_add corruption. next->prev should be prev (ffffffc01209d3c8), but
  was 0000000000000000. (next=ffffff800350faf8).
  ------------[ cut here ]------------
  kernel BUG at lib/list_debug.c:25!
  Internal error: Oops - BUG: 0 [rockchip-linux#1] PREEMPT SMP
  Modules linked in:
  CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.10.110 rockchip-linux#83
  Hardware name: Rockchip RK3562 EVB1 LP4X V10 Board (DT)
  pstate: 60400085 (nZCv daIf +PAN -UAO -TCO BTYPE=--)
  pc : __list_add_valid+0x6c/0x88
  lr : __list_add_valid+0x6c/0x88
  [...]
  Call trace:
   __list_add_valid+0x7c/0x98
   wakeup_source_register+0x120/0x160
   wakeup_source_register+0x120/0x160
   device_init_wakeup+0x60/0xf4
   [...]

So add devm action to fix it.

Signed-off-by: Frank Wang <frank.wang@rock-chips.com>
Change-Id: If54a299a694414ad759002e4f6c4187448ccdb15
Joshua-Riek pushed a commit to Joshua-Riek/rockchip-kernel that referenced this issue Oct 21, 2023
…rate

If EP-IN is disabled (p_chmask = 0), the agdev->in_ep
isn't initialized, it shouldn't access the agdev->in_ep
when set/get sample rate, otherwise it will cause panic
with the following log :

Unable to handle kernel NULL pointer dereference at virtual address 00000021
pgd = 18db8ecb
[00000021] *pgd=00000000
Internal error: Oops: 17 [rockchip-linux#1] PREEMPT SMP ARM
Modules linked in:
CPU: 0 PID: 503 Comm: irq/60-dwc3 Not tainted 4.19.111 rockchip-linux#166
Hardware name: Generic DT based system
PC is at uac_cs_attr_sample_rate+0x54/0x9c
LR is at 0xbb
...
[<b0652280>] (uac_cs_attr_sample_rate) from [<b060ffec>] (dwc3_gadget_giveback+0x34/0x40)
[<b060ffec>] (dwc3_gadget_giveback) from [<b06143f0>] (dwc3_ep0_interrupt+0x2d0/0x7f4)
[<b06143f0>] (dwc3_ep0_interrupt) from [<b0611b60>] (dwc3_thread_interrupt+0x860/0xfa8)
[<b0611b60>] (dwc3_thread_interrupt) from [<b016dfd8>] (irq_thread_fn+0x1c/0x7c)
[<b016dfd8>] (irq_thread_fn) from [<b016e2dc>] (irq_thread+0x124/0x200)
[<b016e2dc>] (irq_thread) from [<b014290c>] (kthread+0x140/0x170)
[<b014290c>] (kthread) from [<b01010d8>] (ret_from_fork+0x14/0x3c)

Signed-off-by: William Wu <william.wu@rock-chips.com>
Change-Id: Ib23128a5b3f9027e711b352a4befcc8be1a97721
Joshua-Riek pushed a commit to Joshua-Riek/rockchip-kernel that referenced this issue Oct 31, 2023
It will cause an error when use memcpy(), so use for(;;) instead, and
need find out the root cause later.

For upstream reference: 76f1416 ("drm/rockchip: Do not use memcpy for
MMIO addresses")

For px30, the reg len is 0x1fc, this will trigger a none 8 byte alignment
address in _memcpy.

Error log when use memcpy() as below:

vop_power_enable regsbak: 0xffffff80050ad040 regs: 0xffffffc00a85d000 len: 0x1fc
Unable to handle kernel paging request at virtual address ffffffc00a55d1bc
Mem abort info:
  ESR = 0x0000000096000021
  EC = 0x25: DABT (current EL), IL = 32 bits
  SET = 0, FnV = 0
  EA = 0, S1PTW = 0
  FSC = 0x21: alignment fault
Data abort info:
  ISV = 0, ISS = 0x00000021
  CM = 0, WnR = 0
swapper pgtable: 4k pages, 39-bit VAs, pgdp=0000000001cc4000
[ffffffc00a55d1bc] pgd=100000007ffff003, p4d=100000007ffff003, pud=100000007ffff003, pmd=1000000002d7c003, pte=00680000ff460713
Internal error: Oops: 0000000096000021 [rockchip-linux#1] PREEMPT SMP
Modules linked in:
CPU: 2 PID: 112 Comm: recovery Not tainted 6.1.25 #1091
Hardware name: Rockchip PX30 evb ddr3 board (DT)
pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : __memcpy+0x168/0x250
lr : vop_initial+0x218/0x9e8
sp : ffffffc00ab1b920
x29: ffffffc00ab1b920 x28: 000000000000032a x27: ffffff8006611780
x26: ffffff80033f3c00 x25: ffffffc00994d6ac x24: ffffff8002eae010
x23: ffffff80033f3c18 x22: 00000000000002d0 x21: ffffff8004938100
x20: ffffffc009979341 x19: ffffff8004ae0040 x18: ffffffffffffffff
x17: 0000000000000000 x16: 00000000000000c0 x15: 0000000000000004
x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000
x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000080000000000
x5 : ffffff8005e0ca3c x4 : ffffffc00a55d1fc x3 : ffffff8005e0c9c0
x2 : ffffffffffffffec x1 : ffffffc00a55d1c0 x0 : ffffff8005e0c840
Call trace:
 __memcpy+0x168/0x250
 vop_crtc_atomic_enable+0x120/0x2c30
 drm_atomic_helper_commit_modeset_enables+0x50/0x268
 rockchip_drm_atomic_helper_commit_tail_rpm+0x48/0x190
 commit_tail+0x108/0x1f4
 drm_atomic_helper_commit+0x204/0x220
 drm_atomic_commit+0xa0/0xc8
 drm_atomic_helper_set_config+0x60/0xdc
 drm_mode_setcrtc+0x29c/0x6cc
 drm_ioctl+0x248/0x3b8
 drm_compat_ioctl+0xd8/0xe8
 __arm64_compat_sys_ioctl+0x14c/0x168
 invoke_syscall+0x3c/0xfc
 el0_svc_common+0xb8/0x160
 do_el0_svc_compat+0x18/0x20
 el0_svc_compat+0x24/0x8c
 el0t_32_sync_handler+0x70/0xa8
 el0t_32_sync+0x17c/0x180

Change-Id: I4e13298e66797df39233b91a3163c156a4e175bc
Signed-off-by: Liang Chen <cl@rock-chips.com>
Signed-off-by: Andy Yan <andy.yan@rock-chips.com>
hejiawencc referenced this issue in LubanCat/kernel Dec 8, 2023
Example: RK3588

Use I2S2_2CH as Clk-Gen to serve TDM_MULTI_LANES

I2S2_2CH ----> BCLK,I2S_LRCK --------> I2S0_8CH_TX (Slave TRCM-TXONLY)
    |
    |--------> BCLK,TDM_SYNC --------> TDM Device (Slave)

Note:

I2S2_2CH_MCLK: BCLK
I2S2_2CH_SCLK: I2S_LRCK (GPIO2_B7)
I2S2_2CH_LRCK: TDM_SYNC (GPIO2_C0)

DT:

&i2s0_8ch {
       status = "okay";
       assigned-clocks = <&cru I2S0_8CH_MCLKOUT>;
       assigned-clock-parents = <&cru MCLK_I2S0_8CH_TX>;
       i2s-lrck-gpio = <&gpio1 RK_PC5 GPIO_ACTIVE_HIGH>;
       tdm-fsync-gpio = <&gpio1 RK_PC2 GPIO_ACTIVE_HIGH>;
       rockchip,tdm-multi-lanes;
       rockchip,tdm-tx-lanes = <2>; //e.g. TDM16 x 2
       rockchip,tdm-rx-lanes = <2>; //e.g. TDM16 x 2
       rockchip,clk-src = <&i2s2_2ch>;
       pinctrl-names = "default";
       pinctrl-0 = <&i2s0_lrck
                    &i2s0_sclk
                    &i2s0_sdi0
                    &i2s0_sdi1
                    &i2s0_sdo0
                    &i2s0_sdo1>;
};

&i2s2_2ch {
       status = "okay";
       assigned-clocks = <&cru I2S2_2CH_MCLKOUT>;
       assigned-clock-parents = <&cru MCLK_I2S2_2CH>;
       pinctrl-names = "default";
       pinctrl-0 = <&i2s2m0_mclk
                    &i2s2m0_lrck
                    &i2s2m0_sclk>;
};

Usage: TDM16 x 2 Playback

amixer contents

numid=3,iface=MIXER,name='Receive SDIx Select'
  ; type=ENUMERATED,access=rw------,values=1,items=5
  ; Item #0 'Auto'
  ; Item #1 'SDIx1'
  ; Item #2 'SDIx2'
  ; Item rockchip-linux#3 'SDIx3'
  ; Item rockchip-linux#4 'SDIx4'
  : values=0
numid=2,iface=MIXER,name='Transmit SDOx Select'
  ; type=ENUMERATED,access=rw------,values=1,items=5
  ; Item #0 'Auto'
  ; Item #1 'SDOx1'
  ; Item #2 'SDOx2'
  ; Item rockchip-linux#3 'SDOx3'
  ; Item rockchip-linux#4 'SDOx4'
  : values=0

/# amixer sset "Transmit SDOx Select" "SDOx2"
Simple mixer control 'Transmit SDOx Select',0
  Capabilities: enum
  Items: 'Auto' 'SDOx1' 'SDOx2' 'SDOx3' 'SDOx4'
  Item0: 'SDOx2'

/# aplay -D hw:0,0 --period-size=1024 --buffer-size=4096 -r 48000 \
   -c 32 -f s32_le /dev/zero

Signed-off-by: Sugar Zhang <sugar.zhang@rock-chips.com>
Change-Id: I6996e05c73a9d68bbeb9562eb6e68e4c99b52d85
hejiawencc referenced this issue in LubanCat/kernel Jan 10, 2024
Example: RK3588

Use I2S2_2CH as Clk-Gen to serve TDM_MULTI_LANES

I2S2_2CH ----> BCLK,I2S_LRCK --------> I2S0_8CH_TX (Slave TRCM-TXONLY)
    |
    |--------> BCLK,TDM_SYNC --------> TDM Device (Slave)

Note:

I2S2_2CH_MCLK: BCLK
I2S2_2CH_SCLK: I2S_LRCK (GPIO2_B7)
I2S2_2CH_LRCK: TDM_SYNC (GPIO2_C0)

DT:

&i2s0_8ch {
       status = "okay";
       assigned-clocks = <&cru I2S0_8CH_MCLKOUT>;
       assigned-clock-parents = <&cru MCLK_I2S0_8CH_TX>;
       i2s-lrck-gpio = <&gpio1 RK_PC5 GPIO_ACTIVE_HIGH>;
       tdm-fsync-gpio = <&gpio1 RK_PC2 GPIO_ACTIVE_HIGH>;
       rockchip,tdm-multi-lanes;
       rockchip,tdm-tx-lanes = <2>; //e.g. TDM16 x 2
       rockchip,tdm-rx-lanes = <2>; //e.g. TDM16 x 2
       rockchip,clk-src = <&i2s2_2ch>;
       pinctrl-names = "default";
       pinctrl-0 = <&i2s0_lrck
                    &i2s0_sclk
                    &i2s0_sdi0
                    &i2s0_sdi1
                    &i2s0_sdo0
                    &i2s0_sdo1>;
};

&i2s2_2ch {
       status = "okay";
       assigned-clocks = <&cru I2S2_2CH_MCLKOUT>;
       assigned-clock-parents = <&cru MCLK_I2S2_2CH>;
       pinctrl-names = "default";
       pinctrl-0 = <&i2s2m0_mclk
                    &i2s2m0_lrck
                    &i2s2m0_sclk>;
};

Usage: TDM16 x 2 Playback

amixer contents

numid=3,iface=MIXER,name='Receive SDIx Select'
  ; type=ENUMERATED,access=rw------,values=1,items=5
  ; Item #0 'Auto'
  ; Item #1 'SDIx1'
  ; Item #2 'SDIx2'
  ; Item rockchip-linux#3 'SDIx3'
  ; Item rockchip-linux#4 'SDIx4'
  : values=0
numid=2,iface=MIXER,name='Transmit SDOx Select'
  ; type=ENUMERATED,access=rw------,values=1,items=5
  ; Item #0 'Auto'
  ; Item #1 'SDOx1'
  ; Item #2 'SDOx2'
  ; Item rockchip-linux#3 'SDOx3'
  ; Item rockchip-linux#4 'SDOx4'
  : values=0

/# amixer sset "Transmit SDOx Select" "SDOx2"
Simple mixer control 'Transmit SDOx Select',0
  Capabilities: enum
  Items: 'Auto' 'SDOx1' 'SDOx2' 'SDOx3' 'SDOx4'
  Item0: 'SDOx2'

/# aplay -D hw:0,0 --period-size=1024 --buffer-size=4096 -r 48000 \
   -c 32 -f s32_le /dev/zero

Signed-off-by: Sugar Zhang <sugar.zhang@rock-chips.com>
Change-Id: I6996e05c73a9d68bbeb9562eb6e68e4c99b52d85
hejiawencc referenced this issue in LubanCat/kernel Jan 10, 2024
This patch add support for DMA-based digital loopback.

BACKGROUND
Audio Products with AEC require loopback for echo cancellation.
the hardware LP is not always available on some products, maybe
the HW limitation(such as internal acodec) or HW Cost-down.

This patch add support software DLP for such products.

Enable:

  CONFIG_SND_SOC_ROCKCHIP_DLP

  &i2s {
      rockchip,digital-loopback;
  };

Mode List:

  amixer contents
  numid=2,iface=MIXER,name='Software Digital Loopback Mode'
    ; type=ENUMERATED,access=rw------,values=1,items=7
    ; Item #0 'Disabled'
    ; Item #1 '2CH: 1 Loopback + 1 Mic'
    ; Item #2 '2CH: 1 Mic + 1 Loopback'
    ; Item rockchip-linux#3 '2CH: 1 Mic + 1 Loopback-mixed'
    ; Item rockchip-linux#4 '2CH: 2 Loopbacks'
    ; Item rockchip-linux#5 '4CH: 2 Mics + 2 Loopbacks'
    ; Item rockchip-linux#6 '4CH: 2 Mics + 1 Loopback-mixed'
    : values=0

Testenv:

wired SDO0 --> SDI0 directly to get external digital loopback
as reference.

Testcase: dlp.sh

  /#!/bin/sh

  item=0
  id=`amixer contents | grep "Software Digital Loopback" | \
      awk -F ',' '{print $1}'`

  items=`amixer contents | grep -A 1 "Software Digital Loopback" | \
         grep items | awk -F 'items=' '{print $2}'`

  echo "Software Digital Loopback: $id, items: $items"

  mode_chs() {
          case $1 in
          [0-4])
                  echo "2"
                  ;;
          [5-6])
                  echo "4"
                  ;;
          *)
                  echo "2"
                  ;;
          esac
  }

  while true
  do
          ch=`mode_chs $item`
          amixer -c 0 cset $id $item
          arecord -D hw:0,0 --period-size=1024 --buffer-size=4096 -r 48000 -c $ch -f s16_le \
                  -d 15 sine/dlp_$item.wav &
          sleep 2
          for i in $(seq 1 10)
          do
                  aplay -D hw:0,0 --period-size=1024 --buffer-size=8192 $((ch))ch.wav -d 1
          done
          pid=$(ps | egrep "aplay|arecord" | grep -v grep | awk '{print $1}' | sort -r)
          for p in $pid
          do
                  wait $p 2>/dev/null
          done
          item=$((item+1))
          if [ $item -ge $items ]; then
                  sleep 1
                  break
          fi
  done
  echo "Done"

Result:

do shell test and verify dlp_x.wav:

* Alignment: ~1 samples shift (loopback <-> mics).
* Integrity: no giltch, no data lost.
* AEC: align loopback and mics sample and do simple AEC, get clean
  waveform.

Logs:
...
numid=2,iface=MIXER,name='Software Digital Loopback Mode'
  ; type=ENUMERATED,access=rw------,values=1,items=7
  ; Item #0 'Disabled'
  ; Item #1 '2CH: 1 Loopback + 1 Mic'
  ; Item #2 '2CH: 1 Mic + 1 Loopback'
  ; Item rockchip-linux#3 '2CH: 1 Mic + 1 Loopback-mixed'
  ; Item rockchip-linux#4 '2CH: 2 Loopbacks'
  ; Item rockchip-linux#5 '4CH: 2 Mics + 2 Loopbacks'
  ; Item rockchip-linux#6 '4CH: 2 Mics + 1 Loopback-mixed'
  : values=2
Recording WAVE 'sine/dlp_2.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Stereo
Playing WAVE '2ch.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Stereo
Playing WAVE '2ch.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Stereo
Playing WAVE '2ch.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Stereo
Playing WAVE '2ch.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Stereo
Playing WAVE '2ch.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Stereo
Playing WAVE '2ch.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Stereo
Playing WAVE '2ch.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Stereo
Playing WAVE '2ch.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Stereo
Playing WAVE '2ch.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Stereo
Playing WAVE '2ch.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Stereo
...
numid=2,iface=MIXER,name='Software Digital Loopback Mode'
  ; type=ENUMERATED,access=rw------,values=1,items=7
  ; Item #0 'Disabled'
  ; Item #1 '2CH: 1 Loopback + 1 Mic'
  ; Item #2 '2CH: 1 Mic + 1 Loopback'
  ; Item rockchip-linux#3 '2CH: 1 Mic + 1 Loopback-mixed'
  ; Item rockchip-linux#4 '2CH: 2 Loopbacks'
  ; Item rockchip-linux#5 '4CH: 2 Mics + 2 Loopbacks'
  ; Item rockchip-linux#6 '4CH: 2 Mics + 1 Loopback-mixed'
  : values=6
Recording WAVE 'sine/dlp_6.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Channels 4
Playing WAVE '4ch.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Channels 4
Playing WAVE '4ch.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Channels 4
Playing WAVE '4ch.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Channels 4
Playing WAVE '4ch.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Channels 4
Playing WAVE '4ch.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Channels 4
Playing WAVE '4ch.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Channels 4
Playing WAVE '4ch.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Channels 4
Playing WAVE '4ch.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Channels 4
Playing WAVE '4ch.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Channels 4
Playing WAVE '4ch.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Channels 4
Done

Signed-off-by: Sugar Zhang <sugar.zhang@rock-chips.com>
Change-Id: I5772f0694f7a14a0f0bd1f0777b6c4cdbd781a64
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants