Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Building kernel from source #68

Open
peda-r opened this issue Feb 15, 2018 · 1 comment
Open

Building kernel from source #68

peda-r opened this issue Feb 15, 2018 · 1 comment

Comments

@peda-r
Copy link
Contributor

peda-r commented Feb 15, 2018

Hi!

I'm trying to build my own kernel and run that with some debian distribution on a rk3288-firefly board. After some searching I started with the guide at http://rockchip.wikidot.com/linux-user-guide, but later changed to http://opensource.rock-chips.com/wiki_Linux_user_guide which is similar but a little bit more up-to-date (or that's my guess anyway). But the directions seem to fail for me. Here's what I do:

mkdir rk-linux
cd rk-linux
repo init --repo-url=https://github.com/rockchip-linux/repo -u https://github.com/rockchip-linux/manifests -b master
repo sync
build/mk-kernel.sh rk3288-firefly
build/mk-uboot.sh rk3288-firefly
build/mk-image.sh -c rk3288 -t system -r rk-debian-build/linaro-rootfs.img
build/flash_tool.sh -c rk3288 -p system -i out/system.img

Before running flash_tool.sh, I put the device in MaskRom mode, and the flashing seems to work fine. I get this output:

PARTITIONS OFFSET: 0 sectors.
Downloading bootloader succeeded.
Write LBA from file (100%)
Reset Device OK.

After that the board is restarted automatically (as indicated) and I get this on the serial console:

U-Boot SPL 2017.09-g515bbbdeca (Feb 15 2018 - 17:09:58)
Returning to boot ROM...

U-Boot 2017.09-g515bbbdeca (Feb 15 2018 - 17:09:58 +0100)
Model: Firefly-RK3288
DRAM: 2 GiB
MMC: dwmmc@ff0c0000: 1, dwmmc@ff0f0000: 0
*** Warning - bad CRC, using default environment

And then nothing happens. I got the linaro-rootfs.img file from a tar-ball named "rootfs-debian-20171225(fix a stupid mistake, affect performance since 20170901).tar.gz" that I got from the rockchip Google Drive, but that seems unlikely to matter when the problem appears to be that u-boot can't find the next stage to run (i.e. kernel).

What am I doing wrong?

Cheers,
Peter

PS: the board works (appears to work anyway) with pre-built images that I have downloaded. But I want to build myself, and eventually make a tweak or two...

@wzyy2
Copy link
Contributor

wzyy2 commented Mar 9, 2018

You can flash them dividually to figure out which part is wrong.

friendlyarm pushed a commit to friendlyarm/kernel-rockchip that referenced this issue Nov 17, 2020
[ Upstream commit 0daf2bf ]

Each EMAD transaction stores the skb used to issue the EMAD request
('trans->tx_skb') so that the request could be retried in case of a
timeout. The skb can be freed when a corresponding response is received
or as part of the retry logic (e.g., failed retransmit, exceeded maximum
number of retries).

The two tasks (i.e., response processing and retransmits) are
synchronized by the atomic 'trans->active' field which ensures that
responses to inactive transactions are ignored.

In case of a failed retransmit the transaction is finished and all of
its resources are freed. However, the current code does not mark it as
inactive. Syzkaller was able to hit a race condition in which a
concurrent response is processed while the transaction's resources are
being freed, resulting in a use-after-free [1].

Fix the issue by making sure to mark the transaction as inactive after a
failed retransmit and free its resources only if a concurrent task did
not already do that.

[1]
BUG: KASAN: use-after-free in consume_skb+0x30/0x370
net/core/skbuff.c:833
Read of size 4 at addr ffff88804f570494 by task syz-executor.0/1004

CPU: 0 PID: 1004 Comm: syz-executor.0 Not tainted 5.8.0-rc7+ rockchip-linux#68
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0xf6/0x16e lib/dump_stack.c:118
 print_address_description.constprop.0+0x1c/0x250
mm/kasan/report.c:383
 __kasan_report mm/kasan/report.c:513 [inline]
 kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
 check_memory_region_inline mm/kasan/generic.c:186 [inline]
 check_memory_region+0x14e/0x1b0 mm/kasan/generic.c:192
 instrument_atomic_read include/linux/instrumented.h:56 [inline]
 atomic_read include/asm-generic/atomic-instrumented.h:27 [inline]
 refcount_read include/linux/refcount.h:147 [inline]
 skb_unref include/linux/skbuff.h:1044 [inline]
 consume_skb+0x30/0x370 net/core/skbuff.c:833
 mlxsw_emad_trans_finish+0x64/0x1c0 drivers/net/ethernet/mellanox/mlxsw/core.c:592
 mlxsw_emad_process_response drivers/net/ethernet/mellanox/mlxsw/core.c:651 [inline]
 mlxsw_emad_rx_listener_func+0x5c9/0xac0 drivers/net/ethernet/mellanox/mlxsw/core.c:672
 mlxsw_core_skb_receive+0x4df/0x770 drivers/net/ethernet/mellanox/mlxsw/core.c:2063
 mlxsw_pci_cqe_rdq_handle drivers/net/ethernet/mellanox/mlxsw/pci.c:595 [inline]
 mlxsw_pci_cq_tasklet+0x12a6/0x2520 drivers/net/ethernet/mellanox/mlxsw/pci.c:651
 tasklet_action_common.isra.0+0x13f/0x3e0 kernel/softirq.c:550
 __do_softirq+0x223/0x964 kernel/softirq.c:292
 asm_call_on_stack+0x12/0x20 arch/x86/entry/entry_64.S:711

Allocated by task 1006:
 save_stack+0x1b/0x40 mm/kasan/common.c:48
 set_track mm/kasan/common.c:56 [inline]
 __kasan_kmalloc mm/kasan/common.c:494 [inline]
 __kasan_kmalloc.constprop.0+0xc2/0xd0 mm/kasan/common.c:467
 slab_post_alloc_hook mm/slab.h:586 [inline]
 slab_alloc_node mm/slub.c:2824 [inline]
 slab_alloc mm/slub.c:2832 [inline]
 kmem_cache_alloc+0xcd/0x2e0 mm/slub.c:2837
 __build_skb+0x21/0x60 net/core/skbuff.c:311
 __netdev_alloc_skb+0x1e2/0x360 net/core/skbuff.c:464
 netdev_alloc_skb include/linux/skbuff.h:2810 [inline]
 mlxsw_emad_alloc drivers/net/ethernet/mellanox/mlxsw/core.c:756 [inline]
 mlxsw_emad_reg_access drivers/net/ethernet/mellanox/mlxsw/core.c:787 [inline]
 mlxsw_core_reg_access_emad+0x1ab/0x1420 drivers/net/ethernet/mellanox/mlxsw/core.c:1817
 mlxsw_reg_trans_query+0x39/0x50 drivers/net/ethernet/mellanox/mlxsw/core.c:1831
 mlxsw_sp_sb_pm_occ_clear drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c:260 [inline]
 mlxsw_sp_sb_occ_max_clear+0xbff/0x10a0 drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c:1365
 mlxsw_devlink_sb_occ_max_clear+0x76/0xb0 drivers/net/ethernet/mellanox/mlxsw/core.c:1037
 devlink_nl_cmd_sb_occ_max_clear_doit+0x1ec/0x280 net/core/devlink.c:1765
 genl_family_rcv_msg_doit net/netlink/genetlink.c:669 [inline]
 genl_family_rcv_msg net/netlink/genetlink.c:714 [inline]
 genl_rcv_msg+0x617/0x980 net/netlink/genetlink.c:731
 netlink_rcv_skb+0x152/0x440 net/netlink/af_netlink.c:2470
 genl_rcv+0x24/0x40 net/netlink/genetlink.c:742
 netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline]
 netlink_unicast+0x53a/0x750 net/netlink/af_netlink.c:1330
 netlink_sendmsg+0x850/0xd90 net/netlink/af_netlink.c:1919
 sock_sendmsg_nosec net/socket.c:651 [inline]
 sock_sendmsg+0x150/0x190 net/socket.c:671
 ____sys_sendmsg+0x6d8/0x840 net/socket.c:2359
 ___sys_sendmsg+0xff/0x170 net/socket.c:2413
 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2446
 do_syscall_64+0x56/0xa0 arch/x86/entry/common.c:384
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Freed by task 73:
 save_stack+0x1b/0x40 mm/kasan/common.c:48
 set_track mm/kasan/common.c:56 [inline]
 kasan_set_free_info mm/kasan/common.c:316 [inline]
 __kasan_slab_free+0x12c/0x170 mm/kasan/common.c:455
 slab_free_hook mm/slub.c:1474 [inline]
 slab_free_freelist_hook mm/slub.c:1507 [inline]
 slab_free mm/slub.c:3072 [inline]
 kmem_cache_free+0xbe/0x380 mm/slub.c:3088
 kfree_skbmem net/core/skbuff.c:622 [inline]
 kfree_skbmem+0xef/0x1b0 net/core/skbuff.c:616
 __kfree_skb net/core/skbuff.c:679 [inline]
 consume_skb net/core/skbuff.c:837 [inline]
 consume_skb+0xe1/0x370 net/core/skbuff.c:831
 mlxsw_emad_trans_finish+0x64/0x1c0 drivers/net/ethernet/mellanox/mlxsw/core.c:592
 mlxsw_emad_transmit_retry.isra.0+0x9d/0xc0 drivers/net/ethernet/mellanox/mlxsw/core.c:613
 mlxsw_emad_trans_timeout_work+0x43/0x50 drivers/net/ethernet/mellanox/mlxsw/core.c:625
 process_one_work+0xa3e/0x17a0 kernel/workqueue.c:2269
 worker_thread+0x9e/0x1050 kernel/workqueue.c:2415
 kthread+0x355/0x470 kernel/kthread.c:291
 ret_from_fork+0x22/0x30 arch/x86/entry/entry_64.S:293

The buggy address belongs to the object at ffff88804f5703c0
 which belongs to the cache skbuff_head_cache of size 224
The buggy address is located 212 bytes inside of
 224-byte region [ffff88804f5703c0, ffff88804f5704a0)
The buggy address belongs to the page:
page:ffffea00013d5c00 refcount:1 mapcount:0 mapping:0000000000000000
index:0x0
flags: 0x100000000000200(slab)
raw: 0100000000000200 dead000000000100 dead000000000122 ffff88806c625400
raw: 0000000000000000 00000000000c000c 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff88804f570380: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
 ffff88804f570400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88804f570480: fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc
                         ^
 ffff88804f570500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 ffff88804f570580: 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc

Fixes: caf7297 ("mlxsw: core: Introduce support for asynchronous EMAD register access")
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Kwiboo pushed a commit to Kwiboo/linux-rockchip that referenced this issue Nov 29, 2020
[ Upstream commit 0daf2bf ]

Each EMAD transaction stores the skb used to issue the EMAD request
('trans->tx_skb') so that the request could be retried in case of a
timeout. The skb can be freed when a corresponding response is received
or as part of the retry logic (e.g., failed retransmit, exceeded maximum
number of retries).

The two tasks (i.e., response processing and retransmits) are
synchronized by the atomic 'trans->active' field which ensures that
responses to inactive transactions are ignored.

In case of a failed retransmit the transaction is finished and all of
its resources are freed. However, the current code does not mark it as
inactive. Syzkaller was able to hit a race condition in which a
concurrent response is processed while the transaction's resources are
being freed, resulting in a use-after-free [1].

Fix the issue by making sure to mark the transaction as inactive after a
failed retransmit and free its resources only if a concurrent task did
not already do that.

[1]
BUG: KASAN: use-after-free in consume_skb+0x30/0x370
net/core/skbuff.c:833
Read of size 4 at addr ffff88804f570494 by task syz-executor.0/1004

CPU: 0 PID: 1004 Comm: syz-executor.0 Not tainted 5.8.0-rc7+ rockchip-linux#68
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0xf6/0x16e lib/dump_stack.c:118
 print_address_description.constprop.0+0x1c/0x250
mm/kasan/report.c:383
 __kasan_report mm/kasan/report.c:513 [inline]
 kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
 check_memory_region_inline mm/kasan/generic.c:186 [inline]
 check_memory_region+0x14e/0x1b0 mm/kasan/generic.c:192
 instrument_atomic_read include/linux/instrumented.h:56 [inline]
 atomic_read include/asm-generic/atomic-instrumented.h:27 [inline]
 refcount_read include/linux/refcount.h:147 [inline]
 skb_unref include/linux/skbuff.h:1044 [inline]
 consume_skb+0x30/0x370 net/core/skbuff.c:833
 mlxsw_emad_trans_finish+0x64/0x1c0 drivers/net/ethernet/mellanox/mlxsw/core.c:592
 mlxsw_emad_process_response drivers/net/ethernet/mellanox/mlxsw/core.c:651 [inline]
 mlxsw_emad_rx_listener_func+0x5c9/0xac0 drivers/net/ethernet/mellanox/mlxsw/core.c:672
 mlxsw_core_skb_receive+0x4df/0x770 drivers/net/ethernet/mellanox/mlxsw/core.c:2063
 mlxsw_pci_cqe_rdq_handle drivers/net/ethernet/mellanox/mlxsw/pci.c:595 [inline]
 mlxsw_pci_cq_tasklet+0x12a6/0x2520 drivers/net/ethernet/mellanox/mlxsw/pci.c:651
 tasklet_action_common.isra.0+0x13f/0x3e0 kernel/softirq.c:550
 __do_softirq+0x223/0x964 kernel/softirq.c:292
 asm_call_on_stack+0x12/0x20 arch/x86/entry/entry_64.S:711

Allocated by task 1006:
 save_stack+0x1b/0x40 mm/kasan/common.c:48
 set_track mm/kasan/common.c:56 [inline]
 __kasan_kmalloc mm/kasan/common.c:494 [inline]
 __kasan_kmalloc.constprop.0+0xc2/0xd0 mm/kasan/common.c:467
 slab_post_alloc_hook mm/slab.h:586 [inline]
 slab_alloc_node mm/slub.c:2824 [inline]
 slab_alloc mm/slub.c:2832 [inline]
 kmem_cache_alloc+0xcd/0x2e0 mm/slub.c:2837
 __build_skb+0x21/0x60 net/core/skbuff.c:311
 __netdev_alloc_skb+0x1e2/0x360 net/core/skbuff.c:464
 netdev_alloc_skb include/linux/skbuff.h:2810 [inline]
 mlxsw_emad_alloc drivers/net/ethernet/mellanox/mlxsw/core.c:756 [inline]
 mlxsw_emad_reg_access drivers/net/ethernet/mellanox/mlxsw/core.c:787 [inline]
 mlxsw_core_reg_access_emad+0x1ab/0x1420 drivers/net/ethernet/mellanox/mlxsw/core.c:1817
 mlxsw_reg_trans_query+0x39/0x50 drivers/net/ethernet/mellanox/mlxsw/core.c:1831
 mlxsw_sp_sb_pm_occ_clear drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c:260 [inline]
 mlxsw_sp_sb_occ_max_clear+0xbff/0x10a0 drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c:1365
 mlxsw_devlink_sb_occ_max_clear+0x76/0xb0 drivers/net/ethernet/mellanox/mlxsw/core.c:1037
 devlink_nl_cmd_sb_occ_max_clear_doit+0x1ec/0x280 net/core/devlink.c:1765
 genl_family_rcv_msg_doit net/netlink/genetlink.c:669 [inline]
 genl_family_rcv_msg net/netlink/genetlink.c:714 [inline]
 genl_rcv_msg+0x617/0x980 net/netlink/genetlink.c:731
 netlink_rcv_skb+0x152/0x440 net/netlink/af_netlink.c:2470
 genl_rcv+0x24/0x40 net/netlink/genetlink.c:742
 netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline]
 netlink_unicast+0x53a/0x750 net/netlink/af_netlink.c:1330
 netlink_sendmsg+0x850/0xd90 net/netlink/af_netlink.c:1919
 sock_sendmsg_nosec net/socket.c:651 [inline]
 sock_sendmsg+0x150/0x190 net/socket.c:671
 ____sys_sendmsg+0x6d8/0x840 net/socket.c:2359
 ___sys_sendmsg+0xff/0x170 net/socket.c:2413
 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2446
 do_syscall_64+0x56/0xa0 arch/x86/entry/common.c:384
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Freed by task 73:
 save_stack+0x1b/0x40 mm/kasan/common.c:48
 set_track mm/kasan/common.c:56 [inline]
 kasan_set_free_info mm/kasan/common.c:316 [inline]
 __kasan_slab_free+0x12c/0x170 mm/kasan/common.c:455
 slab_free_hook mm/slub.c:1474 [inline]
 slab_free_freelist_hook mm/slub.c:1507 [inline]
 slab_free mm/slub.c:3072 [inline]
 kmem_cache_free+0xbe/0x380 mm/slub.c:3088
 kfree_skbmem net/core/skbuff.c:622 [inline]
 kfree_skbmem+0xef/0x1b0 net/core/skbuff.c:616
 __kfree_skb net/core/skbuff.c:679 [inline]
 consume_skb net/core/skbuff.c:837 [inline]
 consume_skb+0xe1/0x370 net/core/skbuff.c:831
 mlxsw_emad_trans_finish+0x64/0x1c0 drivers/net/ethernet/mellanox/mlxsw/core.c:592
 mlxsw_emad_transmit_retry.isra.0+0x9d/0xc0 drivers/net/ethernet/mellanox/mlxsw/core.c:613
 mlxsw_emad_trans_timeout_work+0x43/0x50 drivers/net/ethernet/mellanox/mlxsw/core.c:625
 process_one_work+0xa3e/0x17a0 kernel/workqueue.c:2269
 worker_thread+0x9e/0x1050 kernel/workqueue.c:2415
 kthread+0x355/0x470 kernel/kthread.c:291
 ret_from_fork+0x22/0x30 arch/x86/entry/entry_64.S:293

The buggy address belongs to the object at ffff88804f5703c0
 which belongs to the cache skbuff_head_cache of size 224
The buggy address is located 212 bytes inside of
 224-byte region [ffff88804f5703c0, ffff88804f5704a0)
The buggy address belongs to the page:
page:ffffea00013d5c00 refcount:1 mapcount:0 mapping:0000000000000000
index:0x0
flags: 0x100000000000200(slab)
raw: 0100000000000200 dead000000000100 dead000000000122 ffff88806c625400
raw: 0000000000000000 00000000000c000c 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff88804f570380: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
 ffff88804f570400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88804f570480: fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc
                         ^
 ffff88804f570500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 ffff88804f570580: 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc

Fixes: caf7297 ("mlxsw: core: Introduce support for asynchronous EMAD register access")
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Kwiboo pushed a commit to Kwiboo/linux-rockchip that referenced this issue Dec 30, 2020
[ Upstream commit 0daf2bf ]

Each EMAD transaction stores the skb used to issue the EMAD request
('trans->tx_skb') so that the request could be retried in case of a
timeout. The skb can be freed when a corresponding response is received
or as part of the retry logic (e.g., failed retransmit, exceeded maximum
number of retries).

The two tasks (i.e., response processing and retransmits) are
synchronized by the atomic 'trans->active' field which ensures that
responses to inactive transactions are ignored.

In case of a failed retransmit the transaction is finished and all of
its resources are freed. However, the current code does not mark it as
inactive. Syzkaller was able to hit a race condition in which a
concurrent response is processed while the transaction's resources are
being freed, resulting in a use-after-free [1].

Fix the issue by making sure to mark the transaction as inactive after a
failed retransmit and free its resources only if a concurrent task did
not already do that.

[1]
BUG: KASAN: use-after-free in consume_skb+0x30/0x370
net/core/skbuff.c:833
Read of size 4 at addr ffff88804f570494 by task syz-executor.0/1004

CPU: 0 PID: 1004 Comm: syz-executor.0 Not tainted 5.8.0-rc7+ rockchip-linux#68
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0xf6/0x16e lib/dump_stack.c:118
 print_address_description.constprop.0+0x1c/0x250
mm/kasan/report.c:383
 __kasan_report mm/kasan/report.c:513 [inline]
 kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
 check_memory_region_inline mm/kasan/generic.c:186 [inline]
 check_memory_region+0x14e/0x1b0 mm/kasan/generic.c:192
 instrument_atomic_read include/linux/instrumented.h:56 [inline]
 atomic_read include/asm-generic/atomic-instrumented.h:27 [inline]
 refcount_read include/linux/refcount.h:147 [inline]
 skb_unref include/linux/skbuff.h:1044 [inline]
 consume_skb+0x30/0x370 net/core/skbuff.c:833
 mlxsw_emad_trans_finish+0x64/0x1c0 drivers/net/ethernet/mellanox/mlxsw/core.c:592
 mlxsw_emad_process_response drivers/net/ethernet/mellanox/mlxsw/core.c:651 [inline]
 mlxsw_emad_rx_listener_func+0x5c9/0xac0 drivers/net/ethernet/mellanox/mlxsw/core.c:672
 mlxsw_core_skb_receive+0x4df/0x770 drivers/net/ethernet/mellanox/mlxsw/core.c:2063
 mlxsw_pci_cqe_rdq_handle drivers/net/ethernet/mellanox/mlxsw/pci.c:595 [inline]
 mlxsw_pci_cq_tasklet+0x12a6/0x2520 drivers/net/ethernet/mellanox/mlxsw/pci.c:651
 tasklet_action_common.isra.0+0x13f/0x3e0 kernel/softirq.c:550
 __do_softirq+0x223/0x964 kernel/softirq.c:292
 asm_call_on_stack+0x12/0x20 arch/x86/entry/entry_64.S:711

Allocated by task 1006:
 save_stack+0x1b/0x40 mm/kasan/common.c:48
 set_track mm/kasan/common.c:56 [inline]
 __kasan_kmalloc mm/kasan/common.c:494 [inline]
 __kasan_kmalloc.constprop.0+0xc2/0xd0 mm/kasan/common.c:467
 slab_post_alloc_hook mm/slab.h:586 [inline]
 slab_alloc_node mm/slub.c:2824 [inline]
 slab_alloc mm/slub.c:2832 [inline]
 kmem_cache_alloc+0xcd/0x2e0 mm/slub.c:2837
 __build_skb+0x21/0x60 net/core/skbuff.c:311
 __netdev_alloc_skb+0x1e2/0x360 net/core/skbuff.c:464
 netdev_alloc_skb include/linux/skbuff.h:2810 [inline]
 mlxsw_emad_alloc drivers/net/ethernet/mellanox/mlxsw/core.c:756 [inline]
 mlxsw_emad_reg_access drivers/net/ethernet/mellanox/mlxsw/core.c:787 [inline]
 mlxsw_core_reg_access_emad+0x1ab/0x1420 drivers/net/ethernet/mellanox/mlxsw/core.c:1817
 mlxsw_reg_trans_query+0x39/0x50 drivers/net/ethernet/mellanox/mlxsw/core.c:1831
 mlxsw_sp_sb_pm_occ_clear drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c:260 [inline]
 mlxsw_sp_sb_occ_max_clear+0xbff/0x10a0 drivers/net/ethernet/mellanox/mlxsw/spectrum_buffers.c:1365
 mlxsw_devlink_sb_occ_max_clear+0x76/0xb0 drivers/net/ethernet/mellanox/mlxsw/core.c:1037
 devlink_nl_cmd_sb_occ_max_clear_doit+0x1ec/0x280 net/core/devlink.c:1765
 genl_family_rcv_msg_doit net/netlink/genetlink.c:669 [inline]
 genl_family_rcv_msg net/netlink/genetlink.c:714 [inline]
 genl_rcv_msg+0x617/0x980 net/netlink/genetlink.c:731
 netlink_rcv_skb+0x152/0x440 net/netlink/af_netlink.c:2470
 genl_rcv+0x24/0x40 net/netlink/genetlink.c:742
 netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline]
 netlink_unicast+0x53a/0x750 net/netlink/af_netlink.c:1330
 netlink_sendmsg+0x850/0xd90 net/netlink/af_netlink.c:1919
 sock_sendmsg_nosec net/socket.c:651 [inline]
 sock_sendmsg+0x150/0x190 net/socket.c:671
 ____sys_sendmsg+0x6d8/0x840 net/socket.c:2359
 ___sys_sendmsg+0xff/0x170 net/socket.c:2413
 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2446
 do_syscall_64+0x56/0xa0 arch/x86/entry/common.c:384
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Freed by task 73:
 save_stack+0x1b/0x40 mm/kasan/common.c:48
 set_track mm/kasan/common.c:56 [inline]
 kasan_set_free_info mm/kasan/common.c:316 [inline]
 __kasan_slab_free+0x12c/0x170 mm/kasan/common.c:455
 slab_free_hook mm/slub.c:1474 [inline]
 slab_free_freelist_hook mm/slub.c:1507 [inline]
 slab_free mm/slub.c:3072 [inline]
 kmem_cache_free+0xbe/0x380 mm/slub.c:3088
 kfree_skbmem net/core/skbuff.c:622 [inline]
 kfree_skbmem+0xef/0x1b0 net/core/skbuff.c:616
 __kfree_skb net/core/skbuff.c:679 [inline]
 consume_skb net/core/skbuff.c:837 [inline]
 consume_skb+0xe1/0x370 net/core/skbuff.c:831
 mlxsw_emad_trans_finish+0x64/0x1c0 drivers/net/ethernet/mellanox/mlxsw/core.c:592
 mlxsw_emad_transmit_retry.isra.0+0x9d/0xc0 drivers/net/ethernet/mellanox/mlxsw/core.c:613
 mlxsw_emad_trans_timeout_work+0x43/0x50 drivers/net/ethernet/mellanox/mlxsw/core.c:625
 process_one_work+0xa3e/0x17a0 kernel/workqueue.c:2269
 worker_thread+0x9e/0x1050 kernel/workqueue.c:2415
 kthread+0x355/0x470 kernel/kthread.c:291
 ret_from_fork+0x22/0x30 arch/x86/entry/entry_64.S:293

The buggy address belongs to the object at ffff88804f5703c0
 which belongs to the cache skbuff_head_cache of size 224
The buggy address is located 212 bytes inside of
 224-byte region [ffff88804f5703c0, ffff88804f5704a0)
The buggy address belongs to the page:
page:ffffea00013d5c00 refcount:1 mapcount:0 mapping:0000000000000000
index:0x0
flags: 0x100000000000200(slab)
raw: 0100000000000200 dead000000000100 dead000000000122 ffff88806c625400
raw: 0000000000000000 00000000000c000c 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff88804f570380: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
 ffff88804f570400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88804f570480: fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc
                         ^
 ffff88804f570500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 ffff88804f570580: 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc

Fixes: caf7297 ("mlxsw: core: Introduce support for asynchronous EMAD register access")
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
StephenInVamrs referenced this issue in radxa/kernel Apr 25, 2022
[  177.363952][    C0] SError Interrupt on CPU0, code 0xbe000011 -- SError
[  177.363954][    C0] CPU: 0 PID: 1 Comm: init Not tainted 5.10.66 #68
[  177.363956][    C0] Hardware name: Rockchip RK3588 EVB1 LP4 V10 Board (DT)
[  177.363958][    C0] pstate: 60400009 (nZCv daif +PAN -UAO -TCO BTYPE=--)
[  177.363959][    C0] pc : disable_irq+0x10/0xa0
[  177.363960][    C0] lr : csi2_shutdown+0x50/0x6c
[  177.363961][    C0] sp : ffffffc01211bc20
[  177.363963][    C0] x29: ffffffc01211bc50 x28: ffffff8100b98040
[  177.363967][    C0] x27: 0000000000000000 x26: ffffffc011f7e000
[  177.363970][    C0] x25: ffffffc012099000 x24: ffffff8103325890
[  177.363973][    C0] x23: ffffff8103325818 x22: ffffffc011e48988
[  177.363976][    C0] x21: ffffff8103325810 x20: ffffffc01183bdf0
[  177.363979][    C0] x19: ffffff8107785088 x18: ffffffc012115050
[  177.363982][    C0] x17: 0000000000000009 x16: 0000000000000009
[  177.363985][    C0] x15: 0000000000000000 x14: fffffffeffe00000
[  177.363988][    C0] x13: 0000000000019ba0 x12: 0000000029aaaaf1
[  177.363991][    C0] x11: 001e33635ac745bd x10: ffffffc013790028
[  177.363994][    C0] x9 : ffffffc01379002c x8 : 00000000ffffffff
[  177.363997][    C0] x7 : bbbbbbbbbbbbbbbb x6 : 0000000000000040
[  177.364000][    C0] x5 : 0000000000190018 x4 : ffffff8103d19980
[  177.364003][    C0] x3 : ffffff8100003c90 x2 : 0000000000000001
[  177.364006][    C0] x1 : 0000000000000002 x0 : 000000000000003f
[  177.364010][    C0] Kernel panic - not syncing: Asynchronous SError Interrupt
[  177.364011][    C0] CPU: 0 PID: 1 Comm: init Not tainted 5.10.66 #68
[  177.364013][    C0] Hardware name: Rockchip RK3588 EVB1 LP4 V10 Board (DT)
[  177.364014][    C0] Call trace:
[  177.364015][    C0]  dump_backtrace+0x0/0x1c8
[  177.364016][    C0]  show_stack+0x1c/0x2c
[  177.364017][    C0]  dump_stack_lvl+0xdc/0x12c
[  177.364018][    C0]  dump_stack+0x1c/0x64
[  177.364019][    C0]  panic+0x150/0x3a4
[  177.364020][    C0]  test_taint+0x0/0x30
[  177.364022][    C0]  arm64_serror_panic+0x78/0x84
[  177.364023][    C0]  do_serror+0xe0/0x100
[  177.364024][    C0]  el1_error+0x94/0x118
[  177.364025][    C0]  disable_irq+0x10/0xa0
[  177.364026][    C0]  platform_drv_shutdown+0x24/0x34
[  177.364028][    C0]  device_shutdown+0x174/0x214
[  177.364029][    C0]  kernel_restart+0x44/0x110
[  177.364030][    C0]  __arm64_sys_reboot+0x148/0x21c
[  177.364031][    C0]  el0_svc_common+0xc0/0x23c
[  177.364032][    C0]  do_el0_svc+0x28/0x88
[  177.364033][    C0]  el0_svc+0x14/0x24
[  177.364035][    C0]  el0_sync_handler+0x88/0xec
[  177.364036][    C0]  el0_sync+0x1a8/0x1c0
[  177.365312][    C0] rockchip-thermal fec00000.tsadc: channal 0: temperature(35 C)

Fixes: 3435f13 ("media: rockchip: vicap add shutdown function")
Signed-off-by: Zefa Chen <zefa.chen@rock-chips.com>
Change-Id: I58a76ac614c1a0ad13cf67a8001f6e9ee3d06fe8
Caesar-github pushed a commit that referenced this issue Sep 3, 2022
…lock_t for PREEMPT_RT

[   87.220277] BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:974
[   87.220282] in_atomic(): 1, irqs_disabled(): 128, pid: 0, name: swapper/1
[   87.220284] INFO: lockdep is turned off.
[   87.220287] irq event stamp: 11238
[   87.220300] hardirqs last  enabled at (11237): [<ffffff8008183dec>] tick_nohz_idle_enter+0x74/0xc0
[   87.220307] hardirqs last disabled at (11238): [<ffffff80080fef74>] do_idle+0xbc/0x290
[   87.220314] softirqs last  enabled at (0): [<ffffff80080b2cc8>] copy_process.isra.5.part.6+0x328/0x1c90
[   87.220317] softirqs last disabled at (0): [<0000000000000000>]           (null)
[   87.220319] Preemption disabled at:
[   87.220325] [<ffffff8008099168>] secondary_start_kernel+0xb8/0x1e8
[   87.220333] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G        W         4.19.232-rt104 #68
[   87.220335] Hardware name: Rockchip RK3568 EVB1 DDR4 V10 Board (DT)
[   87.220338] Call trace:
[   87.220343]  dump_backtrace+0x0/0x180
[   87.220347]  show_stack+0x14/0x20
[   87.220352]  dump_stack+0xa4/0xd8
[   87.220357]  ___might_sleep+0x1cc/0x228
[   87.220362]  rt_spin_lock+0xa0/0xe8
[   87.220367]  __log_abort_or_abnormal_wake+0x30/0xb8
[   87.220370]  log_abnormal_wakeup_reason+0x68/0x88
[   87.220374]  irq_may_run+0x98/0xb0
[   87.220379]  handle_fasteoi_irq+0x30/0x1b8
[   87.220382]  generic_handle_irq+0x24/0x38
[   87.220386]  __handle_domain_irq+0x84/0xf8
[   87.220389]  gic_handle_irq+0xc4/0x184
[   87.220392]  el1_irq+0xec/0x1ac
[   87.220398]  cpuidle_enter_state+0xb8/0x3d8
[   87.220401]  cpuidle_enter+0x18/0x20
[   87.220405]  call_cpuidle+0x24/0x48
[   87.220408]  do_idle+0x230/0x290
[   87.220411]  cpu_startup_entry+0x24/0x28
[   87.220415]  secondary_start_kernel+0x150/0x1e8

Signed-off-by: Liang Chen <cl@rock-chips.com>
Change-Id: I383164b2a646250d34dc7c085e7a1d297889581e
scpcom pushed a commit to scpcom/linux that referenced this issue Jul 1, 2023
[ Upstream commit 011a871 ]

When using DMA mode we are facing with Oops:
[  396.458157] Unable to handle kernel access to user memory without uaccess routines at virtual address 000000000000000c
[  396.469374] Oops [#1]
[  396.471839] Modules linked in:
[  396.475144] CPU: 0 PID: 114 Comm: arecord Not tainted 6.0.0-00164-g9a8eccdaf2be-dirty rockchip-linux#68
[  396.483619] Hardware name: YMP ELCT FPGA (DT)
[  396.488156] epc : dmaengine_pcm_open+0x1d2/0x342
[  396.493227]  ra : dmaengine_pcm_open+0x1d2/0x342
[  396.498140] epc : ffffffff807fe346 ra : ffffffff807fe346 sp : ffffffc804e138f0
[  396.505602]  gp : ffffffff817bf730 tp : ffffffd8042c8ac0 t0 : 6500000000000000
[  396.513045]  t1 : 0000000000000064 t2 : 656e69676e65616d s0 : ffffffc804e13990
[  396.520477]  s1 : ffffffd801b86a18 a0 : 0000000000000026 a1 : ffffffff816920f8
[  396.527897]  a2 : 0000000000000010 a3 : fffffffffffffffe a4 : 0000000000000000
[  396.535319]  a5 : 0000000000000000 a6 : ffffffd801b87040 a7 : 0000000000000038
[  396.542740]  s2 : ffffffd801b94a00 s3 : 0000000000000000 s4 : ffffffd80427f5e8
[  396.550153]  s5 : ffffffd80427f5e8 s6 : ffffffd801b44410 s7 : fffffffffffffff5
[  396.557569]  s8 : 0000000000000800 s9 : 0000000000000001 s10: ffffffff8066d254
[  396.564978]  s11: ffffffd8059cf768 t3 : ffffffff817d5577 t4 : ffffffff817d5577
[  396.572391]  t5 : ffffffff817d5578 t6 : ffffffc804e136e8
[  396.577876] status: 0000000200000120 badaddr: 000000000000000c cause: 000000000000000d
[  396.586007] [<ffffffff806839f4>] snd_soc_component_open+0x1a/0x68
[  396.592439] [<ffffffff807fdd62>] __soc_pcm_open+0xf0/0x502
[  396.598217] [<ffffffff80685d86>] soc_pcm_open+0x2e/0x4e
[  396.603741] [<ffffffff8066cea4>] snd_pcm_open_substream+0x442/0x68e
[  396.610313] [<ffffffff8066d1ea>] snd_pcm_open+0xfa/0x212
[  396.615868] [<ffffffff8066d39c>] snd_pcm_capture_open+0x3a/0x60
[  396.622048] [<ffffffff8065b35a>] snd_open+0xa8/0x17a
[  396.627421] [<ffffffff801ae036>] chrdev_open+0xa0/0x218
[  396.632893] [<ffffffff801a5a28>] do_dentry_open+0x17c/0x2a6
[  396.638713] [<ffffffff801a6d9a>] vfs_open+0x1e/0x26
[  396.643850] [<ffffffff801b8544>] path_openat+0x96e/0xc96
[  396.649518] [<ffffffff801b9390>] do_filp_open+0x7c/0xf6
[  396.655034] [<ffffffff801a6ff2>] do_sys_openat2+0x8a/0x11e
[  396.660765] [<ffffffff801a735a>] sys_openat+0x50/0x7c
[  396.666068] [<ffffffff80003aca>] ret_from_syscall+0x0/0x2
[  396.674964] ---[ end trace 0000000000000000 ]---

It happens because of play_dma_data/capture_dma_data pointers are NULL.
Current implementation assigns these pointers at snd_soc_dai_driver
startup() callback and reset them back to NULL at shutdown(). But
soc_pcm_open() sequence uses DMA pointers in dmaengine_pcm_open()
before snd_soc_dai_driver startup().
Most generic DMA capable I2S drivers use snd_soc_dai_driver probe()
callback to init DMA pointers only once at probe. So move DMA init
to dw_i2s_dai_probe and drop shutdown() and startup() callbacks.

Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru>
Link: https://lore.kernel.org/r/20230512110343.66664-1-fido_max@inbox.ru
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
scpcom pushed a commit to scpcom/linux that referenced this issue Jul 1, 2023
[ Upstream commit 011a871 ]

When using DMA mode we are facing with Oops:
[  396.458157] Unable to handle kernel access to user memory without uaccess routines at virtual address 000000000000000c
[  396.469374] Oops [#1]
[  396.471839] Modules linked in:
[  396.475144] CPU: 0 PID: 114 Comm: arecord Not tainted 6.0.0-00164-g9a8eccdaf2be-dirty rockchip-linux#68
[  396.483619] Hardware name: YMP ELCT FPGA (DT)
[  396.488156] epc : dmaengine_pcm_open+0x1d2/0x342
[  396.493227]  ra : dmaengine_pcm_open+0x1d2/0x342
[  396.498140] epc : ffffffff807fe346 ra : ffffffff807fe346 sp : ffffffc804e138f0
[  396.505602]  gp : ffffffff817bf730 tp : ffffffd8042c8ac0 t0 : 6500000000000000
[  396.513045]  t1 : 0000000000000064 t2 : 656e69676e65616d s0 : ffffffc804e13990
[  396.520477]  s1 : ffffffd801b86a18 a0 : 0000000000000026 a1 : ffffffff816920f8
[  396.527897]  a2 : 0000000000000010 a3 : fffffffffffffffe a4 : 0000000000000000
[  396.535319]  a5 : 0000000000000000 a6 : ffffffd801b87040 a7 : 0000000000000038
[  396.542740]  s2 : ffffffd801b94a00 s3 : 0000000000000000 s4 : ffffffd80427f5e8
[  396.550153]  s5 : ffffffd80427f5e8 s6 : ffffffd801b44410 s7 : fffffffffffffff5
[  396.557569]  s8 : 0000000000000800 s9 : 0000000000000001 s10: ffffffff8066d254
[  396.564978]  s11: ffffffd8059cf768 t3 : ffffffff817d5577 t4 : ffffffff817d5577
[  396.572391]  t5 : ffffffff817d5578 t6 : ffffffc804e136e8
[  396.577876] status: 0000000200000120 badaddr: 000000000000000c cause: 000000000000000d
[  396.586007] [<ffffffff806839f4>] snd_soc_component_open+0x1a/0x68
[  396.592439] [<ffffffff807fdd62>] __soc_pcm_open+0xf0/0x502
[  396.598217] [<ffffffff80685d86>] soc_pcm_open+0x2e/0x4e
[  396.603741] [<ffffffff8066cea4>] snd_pcm_open_substream+0x442/0x68e
[  396.610313] [<ffffffff8066d1ea>] snd_pcm_open+0xfa/0x212
[  396.615868] [<ffffffff8066d39c>] snd_pcm_capture_open+0x3a/0x60
[  396.622048] [<ffffffff8065b35a>] snd_open+0xa8/0x17a
[  396.627421] [<ffffffff801ae036>] chrdev_open+0xa0/0x218
[  396.632893] [<ffffffff801a5a28>] do_dentry_open+0x17c/0x2a6
[  396.638713] [<ffffffff801a6d9a>] vfs_open+0x1e/0x26
[  396.643850] [<ffffffff801b8544>] path_openat+0x96e/0xc96
[  396.649518] [<ffffffff801b9390>] do_filp_open+0x7c/0xf6
[  396.655034] [<ffffffff801a6ff2>] do_sys_openat2+0x8a/0x11e
[  396.660765] [<ffffffff801a735a>] sys_openat+0x50/0x7c
[  396.666068] [<ffffffff80003aca>] ret_from_syscall+0x0/0x2
[  396.674964] ---[ end trace 0000000000000000 ]---

It happens because of play_dma_data/capture_dma_data pointers are NULL.
Current implementation assigns these pointers at snd_soc_dai_driver
startup() callback and reset them back to NULL at shutdown(). But
soc_pcm_open() sequence uses DMA pointers in dmaengine_pcm_open()
before snd_soc_dai_driver startup().
Most generic DMA capable I2S drivers use snd_soc_dai_driver probe()
callback to init DMA pointers only once at probe. So move DMA init
to dw_i2s_dai_probe and drop shutdown() and startup() callbacks.

Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru>
Link: https://lore.kernel.org/r/20230512110343.66664-1-fido_max@inbox.ru
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants