-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Config request: Increase COMMAND_LINE_SIZE #140
Comments
|
I would think that it shouldn't be a problem to do so, the 512 character limit is mostly intended to save space on systems with very small amounts of memory. The full amount of space stays allocated permanently, and only gets released on a reboot or kexec. |
|
I believe we are using: However the (start.elf) firmware does have a limit of 500. I'll increase that. |
|
Latest firmware should support 1024 characters. |
Starting with commit 0b52297 ("reset: Add support for shared reset controls") there is a reference count for reset control assertions. The goal is to allow resets to be shared by multiple devices and an assert will take effect only when all instances have asserted the reset. In order to preserve backwards-compatibility, all reset controls become exclusive by default. This is to ensure that reset_control_assert() can immediately assert in hardware. However, this new behaviour triggers the following warning in the EHCI driver for Tegra: [ 3.365019] ------------[ cut here ]------------ [ 3.369639] WARNING: CPU: 0 PID: 1 at drivers/reset/core.c:187 __of_reset_control_get+0x16c/0x23c [ 3.382151] Modules linked in: [ 3.385214] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.6.0-rc6-next-20160503 #140 [ 3.392769] Hardware name: NVIDIA Tegra SoC (Flattened Device Tree) [ 3.399046] [<c010fa50>] (unwind_backtrace) from [<c010b120>] (show_stack+0x10/0x14) [ 3.406787] [<c010b120>] (show_stack) from [<c0347dcc>] (dump_stack+0x90/0xa4) [ 3.414007] [<c0347dcc>] (dump_stack) from [<c011f4fc>] (__warn+0xe8/0x100) [ 3.420964] [<c011f4fc>] (__warn) from [<c011f5c4>] (warn_slowpath_null+0x20/0x28) [ 3.428525] [<c011f5c4>] (warn_slowpath_null) from [<c03cc8cc>] (__of_reset_control_get+0x16c/0x23c) [ 3.437648] [<c03cc8cc>] (__of_reset_control_get) from [<c0526858>] (tegra_ehci_probe+0x394/0x518) [ 3.446600] [<c0526858>] (tegra_ehci_probe) from [<c04516d8>] (platform_drv_probe+0x4c/0xb0) [ 3.455029] [<c04516d8>] (platform_drv_probe) from [<c044fe78>] (driver_probe_device+0x1ec/0x330) [ 3.463892] [<c044fe78>] (driver_probe_device) from [<c0450074>] (__driver_attach+0xb8/0xbc) [ 3.472320] [<c0450074>] (__driver_attach) from [<c044e1ec>] (bus_for_each_dev+0x68/0x9c) [ 3.480489] [<c044e1ec>] (bus_for_each_dev) from [<c044f338>] (bus_add_driver+0x1a0/0x218) [ 3.488743] [<c044f338>] (bus_add_driver) from [<c0450768>] (driver_register+0x78/0xf8) [ 3.496738] [<c0450768>] (driver_register) from [<c010178c>] (do_one_initcall+0x40/0x170) [ 3.504909] [<c010178c>] (do_one_initcall) from [<c0c00ddc>] (kernel_init_freeable+0x158/0x1f8) [ 3.513600] [<c0c00ddc>] (kernel_init_freeable) from [<c0810784>] (kernel_init+0x8/0x114) [ 3.521770] [<c0810784>] (kernel_init) from [<c0107778>] (ret_from_fork+0x14/0x3c) [ 3.529361] ---[ end trace 4bda87dbe4ecef8a ]--- The reason is that Tegra SoCs have three EHCI controllers, each with a separate reset line. However the first controller contains UTMI pads configuration registers that are shared with its siblings and that are reset as part of the first controller's reset. There is special code in the driver to assert and deassert this shared reset at probe time, and it does so irrespective of which controller is probed first to ensure that these shared registers are reset before any of the controllers are initialized. Unfortunately this means that if the first controller gets probed first, it will request its own reset line and will subsequently request the same reset line again (temporarily) to perform the reset. This used to work fine before the above-mentioned commit, but now triggers the new WARN. Work around this by making sure we reuse the controller's reset if the controller happens to be the first controller. Cc: Philipp Zabel <p.zabel@pengutronix.de> Cc: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
…erence KASAN allocates memory from the page allocator as part of kmem_cache_free(), and that can reference current->mempolicy through any number of allocation functions. It needs to be NULL'd out before the final reference is dropped to prevent a use-after-free bug: BUG: KASAN: use-after-free in alloc_pages_current+0x363/0x370 at addr ffff88010b48102c CPU: 0 PID: 15425 Comm: trinity-c2 Not tainted 4.8.0-rc2+ #140 ... Call Trace: dump_stack kasan_object_err kasan_report_error __asan_report_load2_noabort alloc_pages_current <-- use after free depot_save_stack save_stack kasan_slab_free kmem_cache_free __mpol_put <-- free do_exit This patch sets current->mempolicy to NULL before dropping the final reference. Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1608301442180.63329@chino.kir.corp.google.com Fixes: cd11016 ("mm, kasan: stackdepot implementation. Enable stackdepot for SLAB") Signed-off-by: David Rientjes <rientjes@google.com> Reported-by: Vegard Nossum <vegard.nossum@oracle.com> Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: <stable@vger.kernel.org> [4.6+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
…erence commit c11600e upstream. KASAN allocates memory from the page allocator as part of kmem_cache_free(), and that can reference current->mempolicy through any number of allocation functions. It needs to be NULL'd out before the final reference is dropped to prevent a use-after-free bug: BUG: KASAN: use-after-free in alloc_pages_current+0x363/0x370 at addr ffff88010b48102c CPU: 0 PID: 15425 Comm: trinity-c2 Not tainted 4.8.0-rc2+ #140 ... Call Trace: dump_stack kasan_object_err kasan_report_error __asan_report_load2_noabort alloc_pages_current <-- use after free depot_save_stack save_stack kasan_slab_free kmem_cache_free __mpol_put <-- free do_exit This patch sets current->mempolicy to NULL before dropping the final reference. Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1608301442180.63329@chino.kir.corp.google.com Fixes: cd11016 ("mm, kasan: stackdepot implementation. Enable stackdepot for SLAB") Signed-off-by: David Rientjes <rientjes@google.com> Reported-by: Vegard Nossum <vegard.nossum@oracle.com> Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ncsi_channel_monitor() misses stopping the channel monitor in several places that it should, causing a WARN_ON_ONCE() to trigger when the monitor is re-started later, eg: [ 459.040000] WARNING: CPU: 0 PID: 1093 at net/ncsi/ncsi-manage.c:269 ncsi_start_channel_monitor+0x7c/0x90 [ 459.040000] CPU: 0 PID: 1093 Comm: kworker/0:3 Not tainted 4.10.17-gaca2fdd #140 [ 459.040000] Hardware name: ASpeed SoC [ 459.040000] Workqueue: events ncsi_dev_work [ 459.040000] [<80010094>] (unwind_backtrace) from [<8000d950>] (show_stack+0x20/0x24) [ 459.040000] [<8000d950>] (show_stack) from [<801dbf70>] (dump_stack+0x20/0x28) [ 459.040000] [<801dbf70>] (dump_stack) from [<80018d7c>] (__warn+0xe0/0x108) [ 459.040000] [<80018d7c>] (__warn) from [<80018e70>] (warn_slowpath_null+0x30/0x38) [ 459.040000] [<80018e70>] (warn_slowpath_null) from [<803f6a08>] (ncsi_start_channel_monitor+0x7c/0x90) [ 459.040000] [<803f6a08>] (ncsi_start_channel_monitor) from [<803f7664>] (ncsi_configure_channel+0xdc/0x5fc) [ 459.040000] [<803f7664>] (ncsi_configure_channel) from [<803f8160>] (ncsi_dev_work+0xac/0x474) [ 459.040000] [<803f8160>] (ncsi_dev_work) from [<8002d244>] (process_one_work+0x1e0/0x450) [ 459.040000] [<8002d244>] (process_one_work) from [<8002d510>] (worker_thread+0x5c/0x570) [ 459.040000] [<8002d510>] (worker_thread) from [<80033614>] (kthread+0x124/0x164) [ 459.040000] [<80033614>] (kthread) from [<8000a5e8>] (ret_from_fork+0x14/0x2c) This also updates the monitor instead of just returning if ncsi_xmit_cmd() fails to send the get-link-status command so that the monitor properly times out. Fixes: e6f44ed "net/ncsi: Package and channel management" Signed-off-by: Samuel Mendoza-Jonas <sam@mendozajonas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
It seems likely this block was pasted from internal_get_user_pages_fast,
which is not passed an mm struct and therefore uses current's. But
__get_user_pages_locked is passed an explicit mm, and current->mm is not
always valid. This was hit when being called from i915, which uses:
pin_user_pages_remote->
__get_user_pages_remote->
__gup_longterm_locked->
__get_user_pages_locked
Before, this would lead to an OOPS:
BUG: kernel NULL pointer dereference, address: 0000000000000064
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
CPU: 10 PID: 1431 Comm: kworker/u33:1 Tainted: P S U O 5.9.0-rc7+ #140
Hardware name: LENOVO 20QTCTO1WW/20QTCTO1WW, BIOS N2OET47W (1.34 ) 08/06/2020
Workqueue: i915-userptr-acquire __i915_gem_userptr_get_pages_worker [i915]
RIP: 0010:__get_user_pages_remote+0xd7/0x310
Call Trace:
__i915_gem_userptr_get_pages_worker+0xc8/0x260 [i915]
process_one_work+0x1ca/0x390
worker_thread+0x48/0x3c0
kthread+0x114/0x130
ret_from_fork+0x1f/0x30
CR2: 0000000000000064
This commit fixes the problem by using the mm pointer passed to the
function rather than the bogus one in current.
Fixes: 008cfe4 ("mm: Introduce mm_struct.has_pinned")
Tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Reported-by: Harald Arnesen <harald@skogtun.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[ Upstream commit 0c98057 ] I got issue as follows: [ 263.886511] BUG: KASAN: use-after-free in pid_show+0x11f/0x13f [ 263.888359] Read of size 4 at addr ffff8880bf0648c0 by task cat/746 [ 263.890479] CPU: 0 PID: 746 Comm: cat Not tainted 4.19.90-dirty #140 [ 263.893162] Call Trace: [ 263.893509] dump_stack+0x108/0x15f [ 263.893999] print_address_description+0xa5/0x372 [ 263.894641] kasan_report.cold+0x236/0x2a8 [ 263.895696] __asan_report_load4_noabort+0x25/0x30 [ 263.896365] pid_show+0x11f/0x13f [ 263.897422] dev_attr_show+0x48/0x90 [ 263.898361] sysfs_kf_seq_show+0x24d/0x4b0 [ 263.899479] kernfs_seq_show+0x14e/0x1b0 [ 263.900029] seq_read+0x43f/0x1150 [ 263.900499] kernfs_fop_read+0xc7/0x5a0 [ 263.903764] vfs_read+0x113/0x350 [ 263.904231] ksys_read+0x103/0x270 [ 263.905230] __x64_sys_read+0x77/0xc0 [ 263.906284] do_syscall_64+0x106/0x360 [ 263.906797] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Reproduce this issue as follows: 1. nbd-server 8000 /tmp/disk 2. nbd-client localhost 8000 /dev/nbd1 3. cat /sys/block/nbd1/pid Then trigger use-after-free in pid_show. Reason is after do step '2', nbd-client progress is already exit. So it's task_struct already freed. To solve this issue, revert part of 6521d39's modify and remove useless 'recv_task' member of nbd_device. Fixes: 6521d39 ("nbd: Remove variable 'pid'") Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Link: https://lore.kernel.org/r/20211020073959.2679255-1-yebin10@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0c98057 ] I got issue as follows: [ 263.886511] BUG: KASAN: use-after-free in pid_show+0x11f/0x13f [ 263.888359] Read of size 4 at addr ffff8880bf0648c0 by task cat/746 [ 263.890479] CPU: 0 PID: 746 Comm: cat Not tainted 4.19.90-dirty #140 [ 263.893162] Call Trace: [ 263.893509] dump_stack+0x108/0x15f [ 263.893999] print_address_description+0xa5/0x372 [ 263.894641] kasan_report.cold+0x236/0x2a8 [ 263.895696] __asan_report_load4_noabort+0x25/0x30 [ 263.896365] pid_show+0x11f/0x13f [ 263.897422] dev_attr_show+0x48/0x90 [ 263.898361] sysfs_kf_seq_show+0x24d/0x4b0 [ 263.899479] kernfs_seq_show+0x14e/0x1b0 [ 263.900029] seq_read+0x43f/0x1150 [ 263.900499] kernfs_fop_read+0xc7/0x5a0 [ 263.903764] vfs_read+0x113/0x350 [ 263.904231] ksys_read+0x103/0x270 [ 263.905230] __x64_sys_read+0x77/0xc0 [ 263.906284] do_syscall_64+0x106/0x360 [ 263.906797] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Reproduce this issue as follows: 1. nbd-server 8000 /tmp/disk 2. nbd-client localhost 8000 /dev/nbd1 3. cat /sys/block/nbd1/pid Then trigger use-after-free in pid_show. Reason is after do step '2', nbd-client progress is already exit. So it's task_struct already freed. To solve this issue, revert part of 6521d39's modify and remove useless 'recv_task' member of nbd_device. Fixes: 6521d39 ("nbd: Remove variable 'pid'") Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Link: https://lore.kernel.org/r/20211020073959.2679255-1-yebin10@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e2daec4 ] I got follow issue: [ 247.381177] INFO: task kworker/u10:0:47 blocked for more than 120 seconds. [ 247.382644] Not tainted 4.19.90-dirty #140 [ 247.383502] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 247.385027] Call Trace: [ 247.388384] schedule+0xb8/0x3c0 [ 247.388966] schedule_timeout+0x2b4/0x380 [ 247.392815] wait_for_completion+0x367/0x510 [ 247.397713] flush_workqueue+0x32b/0x1340 [ 247.402700] drain_workqueue+0xda/0x3c0 [ 247.403442] destroy_workqueue+0x7b/0x690 [ 247.405014] nbd_config_put.cold+0x2f9/0x5b6 [ 247.405823] recv_work+0x1fd/0x2b0 [ 247.406485] process_one_work+0x70b/0x1610 [ 247.407262] worker_thread+0x5a9/0x1060 [ 247.408699] kthread+0x35e/0x430 [ 247.410918] ret_from_fork+0x1f/0x30 We can reproduce issue as follows: 1. Inject memory fault in nbd_start_device -1244,10 +1248,18 @@ static int nbd_start_device(struct nbd_device *nbd) nbd_dev_dbg_init(nbd); for (i = 0; i < num_connections; i++) { struct recv_thread_args *args; - - args = kzalloc(sizeof(*args), GFP_KERNEL); + + if (i == 1) { + args = NULL; + printk("%s: inject malloc error\n", __func__); + } + else + args = kzalloc(sizeof(*args), GFP_KERNEL); 2. Inject delay in recv_work -757,6 +760,8 @@ static void recv_work(struct work_struct *work) blk_mq_complete_request(blk_mq_rq_from_pdu(cmd)); } + printk("%s: comm=%s pid=%d\n", __func__, current->comm, current->pid); + mdelay(5 * 1000); nbd_config_put(nbd); atomic_dec(&config->recv_threads); wake_up(&config->recv_wq); 3. Create nbd server nbd-server 8000 /tmp/disk 4. Create nbd client nbd-client localhost 8000 /dev/nbd1 Then will trigger above issue. Reason is when add delay in recv_work, lead to release the last reference of 'nbd->config_refs'. nbd_config_put will call flush_workqueue to make all work finish. Obviously, it will lead to deadloop. To solve this issue, according to Josef's suggestion move 'recv_work' init from start device to nbd_dev_add, then destroy 'recv_work'when nbd device teardown. Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Link: https://lore.kernel.org/r/20211102015237.2309763-5-yebin10@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
Hey there,
Given the amount of args prepended to the kernel command line before adding cmdline.txt, and given the amount of stuff required in cmdline.txt by default, I've found myself hitting the 512 character limit for the kernel command line.
This can result in an unbootable system due to crucial options being truncated.
Could COMMAND_LINE_SIZE in include/asm-generic/setup.h please be set to something bigger than 512?
Cheers.
The text was updated successfully, but these errors were encountered: