-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial hack at kernel build system support for Rust #3
Conversation
A deadlock with this stacktrace was observed. The loop thread does a GFP_KERNEL allocation, it calls into dm-bufio shrinker and the shrinker depends on I/O completion in the dm-bufio subsystem. In order to fix the deadlock (and other similar ones), we set the flag PF_MEMALLOC_NOIO at loop thread entry. PID: 474 TASK: ffff8813e11f4600 CPU: 10 COMMAND: "kswapd0" #0 [ffff8813dedfb938] __schedule at ffffffff8173f405 #1 [ffff8813dedfb990] schedule at ffffffff8173fa27 #2 [ffff8813dedfb9b0] schedule_timeout at ffffffff81742fec #3 [ffff8813dedfba60] io_schedule_timeout at ffffffff8173f186 #4 [ffff8813dedfbaa0] bit_wait_io at ffffffff8174034f #5 [ffff8813dedfbac0] __wait_on_bit at ffffffff8173fec8 torvalds#6 [ffff8813dedfbb10] out_of_line_wait_on_bit at ffffffff8173ff81 torvalds#7 [ffff8813dedfbb90] __make_buffer_clean at ffffffffa038736f [dm_bufio] torvalds#8 [ffff8813dedfbbb0] __try_evict_buffer at ffffffffa0387bb8 [dm_bufio] torvalds#9 [ffff8813dedfbbd0] dm_bufio_shrink_scan at ffffffffa0387cc3 [dm_bufio] torvalds#10 [ffff8813dedfbc40] shrink_slab at ffffffff811a87ce torvalds#11 [ffff8813dedfbd30] shrink_zone at ffffffff811ad778 torvalds#12 [ffff8813dedfbdc0] kswapd at ffffffff811ae92f torvalds#13 [ffff8813dedfbec0] kthread at ffffffff810a8428 torvalds#14 [ffff8813dedfbf50] ret_from_fork at ffffffff81745242 PID: 14127 TASK: ffff881455749c00 CPU: 11 COMMAND: "loop1" #0 [ffff88272f5af228] __schedule at ffffffff8173f405 #1 [ffff88272f5af280] schedule at ffffffff8173fa27 #2 [ffff88272f5af2a0] schedule_preempt_disabled at ffffffff8173fd5e #3 [ffff88272f5af2b0] __mutex_lock_slowpath at ffffffff81741fb5 #4 [ffff88272f5af330] mutex_lock at ffffffff81742133 #5 [ffff88272f5af350] dm_bufio_shrink_count at ffffffffa03865f9 [dm_bufio] torvalds#6 [ffff88272f5af380] shrink_slab at ffffffff811a86bd torvalds#7 [ffff88272f5af470] shrink_zone at ffffffff811ad778 torvalds#8 [ffff88272f5af500] do_try_to_free_pages at ffffffff811adb34 torvalds#9 [ffff88272f5af590] try_to_free_pages at ffffffff811adef8 torvalds#10 [ffff88272f5af610] __alloc_pages_nodemask at ffffffff811a09c3 torvalds#11 [ffff88272f5af710] alloc_pages_current at ffffffff811e8b71 torvalds#12 [ffff88272f5af760] new_slab at ffffffff811f4523 torvalds#13 [ffff88272f5af7b0] __slab_alloc at ffffffff8173a1b5 torvalds#14 [ffff88272f5af880] kmem_cache_alloc at ffffffff811f484b torvalds#15 [ffff88272f5af8d0] do_blockdev_direct_IO at ffffffff812535b3 torvalds#16 [ffff88272f5afb00] __blockdev_direct_IO at ffffffff81255dc3 torvalds#17 [ffff88272f5afb30] xfs_vm_direct_IO at ffffffffa01fe3fc [xfs] torvalds#18 [ffff88272f5afb90] generic_file_read_iter at ffffffff81198994 torvalds#19 [ffff88272f5afc50] __dta_xfs_file_read_iter_2398 at ffffffffa020c970 [xfs] torvalds#20 [ffff88272f5afcc0] lo_rw_aio at ffffffffa0377042 [loop] torvalds#21 [ffff88272f5afd70] loop_queue_work at ffffffffa0377c3b [loop] torvalds#22 [ffff88272f5afe60] kthread_worker_fn at ffffffff810a8a0c torvalds#23 [ffff88272f5afec0] kthread at ffffffff810a8428 torvalds#24 [ffff88272f5aff50] ret_from_fork at ffffffff81745242 Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
…OL_MF_STRICT were specified When both MPOL_MF_MOVE* and MPOL_MF_STRICT was specified, mbind() should try best to migrate misplaced pages, if some of the pages could not be migrated, then return -EIO. There are three different sub-cases: 1. vma is not migratable 2. vma is migratable, but there are unmovable pages 3. vma is migratable, pages are movable, but migrate_pages() fails If #1 happens, kernel would just abort immediately, then return -EIO, after a7f40cf ("mm: mempolicy: make mbind() return -EIO when MPOL_MF_STRICT is specified"). If #3 happens, kernel would set policy and migrate pages with best-effort, but won't rollback the migrated pages and reset the policy back. Before that commit, they behaves in the same way. It'd better to keep their behavior consistent. But, rolling back the migrated pages and resetting the policy back sounds not feasible, so just make #1 behave as same as #3. Userspace will know that not everything was successfully migrated (via -EIO), and can take whatever steps it deems necessary - attempt rollback, determine which exact page(s) are violating the policy, etc. Make queue_pages_range() return 1 to indicate there are unmovable pages or vma is not migratable. The #2 is not handled correctly in the current kernel, the following patch will fix it. [yang.shi@linux.alibaba.com: fix review comments from Vlastimil] Link: http://lkml.kernel.org/r/1563556862-54056-2-git-send-email-yang.shi@linux.alibaba.com Link: http://lkml.kernel.org/r/1561162809-59140-2-git-send-email-yang.shi@linux.alibaba.com Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Michal Hocko <mhocko@suse.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5539004
to
bf6e269
Compare
scripts/Makefile.build
Outdated
# TODO: release/debug | ||
$(obj)/%.rust.a: $(src)/Cargo.toml $(wildcard $(src)/src/*.rs) $(srctree)/arc/$(SRCARCH)/$(ARCH)-kernel-target.json | ||
cd $(src); env -u MAKE -u MAKEFLAGS cargo xbuild --target=$(srctree)/arc/$(SRCARCH)/$(ARCH)-kernel-target.json | ||
cp $(src)/target/x86_64-linux-kernel-module/debug/lib%.a $(obj)/%.rust.a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$(obj)/%.rust.a
can probably be $@
ok, attempting to build using this the kernel in this PR is not being successful. built with
|
init/Kconfig
Outdated
def_bool $(success,cargo --version) | ||
|
||
config HAS_CARGO_XBUILD | ||
def_bool $(success,cargo xbuild --version) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cargo xbuild
requires nightly Rust. I would suggest that we get the target upstreamed into the Rust project and into stable Rust, which will make it much more palatable to the Linux kernel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think this is a hard requirement for Rust support moving into mainline? I want to make sure I have my yak stack^W^W dependency chain right :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we do have optional things in the kernel that require third-party tools and we also preemptively add support for upcoming GCC (stable) releases, this is about adding support for a new language which will likely require treewide changes to do properly. Therefore, yes, it would help a lot to show that the proposed changes are reasonably stable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a hard requirement, but you're going to have enough interesting arguments getting this upstream that if we can forestall a couple of them we should. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Compelling enough for me! Since I know you're also involved in upstream Rust work, do you have an opinion on process for this? Does it need an RFC, or is a PR enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A PR should suffice. CC me and I'll review it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome -- fine to start with just x86_64?
init/Kconfig
Outdated
depends on HAS_CARGO | ||
depends on HAS_CARGO_XBUILD | ||
help | ||
Whether to support building modules written in Rust. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though you autodetect the availability of Rust and Cargo, for the moment at least the request from upstream was to make sure this wasn't enabled by make allyesconfig
or make allmodconfig
. (For instance, just because a developer has rust and cargo installed for other reasons, doesn't necessarily mean they want their kernel tests to build this for now.)
To avoid that, I'd suggest the following pattern, inspired by a similar approach and requirement for link-time optimization (LTO). Rename config RUST
to config RUST_MENU
, and then add the following below it:
config RUST_DISABLE
bool
depends on RUST_MENU
help
This option disables the support for Rust, so that make
allyesconfig and make allmodconfig will not enable it. To
build modules written in Rust, leave this option set to 'n'.
config RUST
bool
default y
depends on RUST_MENU && !RUST_DISABLE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I wasn't sure what the right pattern was here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just using "depends on !COMPILE_TEST" is sufficient.
I made a couple of comments on items in the code, but I think they're hidden because I commented on a previous version. Links: |
On August 31, 2019 11:38:40 AM PDT, Alex Gaynor ***@***.***> wrote:
alex commented on this pull request.
> @@ -2181,3 +2181,20 @@ config ARCH_HAS_SYNC_CORE_BEFORE_USERMODE
# <asm/syscall_wrapper.h>.
config ARCH_HAS_SYSCALL_WRAPPER
def_bool n
+
+config HAS_RUST
+ def_bool $(success,rustc --version)
+
+config HAS_CARGO
+ def_bool $(success,cargo --version)
+
+config HAS_CARGO_XBUILD
+ def_bool $(success,cargo xbuild --version)
Awesome -- fine to start with just x86_64?
Sounds good to me!
|
Fantastic, thanks much! Hopefully will have a PR up this long weekend! |
Just for context, I've paused on this to work on:
|
Ok, this is now working (requires some changes I haven't landed yet to the rust lib side)! There's still some TODO comments though. The next step is going to be to do fishinabarrel/linux-kernel-module-rust#177, which should make this work with no patches required on that side. Once that happens I'll come back to do these TODOs and do other polish work here. |
@@ -1927,14 +1927,27 @@ config HAS_CARGO | |||
config HAS_CARGO_XBUILD | |||
def_bool $(success,cargo xbuild --version) | |||
|
|||
config RUST | |||
config MENU_RUST | |||
bool "Enables building kernel modules written in Rust" | |||
depends on HAS_RUST |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we only need to know if "cargo xbuild" works, and that sufficient for "HAS_RUST" in the sense that there's nothing we can do with only the existing HAS_RUST nor HAS_CARGO. Only HAS_CARGO_XBUILD is meaningful (it requires rustc and cargo), and the kernel needs full 'cargo xbuild' support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cargo xbuild
isn't even required anymore -- as of a week or two ago, everything we need is in upstream cargo!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether we need to check for any of this. If Rust modules get into mainline, we will build them as any other, so we will assume we have rustc
etc. available no matter what (and we will want to be able to change it like we do with CC
). And if there are no modules, we don't care anyway.
config RUST | ||
bool | ||
default y | ||
depends on RUST_MENU && !RUST_DISABLE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think any of this is needed -- just leave it "config RUST" with a depends on HAS_RUST that does the "cargo xbuild" test. An "allmodconfig" will not include things that require RUST if HAS_RUST isn't set, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you saying the whole RUST_DISABLE
should be removed entirely, or just that this particular line can be simplified?
0e7d217
to
8f07a3e
Compare
Try to print out startup pgm check info including exact linux kernel version, pgm interruption code and ilc, psw and general registers. Like the following: Linux version 5.3.0-rc7-07282-ge7b4d41d61bd-dirty (gor@tuxmaker) #3 SMP PREEMPT Thu Sep 5 16:07:34 CEST 2019 Kernel fault: interruption code 0005 ilc:2 PSW : 0000000180000000 0000000000012e52 R:0 T:0 IO:0 EX:0 Key:0 M:0 W:0 P:0 AS:0 CC:0 PM:0 RI:0 EA:3 GPRS: 0000000000000000 00ffffffffffffff 0000000000000000 0000000000019a58 000000000000bf68 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 000000000001a041 0000000000000000 0000000004c9c000 0000000000010070 0000000000012e42 000000000000beb0 This info makes it apparent that kernel startup failed and might help to understand what went wrong without actual standalone dump. Printing code runs on its own stack of 1 page (at unused 0x5000), which should be sufficient for sclp_early_printk usage (typical stack usage observed has been around 512 bytes). The code has pgm check recursion prevention, despite pgm check info printing failure (follow on pgm check) or success it restores original faulty psw and gprs and does disabled wait. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
…-ports-shared-buffer' Ido Schimmel says: ==================== mlxsw: spectrum_buffers: Add the ability to query the CPU port's shared buffer Shalom says: While debugging packet loss towards the CPU, it is useful to be able to query the CPU port's shared buffer quotas and occupancy. Patch #1 prevents changing the CPU port's threshold and binding. Patch #2 registers the CPU port with devlink. Patch #3 adds the ability to query the CPU port's shared buffer quotas and occupancy. v3: Patch #2: * Remove unnecessary wrapping v2: Patch #1: * s/0/MLXSW_PORT_CPU_PORT/ * Assign "mlxsw_sp->ports[MLXSW_PORT_CPU_PORT]" at the end of mlxsw_sp_cpu_port_create() to avoid NULL assignment on error path * Add common functions for mlxsw_core_port_init/fini() Patch #2: * Move "changing CPU port's threshold and binding" check to a separate patch ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
Ok. I've gotten most of the TODOs. One of the major remaining ones is release vs. debug builds. Does anyone have an opinion on a) which should be the default, b) what the right way to control which you're getting is? |
On September 22, 2019 5:15:47 PM PDT, Alex Gaynor ***@***.***> wrote:
Ok. I've gotten most of the TODOs. One of the major remaining ones is
release vs. debug builds.
Does anyone have an opinion on a) which should be the default, b) what
the right way to control which you're getting is?
Same as the C code: Build release by default, and control debug symbols, optimization, and optimize-for-size the same way C does. (Each of those has separate kconfig options, the latter two mutually exclusive.)
|
@@ -283,6 +283,19 @@ quiet_cmd_cc_lst_c = MKLST $@ | |||
$(obj)/%.lst: $(src)/%.c FORCE | |||
$(call if_changed_dep,cc_lst_c) | |||
|
|||
# Compile Rust sources | |||
# -------------------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the same length as the other sections for the ---
line.
|
||
# TODO: release/debug | ||
$(obj)/%.rust.o: $(src)/Cargo.toml $(src)/Cargo.lock $(wildcard $(src)/src/*.rs) FORCE | ||
cd $(src); env -u MAKE -u MAKEFLAGS KDIR="$(CURDIR)/$(srctree)" $(CARGO) build -Z build-std=core,alloc --target=$(CONFIG_ARCH_RUST_TARGET) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Split long lines with backslashes.
On Mon, Sep 23, 2019 at 09:59:58AM -0700, Miguel Ojeda wrote:
ojeda commented on this pull request.
> @@ -1927,14 +1927,27 @@ config HAS_CARGO
config HAS_CARGO_XBUILD
def_bool $(success,cargo xbuild --version)
-config RUST
+config MENU_RUST
bool "Enables building kernel modules written in Rust"
depends on HAS_RUST
I wonder whether we need to check for any of this. If Rust modules get into mainline, we will build them as any other, so we will assume we have `rustc` etc. available no matter what (and we will want to be able to change it like we do with `CC`). And if there are no modules, we don't care anyway.
This was the specific requirement for merging upstream. People running
"make allyesconfig" or "make allmodconfig" don't want to have to have
Rust installed.
(Analogously, LTO required a newer toolchain and a *lot* of memory, and
people running "make allyesconfig" or "make allmodconfig" didn't want to
have to have that.)
|
I am not saying we require Rust, but rather that if no Rust module is built in all{yes,mod}config, we don't need to care whether Or, to put it another way: LTO and some other config options (like debug ones) are optional, because we can either use them or not; however, Rust is always mandatory for those modules that are written in it. |
@ojeda Ah, sorry, I see what you're saying. You're not talking about the config structure with a re-disable option so that it gets disabled on I do agree that you could just unconditionally use the tools, rather than detecting them, if they're enabled. The "re-disable" mechanism is still needed, though, so that if someone has rustc installed it still isn't used in the kernel build without specifically being enabled. |
@joshtriplett No need to be sorry! Yeah, I was referring to the detection ( For the re-disabling mechanism, I guess we cannot do anything else if we want to start including some Rust code in mainline (i.e. not just the build mechanism). |
Observe a segmentation fault when 'perf stat' is asked to repeat forever with the interval option. Without fix: # perf stat -r 0 -I 5000 -e cycles -a sleep 10 # time counts unit events 5.000211692 3,13,89,82,34,157 cycles 10.000380119 1,53,98,52,22,294 cycles 10.040467280 17,16,79,265 cycles Segmentation fault This problem was only observed when we use forever option aka -r 0 and works with limited repeats. Calling print_counter with ts being set to NULL, is not a correct option when interval is set. Hence avoid print_counter(NULL,..) if interval is set. With fix: # perf stat -r 0 -I 5000 -e cycles -a sleep 10 # time counts unit events 5.019866622 3,15,14,43,08,697 cycles 10.039865756 3,15,16,31,95,261 cycles 10.059950628 1,26,05,47,158 cycles 5.009902655 3,14,52,62,33,932 cycles 10.019880228 3,14,52,22,89,154 cycles 10.030543876 66,90,18,333 cycles 5.009848281 3,14,51,98,25,437 cycles 10.029854402 3,15,14,93,04,918 cycles 5.009834177 3,14,51,95,92,316 cycles Committer notes: Did the 'git bisect' to find the cset introducing the problem to add the Fixes tag below, and at that time the problem reproduced as: (gdb) run stat -r0 -I500 sleep 1 <SNIP> Program received signal SIGSEGV, Segmentation fault. print_interval (prefix=prefix@entry=0x7fffffffc8d0 "", ts=ts@entry=0x0) at builtin-stat.c:866 866 sprintf(prefix, "%6lu.%09lu%s", ts->tv_sec, ts->tv_nsec, csv_sep); (gdb) bt #0 print_interval (prefix=prefix@entry=0x7fffffffc8d0 "", ts=ts@entry=0x0) at builtin-stat.c:866 #1 0x000000000041860a in print_counters (ts=ts@entry=0x0, argc=argc@entry=2, argv=argv@entry=0x7fffffffd640) at builtin-stat.c:938 #2 0x0000000000419a7f in cmd_stat (argc=2, argv=0x7fffffffd640, prefix=<optimized out>) at builtin-stat.c:1411 #3 0x000000000045c65a in run_builtin (p=p@entry=0x6291b8 <commands+216>, argc=argc@entry=5, argv=argv@entry=0x7fffffffd640) at perf.c:370 #4 0x000000000045c893 in handle_internal_command (argc=5, argv=0x7fffffffd640) at perf.c:429 #5 0x000000000045c8f1 in run_argv (argcp=argcp@entry=0x7fffffffd4ac, argv=argv@entry=0x7fffffffd4a0) at perf.c:473 torvalds#6 0x000000000045cac9 in main (argc=<optimized out>, argv=<optimized out>) at perf.c:588 (gdb) Mostly the same as just before this patch: Program received signal SIGSEGV, Segmentation fault. 0x00000000005874a7 in print_interval (config=0xa1f2a0 <stat_config>, evlist=0xbc9b90, prefix=0x7fffffffd1c0 "`", ts=0x0) at util/stat-display.c:964 964 sprintf(prefix, "%6lu.%09lu%s", ts->tv_sec, ts->tv_nsec, config->csv_sep); (gdb) bt #0 0x00000000005874a7 in print_interval (config=0xa1f2a0 <stat_config>, evlist=0xbc9b90, prefix=0x7fffffffd1c0 "`", ts=0x0) at util/stat-display.c:964 #1 0x0000000000588047 in perf_evlist__print_counters (evlist=0xbc9b90, config=0xa1f2a0 <stat_config>, _target=0xa1f0c0 <target>, ts=0x0, argc=2, argv=0x7fffffffd670) at util/stat-display.c:1172 #2 0x000000000045390f in print_counters (ts=0x0, argc=2, argv=0x7fffffffd670) at builtin-stat.c:656 #3 0x0000000000456bb5 in cmd_stat (argc=2, argv=0x7fffffffd670) at builtin-stat.c:1960 #4 0x00000000004dd2e0 in run_builtin (p=0xa30e00 <commands+288>, argc=5, argv=0x7fffffffd670) at perf.c:310 #5 0x00000000004dd54d in handle_internal_command (argc=5, argv=0x7fffffffd670) at perf.c:362 torvalds#6 0x00000000004dd694 in run_argv (argcp=0x7fffffffd4cc, argv=0x7fffffffd4c0) at perf.c:406 torvalds#7 0x00000000004dda11 in main (argc=5, argv=0x7fffffffd670) at perf.c:531 (gdb) Fixes: d4f63a4 ("perf stat: Introduce print_counters function") Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: stable@vger.kernel.org # v4.2+ Link: http://lore.kernel.org/lkml/20190904094738.9558-3-srikar@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We release wrong pointer on error path in cpu_cache_level__read function, leading to segfault: (gdb) r record ls Starting program: /root/perf/tools/perf/perf record ls ... [ perf record: Woken up 1 times to write data ] double free or corruption (out) Thread 1 "perf" received signal SIGABRT, Aborted. 0x00007ffff7463798 in raise () from /lib64/power9/libc.so.6 (gdb) bt #0 0x00007ffff7463798 in raise () from /lib64/power9/libc.so.6 #1 0x00007ffff7443bac in abort () from /lib64/power9/libc.so.6 #2 0x00007ffff74af8bc in __libc_message () from /lib64/power9/libc.so.6 #3 0x00007ffff74b92b8 in malloc_printerr () from /lib64/power9/libc.so.6 #4 0x00007ffff74bb874 in _int_free () from /lib64/power9/libc.so.6 #5 0x0000000010271260 in __zfree (ptr=0x7fffffffa0b0) at ../../lib/zalloc.. torvalds#6 0x0000000010139340 in cpu_cache_level__read (cache=0x7fffffffa090, cac.. torvalds#7 0x0000000010143c90 in build_caches (cntp=0x7fffffffa118, size=<optimiz.. ... Releasing the proper pointer. Fixes: 720e98b ("perf tools: Add perf data cache feature") Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org: # v4.6+ Link: http://lore.kernel.org/lkml/20190912105235.10689-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
I'm seeing a bunch of debug prints from a user of print_hex_dump_bytes() in my kernel logs, but I don't have CONFIG_DYNAMIC_DEBUG enabled nor do I have DEBUG defined in my build. The problem is that print_hex_dump_bytes() calls a wrapper function in lib/hexdump.c that calls print_hex_dump() with KERN_DEBUG level. There are three cases to consider here 1. CONFIG_DYNAMIC_DEBUG=y --> call dynamic_hex_dum() 2. CONFIG_DYNAMIC_DEBUG=n && DEBUG --> call print_hex_dump() 3. CONFIG_DYNAMIC_DEBUG=n && !DEBUG --> stub it out Right now, that last case isn't detected and we still call print_hex_dump() from the stub wrapper. Let's make print_hex_dump_bytes() only call print_hex_dump_debug() so that it works properly in all cases. Case #1, print_hex_dump_debug() calls dynamic_hex_dump() and we get same behavior. Case #2, print_hex_dump_debug() calls print_hex_dump() with KERN_DEBUG and we get the same behavior. Case #3, print_hex_dump_debug() is a nop, changing behavior to what we want, i.e. print nothing. Link: http://lkml.kernel.org/r/20190816235624.115280-1-swboyd@chromium.org Signed-off-by: Stephen Boyd <swboyd@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9db5573
to
475e659
Compare
Ido Schimmel says: ==================== mlxsw: Various fixes This patchset includes two small fixes for the mlxsw driver and one patch which clarifies recently introduced devlink-trap documentation. Patch #1 clears the port's VLAN filters during port initialization. This ensures that the drop reason reported to the user is consistent. The problem is explained in detail in the commit message. Patch #2 clarifies the description of one of the traps exposed via devlink-trap. Patch #3 from Danielle forbids the installation of a tc filter with multiple mirror actions since this is not supported by the device. The failure is communicated to the user via extack. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes the lock inversion complaint: ============================================ WARNING: possible recursive locking detected 5.3.0-rc7-dbg+ #1 Not tainted -------------------------------------------- kworker/u16:6/171 is trying to acquire lock: 00000000035c6e6c (&id_priv->handler_mutex){+.+.}, at: rdma_destroy_id+0x78/0x4a0 [rdma_cm] but task is already holding lock: 00000000bc7c307d (&id_priv->handler_mutex){+.+.}, at: iw_conn_req_handler+0x151/0x680 [rdma_cm] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&id_priv->handler_mutex); lock(&id_priv->handler_mutex); *** DEADLOCK *** May be due to missing lock nesting notation 3 locks held by kworker/u16:6/171: #0: 00000000e2eaa773 ((wq_completion)iw_cm_wq){+.+.}, at: process_one_work+0x472/0xac0 #1: 000000001efd357b ((work_completion)(&work->work)#3){+.+.}, at: process_one_work+0x476/0xac0 #2: 00000000bc7c307d (&id_priv->handler_mutex){+.+.}, at: iw_conn_req_handler+0x151/0x680 [rdma_cm] stack backtrace: CPU: 3 PID: 171 Comm: kworker/u16:6 Not tainted 5.3.0-rc7-dbg+ #1 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Workqueue: iw_cm_wq cm_work_handler [iw_cm] Call Trace: dump_stack+0x8a/0xd6 __lock_acquire.cold+0xe1/0x24d lock_acquire+0x106/0x240 __mutex_lock+0x12e/0xcb0 mutex_lock_nested+0x1f/0x30 rdma_destroy_id+0x78/0x4a0 [rdma_cm] iw_conn_req_handler+0x5c9/0x680 [rdma_cm] cm_work_handler+0xe62/0x1100 [iw_cm] process_one_work+0x56d/0xac0 worker_thread+0x7a/0x5d0 kthread+0x1bc/0x210 ret_from_fork+0x24/0x30 This is not a bug as there are actually two lock classes here. Link: https://lore.kernel.org/r/20190930231707.48259-3-bvanassche@acm.org Fixes: de910bd ("RDMA/cma: Simplify locking needed for serialization of callbacks") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
No description provided.