-
Notifications
You must be signed in to change notification settings - Fork 131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to build kernel? #7
Comments
pimpmaneaton
pushed a commit
to BlissRoms-Devices/android_kernel_oneplus_sm8150
that referenced
this issue
Nov 12, 2019
[ Upstream commit 0216234c2eed1367a318daeb9f4a97d8217412a0 ] We release wrong pointer on error path in cpu_cache_level__read function, leading to segfault: (gdb) r record ls Starting program: /root/perf/tools/perf/perf record ls ... [ perf record: Woken up 1 times to write data ] double free or corruption (out) Thread 1 "perf" received signal SIGABRT, Aborted. 0x00007ffff7463798 in raise () from /lib64/power9/libc.so.6 (gdb) bt #0 0x00007ffff7463798 in raise () from /lib64/power9/libc.so.6 PeterCxy#1 0x00007ffff7443bac in abort () from /lib64/power9/libc.so.6 OnePlusOSS#2 0x00007ffff74af8bc in __libc_message () from /lib64/power9/libc.so.6 OnePlusOSS#3 0x00007ffff74b92b8 in malloc_printerr () from /lib64/power9/libc.so.6 OnePlusOSS#4 0x00007ffff74bb874 in _int_free () from /lib64/power9/libc.so.6 OnePlusOSS#5 0x0000000010271260 in __zfree (ptr=0x7fffffffa0b0) at ../../lib/zalloc.. OnePlusOSS#6 0x0000000010139340 in cpu_cache_level__read (cache=0x7fffffffa090, cac.. OnePlusOSS#7 0x0000000010143c90 in build_caches (cntp=0x7fffffffa118, size=<optimiz.. ... Releasing the proper pointer. Fixes: 720e98b ("perf tools: Add perf data cache feature") Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org: # v4.6+ Link: http://lore.kernel.org/lkml/20190912105235.10689-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
pimpmaneaton
pushed a commit
to BlissRoms-Devices/android_kernel_oneplus_sm8150
that referenced
this issue
Nov 12, 2019
[ Upstream commit 443f2d5ba13d65ccfd879460f77941875159d154 ] Observe a segmentation fault when 'perf stat' is asked to repeat forever with the interval option. Without fix: # perf stat -r 0 -I 5000 -e cycles -a sleep 10 # time counts unit events 5.000211692 3,13,89,82,34,157 cycles 10.000380119 1,53,98,52,22,294 cycles 10.040467280 17,16,79,265 cycles Segmentation fault This problem was only observed when we use forever option aka -r 0 and works with limited repeats. Calling print_counter with ts being set to NULL, is not a correct option when interval is set. Hence avoid print_counter(NULL,..) if interval is set. With fix: # perf stat -r 0 -I 5000 -e cycles -a sleep 10 # time counts unit events 5.019866622 3,15,14,43,08,697 cycles 10.039865756 3,15,16,31,95,261 cycles 10.059950628 1,26,05,47,158 cycles 5.009902655 3,14,52,62,33,932 cycles 10.019880228 3,14,52,22,89,154 cycles 10.030543876 66,90,18,333 cycles 5.009848281 3,14,51,98,25,437 cycles 10.029854402 3,15,14,93,04,918 cycles 5.009834177 3,14,51,95,92,316 cycles Committer notes: Did the 'git bisect' to find the cset introducing the problem to add the Fixes tag below, and at that time the problem reproduced as: (gdb) run stat -r0 -I500 sleep 1 <SNIP> Program received signal SIGSEGV, Segmentation fault. print_interval (prefix=prefix@entry=0x7fffffffc8d0 "", ts=ts@entry=0x0) at builtin-stat.c:866 866 sprintf(prefix, "%6lu.%09lu%s", ts->tv_sec, ts->tv_nsec, csv_sep); (gdb) bt #0 print_interval (prefix=prefix@entry=0x7fffffffc8d0 "", ts=ts@entry=0x0) at builtin-stat.c:866 PeterCxy#1 0x000000000041860a in print_counters (ts=ts@entry=0x0, argc=argc@entry=2, argv=argv@entry=0x7fffffffd640) at builtin-stat.c:938 OnePlusOSS#2 0x0000000000419a7f in cmd_stat (argc=2, argv=0x7fffffffd640, prefix=<optimized out>) at builtin-stat.c:1411 OnePlusOSS#3 0x000000000045c65a in run_builtin (p=p@entry=0x6291b8 <commands+216>, argc=argc@entry=5, argv=argv@entry=0x7fffffffd640) at perf.c:370 OnePlusOSS#4 0x000000000045c893 in handle_internal_command (argc=5, argv=0x7fffffffd640) at perf.c:429 OnePlusOSS#5 0x000000000045c8f1 in run_argv (argcp=argcp@entry=0x7fffffffd4ac, argv=argv@entry=0x7fffffffd4a0) at perf.c:473 OnePlusOSS#6 0x000000000045cac9 in main (argc=<optimized out>, argv=<optimized out>) at perf.c:588 (gdb) Mostly the same as just before this patch: Program received signal SIGSEGV, Segmentation fault. 0x00000000005874a7 in print_interval (config=0xa1f2a0 <stat_config>, evlist=0xbc9b90, prefix=0x7fffffffd1c0 "`", ts=0x0) at util/stat-display.c:964 964 sprintf(prefix, "%6lu.%09lu%s", ts->tv_sec, ts->tv_nsec, config->csv_sep); (gdb) bt #0 0x00000000005874a7 in print_interval (config=0xa1f2a0 <stat_config>, evlist=0xbc9b90, prefix=0x7fffffffd1c0 "`", ts=0x0) at util/stat-display.c:964 PeterCxy#1 0x0000000000588047 in perf_evlist__print_counters (evlist=0xbc9b90, config=0xa1f2a0 <stat_config>, _target=0xa1f0c0 <target>, ts=0x0, argc=2, argv=0x7fffffffd670) at util/stat-display.c:1172 OnePlusOSS#2 0x000000000045390f in print_counters (ts=0x0, argc=2, argv=0x7fffffffd670) at builtin-stat.c:656 OnePlusOSS#3 0x0000000000456bb5 in cmd_stat (argc=2, argv=0x7fffffffd670) at builtin-stat.c:1960 OnePlusOSS#4 0x00000000004dd2e0 in run_builtin (p=0xa30e00 <commands+288>, argc=5, argv=0x7fffffffd670) at perf.c:310 OnePlusOSS#5 0x00000000004dd54d in handle_internal_command (argc=5, argv=0x7fffffffd670) at perf.c:362 OnePlusOSS#6 0x00000000004dd694 in run_argv (argcp=0x7fffffffd4cc, argv=0x7fffffffd4c0) at perf.c:406 OnePlusOSS#7 0x00000000004dda11 in main (argc=5, argv=0x7fffffffd670) at perf.c:531 (gdb) Fixes: d4f63a4 ("perf stat: Introduce print_counters function") Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: stable@vger.kernel.org # v4.2+ Link: http://lore.kernel.org/lkml/20190904094738.9558-3-srikar@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
elginsk8r
pushed a commit
to elginsk8r/android_kernel_oneplus_sm8150
that referenced
this issue
May 16, 2020
[ Upstream commit e6ac075882b2afcdf2d5ab328ce4ab42a1eb9593 ] Like all other virtual devices (macvlan, vlan), the operstate of a macsec device should match the state of its lower device. This is done by calling netif_stacked_transfer_operstate from its netdevice notifier. We also need to call netif_stacked_transfer_operstate when a new macsec device is created, so that its operstate is set properly. This is only relevant when we try to bring the device up directly when we create it. Radu Rendec proposed a similar patch, inspired from the 802.1q driver, that included changing the administrative state of the macsec device, instead of just the operstate. This version is similar to what the macvlan driver does, and updates only the operstate. Change-Id: I27b2a5951814cf2b4150ccd1fa108ee7c1a79adc Fixes: c09440f ("macsec: introduce IEEE 802.1AE driver") Reported-by: Radu Rendec <radu.rendec@gmail.com> Reported-by: Patrick Talbert <ptalbert@redhat.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net> Git-commit: 0fa5ff416504fd4c8a5ea3a1f0b4ba0b4204032c Git-repo: https://github.com/aquantia/linux-4.14-atlantic-forwarding Signed-off-by: Jinesh K. Jayakumar <jineshk@codeaurora.org>
Jackeagle
pushed a commit
to BlissRoms-Devices/android_kernel_oneplus_sm8150
that referenced
this issue
Sep 29, 2020
commit 6bec6ad upstream. When setting page_owner = on, the following warning can be seen in the boot log: WARNING: CPU: 0 PID: 0 at mm/page_alloc.c:2537 drain_all_pages+0x171/0x1a0 Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.15.0-rc7-next-20180109-1-default+ OnePlusOSS#7 Hardware name: Dell Inc. Latitude E7470/0T6HHJ, BIOS 1.11.3 11/09/2016 RIP: 0010:drain_all_pages+0x171/0x1a0 Call Trace: init_page_owner+0x4e/0x260 start_kernel+0x3e6/0x4a6 ? set_init_arg+0x55/0x55 secondary_startup_64+0xa5/0xb0 Code: c5 ed ff 89 df 48 c7 c6 20 3b 71 82 e8 f9 4b 52 00 3b 05 d7 0b f8 00 89 c3 72 d5 5b 5d 41 5 This warning is shown because we are calling drain_all_pages() in init_early_allocated_pages(), but mm_percpu_wq is not up yet, it is being set up later on in kernel_init_freeable() -> init_mm_internals(). Link: http://lkml.kernel.org/r/20180109153921.GA13070@techadventures.net Signed-off-by: Oscar Salvador <osalvador@techadventures.net> Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Michal Hocko <mhocko@suse.com> Cc: Ayush Mittal <ayush.m@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jackeagle
pushed a commit
to BlissRoms-Devices/android_kernel_oneplus_sm8150
that referenced
this issue
Sep 29, 2020
[ Upstream commit e24c6447ccb7b1a01f9bf0aec94939e6450c0b4d ] I compiled with AddressSanitizer and I had these memory leaks while I was using the tep_parse_format function: Direct leak of 28 byte(s) in 4 object(s) allocated from: #0 0x7fb07db49ffe in __interceptor_realloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10dffe) PeterCxy#1 0x7fb07a724228 in extend_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:985 OnePlusOSS#2 0x7fb07a724c21 in __read_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1140 OnePlusOSS#3 0x7fb07a724f78 in read_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1206 OnePlusOSS#4 0x7fb07a725191 in __read_expect_type /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1291 OnePlusOSS#5 0x7fb07a7251df in read_expect_type /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1299 OnePlusOSS#6 0x7fb07a72e6c8 in process_dynamic_array_len /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:2849 OnePlusOSS#7 0x7fb07a7304b8 in process_function /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3161 OnePlusOSS#8 0x7fb07a730900 in process_arg_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3207 OnePlusOSS#9 0x7fb07a727c0b in process_arg /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1786 OnePlusOSS#10 0x7fb07a731080 in event_read_print_args /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3285 OnePlusOSS#11 0x7fb07a731722 in event_read_print /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3369 OnePlusOSS#12 0x7fb07a740054 in __tep_parse_format /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6335 OnePlusOSS#13 0x7fb07a74047a in __parse_event /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6389 OnePlusOSS#14 0x7fb07a740536 in tep_parse_format /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6431 OnePlusOSS#15 0x7fb07a785acf in parse_event ../../../src/fs-src/fs.c:251 OnePlusOSS#16 0x7fb07a785ccd in parse_systems ../../../src/fs-src/fs.c:284 OnePlusOSS#17 0x7fb07a786fb3 in read_metadata ../../../src/fs-src/fs.c:593 OnePlusOSS#18 0x7fb07a78760e in ftrace_fs_source_init ../../../src/fs-src/fs.c:727 OnePlusOSS#19 0x7fb07d90c19c in add_component_with_init_method_data ../../../../src/lib/graph/graph.c:1048 OnePlusOSS#20 0x7fb07d90c87b in add_source_component_with_initialize_method_data ../../../../src/lib/graph/graph.c:1127 OnePlusOSS#21 0x7fb07d90c92a in bt_graph_add_source_component ../../../../src/lib/graph/graph.c:1152 OnePlusOSS#22 0x55db11aa632e in cmd_run_ctx_create_components_from_config_components ../../../src/cli/babeltrace2.c:2252 OnePlusOSS#23 0x55db11aa6fda in cmd_run_ctx_create_components ../../../src/cli/babeltrace2.c:2347 #24 0x55db11aa780c in cmd_run ../../../src/cli/babeltrace2.c:2461 #25 0x55db11aa8a7d in main ../../../src/cli/babeltrace2.c:2673 #26 0x7fb07d5460b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2) The token variable in the process_dynamic_array_len function is allocated in the read_expect_type function, but is not freed before calling the read_token function. Free the token variable before calling read_token in order to plug the leak. Signed-off-by: Philippe Duplessis-Guindon <pduplessis@efficios.com> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: https://lore.kernel.org/linux-trace-devel/20200730150236.5392-1-pduplessis@efficios.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
srgrusso
pushed a commit
to BlissRoms-Devices/android_kernel_oneplus_sm8150
that referenced
this issue
Oct 14, 2021
It was reported by Sergey Senozhatsky that if THP (Transparent Huge Page) and frontswap (via zswap) are both enabled, when memory goes low so that swap is triggered, segfault and memory corruption will occur in random user space applications as follow, kernel: urxvt[338]: segfault at 20 ip 00007fc08889ae0d sp 00007ffc73a7fc40 error 6 in libc-2.26.so[7fc08881a000+1ae000] #0 0x00007fc08889ae0d _int_malloc (libc.so.6) PeterCxy#1 0x00007fc08889c2f3 malloc (libc.so.6) OnePlusOSS#2 0x0000560e6004bff7 _Z14rxvt_wcstoutf8PKwi (urxvt) OnePlusOSS#3 0x0000560e6005e75c n/a (urxvt) OnePlusOSS#4 0x0000560e6007d9f1 _ZN16rxvt_perl_interp6invokeEP9rxvt_term9hook_typez (urxvt) OnePlusOSS#5 0x0000560e6003d988 _ZN9rxvt_term9cmd_parseEv (urxvt) OnePlusOSS#6 0x0000560e60042804 _ZN9rxvt_term6pty_cbERN2ev2ioEi (urxvt) OnePlusOSS#7 0x0000560e6005c10f _Z17ev_invoke_pendingv (urxvt) OnePlusOSS#8 0x0000560e6005cb55 ev_run (urxvt) OnePlusOSS#9 0x0000560e6003b9b9 main (urxvt) OnePlusOSS#10 0x00007fc08883af4a __libc_start_main (libc.so.6) OnePlusOSS#11 0x0000560e6003f9da _start (urxvt) After bisection, it was found the first bad commit is bd4c82c ("mm, THP, swap: delay splitting THP after swapped out"). The root cause is as follows: When the pages are written to swap device during swapping out in swap_writepage(), zswap (fontswap) is tried to compress the pages to improve performance. But zswap (frontswap) will treat THP as a normal page, so only the head page is saved. After swapping in, tail pages will not be restored to their original contents, causing memory corruption in the applications. This is fixed by refusing to save page in the frontswap store functions if the page is a THP. So that the THP will be swapped out to swap device. Another choice is to split THP if frontswap is enabled. But it is found that the frontswap enabling isn't flexible. For example, if CONFIG_ZSWAP=y (cannot be module), frontswap will be enabled even if zswap itself isn't enabled. Frontswap has multiple backends, to make it easy for one backend to enable THP support, the THP checking is put in backend frontswap store functions instead of the general interfaces. Link: http://lkml.kernel.org/r/20180209084947.22749-1-ying.huang@intel.com Fixes: bd4c82c ("mm, THP, swap: delay splitting THP after swapped out") Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Suggested-by: Minchan Kim <minchan@kernel.org> [put THP checking in backend] Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Streetman <ddstreet@ieee.org> Cc: Seth Jennings <sjenning@redhat.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Shaohua Li <shli@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Shakeel Butt <shakeelb@google.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: <stable@vger.kernel.org> [4.14] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: alk3pInjection <webmaster@raspii.tech>
elginsk8r
pushed a commit
to elginsk8r/android_kernel_oneplus_sm8150
that referenced
this issue
Nov 21, 2021
https://bugzilla.kernel.org/show_bug.cgi?id=208565 PID: 257 TASK: ecdd0000 CPU: 0 COMMAND: "init" #0 [<c0b420ec>] (__schedule) from [<c0b423c8>] PeterCxy#1 [<c0b423c8>] (schedule) from [<c0b459d4>] OnePlusOSS#2 [<c0b459d4>] (rwsem_down_read_failed) from [<c0b44fa0>] OnePlusOSS#3 [<c0b44fa0>] (down_read) from [<c044233c>] OnePlusOSS#4 [<c044233c>] (f2fs_truncate_blocks) from [<c0442890>] OnePlusOSS#5 [<c0442890>] (f2fs_truncate) from [<c044d408>] OnePlusOSS#6 [<c044d408>] (f2fs_evict_inode) from [<c030be18>] OnePlusOSS#7 [<c030be18>] (evict) from [<c030a558>] OnePlusOSS#8 [<c030a558>] (iput) from [<c047c600>] OnePlusOSS#9 [<c047c600>] (f2fs_sync_node_pages) from [<c0465414>] OnePlusOSS#10 [<c0465414>] (f2fs_write_checkpoint) from [<c04575f4>] OnePlusOSS#11 [<c04575f4>] (f2fs_sync_fs) from [<c0441918>] OnePlusOSS#12 [<c0441918>] (f2fs_do_sync_file) from [<c0441098>] OnePlusOSS#13 [<c0441098>] (f2fs_sync_file) from [<c0323fa0>] OnePlusOSS#14 [<c0323fa0>] (vfs_fsync_range) from [<c0324294>] OnePlusOSS#15 [<c0324294>] (do_fsync) from [<c0324014>] OnePlusOSS#16 [<c0324014>] (sys_fsync) from [<c0108bc0>] This can be caused by flush_dirty_inode() in f2fs_sync_node_pages() where iput() requires f2fs_lock_op() again resulting in livelock. Reported-by: Zhiguo Niu <Zhiguo.Niu@unisoc.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
elginsk8r
pushed a commit
to elginsk8r/android_kernel_oneplus_sm8150
that referenced
this issue
Nov 21, 2021
This patch is to fix a crash: OnePlusOSS#3 [ffffb6580689f898] oops_end at ffffffffa2835bc2 OnePlusOSS#4 [ffffb6580689f8b8] no_context at ffffffffa28766e7 OnePlusOSS#5 [ffffb6580689f920] async_page_fault at ffffffffa320135e [exception RIP: f2fs_is_compressed_page+34] RIP: ffffffffa2ba83a2 RSP: ffffb6580689f9d8 RFLAGS: 00010213 RAX: 0000000000000001 RBX: fffffc0f50b34bc0 RCX: 0000000000002122 RDX: 0000000000002123 RSI: 0000000000000c00 RDI: fffffc0f50b34bc0 RBP: ffff97e815a40178 R8: 0000000000000000 R9: ffff97e83ffc9000 R10: 0000000000032300 R11: 0000000000032380 R12: ffffb6580689fa38 R13: fffffc0f50b34bc0 R14: ffff97e825cbd000 R15: 0000000000000c00 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 OnePlusOSS#6 [ffffb6580689f9d8] __is_cp_guaranteed at ffffffffa2b7ea98 OnePlusOSS#7 [ffffb6580689f9f0] f2fs_submit_page_write at ffffffffa2b81a69 OnePlusOSS#8 [ffffb6580689fa30] f2fs_do_write_meta_page at ffffffffa2b99777 OnePlusOSS#9 [ffffb6580689fae0] __f2fs_write_meta_page at ffffffffa2b75f1a OnePlusOSS#10 [ffffb6580689fb18] f2fs_sync_meta_pages at ffffffffa2b77466 OnePlusOSS#11 [ffffb6580689fc98] do_checkpoint at ffffffffa2b78e46 OnePlusOSS#12 [ffffb6580689fd88] f2fs_write_checkpoint at ffffffffa2b79c29 OnePlusOSS#13 [ffffb6580689fdd0] f2fs_sync_fs at ffffffffa2b69d95 OnePlusOSS#14 [ffffb6580689fe20] sync_filesystem at ffffffffa2ad2574 OnePlusOSS#15 [ffffb6580689fe30] generic_shutdown_super at ffffffffa2a9b582 OnePlusOSS#16 [ffffb6580689fe48] kill_block_super at ffffffffa2a9b6d1 OnePlusOSS#17 [ffffb6580689fe60] kill_f2fs_super at ffffffffa2b6abe1 OnePlusOSS#18 [ffffb6580689fea0] deactivate_locked_super at ffffffffa2a9afb6 OnePlusOSS#19 [ffffb6580689feb8] cleanup_mnt at ffffffffa2abcad4 OnePlusOSS#20 [ffffb6580689fee0] task_work_run at ffffffffa28bca28 OnePlusOSS#21 [ffffb6580689ff00] exit_to_usermode_loop at ffffffffa28050b7 OnePlusOSS#22 [ffffb6580689ff38] do_syscall_64 at ffffffffa280560e OnePlusOSS#23 [ffffb6580689ff50] entry_SYSCALL_64_after_hwframe at ffffffffa320008c This occurred when umount f2fs if enable F2FS_FS_COMPRESSION with F2FS_IO_TRACE. Fixes it by adding IS_IO_TRACED_PAGE to check validity of pid for page_private. Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
elginsk8r
pushed a commit
to elginsk8r/android_kernel_oneplus_sm8150
that referenced
this issue
Jul 17, 2022
commit 050fad7 upstream. Recently during testing, I ran into the following panic: [ 207.892422] Internal error: Accessing user space memory outside uaccess.h routines: 96000004 [PeterCxy#1] SMP [ 207.901637] Modules linked in: binfmt_misc [...] [ 207.966530] CPU: 45 PID: 2256 Comm: test_verifier Tainted: G W 4.17.0-rc3+ OnePlusOSS#7 [ 207.974956] Hardware name: FOXCONN R2-1221R-A4/C2U4N_MB, BIOS G31FB18A 03/31/2017 [ 207.982428] pstate: 60400005 (nZCv daif +PAN -UAO) [ 207.987214] pc : bpf_skb_load_helper_8_no_cache+0x34/0xc0 [ 207.992603] lr : 0xffff000000bdb754 [ 207.996080] sp : ffff000013703ca0 [ 207.999384] x29: ffff000013703ca0 x28: 0000000000000001 [ 208.004688] x27: 0000000000000001 x26: 0000000000000000 [ 208.009992] x25: ffff000013703ce0 x24: ffff800fb4afcb00 [ 208.015295] x23: ffff00007d2f5038 x22: ffff00007d2f5000 [ 208.020599] x21: fffffffffeff2a6f x20: 000000000000000a [ 208.025903] x19: ffff000009578000 x18: 0000000000000a03 [ 208.031206] x17: 0000000000000000 x16: 0000000000000000 [ 208.036510] x15: 0000ffff9de83000 x14: 0000000000000000 [ 208.041813] x13: 0000000000000000 x12: 0000000000000000 [ 208.047116] x11: 0000000000000001 x10: ffff0000089e7f18 [ 208.052419] x9 : fffffffffeff2a6f x8 : 0000000000000000 [ 208.057723] x7 : 000000000000000a x6 : 00280c6160000000 [ 208.063026] x5 : 0000000000000018 x4 : 0000000000007db6 [ 208.068329] x3 : 000000000008647a x2 : 19868179b1484500 [ 208.073632] x1 : 0000000000000000 x0 : ffff000009578c08 [ 208.078938] Process test_verifier (pid: 2256, stack limit = 0x0000000049ca7974) [ 208.086235] Call trace: [ 208.088672] bpf_skb_load_helper_8_no_cache+0x34/0xc0 [ 208.093713] 0xffff000000bdb754 [ 208.096845] bpf_test_run+0x78/0xf8 [ 208.100324] bpf_prog_test_run_skb+0x148/0x230 [ 208.104758] sys_bpf+0x314/0x1198 [ 208.108064] el0_svc_naked+0x30/0x34 [ 208.111632] Code: 91302260 f9400001 f9001fa1 d2800001 (29500680) [ 208.117717] ---[ end trace 263cb8a59b5bf29f ]--- The program itself which caused this had a long jump over the whole instruction sequence where all of the inner instructions required heavy expansions into multiple BPF instructions. Additionally, I also had BPF hardening enabled which requires once more rewrites of all constant values in order to blind them. Each time we rewrite insns, bpf_adj_branches() would need to potentially adjust branch targets which cross the patchlet boundary to accommodate for the additional delta. Eventually that lead to the case where the target offset could not fit into insn->off's upper 0x7fff limit anymore where then offset wraps around becoming negative (in s16 universe), or vice versa depending on the jump direction. Therefore it becomes necessary to detect and reject any such occasions in a generic way for native eBPF and cBPF to eBPF migrations. For the latter we can simply check bounds in the bpf_convert_filter()'s BPF_EMIT_JMP helper macro and bail out once we surpass limits. The bpf_patch_insn_single() for native eBPF (and cBPF to eBPF in case of subsequent hardening) is a bit more complex in that we need to detect such truncations before hitting the bpf_prog_realloc(). Thus the latter is split into an extra pass to probe problematic offsets on the original program in order to fail early. With that in place and carefully tested I no longer hit the panic and the rewrites are rejected properly. The above example panic I've seen on bpf-next, though the issue itself is generic in that a guard against this issue in bpf seems more appropriate in this case. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> [ab: Dropped BPF_PSEUDO_CALL hardening, introoduced in 4.16] Signed-off-by: Alessio Balsini <balsini@android.com> Acked-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
elginsk8r
pushed a commit
to elginsk8r/android_kernel_oneplus_sm8150
that referenced
this issue
Jul 17, 2022
[ Upstream commit f63d24baff787e13b723d86fe036f84bdbc35045 ] This fixes the following trace caused by receiving HCI_EV_DISCONN_PHY_LINK_COMPLETE which does call hci_conn_del without first checking if conn->type is in fact AMP_LINK and in case it is do properly cleanup upper layers with hci_disconn_cfm: ================================================================== BUG: KASAN: use-after-free in hci_send_acl+0xaba/0xc50 Read of size 8 at addr ffff88800e404818 by task bluetoothd/142 CPU: 0 PID: 142 Comm: bluetoothd Not tainted 5.17.0-rc5-00006-gda4022eeac1a OnePlusOSS#7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x45/0x59 print_address_description.constprop.0+0x1f/0x150 kasan_report.cold+0x7f/0x11b hci_send_acl+0xaba/0xc50 l2cap_do_send+0x23f/0x3d0 l2cap_chan_send+0xc06/0x2cc0 l2cap_sock_sendmsg+0x201/0x2b0 sock_sendmsg+0xdc/0x110 sock_write_iter+0x20f/0x370 do_iter_readv_writev+0x343/0x690 do_iter_write+0x132/0x640 vfs_writev+0x198/0x570 do_writev+0x202/0x280 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae RSP: 002b:00007ffce8a099b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000014 Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 14 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10 RDX: 0000000000000001 RSI: 00007ffce8a099e0 RDI: 0000000000000015 RAX: ffffffffffffffda RBX: 00007ffce8a099e0 RCX: 00007f788fc3cf77 R10: 00007ffce8af7080 R11: 0000000000000246 R12: 000055e4ccf75580 RBP: 0000000000000015 R08: 0000000000000002 R09: 0000000000000001 </TASK> R13: 000055e4ccf754a0 R14: 000055e4ccf75cd0 R15: 000055e4ccf4a6b0 Allocated by task 45: kasan_save_stack+0x1e/0x40 __kasan_kmalloc+0x81/0xa0 hci_chan_create+0x9a/0x2f0 l2cap_conn_add.part.0+0x1a/0xdc0 l2cap_connect_cfm+0x236/0x1000 le_conn_complete_evt+0x15a7/0x1db0 hci_le_conn_complete_evt+0x226/0x2c0 hci_le_meta_evt+0x247/0x450 hci_event_packet+0x61b/0xe90 hci_rx_work+0x4d5/0xc50 process_one_work+0x8fb/0x15a0 worker_thread+0x576/0x1240 kthread+0x29d/0x340 ret_from_fork+0x1f/0x30 Freed by task 45: kasan_save_stack+0x1e/0x40 kasan_set_track+0x21/0x30 kasan_set_free_info+0x20/0x30 __kasan_slab_free+0xfb/0x130 kfree+0xac/0x350 hci_conn_cleanup+0x101/0x6a0 hci_conn_del+0x27e/0x6c0 hci_disconn_phylink_complete_evt+0xe0/0x120 hci_event_packet+0x812/0xe90 hci_rx_work+0x4d5/0xc50 process_one_work+0x8fb/0x15a0 worker_thread+0x576/0x1240 kthread+0x29d/0x340 ret_from_fork+0x1f/0x30 The buggy address belongs to the object at ffff88800c0f0500 The buggy address is located 24 bytes inside of which belongs to the cache kmalloc-128 of size 128 The buggy address belongs to the page: 128-byte region [ffff88800c0f0500, ffff88800c0f0580) flags: 0x100000000000200(slab|node=0|zone=1) page:00000000fe45cd86 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0xc0f0 raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000 raw: 0100000000000200 ffffea00003a2c80 dead000000000004 ffff8880078418c0 page dumped because: kasan: bad access detected ffff88800c0f0400: 00 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc Memory state around the buggy address: >ffff88800c0f0500: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88800c0f0480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88800c0f0580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ^ ================================================================== ffff88800c0f0600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb Reported-by: Sönke Huster <soenke.huster@eknoes.de> Tested-by: Sönke Huster <soenke.huster@eknoes.de> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
elginsk8r
pushed a commit
to elginsk8r/android_kernel_oneplus_sm8150
that referenced
this issue
Jul 17, 2022
[ Upstream commit af68656d66eda219b7f55ce8313a1da0312c79e1 ] While handling PCI errors (AER flow) driver tries to disable NAPI [napi_disable()] after NAPI is deleted [__netif_napi_del()] which causes unexpected system hang/crash. System message log shows the following: ======================================= [ 3222.537510] EEH: Detected PCI bus error on PHB#384-PE#800000 [ 3222.537511] EEH: This PCI device has failed 2 times in the last hour and will be permanently disabled after 5 failures. [ 3222.537512] EEH: Notify device drivers to shutdown [ 3222.537513] EEH: Beginning: 'error_detected(IO frozen)' [ 3222.537514] EEH: PE#800000 (PCI 0384:80:00.0): Invoking bnx2x->error_detected(IO frozen) [ 3222.537516] bnx2x: [bnx2x_io_error_detected:14236(eth14)]IO error detected [ 3222.537650] EEH: PE#800000 (PCI 0384:80:00.0): bnx2x driver reports: 'need reset' [ 3222.537651] EEH: PE#800000 (PCI 0384:80:00.1): Invoking bnx2x->error_detected(IO frozen) [ 3222.537651] bnx2x: [bnx2x_io_error_detected:14236(eth13)]IO error detected [ 3222.537729] EEH: PE#800000 (PCI 0384:80:00.1): bnx2x driver reports: 'need reset' [ 3222.537729] EEH: Finished:'error_detected(IO frozen)' with aggregate recovery state:'need reset' [ 3222.537890] EEH: Collect temporary log [ 3222.583481] EEH: of node=0384:80:00.0 [ 3222.583519] EEH: PCI device/vendor: 168e14e4 [ 3222.583557] EEH: PCI cmd/status register: 00100140 [ 3222.583557] EEH: PCI-E capabilities and status follow: [ 3222.583744] EEH: PCI-E 00: 00020010 012c8da2 00095d5e 00455c82 [ 3222.583892] EEH: PCI-E 10: 10820000 00000000 00000000 00000000 [ 3222.583893] EEH: PCI-E 20: 00000000 [ 3222.583893] EEH: PCI-E AER capability register set follows: [ 3222.584079] EEH: PCI-E AER 00: 13c10001 00000000 00000000 00062030 [ 3222.584230] EEH: PCI-E AER 10: 00002000 000031c0 000001e0 00000000 [ 3222.584378] EEH: PCI-E AER 20: 00000000 00000000 00000000 00000000 [ 3222.584416] EEH: PCI-E AER 30: 00000000 00000000 [ 3222.584416] EEH: of node=0384:80:00.1 [ 3222.584454] EEH: PCI device/vendor: 168e14e4 [ 3222.584491] EEH: PCI cmd/status register: 00100140 [ 3222.584492] EEH: PCI-E capabilities and status follow: [ 3222.584677] EEH: PCI-E 00: 00020010 012c8da2 00095d5e 00455c82 [ 3222.584825] EEH: PCI-E 10: 10820000 00000000 00000000 00000000 [ 3222.584826] EEH: PCI-E 20: 00000000 [ 3222.584826] EEH: PCI-E AER capability register set follows: [ 3222.585011] EEH: PCI-E AER 00: 13c10001 00000000 00000000 00062030 [ 3222.585160] EEH: PCI-E AER 10: 00002000 000031c0 000001e0 00000000 [ 3222.585309] EEH: PCI-E AER 20: 00000000 00000000 00000000 00000000 [ 3222.585347] EEH: PCI-E AER 30: 00000000 00000000 [ 3222.586872] RTAS: event: 5, Type: Platform Error (224), Severity: 2 [ 3222.586873] EEH: Reset without hotplug activity [ 3224.762767] EEH: Beginning: 'slot_reset' [ 3224.762770] EEH: PE#800000 (PCI 0384:80:00.0): Invoking bnx2x->slot_reset() [ 3224.762771] bnx2x: [bnx2x_io_slot_reset:14271(eth14)]IO slot reset initializing... [ 3224.762887] bnx2x 0384:80:00.0: enabling device (0140 -> 0142) [ 3224.768157] bnx2x: [bnx2x_io_slot_reset:14287(eth14)]IO slot reset --> driver unload Uninterruptible tasks ===================== crash> ps | grep UN 213 2 11 c000000004c89e00 UN 0.0 0 0 [eehd] 215 2 0 c000000004c80000 UN 0.0 0 0 [kworker/0:2] 2196 1 28 c000000004504f00 UN 0.1 15936 11136 wickedd 4287 1 9 c00000020d076800 UN 0.0 4032 3008 agetty 4289 1 20 c00000020d056680 UN 0.0 7232 3840 agetty 32423 2 26 c00000020038c580 UN 0.0 0 0 [kworker/26:3] 32871 4241 27 c0000002609ddd00 UN 0.1 18624 11648 sshd 32920 10130 16 c00000027284a100 UN 0.1 48512 12608 sendmail 33092 32987 0 c000000205218b00 UN 0.1 48512 12608 sendmail 33154 4567 16 c000000260e51780 UN 0.1 48832 12864 pickup 33209 4241 36 c000000270cb6500 UN 0.1 18624 11712 sshd 33473 33283 0 c000000205211480 UN 0.1 48512 12672 sendmail 33531 4241 37 c00000023c902780 UN 0.1 18624 11648 sshd EEH handler hung while bnx2x sleeping and holding RTNL lock =========================================================== crash> bt 213 PID: 213 TASK: c000000004c89e00 CPU: 11 COMMAND: "eehd" #0 [c000000004d477e0] __schedule at c000000000c70808 PeterCxy#1 [c000000004d478b0] schedule at c000000000c70ee0 OnePlusOSS#2 [c000000004d478e0] schedule_timeout at c000000000c76dec OnePlusOSS#3 [c000000004d479c0] msleep at c0000000002120cc OnePlusOSS#4 [c000000004d479f0] napi_disable at c000000000a06448 ^^^^^^^^^^^^^^^^ OnePlusOSS#5 [c000000004d47a30] bnx2x_netif_stop at c0080000018dba94 [bnx2x] OnePlusOSS#6 [c000000004d47a60] bnx2x_io_slot_reset at c0080000018a551c [bnx2x] OnePlusOSS#7 [c000000004d47b20] eeh_report_reset at c00000000004c9bc OnePlusOSS#8 [c000000004d47b90] eeh_pe_report at c00000000004d1a8 OnePlusOSS#9 [c000000004d47c40] eeh_handle_normal_event at c00000000004da64 And the sleeping source code ============================ crash> dis -ls c000000000a06448 FILE: ../net/core/dev.c LINE: 6702 6697 { 6698 might_sleep(); 6699 set_bit(NAPI_STATE_DISABLE, &n->state); 6700 6701 while (test_and_set_bit(NAPI_STATE_SCHED, &n->state)) * 6702 msleep(1); 6703 while (test_and_set_bit(NAPI_STATE_NPSVC, &n->state)) 6704 msleep(1); 6705 6706 hrtimer_cancel(&n->timer); 6707 6708 clear_bit(NAPI_STATE_DISABLE, &n->state); 6709 } EEH calls into bnx2x twice based on the system log above, first through bnx2x_io_error_detected() and then bnx2x_io_slot_reset(), and executes the following call chains: bnx2x_io_error_detected() +-> bnx2x_eeh_nic_unload() +-> bnx2x_del_all_napi() +-> __netif_napi_del() bnx2x_io_slot_reset() +-> bnx2x_netif_stop() +-> bnx2x_napi_disable() +->napi_disable() Fix this by correcting the sequence of NAPI APIs usage, that is delete the NAPI after disabling it. Fixes: 7fa6f34 ("bnx2x: AER revised") Reported-by: David Christensen <drc@linux.vnet.ibm.com> Tested-by: David Christensen <drc@linux.vnet.ibm.com> Signed-off-by: Manish Chopra <manishc@marvell.com> Signed-off-by: Ariel Elior <aelior@marvell.com> Link: https://lore.kernel.org/r/20220426153913.6966-1-manishc@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
elginsk8r
pushed a commit
to elginsk8r/android_kernel_oneplus_sm8150
that referenced
this issue
Sep 27, 2022
commit 38c9c22a85aeed28d0831f230136e9cf6fa2ed44 upstream. Syzkaller reported use-after-free bug as follows: ================================================================== BUG: KASAN: use-after-free in ntfs_ucsncmp+0x123/0x130 Read of size 2 at addr ffff8880751acee8 by task a.out/879 CPU: 7 PID: 879 Comm: a.out Not tainted 5.19.0-rc4-next-20220630-00001-gcc5218c8bd2c-dirty OnePlusOSS#7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x1c0/0x2b0 print_address_description.constprop.0.cold+0xd4/0x484 print_report.cold+0x55/0x232 kasan_report+0xbf/0xf0 ntfs_ucsncmp+0x123/0x130 ntfs_are_names_equal.cold+0x2b/0x41 ntfs_attr_find+0x43b/0xb90 ntfs_attr_lookup+0x16d/0x1e0 ntfs_read_locked_attr_inode+0x4aa/0x2360 ntfs_attr_iget+0x1af/0x220 ntfs_read_locked_inode+0x246c/0x5120 ntfs_iget+0x132/0x180 load_system_files+0x1cc6/0x3480 ntfs_fill_super+0xa66/0x1cf0 mount_bdev+0x38d/0x460 legacy_get_tree+0x10d/0x220 vfs_get_tree+0x93/0x300 do_new_mount+0x2da/0x6d0 path_mount+0x496/0x19d0 __x64_sys_mount+0x284/0x300 do_syscall_64+0x3b/0xc0 entry_SYSCALL_64_after_hwframe+0x46/0xb0 RIP: 0033:0x7f3f2118d9ea Code: 48 8b 0d a9 f4 0b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 76 f4 0b 00 f7 d8 64 89 01 48 RSP: 002b:00007ffc269deac8 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f3f2118d9ea RDX: 0000000020000000 RSI: 0000000020000100 RDI: 00007ffc269dec00 RBP: 00007ffc269dec80 R08: 00007ffc269deb00 R09: 00007ffc269dec44 R10: 0000000000000000 R11: 0000000000000202 R12: 000055f81ab1d220 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 </TASK> The buggy address belongs to the physical page: page:0000000085430378 refcount:1 mapcount:1 mapping:0000000000000000 index:0x555c6a81d pfn:0x751ac memcg:ffff888101f7e180 anon flags: 0xfffffc00a0014(uptodate|lru|mappedtodisk|swapbacked|node=0|zone=1|lastcpupid=0x1fffff) raw: 000fffffc00a0014 ffffea0001bf2988 ffffea0001de2448 ffff88801712e201 raw: 0000000555c6a81d 0000000000000000 0000000100000000 ffff888101f7e180 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff8880751acd80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff8880751ace00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff8880751ace80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ^ ffff8880751acf00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff8880751acf80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ================================================================== The reason is that struct ATTR_RECORD->name_offset is 6485, end address of name string is out of bounds. Fix this by adding sanity check on end address of attribute name string. [akpm@linux-foundation.org: coding-style cleanups] [chenxiaosong2@huawei.com: cleanup suggested by Hawkins Jiawei] Link: https://lkml.kernel.org/r/20220709064511.3304299-1-chenxiaosong2@huawei.com Link: https://lkml.kernel.org/r/20220707105329.4020708-1-chenxiaosong2@huawei.com Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com> Signed-off-by: Hawkins Jiawei <yin31149@gmail.com> Cc: Anton Altaparmakov <anton@tuxera.com> Cc: ChenXiaoSong <chenxiaosong2@huawei.com> Cc: Yongqiang Liu <liuyongqiang13@huawei.com> Cc: Zhang Yi <yi.zhang@huawei.com> Cc: Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
elginsk8r
pushed a commit
to elginsk8r/android_kernel_oneplus_sm8150
that referenced
this issue
Sep 27, 2022
…loc() commit d29f59051d3a07b81281b2df2b8c9dfe4716067f upstream. The voice allocator sometimes begins allocating from near the end of the array and then wraps around, however snd_emu10k1_pcm_channel_alloc() accesses the newly allocated voices as if it never wrapped around. This results in out of bounds access if the first voice has a high enough index so that first_voice + requested_voice_count > NUM_G (64). The more voices are requested, the more likely it is for this to occur. This was initially discovered using PipeWire, however it can be reproduced by calling aplay multiple times with 16 channels: aplay -r 48000 -D plughw:CARD=Live,DEV=3 -c 16 /dev/zero UBSAN: array-index-out-of-bounds in sound/pci/emu10k1/emupcm.c:127:40 index 65 is out of range for type 'snd_emu10k1_voice [64]' CPU: 1 PID: 31977 Comm: aplay Tainted: G W IOE 6.0.0-rc2-emu10k1+ OnePlusOSS#7 Hardware name: ASUSTEK COMPUTER INC P5W DH Deluxe/P5W DH Deluxe, BIOS 3002 07/22/2010 Call Trace: <TASK> dump_stack_lvl+0x49/0x63 dump_stack+0x10/0x16 ubsan_epilogue+0x9/0x3f __ubsan_handle_out_of_bounds.cold+0x44/0x49 snd_emu10k1_playback_hw_params+0x3bc/0x420 [snd_emu10k1] snd_pcm_hw_params+0x29f/0x600 [snd_pcm] snd_pcm_common_ioctl+0x188/0x1410 [snd_pcm] ? exit_to_user_mode_prepare+0x35/0x170 ? do_syscall_64+0x69/0x90 ? syscall_exit_to_user_mode+0x26/0x50 ? do_syscall_64+0x69/0x90 ? exit_to_user_mode_prepare+0x35/0x170 snd_pcm_ioctl+0x27/0x40 [snd_pcm] __x64_sys_ioctl+0x95/0xd0 do_syscall_64+0x5c/0x90 ? do_syscall_64+0x69/0x90 ? do_syscall_64+0x69/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd Signed-off-by: Tasos Sahanidis <tasos@tasossah.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/3707dcab-320a-62ff-63c0-73fc201ef756@tasossah.com Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
elginsk8r
pushed a commit
to elginsk8r/android_kernel_oneplus_sm8150
that referenced
this issue
Sep 27, 2022
[ Upstream commit 84a53580c5d2138c7361c7c3eea5b31827e63b35 ] The SRv6 layer allows defining HMAC data that can later be used to sign IPv6 Segment Routing Headers. This configuration is realised via netlink through four attributes: SEG6_ATTR_HMACKEYID, SEG6_ATTR_SECRET, SEG6_ATTR_SECRETLEN and SEG6_ATTR_ALGID. Because the SECRETLEN attribute is decoupled from the actual length of the SECRET attribute, it is possible to provide invalid combinations (e.g., secret = "", secretlen = 64). This case is not checked in the code and with an appropriately crafted netlink message, an out-of-bounds read of up to 64 bytes (max secret length) can occur past the skb end pointer and into skb_shared_info: Breakpoint 1, seg6_genl_sethmac (skb=<optimized out>, info=<optimized out>) at net/ipv6/seg6.c:208 208 memcpy(hinfo->secret, secret, slen); (gdb) bt #0 seg6_genl_sethmac (skb=<optimized out>, info=<optimized out>) at net/ipv6/seg6.c:208 PeterCxy#1 0xffffffff81e012e9 in genl_family_rcv_msg_doit (skb=skb@entry=0xffff88800b1f9f00, nlh=nlh@entry=0xffff88800b1b7600, extack=extack@entry=0xffffc90000ba7af0, ops=ops@entry=0xffffc90000ba7a80, hdrlen=4, net=0xffffffff84237580 <init_net>, family=<optimized out>, family=<optimized out>) at net/netlink/genetlink.c:731 OnePlusOSS#2 0xffffffff81e01435 in genl_family_rcv_msg (extack=0xffffc90000ba7af0, nlh=0xffff88800b1b7600, skb=0xffff88800b1f9f00, family=0xffffffff82fef6c0 <seg6_genl_family>) at net/netlink/genetlink.c:775 OnePlusOSS#3 genl_rcv_msg (skb=0xffff88800b1f9f00, nlh=0xffff88800b1b7600, extack=0xffffc90000ba7af0) at net/netlink/genetlink.c:792 OnePlusOSS#4 0xffffffff81dfffc3 in netlink_rcv_skb (skb=skb@entry=0xffff88800b1f9f00, cb=cb@entry=0xffffffff81e01350 <genl_rcv_msg>) at net/netlink/af_netlink.c:2501 OnePlusOSS#5 0xffffffff81e00919 in genl_rcv (skb=0xffff88800b1f9f00) at net/netlink/genetlink.c:803 OnePlusOSS#6 0xffffffff81dff6ae in netlink_unicast_kernel (ssk=0xffff888010eec800, skb=0xffff88800b1f9f00, sk=0xffff888004aed000) at net/netlink/af_netlink.c:1319 OnePlusOSS#7 netlink_unicast (ssk=ssk@entry=0xffff888010eec800, skb=skb@entry=0xffff88800b1f9f00, portid=portid@entry=0, nonblock=<optimized out>) at net/netlink/af_netlink.c:1345 OnePlusOSS#8 0xffffffff81dff9a4 in netlink_sendmsg (sock=<optimized out>, msg=0xffffc90000ba7e48, len=<optimized out>) at net/netlink/af_netlink.c:1921 ... (gdb) p/x ((struct sk_buff *)0xffff88800b1f9f00)->head + ((struct sk_buff *)0xffff88800b1f9f00)->end $1 = 0xffff88800b1b76c0 (gdb) p/x secret $2 = 0xffff88800b1b76c0 (gdb) p slen $3 = 64 '@' The OOB data can then be read back from userspace by dumping HMAC state. This commit fixes this by ensuring SECRETLEN cannot exceed the actual length of SECRET. Reported-by: Lucas Leong <wmliang.tw@gmail.com> Tested: verified that EINVAL is correctly returned when secretlen > len(secret) Fixes: 4f4853d ("ipv6: sr: implement API to control SR HMAC structure") Signed-off-by: David Lebrun <dlebrun@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
elginsk8r
pushed a commit
to elginsk8r/android_kernel_oneplus_sm8150
that referenced
this issue
Jun 12, 2023
commit 675751bb20634f981498c7d66161584080cc061e upstream. If something was written to the buffer just before destruction, it may be possible (maybe not in a real system, but it did happen in ARCH=um with time-travel) to destroy the ringbuffer before the IRQ work ran, leading this KASAN report (or a crash without KASAN): BUG: KASAN: slab-use-after-free in irq_work_run_list+0x11a/0x13a Read of size 8 at addr 000000006d640a48 by task swapper/0 CPU: 0 PID: 0 Comm: swapper Tainted: G W O 6.3.0-rc1 OnePlusOSS#7 Stack: 60c4f20f 0c203d48 41b58ab3 60f224fc 600477fa 60f35687 60c4f20f 601273dd 00000008 6101eb00 6101eab0 615be548 Call Trace: [<60047a58>] show_stack+0x25e/0x282 [<60c609e0>] dump_stack_lvl+0x96/0xfd [<60c50d4c>] print_report+0x1a7/0x5a8 [<603078d3>] kasan_report+0xc1/0xe9 [<60308950>] __asan_report_load8_noabort+0x1b/0x1d [<60232844>] irq_work_run_list+0x11a/0x13a [<602328b4>] irq_work_tick+0x24/0x34 [<6017f9dc>] update_process_times+0x162/0x196 [<6019f335>] tick_sched_handle+0x1a4/0x1c3 [<6019fd9e>] tick_sched_timer+0x79/0x10c [<601812b9>] __hrtimer_run_queues.constprop.0+0x425/0x695 [<60182913>] hrtimer_interrupt+0x16c/0x2c4 [<600486a3>] um_timer+0x164/0x183 [...] Allocated by task 411: save_stack_trace+0x99/0xb5 stack_trace_save+0x81/0x9b kasan_save_stack+0x2d/0x54 kasan_set_track+0x34/0x3e kasan_save_alloc_info+0x25/0x28 ____kasan_kmalloc+0x8b/0x97 __kasan_kmalloc+0x10/0x12 __kmalloc+0xb2/0xe8 load_elf_phdrs+0xee/0x182 [...] The buggy address belongs to the object at 000000006d640800 which belongs to the cache kmalloc-1k of size 1024 The buggy address is located 584 bytes inside of freed 1024-byte region [000000006d640800, 000000006d640c00) Add the appropriate irq_work_sync() so the work finishes before the buffers are destroyed. Prior to the commit in the Fixes tag below, there was only a single global IRQ work, so this issue didn't exist. Link: https://lore.kernel.org/linux-trace-kernel/20230427175920.a76159263122.I8295e405c44362a86c995e9c2c37e3e03810aa56@changeid Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Fixes: 1569345 ("tracing/ring-buffer: Move poll wake ups into ring buffer code") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
elginsk8r
pushed a commit
to elginsk8r/android_kernel_oneplus_sm8150
that referenced
this issue
Jun 12, 2023
[ Upstream commit 05bb0167c80b8f93c6a4e0451b7da9b96db990c2 ] ACPICA commit 770653e3ba67c30a629ca7d12e352d83c2541b1e Before this change we see the following UBSAN stack trace in Fuchsia: #0 0x000021e4213b3302 in acpi_ds_init_aml_walk(struct acpi_walk_state*, union acpi_parse_object*, struct acpi_namespace_node*, u8*, u32, struct acpi_evaluate_info*, u8) ../../third_party/acpica/source/components/dispatcher/dswstate.c:682 <platform-bus-x86.so>+0x233302 PeterCxy#1.2 0x000020d0f660777f in ubsan_get_stack_trace() compiler-rt/lib/ubsan/ubsan_diag.cpp:41 <libclang_rt.asan.so>+0x3d77f PeterCxy#1.1 0x000020d0f660777f in maybe_print_stack_trace() compiler-rt/lib/ubsan/ubsan_diag.cpp:51 <libclang_rt.asan.so>+0x3d77f PeterCxy#1 0x000020d0f660777f in ~scoped_report() compiler-rt/lib/ubsan/ubsan_diag.cpp:387 <libclang_rt.asan.so>+0x3d77f OnePlusOSS#2 0x000020d0f660b96d in handlepointer_overflow_impl() compiler-rt/lib/ubsan/ubsan_handlers.cpp:809 <libclang_rt.asan.so>+0x4196d OnePlusOSS#3 0x000020d0f660b50d in compiler-rt/lib/ubsan/ubsan_handlers.cpp:815 <libclang_rt.asan.so>+0x4150d OnePlusOSS#4 0x000021e4213b3302 in acpi_ds_init_aml_walk(struct acpi_walk_state*, union acpi_parse_object*, struct acpi_namespace_node*, u8*, u32, struct acpi_evaluate_info*, u8) ../../third_party/acpica/source/components/dispatcher/dswstate.c:682 <platform-bus-x86.so>+0x233302 OnePlusOSS#5 0x000021e4213e2369 in acpi_ds_call_control_method(struct acpi_thread_state*, struct acpi_walk_state*, union acpi_parse_object*) ../../third_party/acpica/source/components/dispatcher/dsmethod.c:605 <platform-bus-x86.so>+0x262369 OnePlusOSS#6 0x000021e421437fac in acpi_ps_parse_aml(struct acpi_walk_state*) ../../third_party/acpica/source/components/parser/psparse.c:550 <platform-bus-x86.so>+0x2b7fac OnePlusOSS#7 0x000021e4214464d2 in acpi_ps_execute_method(struct acpi_evaluate_info*) ../../third_party/acpica/source/components/parser/psxface.c:244 <platform-bus-x86.so>+0x2c64d2 OnePlusOSS#8 0x000021e4213aa052 in acpi_ns_evaluate(struct acpi_evaluate_info*) ../../third_party/acpica/source/components/namespace/nseval.c:250 <platform-bus-x86.so>+0x22a052 OnePlusOSS#9 0x000021e421413dd8 in acpi_ns_init_one_device(acpi_handle, u32, void*, void**) ../../third_party/acpica/source/components/namespace/nsinit.c:735 <platform-bus-x86.so>+0x293dd8 OnePlusOSS#10 0x000021e421429e98 in acpi_ns_walk_namespace(acpi_object_type, acpi_handle, u32, u32, acpi_walk_callback, acpi_walk_callback, void*, void**) ../../third_party/acpica/source/components/namespace/nswalk.c:298 <platform-bus-x86.so>+0x2a9e98 OnePlusOSS#11 0x000021e4214131ac in acpi_ns_initialize_devices(u32) ../../third_party/acpica/source/components/namespace/nsinit.c:268 <platform-bus-x86.so>+0x2931ac OnePlusOSS#12 0x000021e42147c40d in acpi_initialize_objects(u32) ../../third_party/acpica/source/components/utilities/utxfinit.c:304 <platform-bus-x86.so>+0x2fc40d OnePlusOSS#13 0x000021e42126d603 in acpi::acpi_impl::initialize_acpi(acpi::acpi_impl*) ../../src/devices/board/lib/acpi/acpi-impl.cc:224 <platform-bus-x86.so>+0xed603 Add a simple check that avoids incrementing a pointer by zero, but otherwise behaves as before. Note that our findings are against ACPICA 20221020, but the same code exists on master. Link: acpica/acpica@770653e3 Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
I have some error too. |
elginsk8r
pushed a commit
to elginsk8r/android_kernel_oneplus_sm8150
that referenced
this issue
Dec 9, 2023
[ Upstream commit a154f5f643c6ecddd44847217a7a3845b4350003 ] The following call trace shows a deadlock issue due to recursive locking of mutex "device_mutex". First lock acquire is in target_for_each_device() and second in target_free_device(). PID: 148266 TASK: ffff8be21ffb5d00 CPU: 10 COMMAND: "iscsi_ttx" #0 [ffffa2bfc9ec3b18] __schedule at ffffffffa8060e7f PeterCxy#1 [ffffa2bfc9ec3ba0] schedule at ffffffffa8061224 OnePlusOSS#2 [ffffa2bfc9ec3bb8] schedule_preempt_disabled at ffffffffa80615ee OnePlusOSS#3 [ffffa2bfc9ec3bc8] __mutex_lock at ffffffffa8062fd7 OnePlusOSS#4 [ffffa2bfc9ec3c40] __mutex_lock_slowpath at ffffffffa80631d3 OnePlusOSS#5 [ffffa2bfc9ec3c50] mutex_lock at ffffffffa806320c OnePlusOSS#6 [ffffa2bfc9ec3c68] target_free_device at ffffffffc0935998 [target_core_mod] OnePlusOSS#7 [ffffa2bfc9ec3c90] target_core_dev_release at ffffffffc092f975 [target_core_mod] OnePlusOSS#8 [ffffa2bfc9ec3ca0] config_item_put at ffffffffa79d250f OnePlusOSS#9 [ffffa2bfc9ec3cd0] config_item_put at ffffffffa79d2583 OnePlusOSS#10 [ffffa2bfc9ec3ce0] target_devices_idr_iter at ffffffffc0933f3a [target_core_mod] OnePlusOSS#11 [ffffa2bfc9ec3d00] idr_for_each at ffffffffa803f6fc OnePlusOSS#12 [ffffa2bfc9ec3d60] target_for_each_device at ffffffffc0935670 [target_core_mod] OnePlusOSS#13 [ffffa2bfc9ec3d98] transport_deregister_session at ffffffffc0946408 [target_core_mod] OnePlusOSS#14 [ffffa2bfc9ec3dc8] iscsit_close_session at ffffffffc09a44a6 [iscsi_target_mod] OnePlusOSS#15 [ffffa2bfc9ec3df0] iscsit_close_connection at ffffffffc09a4a88 [iscsi_target_mod] OnePlusOSS#16 [ffffa2bfc9ec3df8] finish_task_switch at ffffffffa76e5d07 OnePlusOSS#17 [ffffa2bfc9ec3e78] iscsit_take_action_for_connection_exit at ffffffffc0991c23 [iscsi_target_mod] OnePlusOSS#18 [ffffa2bfc9ec3ea0] iscsi_target_tx_thread at ffffffffc09a403b [iscsi_target_mod] OnePlusOSS#19 [ffffa2bfc9ec3f08] kthread at ffffffffa76d8080 OnePlusOSS#20 [ffffa2bfc9ec3f50] ret_from_fork at ffffffffa8200364 Fixes: 36d4cb4 ("scsi: target: Avoid that EXTENDED COPY commands trigger lock inversion") Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Link: https://lore.kernel.org/r/20230918225848.66463-1-junxiao.bi@oracle.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
elginsk8r
pushed a commit
to elginsk8r/android_kernel_oneplus_sm8150
that referenced
this issue
Dec 18, 2023
[ Upstream commit a154f5f643c6ecddd44847217a7a3845b4350003 ] The following call trace shows a deadlock issue due to recursive locking of mutex "device_mutex". First lock acquire is in target_for_each_device() and second in target_free_device(). PID: 148266 TASK: ffff8be21ffb5d00 CPU: 10 COMMAND: "iscsi_ttx" #0 [ffffa2bfc9ec3b18] __schedule at ffffffffa8060e7f PeterCxy#1 [ffffa2bfc9ec3ba0] schedule at ffffffffa8061224 OnePlusOSS#2 [ffffa2bfc9ec3bb8] schedule_preempt_disabled at ffffffffa80615ee OnePlusOSS#3 [ffffa2bfc9ec3bc8] __mutex_lock at ffffffffa8062fd7 OnePlusOSS#4 [ffffa2bfc9ec3c40] __mutex_lock_slowpath at ffffffffa80631d3 OnePlusOSS#5 [ffffa2bfc9ec3c50] mutex_lock at ffffffffa806320c OnePlusOSS#6 [ffffa2bfc9ec3c68] target_free_device at ffffffffc0935998 [target_core_mod] OnePlusOSS#7 [ffffa2bfc9ec3c90] target_core_dev_release at ffffffffc092f975 [target_core_mod] OnePlusOSS#8 [ffffa2bfc9ec3ca0] config_item_put at ffffffffa79d250f OnePlusOSS#9 [ffffa2bfc9ec3cd0] config_item_put at ffffffffa79d2583 OnePlusOSS#10 [ffffa2bfc9ec3ce0] target_devices_idr_iter at ffffffffc0933f3a [target_core_mod] OnePlusOSS#11 [ffffa2bfc9ec3d00] idr_for_each at ffffffffa803f6fc OnePlusOSS#12 [ffffa2bfc9ec3d60] target_for_each_device at ffffffffc0935670 [target_core_mod] OnePlusOSS#13 [ffffa2bfc9ec3d98] transport_deregister_session at ffffffffc0946408 [target_core_mod] OnePlusOSS#14 [ffffa2bfc9ec3dc8] iscsit_close_session at ffffffffc09a44a6 [iscsi_target_mod] OnePlusOSS#15 [ffffa2bfc9ec3df0] iscsit_close_connection at ffffffffc09a4a88 [iscsi_target_mod] OnePlusOSS#16 [ffffa2bfc9ec3df8] finish_task_switch at ffffffffa76e5d07 OnePlusOSS#17 [ffffa2bfc9ec3e78] iscsit_take_action_for_connection_exit at ffffffffc0991c23 [iscsi_target_mod] OnePlusOSS#18 [ffffa2bfc9ec3ea0] iscsi_target_tx_thread at ffffffffc09a403b [iscsi_target_mod] OnePlusOSS#19 [ffffa2bfc9ec3f08] kthread at ffffffffa76d8080 OnePlusOSS#20 [ffffa2bfc9ec3f50] ret_from_fork at ffffffffa8200364 Fixes: 36d4cb4 ("scsi: target: Avoid that EXTENDED COPY commands trigger lock inversion") Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Link: https://lore.kernel.org/r/20230918225848.66463-1-junxiao.bi@oracle.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
mhmdeve
pushed a commit
to mhmdeve/kernel_oneplus_sm8150
that referenced
this issue
Apr 28, 2024
…ock() commit 1a3e1f40962c445b997151a542314f3c6097f8c3 upstream. NOTE: This is a partial backport since we only need the refcnt between memcg and stock to fix the problem stated below, and in this way multiple versions use the same code and align with each other. There was a kernel panic happened on an in-house environment running 3.10, and the same problem was reproduced on 4.19: general protection fault: 0000 [OnePlusOSS#1] SMP PTI CPU: 1 PID: 2085 Comm: bash Kdump: loaded Tainted: G L 4.19.90+ OnePlusOSS#7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014 RIP: 0010 drain_all_stock+0xad/0x140 Code: 00 00 4d 85 ff 74 2c 45 85 c9 74 27 4d 39 fc 74 42 41 80 bc 24 28 04 00 00 00 74 17 49 8b 04 24 49 8b 17 48 8b 88 90 02 00 00 <48> 39 8a 90 02 00 00 74 02 eb 86 48 63 88 3c 01 00 00 39 8a 3c 01 RSP: 0018:ffffa7efc5813d70 EFLAGS: 00010202 RAX: ffff8cb185548800 RBX: ffff8cb89f420160 RCX: ffff8cb1867b6000 RDX: babababababababa RSI: 0000000000000001 RDI: 0000000000231876 RBP: 0000000000000000 R08: 0000000000000415 R09: 0000000000000002 R10: 0000000000000000 R11: 0000000000000001 R12: ffff8cb186f89040 R13: 0000000000020160 R14: 0000000000000001 R15: ffff8cb186b27040 FS: 00007f4a308d3740(0000) GS:ffff8cb89f440000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffe4d634a68 CR3: 000000010b022000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: mem_cgroup_force_empty_write+0x31/0xb0 cgroup_file_write+0x60/0x140 ? __check_object_size+0x136/0x147 kernfs_fop_write+0x10e/0x190 __vfs_write+0x37/0x1b0 ? selinux_file_permission+0xe8/0x130 ? security_file_permission+0x2e/0xb0 vfs_write+0xb6/0x1a0 ksys_write+0x57/0xd0 do_syscall_64+0x63/0x250 ? async_page_fault+0x8/0x30 entry_SYSCALL_64_after_hwframe+0x5c/0xc1 Modules linked in: ... It is found that in case of stock->nr_pages == 0, the memcg on stock->cached could be freed due to its refcnt decreased to 0, which made stock->cached become a dangling pointer. It could cause a UAF problem in drain_all_stock() in the following concurrent scenario. Note that drain_all_stock() doesn't disable irq but only preemption. CPU1 CPU2 ============================================================================== stock->cached = memcgA (freed) drain_all_stock(memcgB) rcu_read_lock() memcg = CPU1's stock->cached (memcgA) (interrupted) refill_stock(memcgC) drain_stock(memcgA) stock->cached = memcgC stock->nr_pages += xxx (> 0) stock->nr_pages > 0 mem_cgroup_is_descendant(memcgA, memcgB) [UAF] rcu_read_unlock() This problem is, unintentionally, fixed at 5.9, where commit 1a3e1f40962c ("mm: memcontrol: decouple reference counting from page accounting") adds memcg refcnt for stock. Therefore affected LTS versions include 4.19 and 5.4. For 4.19, memcg's css offline process doesn't call drain_all_stock(). so it's easier for the released memcg to be left on the stock. For 5.4, although mem_cgroup_css_offline() does call drain_all_stock(), but the flushing could be skipped when stock->nr_pages happens to be 0, and besides the async draining could be delayed and take place after the UAF problem has happened. Fix this problem by adding (and decreasing) memcg's refcnt when memcg is put onto (and removed from) stock, just like how commit 1a3e1f40962c ("mm: memcontrol: decouple reference counting from page accounting") does. After all, "being on the stock" is a kind of reference with regards to memcg. As such, it's guaranteed that a css on stock would not be freed. It's good to mention that refill_stock() is executed in an irq-disabled context, so the drain_stock() patched with css_put() would not actually free memcgA until the end of refill_stock(), since css_put() is an RCU free and it's still in grace period. For CPU2, the access to CPU1's stock->cached is protected by rcu_read_lock(), so in this case it gets either NULL from stock->cached or a memcgA that is still good. Cc: stable@vger.kernel.org # 4.19 5.4 Fixes: cdec2e4 ("memcg: coalesce charging via percpu storage") Signed-off-by: GONG, Ruiqi <gongruiqi1@huawei.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 9e46a20) Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
elginsk8r
pushed a commit
to elginsk8r/android_kernel_oneplus_sm8150
that referenced
this issue
May 15, 2024
…ock() commit 1a3e1f40962c445b997151a542314f3c6097f8c3 upstream. NOTE: This is a partial backport since we only need the refcnt between memcg and stock to fix the problem stated below, and in this way multiple versions use the same code and align with each other. There was a kernel panic happened on an in-house environment running 3.10, and the same problem was reproduced on 4.19: general protection fault: 0000 [PeterCxy#1] SMP PTI CPU: 1 PID: 2085 Comm: bash Kdump: loaded Tainted: G L 4.19.90+ OnePlusOSS#7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014 RIP: 0010 drain_all_stock+0xad/0x140 Code: 00 00 4d 85 ff 74 2c 45 85 c9 74 27 4d 39 fc 74 42 41 80 bc 24 28 04 00 00 00 74 17 49 8b 04 24 49 8b 17 48 8b 88 90 02 00 00 <48> 39 8a 90 02 00 00 74 02 eb 86 48 63 88 3c 01 00 00 39 8a 3c 01 RSP: 0018:ffffa7efc5813d70 EFLAGS: 00010202 RAX: ffff8cb185548800 RBX: ffff8cb89f420160 RCX: ffff8cb1867b6000 RDX: babababababababa RSI: 0000000000000001 RDI: 0000000000231876 RBP: 0000000000000000 R08: 0000000000000415 R09: 0000000000000002 R10: 0000000000000000 R11: 0000000000000001 R12: ffff8cb186f89040 R13: 0000000000020160 R14: 0000000000000001 R15: ffff8cb186b27040 FS: 00007f4a308d3740(0000) GS:ffff8cb89f440000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffe4d634a68 CR3: 000000010b022000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: mem_cgroup_force_empty_write+0x31/0xb0 cgroup_file_write+0x60/0x140 ? __check_object_size+0x136/0x147 kernfs_fop_write+0x10e/0x190 __vfs_write+0x37/0x1b0 ? selinux_file_permission+0xe8/0x130 ? security_file_permission+0x2e/0xb0 vfs_write+0xb6/0x1a0 ksys_write+0x57/0xd0 do_syscall_64+0x63/0x250 ? async_page_fault+0x8/0x30 entry_SYSCALL_64_after_hwframe+0x5c/0xc1 Modules linked in: ... It is found that in case of stock->nr_pages == 0, the memcg on stock->cached could be freed due to its refcnt decreased to 0, which made stock->cached become a dangling pointer. It could cause a UAF problem in drain_all_stock() in the following concurrent scenario. Note that drain_all_stock() doesn't disable irq but only preemption. CPU1 CPU2 ============================================================================== stock->cached = memcgA (freed) drain_all_stock(memcgB) rcu_read_lock() memcg = CPU1's stock->cached (memcgA) (interrupted) refill_stock(memcgC) drain_stock(memcgA) stock->cached = memcgC stock->nr_pages += xxx (> 0) stock->nr_pages > 0 mem_cgroup_is_descendant(memcgA, memcgB) [UAF] rcu_read_unlock() This problem is, unintentionally, fixed at 5.9, where commit 1a3e1f40962c ("mm: memcontrol: decouple reference counting from page accounting") adds memcg refcnt for stock. Therefore affected LTS versions include 4.19 and 5.4. For 4.19, memcg's css offline process doesn't call drain_all_stock(). so it's easier for the released memcg to be left on the stock. For 5.4, although mem_cgroup_css_offline() does call drain_all_stock(), but the flushing could be skipped when stock->nr_pages happens to be 0, and besides the async draining could be delayed and take place after the UAF problem has happened. Fix this problem by adding (and decreasing) memcg's refcnt when memcg is put onto (and removed from) stock, just like how commit 1a3e1f40962c ("mm: memcontrol: decouple reference counting from page accounting") does. After all, "being on the stock" is a kind of reference with regards to memcg. As such, it's guaranteed that a css on stock would not be freed. It's good to mention that refill_stock() is executed in an irq-disabled context, so the drain_stock() patched with css_put() would not actually free memcgA until the end of refill_stock(), since css_put() is an RCU free and it's still in grace period. For CPU2, the access to CPU1's stock->cached is protected by rcu_read_lock(), so in this case it gets either NULL from stock->cached or a memcgA that is still good. Cc: stable@vger.kernel.org # 4.19 5.4 Fixes: cdec2e4 ("memcg: coalesce charging via percpu storage") Signed-off-by: GONG, Ruiqi <gongruiqi1@huawei.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
elginsk8r
pushed a commit
to elginsk8r/android_kernel_oneplus_sm8150
that referenced
this issue
Jun 26, 2024
[ Upstream commit f8bbc07ac535593139c875ffa19af924b1084540 ] vhost_worker will call tun call backs to receive packets. If too many illegal packets arrives, tun_do_read will keep dumping packet contents. When console is enabled, it will costs much more cpu time to dump packet and soft lockup will be detected. net_ratelimit mechanism can be used to limit the dumping rate. PID: 33036 TASK: ffff949da6f20000 CPU: 23 COMMAND: "vhost-32980" #0 [fffffe00003fce50] crash_nmi_callback at ffffffff89249253 PeterCxy#1 [fffffe00003fce58] nmi_handle at ffffffff89225fa3 OnePlusOSS#2 [fffffe00003fceb0] default_do_nmi at ffffffff8922642e OnePlusOSS#3 [fffffe00003fced0] do_nmi at ffffffff8922660d OnePlusOSS#4 [fffffe00003fcef0] end_repeat_nmi at ffffffff89c01663 [exception RIP: io_serial_in+20] RIP: ffffffff89792594 RSP: ffffa655314979e8 RFLAGS: 00000002 RAX: ffffffff89792500 RBX: ffffffff8af428a0 RCX: 0000000000000000 RDX: 00000000000003fd RSI: 0000000000000005 RDI: ffffffff8af428a0 RBP: 0000000000002710 R8: 0000000000000004 R9: 000000000000000f R10: 0000000000000000 R11: ffffffff8acbf64f R12: 0000000000000020 R13: ffffffff8acbf698 R14: 0000000000000058 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 OnePlusOSS#5 [ffffa655314979e8] io_serial_in at ffffffff89792594 OnePlusOSS#6 [ffffa655314979e8] wait_for_xmitr at ffffffff89793470 OnePlusOSS#7 [ffffa65531497a08] serial8250_console_putchar at ffffffff897934f6 OnePlusOSS#8 [ffffa65531497a20] uart_console_write at ffffffff8978b605 OnePlusOSS#9 [ffffa65531497a48] serial8250_console_write at ffffffff89796558 OnePlusOSS#10 [ffffa65531497ac8] console_unlock at ffffffff89316124 OnePlusOSS#11 [ffffa65531497b10] vprintk_emit at ffffffff89317c07 OnePlusOSS#12 [ffffa65531497b68] printk at ffffffff89318306 OnePlusOSS#13 [ffffa65531497bc8] print_hex_dump at ffffffff89650765 OnePlusOSS#14 [ffffa65531497ca8] tun_do_read at ffffffffc0b06c27 [tun] OnePlusOSS#15 [ffffa65531497d38] tun_recvmsg at ffffffffc0b06e34 [tun] OnePlusOSS#16 [ffffa65531497d68] handle_rx at ffffffffc0c5d682 [vhost_net] OnePlusOSS#17 [ffffa65531497ed0] vhost_worker at ffffffffc0c644dc [vhost] OnePlusOSS#18 [ffffa65531497f10] kthread at ffffffff892d2e72 OnePlusOSS#19 [ffffa65531497f50] ret_from_fork at ffffffff89c0022f Fixes: ef3db4a ("tun: avoid BUG, dump packet on GSO errors") Signed-off-by: Lei Chen <lei.chen@smartx.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/20240415020247.2207781-1-lei.chen@smartx.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 68459b8e3ee554ce71878af9eb69659b9462c588) Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
@OnePlusOSSAdmin @bettywangOP @hecaiqiang
I cloned OnePlusOSS/android_kernel_oneplus_sm8150 kernel code, but compile failed. Could you tell me how to compile successfully? Thanks.
I used config file is arch/arm64/configs/vendor/sm8150_defconfig, cross compile tool is aarch-linux-android-, CC compiler is clang. Command is below:
make O=out sm8150_defconfig
make O=out REAL_CC=../gclang9.0.6/bin/clang -j8
I am looking forward for your reply. Thank you!
The text was updated successfully, but these errors were encountered: