-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When Phicomm-n1 is under no load, the system load is around 2 #9
Comments
It is not clear what system, what kernel. And you need to see what processes are running (that spends resources). |
load average: 2.46 2.43 2.37 |
This element is associated with interrupts. There are recipes on the Internet how to determine which firmware the activity may be associated with and how to disable them. |
Return to normal after deleting the following code: |
[ Upstream commit aadcef64b22f668c1a107b86d3521d9cac915c24 ] As Jiqun Li reported in bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=202883 sometimes, dead lock when make system call SYS_getdents64 with fsync() is called by another process. monkey running on android9.0 1. task 9785 held sbi->cp_rwsem and waiting lock_page() 2. task 10349 held mm_sem and waiting sbi->cp_rwsem 3. task 9709 held lock_page() and waiting mm_sem so this is a dead lock scenario. task stack is show by crash tools as following crash_arm64> bt ffffffc03c354080 PID: 9785 TASK: ffffffc03c354080 CPU: 1 COMMAND: "RxIoScheduler-3" >> 150balbes#7 [ffffffc01b50fac0] __lock_page at ffffff80081b11e8 crash-arm64> bt 10349 PID: 10349 TASK: ffffffc018b83080 CPU: 1 COMMAND: "BUGLY_ASYNC_UPL" >> 150balbes#3 [ffffffc01f8cfa40] rwsem_down_read_failed at ffffff8008a93afc PC: 00000033 LR: 00000000 SP: 00000000 PSTATE: ffffffffffffffff crash-arm64> bt 9709 PID: 9709 TASK: ffffffc03e7f3080 CPU: 1 COMMAND: "IntentService[A" >> 150balbes#3 [ffffffc001e67850] rwsem_down_read_failed at ffffff8008a93afc >> 150balbes#8 [ffffffc001e67b80] el1_ia at ffffff8008084fc4 PC: ffffff8008274114 [compat_filldir64+120] LR: ffffff80083584d4 [f2fs_fill_dentries+448] SP: ffffffc001e67b80 PSTATE: 80400145 X29: ffffffc001e67b80 X28: 0000000000000000 X27: 000000000000001a X26: 00000000000093d7 X25: ffffffc070d52480 X24: 0000000000000008 X23: 0000000000000028 X22: 00000000d43dfd60 X21: ffffffc001e67e90 X20: 0000000000000011 X19: ffffff80093a4000 X18: 0000000000000000 X17: 0000000000000000 X16: 0000000000000000 X15: 0000000000000000 X14: ffffffffffffffff X13: 0000000000000008 X12: 0101010101010101 X11: 7f7f7f7f7f7f7f7f X10: 6a6a6a6a6a6a6a6a X9: 7f7f7f7f7f7f7f7f X8: 0000000080808000 X7: ffffff800827409c X6: 0000000080808000 X5: 0000000000000008 X4: 00000000000093d7 X3: 000000000000001a X2: 0000000000000011 X1: ffffffc070d52480 X0: 0000000000800238 >> 150balbes#9 [ffffffc001e67be0] f2fs_fill_dentries at ffffff80083584d0 PC: 0000003c LR: 00000000 SP: 00000000 PSTATE: 000000d9 X12: f48a02ff X11: d4678960 X10: d43dfc00 X9: d4678ae4 X8: 00000058 X7: d4678994 X6: d43de800 X5: 000000d9 X4: d43dfc0c X3: d43dfc10 X2: d46799c8 X1: 00000000 X0: 00001068 Below potential deadlock will happen between three threads: Thread A Thread B Thread C - f2fs_do_sync_file - f2fs_write_checkpoint - down_write(&sbi->node_change) -- 1) - do_page_fault - down_write(&mm->mmap_sem) -- 2) - do_wp_page - f2fs_vm_page_mkwrite - getdents64 - f2fs_read_inline_dir - lock_page -- 3) - f2fs_sync_node_pages - lock_page -- 3) - __do_map_lock - down_read(&sbi->node_change) -- 1) - f2fs_fill_dentries - dir_emit - compat_filldir64 - do_page_fault - down_read(&mm->mmap_sem) -- 2) Since f2fs_readdir is protected by inode.i_rwsem, there should not be any updates in inode page, we're safe to lookup dents in inode page without its lock held, so taking off the lock to improve concurrency of readdir and avoid potential deadlock. Reported-by: Jiqun Li <jiqun.li@unisoc.com> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
…_map [ Upstream commit 39df730b09774bd860e39ea208a48d15078236cb ] Detected via gcc's ASan: Direct leak of 2048 byte(s) in 64 object(s) allocated from: 6 #0 0x7f606512e370 in __interceptor_realloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee370) 7 150balbes#1 0x556b0f1d7ddd in thread_map__realloc util/thread_map.c:43 8 150balbes#2 0x556b0f1d84c7 in thread_map__new_by_tid util/thread_map.c:85 9 150balbes#3 0x556b0f0e045e in is_event_supported util/parse-events.c:2250 10 150balbes#4 0x556b0f0e1aa1 in print_hwcache_events util/parse-events.c:2382 11 150balbes#5 0x556b0f0e3231 in print_events util/parse-events.c:2514 12 150balbes#6 0x556b0ee0a66e in cmd_list /home/changbin/work/linux/tools/perf/builtin-list.c:58 13 150balbes#7 0x556b0f01e0ae in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302 14 150balbes#8 0x556b0f01e859 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354 15 150balbes#9 0x556b0f01edc8 in run_argv /home/changbin/work/linux/tools/perf/perf.c:398 16 150balbes#10 0x556b0f01f71f in main /home/changbin/work/linux/tools/perf/perf.c:520 17 150balbes#11 0x7f6062ccf09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) Signed-off-by: Changbin Du <changbin.du@gmail.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Fixes: 8989605 ("perf tools: Do not put a variable sized type not at the end of a struct") Link: http://lkml.kernel.org/r/20190316080556.3075-3-changbin.du@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 54569ba4b06d5baedae4614bde33a25a191473ba ] Detected with gcc's ASan: Direct leak of 66 byte(s) in 5 object(s) allocated from: #0 0x7ff3b1f32070 in __interceptor_strdup (/usr/lib/x86_64-linux-gnu/libasan.so.5+0x3b070) 150balbes#1 0x560c8761034d in collect_config util/config.c:597 150balbes#2 0x560c8760d9cb in get_value util/config.c:169 150balbes#3 0x560c8760dfd7 in perf_parse_file util/config.c:285 150balbes#4 0x560c8760e0d2 in perf_config_from_file util/config.c:476 150balbes#5 0x560c876108fd in perf_config_set__init util/config.c:661 150balbes#6 0x560c87610c72 in perf_config_set__new util/config.c:709 150balbes#7 0x560c87610d2f in perf_config__init util/config.c:718 150balbes#8 0x560c87610e5d in perf_config util/config.c:730 150balbes#9 0x560c875ddea0 in main /home/changbin/work/linux/tools/perf/perf.c:442 150balbes#10 0x7ff3afb8609a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) Signed-off-by: Changbin Du <changbin.du@gmail.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Taeung Song <treeze.taeung@gmail.com> Fixes: 20105ca ("perf config: Introduce perf_config_set class") Link: http://lkml.kernel.org/r/20190316080556.3075-6-changbin.du@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8bde8516893da5a5fdf06121f74d11b52ab92df5 ] Detected with gcc's ASan: Direct leak of 4356 byte(s) in 120 object(s) allocated from: #0 0x7ff1a2b5a070 in __interceptor_strdup (/usr/lib/x86_64-linux-gnu/libasan.so.5+0x3b070) 150balbes#1 0x55719aef4814 in build_id_cache__origname util/build-id.c:215 150balbes#2 0x55719af649b6 in print_sdt_events util/parse-events.c:2339 150balbes#3 0x55719af66272 in print_events util/parse-events.c:2542 150balbes#4 0x55719ad1ecaa in cmd_list /home/changbin/work/linux/tools/perf/builtin-list.c:58 150balbes#5 0x55719aec745d in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302 150balbes#6 0x55719aec7d1a in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354 150balbes#7 0x55719aec8184 in run_argv /home/changbin/work/linux/tools/perf/perf.c:398 150balbes#8 0x55719aeca41a in main /home/changbin/work/linux/tools/perf/perf.c:520 150balbes#9 0x7ff1a07ae09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) Signed-off-by: Changbin Du <changbin.du@gmail.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Fixes: 40218da ("perf list: Show SDT and pre-cached events") Link: http://lkml.kernel.org/r/20190316080556.3075-7-changbin.du@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 42dfa451d825a2ad15793c476f73e7bbc0f9d312 ] Using gcc's ASan, Changbin reports: ================================================================= ==7494==ERROR: LeakSanitizer: detected memory leaks Direct leak of 48 byte(s) in 1 object(s) allocated from: #0 0x7f0333a89138 in calloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee138) 150balbes#1 0x5625e5330a5e in zalloc util/util.h:23 150balbes#2 0x5625e5330a9b in perf_counts__new util/counts.c:10 150balbes#3 0x5625e5330ca0 in perf_evsel__alloc_counts util/counts.c:47 150balbes#4 0x5625e520d8e5 in __perf_evsel__read_on_cpu util/evsel.c:1505 150balbes#5 0x5625e517a985 in perf_evsel__read_on_cpu /home/work/linux/tools/perf/util/evsel.h:347 150balbes#6 0x5625e517ad1a in test__openat_syscall_event tests/openat-syscall.c:47 150balbes#7 0x5625e51528e6 in run_test tests/builtin-test.c:358 150balbes#8 0x5625e5152baf in test_and_print tests/builtin-test.c:388 150balbes#9 0x5625e51543fe in __cmd_test tests/builtin-test.c:583 150balbes#10 0x5625e515572f in cmd_test tests/builtin-test.c:722 150balbes#11 0x5625e51c3fb8 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302 150balbes#12 0x5625e51c44f7 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354 150balbes#13 0x5625e51c48fb in run_argv /home/changbin/work/linux/tools/perf/perf.c:398 150balbes#14 0x5625e51c5069 in main /home/changbin/work/linux/tools/perf/perf.c:520 150balbes#15 0x7f033214d09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) Indirect leak of 72 byte(s) in 1 object(s) allocated from: #0 0x7f0333a89138 in calloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee138) 150balbes#1 0x5625e532560d in zalloc util/util.h:23 150balbes#2 0x5625e532566b in xyarray__new util/xyarray.c:10 150balbes#3 0x5625e5330aba in perf_counts__new util/counts.c:15 150balbes#4 0x5625e5330ca0 in perf_evsel__alloc_counts util/counts.c:47 150balbes#5 0x5625e520d8e5 in __perf_evsel__read_on_cpu util/evsel.c:1505 150balbes#6 0x5625e517a985 in perf_evsel__read_on_cpu /home/work/linux/tools/perf/util/evsel.h:347 150balbes#7 0x5625e517ad1a in test__openat_syscall_event tests/openat-syscall.c:47 150balbes#8 0x5625e51528e6 in run_test tests/builtin-test.c:358 150balbes#9 0x5625e5152baf in test_and_print tests/builtin-test.c:388 150balbes#10 0x5625e51543fe in __cmd_test tests/builtin-test.c:583 150balbes#11 0x5625e515572f in cmd_test tests/builtin-test.c:722 150balbes#12 0x5625e51c3fb8 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302 150balbes#13 0x5625e51c44f7 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354 150balbes#14 0x5625e51c48fb in run_argv /home/changbin/work/linux/tools/perf/perf.c:398 150balbes#15 0x5625e51c5069 in main /home/changbin/work/linux/tools/perf/perf.c:520 150balbes#16 0x7f033214d09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) His patch took care of evsel->prev_raw_counts, but the above backtraces are about evsel->counts, so fix that instead. Reported-by: Changbin Du <changbin.du@gmail.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: https://lkml.kernel.org/n/tip-hd1x13g59f0nuhe4anxhsmfp@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
…_event_on_all_cpus test [ Upstream commit 93faa52e8371f0291ee1ff4994edae2b336b6233 ] ================================================================= ==7497==ERROR: LeakSanitizer: detected memory leaks Direct leak of 40 byte(s) in 1 object(s) allocated from: #0 0x7f0333a88f30 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xedf30) 150balbes#1 0x5625e5326213 in cpu_map__trim_new util/cpumap.c:45 150balbes#2 0x5625e5326703 in cpu_map__read util/cpumap.c:103 150balbes#3 0x5625e53267ef in cpu_map__read_all_cpu_map util/cpumap.c:120 150balbes#4 0x5625e5326915 in cpu_map__new util/cpumap.c:135 150balbes#5 0x5625e517b355 in test__openat_syscall_event_on_all_cpus tests/openat-syscall-all-cpus.c:36 150balbes#6 0x5625e51528e6 in run_test tests/builtin-test.c:358 150balbes#7 0x5625e5152baf in test_and_print tests/builtin-test.c:388 150balbes#8 0x5625e51543fe in __cmd_test tests/builtin-test.c:583 150balbes#9 0x5625e515572f in cmd_test tests/builtin-test.c:722 150balbes#10 0x5625e51c3fb8 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302 150balbes#11 0x5625e51c44f7 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354 150balbes#12 0x5625e51c48fb in run_argv /home/changbin/work/linux/tools/perf/perf.c:398 150balbes#13 0x5625e51c5069 in main /home/changbin/work/linux/tools/perf/perf.c:520 150balbes#14 0x7f033214d09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) Signed-off-by: Changbin Du <changbin.du@gmail.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Fixes: f30a79b ("perf tools: Add reference counting for cpu_map object") Link: http://lkml.kernel.org/r/20190316080556.3075-15-changbin.du@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f97a8991d3b998e518f56794d879f645964de649 ] ================================================================= ==7506==ERROR: LeakSanitizer: detected memory leaks Direct leak of 13 byte(s) in 3 object(s) allocated from: #0 0x7f03339d6070 in __interceptor_strdup (/usr/lib/x86_64-linux-gnu/libasan.so.5+0x3b070) 150balbes#1 0x5625e53aaef0 in expr__find_other util/expr.y:221 150balbes#2 0x5625e51bcd3f in test__expr tests/expr.c:52 150balbes#3 0x5625e51528e6 in run_test tests/builtin-test.c:358 150balbes#4 0x5625e5152baf in test_and_print tests/builtin-test.c:388 150balbes#5 0x5625e51543fe in __cmd_test tests/builtin-test.c:583 150balbes#6 0x5625e515572f in cmd_test tests/builtin-test.c:722 150balbes#7 0x5625e51c3fb8 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302 150balbes#8 0x5625e51c44f7 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354 150balbes#9 0x5625e51c48fb in run_argv /home/changbin/work/linux/tools/perf/perf.c:398 150balbes#10 0x5625e51c5069 in main /home/changbin/work/linux/tools/perf/perf.c:520 150balbes#11 0x7f033214d09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) Signed-off-by: Changbin Du <changbin.du@gmail.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Fixes: 0751673 ("perf tools: Add a simple expression parser for JSON") Link: http://lkml.kernel.org/r/20190316080556.3075-16-changbin.du@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d982b33133284fa7efa0e52ae06b88f9be3ea764 ] ================================================================= ==20875==ERROR: LeakSanitizer: detected memory leaks Direct leak of 1160 byte(s) in 1 object(s) allocated from: #0 0x7f1b6fc84138 in calloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee138) 150balbes#1 0x55bd50005599 in zalloc util/util.h:23 150balbes#2 0x55bd500068f5 in perf_evsel__newtp_idx util/evsel.c:327 150balbes#3 0x55bd4ff810fc in perf_evsel__newtp /home/work/linux/tools/perf/util/evsel.h:216 150balbes#4 0x55bd4ff81608 in test__perf_evsel__tp_sched_test tests/evsel-tp-sched.c:69 150balbes#5 0x55bd4ff528e6 in run_test tests/builtin-test.c:358 150balbes#6 0x55bd4ff52baf in test_and_print tests/builtin-test.c:388 150balbes#7 0x55bd4ff543fe in __cmd_test tests/builtin-test.c:583 150balbes#8 0x55bd4ff5572f in cmd_test tests/builtin-test.c:722 150balbes#9 0x55bd4ffc4087 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302 150balbes#10 0x55bd4ffc45c6 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354 150balbes#11 0x55bd4ffc49ca in run_argv /home/changbin/work/linux/tools/perf/perf.c:398 150balbes#12 0x55bd4ffc5138 in main /home/changbin/work/linux/tools/perf/perf.c:520 150balbes#13 0x7f1b6e34809a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) Indirect leak of 19 byte(s) in 1 object(s) allocated from: #0 0x7f1b6fc83f30 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xedf30) 150balbes#1 0x7f1b6e3ac30f in vasprintf (/lib/x86_64-linux-gnu/libc.so.6+0x8830f) Signed-off-by: Changbin Du <changbin.du@gmail.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Fixes: 6a6cd11 ("perf test: Add test for the sched tracepoint format fields") Link: http://lkml.kernel.org/r/20190316080556.3075-17-changbin.du@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ff612ba7849964b1898fd3ccd1f56941129c6aab ] We've been seeing the following sporadically throughout our fleet panic: kernel BUG at fs/btrfs/relocation.c:4584! netversion: 5.0-0 Backtrace: #0 [ffffc90003adb880] machine_kexec at ffffffff81041da8 150balbes#1 [ffffc90003adb8c8] __crash_kexec at ffffffff8110396c 150balbes#2 [ffffc90003adb988] crash_kexec at ffffffff811048ad 150balbes#3 [ffffc90003adb9a0] oops_end at ffffffff8101c19a 150balbes#4 [ffffc90003adb9c0] do_trap at ffffffff81019114 150balbes#5 [ffffc90003adba00] do_error_trap at ffffffff810195d0 150balbes#6 [ffffc90003adbab0] invalid_op at ffffffff81a00a9b [exception RIP: btrfs_reloc_cow_block+692] RIP: ffffffff8143b614 RSP: ffffc90003adbb68 RFLAGS: 00010246 RAX: fffffffffffffff7 RBX: ffff8806b9c32000 RCX: ffff8806aad00690 RDX: ffff880850b295e0 RSI: ffff8806b9c32000 RDI: ffff88084f205bd0 RBP: ffff880849415000 R8: ffffc90003adbbe0 R9: ffff88085ac90000 R10: ffff8805f7369140 R11: 0000000000000000 R12: ffff880850b295e0 R13: ffff88084f205bd0 R14: 0000000000000000 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 150balbes#7 [ffffc90003adbbb0] __btrfs_cow_block at ffffffff813bf1cd 150balbes#8 [ffffc90003adbc28] btrfs_cow_block at ffffffff813bf4b3 150balbes#9 [ffffc90003adbc78] btrfs_search_slot at ffffffff813c2e6c The way relocation moves data extents is by creating a reloc inode and preallocating extents in this inode and then copying the data into these preallocated extents. Once we've done this for all of our extents, we'll write out these dirty pages, which marks the extent written, and goes into btrfs_reloc_cow_block(). From here we get our current reloc_control, which _should_ match the reloc_control for the current block group we're relocating. However if we get an ENOSPC in this path at some point we'll bail out, never initiating writeback on this inode. Not a huge deal, unless we happen to be doing relocation on a different block group, and this block group is now rc->stage == UPDATE_DATA_PTRS. This trips the BUG_ON() in btrfs_reloc_cow_block(), because we expect to be done modifying the data inode. We are in fact done modifying the metadata for the data inode we're currently using, but not the one from the failed block group, and thus we BUG_ON(). (This happens when writeback finishes for extents from the previous group, when we are at btrfs_finish_ordered_io() which updates the data reloc tree (inode item, drops/adds extent items, etc).) Fix this by writing out the reloc data inode always, and then breaking out of the loop after that point to keep from tripping this BUG_ON() later. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> [ add note from Filipe ] Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
… allocation commit a1ad1cc9704f64c169261a76e1aee1cf1ae51832 upstream. After memory allocation failure vc_allocate() doesn't clean up data which has been initialized in visual_init(). In case of fbcon this leads to divide-by-0 in fbcon_init() on next open of the same tty. memory allocation in vc_allocate() may fail here: 1097: vc->vc_screenbuf = kzalloc(vc->vc_screenbuf_size, GFP_KERNEL); on next open() fbcon_init() skips vc_font.data initialization: 1088: if (!p->fontdata) { division by zero in fbcon_init() happens here: 1149: new_cols /= vc->vc_font.width; Additional check is needed in fbcon_deinit() to prevent usage of uninitialized vc_screenbuf: 1251: if (vc->vc_hi_font_mask && vc->vc_screenbuf) 1252: set_vc_hi_font(vc, false); Crash: 150balbes#6 [ffffc90001eafa60] divide_error at ffffffff81a00be4 [exception RIP: fbcon_init+463] RIP: ffffffff814b860f RSP: ffffc90001eafb18 RFLAGS: 00010246 ... 150balbes#7 [ffffc90001eafb60] visual_init at ffffffff8154c36e 150balbes#8 [ffffc90001eafb80] vc_allocate at ffffffff8154f53c 150balbes#9 [ffffc90001eafbc8] con_install at ffffffff8154f624 ... Signed-off-by: Grzegorz Halat <ghalat@redhat.com> Reviewed-by: Oleksandr Natalenko <oleksandr@redhat.com> Acked-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d0a255e795ab976481565f6ac178314b34fbf891 upstream. A deadlock with this stacktrace was observed. The loop thread does a GFP_KERNEL allocation, it calls into dm-bufio shrinker and the shrinker depends on I/O completion in the dm-bufio subsystem. In order to fix the deadlock (and other similar ones), we set the flag PF_MEMALLOC_NOIO at loop thread entry. PID: 474 TASK: ffff8813e11f4600 CPU: 10 COMMAND: "kswapd0" #0 [ffff8813dedfb938] __schedule at ffffffff8173f405 150balbes#1 [ffff8813dedfb990] schedule at ffffffff8173fa27 150balbes#2 [ffff8813dedfb9b0] schedule_timeout at ffffffff81742fec 150balbes#3 [ffff8813dedfba60] io_schedule_timeout at ffffffff8173f186 150balbes#4 [ffff8813dedfbaa0] bit_wait_io at ffffffff8174034f 150balbes#5 [ffff8813dedfbac0] __wait_on_bit at ffffffff8173fec8 150balbes#6 [ffff8813dedfbb10] out_of_line_wait_on_bit at ffffffff8173ff81 150balbes#7 [ffff8813dedfbb90] __make_buffer_clean at ffffffffa038736f [dm_bufio] 150balbes#8 [ffff8813dedfbbb0] __try_evict_buffer at ffffffffa0387bb8 [dm_bufio] 150balbes#9 [ffff8813dedfbbd0] dm_bufio_shrink_scan at ffffffffa0387cc3 [dm_bufio] 150balbes#10 [ffff8813dedfbc40] shrink_slab at ffffffff811a87ce 150balbes#11 [ffff8813dedfbd30] shrink_zone at ffffffff811ad778 150balbes#12 [ffff8813dedfbdc0] kswapd at ffffffff811ae92f 150balbes#13 [ffff8813dedfbec0] kthread at ffffffff810a8428 150balbes#14 [ffff8813dedfbf50] ret_from_fork at ffffffff81745242 PID: 14127 TASK: ffff881455749c00 CPU: 11 COMMAND: "loop1" #0 [ffff88272f5af228] __schedule at ffffffff8173f405 150balbes#1 [ffff88272f5af280] schedule at ffffffff8173fa27 150balbes#2 [ffff88272f5af2a0] schedule_preempt_disabled at ffffffff8173fd5e 150balbes#3 [ffff88272f5af2b0] __mutex_lock_slowpath at ffffffff81741fb5 150balbes#4 [ffff88272f5af330] mutex_lock at ffffffff81742133 150balbes#5 [ffff88272f5af350] dm_bufio_shrink_count at ffffffffa03865f9 [dm_bufio] 150balbes#6 [ffff88272f5af380] shrink_slab at ffffffff811a86bd 150balbes#7 [ffff88272f5af470] shrink_zone at ffffffff811ad778 150balbes#8 [ffff88272f5af500] do_try_to_free_pages at ffffffff811adb34 150balbes#9 [ffff88272f5af590] try_to_free_pages at ffffffff811adef8 150balbes#10 [ffff88272f5af610] __alloc_pages_nodemask at ffffffff811a09c3 150balbes#11 [ffff88272f5af710] alloc_pages_current at ffffffff811e8b71 150balbes#12 [ffff88272f5af760] new_slab at ffffffff811f4523 150balbes#13 [ffff88272f5af7b0] __slab_alloc at ffffffff8173a1b5 150balbes#14 [ffff88272f5af880] kmem_cache_alloc at ffffffff811f484b 150balbes#15 [ffff88272f5af8d0] do_blockdev_direct_IO at ffffffff812535b3 150balbes#16 [ffff88272f5afb00] __blockdev_direct_IO at ffffffff81255dc3 150balbes#17 [ffff88272f5afb30] xfs_vm_direct_IO at ffffffffa01fe3fc [xfs] 150balbes#18 [ffff88272f5afb90] generic_file_read_iter at ffffffff81198994 150balbes#19 [ffff88272f5afc50] __dta_xfs_file_read_iter_2398 at ffffffffa020c970 [xfs] 150balbes#20 [ffff88272f5afcc0] lo_rw_aio at ffffffffa0377042 [loop] 150balbes#21 [ffff88272f5afd70] loop_queue_work at ffffffffa0377c3b [loop] 150balbes#22 [ffff88272f5afe60] kthread_worker_fn at ffffffff810a8a0c 150balbes#23 [ffff88272f5afec0] kthread at ffffffff810a8428 150balbes#24 [ffff88272f5aff50] ret_from_fork at ffffffff81745242 Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 76e43c8ccaa35c30d5df853013561145a0f750a5 ] When IOCB_CMD_POLL is used on the FUSE device, aio_poll() disables IRQs and takes kioctx::ctx_lock, then fuse_iqueue::waitq.lock. This may have to wait for fuse_iqueue::waitq.lock to be released by one of many places that take it with IRQs enabled. Since the IRQ handler may take kioctx::ctx_lock, lockdep reports that a deadlock is possible. Fix it by protecting the state of struct fuse_iqueue with a separate spinlock, and only accessing fuse_iqueue::waitq using the versions of the waitqueue functions which do IRQ-safe locking internally. Reproducer: #include <fcntl.h> #include <stdio.h> #include <sys/mount.h> #include <sys/stat.h> #include <sys/syscall.h> #include <unistd.h> #include <linux/aio_abi.h> int main() { char opts[128]; int fd = open("/dev/fuse", O_RDWR); aio_context_t ctx = 0; struct iocb cb = { .aio_lio_opcode = IOCB_CMD_POLL, .aio_fildes = fd }; struct iocb *cbp = &cb; sprintf(opts, "fd=%d,rootmode=040000,user_id=0,group_id=0", fd); mkdir("mnt", 0700); mount("foo", "mnt", "fuse", 0, opts); syscall(__NR_io_setup, 1, &ctx); syscall(__NR_io_submit, ctx, 1, &cbp); } Beginning of lockdep output: ===================================================== WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected 5.3.0-rc5 150balbes#9 Not tainted ----------------------------------------------------- syz_fuse/135 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: 000000003590ceda (&fiq->waitq){+.+.}, at: spin_lock include/linux/spinlock.h:338 [inline] 000000003590ceda (&fiq->waitq){+.+.}, at: aio_poll fs/aio.c:1751 [inline] 000000003590ceda (&fiq->waitq){+.+.}, at: __io_submit_one.constprop.0+0x203/0x5b0 fs/aio.c:1825 and this task is already holding: 0000000075037284 (&(&ctx->ctx_lock)->rlock){..-.}, at: spin_lock_irq include/linux/spinlock.h:363 [inline] 0000000075037284 (&(&ctx->ctx_lock)->rlock){..-.}, at: aio_poll fs/aio.c:1749 [inline] 0000000075037284 (&(&ctx->ctx_lock)->rlock){..-.}, at: __io_submit_one.constprop.0+0x1f4/0x5b0 fs/aio.c:1825 which would create a new lock dependency: (&(&ctx->ctx_lock)->rlock){..-.} -> (&fiq->waitq){+.+.} but this new dependency connects a SOFTIRQ-irq-safe lock: (&(&ctx->ctx_lock)->rlock){..-.} [...] Reported-by: syzbot+af05535bb79520f95431@syzkaller.appspotmail.com Reported-by: syzbot+d86c4426a01f60feddc7@syzkaller.appspotmail.com Fixes: bfe4037 ("aio: implement IOCB_CMD_POLL") Cc: <stable@vger.kernel.org> # v4.19+ Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
…_clear_bit [ Upstream commit fadcbd2901a0f7c8721f3bdb69eac95c272dc8ed ] We need to move "spin_lock_irq(&bitmap->counts.lock)" before unmap previous storage, otherwise panic like belows could happen as follows. [ 902.353802] sdl: detected capacity change from 1077936128 to 3221225472 [ 902.616948] general protection fault: 0000 [150balbes#1] SMP [snip] [ 902.618588] CPU: 12 PID: 33698 Comm: md0_raid1 Tainted: G O 4.14.144-1-pserver 150balbes#4.14.144-1.1~deb10 [ 902.618870] Hardware name: Supermicro SBA-7142G-T4/BHQGE, BIOS 3.00 10/24/2012 [ 902.619120] task: ffff9ae1860fc600 task.stack: ffffb52e4c704000 [ 902.619301] RIP: 0010:bitmap_file_clear_bit+0x90/0xd0 [md_mod] [ 902.619464] RSP: 0018:ffffb52e4c707d28 EFLAGS: 00010087 [ 902.619626] RAX: ffe8008b0d061000 RBX: ffff9ad078c87300 RCX: 0000000000000000 [ 902.619792] RDX: ffff9ad986341868 RSI: 0000000000000803 RDI: ffff9ad078c87300 [ 902.619986] RBP: ffff9ad0ed7a8000 R08: 0000000000000000 R09: 0000000000000000 [ 902.620154] R10: ffffb52e4c707ec0 R11: ffff9ad987d1ed44 R12: ffff9ad0ed7a8360 [ 902.620320] R13: 0000000000000003 R14: 0000000000060000 R15: 0000000000000800 [ 902.620487] FS: 0000000000000000(0000) GS:ffff9ad987d00000(0000) knlGS:0000000000000000 [ 902.620738] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 902.620901] CR2: 000055ff12aecec0 CR3: 0000001005207000 CR4: 00000000000406e0 [ 902.621068] Call Trace: [ 902.621256] bitmap_daemon_work+0x2dd/0x360 [md_mod] [ 902.621429] ? find_pers+0x70/0x70 [md_mod] [ 902.621597] md_check_recovery+0x51/0x540 [md_mod] [ 902.621762] raid1d+0x5c/0xeb0 [raid1] [ 902.621939] ? try_to_del_timer_sync+0x4d/0x80 [ 902.622102] ? del_timer_sync+0x35/0x40 [ 902.622265] ? schedule_timeout+0x177/0x360 [ 902.622453] ? call_timer_fn+0x130/0x130 [ 902.622623] ? find_pers+0x70/0x70 [md_mod] [ 902.622794] ? md_thread+0x94/0x150 [md_mod] [ 902.622959] md_thread+0x94/0x150 [md_mod] [ 902.623121] ? wait_woken+0x80/0x80 [ 902.623280] kthread+0x119/0x130 [ 902.623437] ? kthread_create_on_node+0x60/0x60 [ 902.623600] ret_from_fork+0x22/0x40 [ 902.624225] RIP: bitmap_file_clear_bit+0x90/0xd0 [md_mod] RSP: ffffb52e4c707d28 Because mdadm was running on another cpu to do resize, so bitmap_resize was called to replace bitmap as below shows. PID: 38801 TASK: ffff9ad074a90e00 CPU: 0 COMMAND: "mdadm" [exception RIP: queued_spin_lock_slowpath+56] [snip] -- <NMI exception stack> -- 150balbes#5 [ffffb52e60f17c58] queued_spin_lock_slowpath at ffffffff9c0b27b8 150balbes#6 [ffffb52e60f17c58] bitmap_resize at ffffffffc0399877 [md_mod] 150balbes#7 [ffffb52e60f17d30] raid1_resize at ffffffffc0285bf9 [raid1] 150balbes#8 [ffffb52e60f17d50] update_size at ffffffffc038a31a [md_mod] 150balbes#9 [ffffb52e60f17d70] md_ioctl at ffffffffc0395ca4 [md_mod] And the procedure to keep resize bitmap safe is allocate new storage space, then quiesce, copy bits, replace bitmap, and re-start. However the daemon (bitmap_daemon_work) could happen even the array is quiesced, which means when bitmap_file_clear_bit is triggered by raid1d, then it thinks it should be fine to access store->filemap since counts->lock is held, but resize could change the storage without the protection of the lock. Cc: Jack Wang <jinpu.wang@cloud.ionos.com> Cc: NeilBrown <neilb@suse.com> Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
When Phicomm-n1 is under no load, the system load is around 2
Is there a patch that can be fixed?
The text was updated successfully, but these errors were encountered: