Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please publish the complete WiFi source code kernel #15

Open
pgw308 opened this issue May 7, 2021 · 51 comments
Open

Please publish the complete WiFi source code kernel #15

pgw308 opened this issue May 7, 2021 · 51 comments

Comments

@pgw308
Copy link

pgw308 commented May 7, 2021

Why is the 5G WiFi signal of the ROM built with your source code is particularly weak and the signal of Oxygen OS 5G WiFi is particularly strong Please publish the complete or specially optimized WiFi code

@pgw308 pgw308 changed the title Please post your modified WiFi kernel or code Please publish the complete WiFi source code kernel May 7, 2021
@Hikari-no-Tenshi
Copy link

I guess problem is not just kernel or WiFi kernel modules.
I've tried to use custom ROM with Oxygen kernel and Oxygen WiFi modules and still had same weak signal.
More likely needd changes are in system (wifi-service, wificond). WiFi firmware may expect some commands from system to configure its transmition power.

If OnePlus reads this, leave comment to confirm my guess.

@pgw308
Copy link
Author

pgw308 commented May 7, 2021

I guess problem is not just kernel or WiFi kernel modules.
I've tried to use custom ROM with Oxygen kernel and Oxygen WiFi modules and still had same weak signal.
More likely needd changes are in system (wifi-service, wificond). WiFi firmware may expect some commands from system to configure its transmition power.

If OnePlus reads this, leave comment to confirm my guess.

Yes, there may be many other things that they haven’t released. I heard that this problem affects the entire 7 series models’ third-party firmware. WiFi 7pro, 7t, and 7tpro have similar problems.

@Hikari-no-Tenshi
Copy link

Because this changes not a part of Open source you can forget about it to be released.
You must find solution by yourself.
Some time ago i've tried to fix 5GHz band power, but no luck. Then i just dropped my attempts.
I'm for 99% sure that wifi-service.jar controlling WiFi power and with help of vendorcmdtool sending commands to WiFi driver.

@pgw308
Copy link
Author

pgw308 commented May 7, 2021

Because this changes not a part of Open source you can forget about it to be released.
You must find solution by yourself.
Some time ago i've tried to fix 5GHz band power, but no luck. Then i just dropped my attempts.
I'm for 99% sure that wifi-service.jar controlling WiFi power and with help of vendorcmdtool sending commands to WiFi driver.

Wow! If successful, the 5GWiFi problem will be resolved, then you will make a major share for the 7T custom community

@pgw308
Copy link
Author

pgw308 commented May 7, 2021

Because this changes not a part of Open source you can forget about it to be released.
You must find solution by yourself.
Some time ago i've tried to fix 5GHz band power, but no luck. Then i just dropped my attempts.
I'm for 99% sure that wifi-service.jar controlling WiFi power and with help of vendorcmdtool sending commands to WiFi driver.

And I found that not only the 5G WiFi signal is weak, the 2.4G WiFi is also a bit weak

@luk1337
Copy link

luk1337 commented May 7, 2021

@RealJohnGalt
Copy link

@vl3550 @Hikari-no-Tenshi try this change https://review.lineageos.org/c/LineageOS/android_device_oneplus_sm8150-common/+/309551

No change whatsoever in 5ghz wlan performance. Also it appears to load same firmware with or without this change on HD1905 for me.

@luk1337
Copy link

luk1337 commented May 7, 2021

adb logcat | grep -i cnss and post output.

@RealJohnGalt
Copy link

adb logcat | grep -i cnss and post output.

05-07 13:05:28.878 1236 1236 I cnss-daemon: nl80211 response handler invoked
05-07 13:05:28.878 1236 1236 I cnss-daemon: nl80211_response_handler: cmd 103, vendorID 4980, subcmd 13 received
repeatedly spammed.

@luk1337
Copy link

luk1337 commented May 7, 2021

Start logging at the beginning of booting up.

@RealJohnGalt
Copy link

Start logging at the beginning of booting up.

http://ix.io/3m5L

Seems it hanged at some point, will attempt to get a better log.

@luk1337
Copy link

luk1337 commented May 7, 2021

05-07 13:07:21.767  1272  1526 I cnss-daemon: pj_id=4, hw_id=14, rf_id=2
05-07 13:07:21.767  1272  1526 I cnss-daemon: BDF file properties is set as : 0
05-07 13:07:21.767  1272  1526 I cnss-daemon: it is 18865, begin to load the 18865 BDF file
05-07 13:07:21.771  1272  1526 I cnss-daemon: wlfw_send_bdf_download_req: BDF file : 4bdwlan.b0a
05-07 13:07:21.775  1272  1526 I cnss-daemon: wlfw_send_bdf_download_req: bdf type 0,result 0, error 0

it loads proper FW now.

@luk1337
Copy link

luk1337 commented May 7, 2021

You should now see that 2.4GHz TX power is increased ( 20 -> 23, check iw reg get )

@RealJohnGalt
Copy link

RealJohnGalt commented May 7, 2021

You should now see that 2.4GHz TX power is increased ( 20 -> 23, check iw reg get )

Will do, apologies for the false alarm. I tested range with my drone, and as usual it was poor on 5ghz compared to my other device and my old OOS test.

Really nice work if this helps some people.

@gotenksIN
Copy link

seems to be working nice for me, thanks :D

@luk1337
Copy link

luk1337 commented May 7, 2021

BTW, it's kinda stupid that OnePlus hardcoded setting these props in /system/bin/init...
( my commit message originally said that it was QMI but after a closer look, it was actually init. )

@gotenksIN
Copy link

ikr, apparently similar things with perms for input nodes
those actually broke prox with custom kernels even on oos xD

@luk1337
Copy link

luk1337 commented May 7, 2021

Hmm...I thought OOS just used oneplus.sensor.infrared.proximity instead of android.sensor.proximity and that's why it didn't need chmod.

@gotenksIN
Copy link

according to multiple posts on custom kernel threads (like https://forum.xda-developers.com/t/kernel-25-04-2021-4-14-231-android-11-kirisakura-1-1-6_r-for-op7-pro-aka-guacamole.3933916/post-84798999) it was broken until he added it in bootscripts to chmod

@luk1337
Copy link

luk1337 commented May 7, 2021

Wouldn't be surprised if oss kernel was just slightly broken...

@luk1337
Copy link

luk1337 commented May 7, 2021

btw @gotenksIN I assume that ALS is not working properly for you on non-OOS? I looked through various forks of my sm8150-common tree and sadly none of them had made any changes to ALS correction code...

@gotenksIN
Copy link

yeah, although it does seem be fairly consistent compared to q sensors hal, although no where near to the improvements oos 11 ob2 and newer brought
tried different brightness configs and all but so far the only usable change would be to just increase the debounce values but I feel that it's too ugly :3
but oos does seem to have overlays in OdmOverlay-framework-res.apk with config_limitMinLux which we have no idea what it does but is probably related

@luk1337
Copy link

luk1337 commented May 7, 2021

Well...hacking around brightness overlays is only working around the actual problem — ALS correction code just doesn't really work on 7T series devices. It works relatively ok on 7 Pro and maybe 7 but I don't have 7 so can't really talk much about it.

@idoybh
Copy link

idoybh commented May 7, 2021

Well...hacking around brightness overlays is only working around the actual problem — ALS correction code just doesn't really work on 7T series devices. It works relatively ok on 7 Pro and maybe 7 but I don't have 7 so can't really talk much about it.

It works ok-ish on 7p. But it's still way worse than OOS

@luk1337
Copy link

luk1337 commented May 7, 2021

Try to disassemble stock libsensorservice. I was amazed at how complex the whole thing was, there are like 20 variables controlling the whole correction code.

        android::String8::append(v2, "fusionlight args list:\n");
        android::String8::appendFormat(v2, "R_max: %f\n", *(float *)&R_max);
        android::String8::appendFormat(v2, "G_max: %f\n", *(float *)&G_max);
        android::String8::appendFormat(v2, "B_max: %f\n", *(float *)&B_max);
        android::String8::appendFormat(v2, "W_max: %f\n", *(float *)&W_max);
        android::String8::appendFormat(v2, "R_max_cal: %f\n", *(float *)&dword_400F4);
        android::String8::appendFormat(v2, "G_max_cal: %f\n", *(float *)&dword_400F8);
        android::String8::appendFormat(v2, "B_max_cal: %f\n", *(float *)&dword_400FC);
        android::String8::appendFormat(v2, "W_max_cal: %f\n", *(float *)&dword_40100);
        android::String8::appendFormat(v2, "R_Comp_1: %f\n", *(float *)&dword_40104);
        android::String8::appendFormat(v2, "R_Comp_2: %f\n", *(float *)&dword_40108);
        android::String8::appendFormat(v2, "R_Comp_3: %f\n", *(float *)&dword_4010C);
        android::String8::appendFormat(v2, "G_Comp_1: %f\n", *(float *)&dword_40114);
        android::String8::appendFormat(v2, "G_Comp_2: %f\n", *(float *)&dword_40118);
        android::String8::appendFormat(v2, "G_Comp_3: %f\n", *(float *)&dword_4011C);
        android::String8::appendFormat(v2, "B_Comp_1: %f\n", *(float *)&dword_40124);
        android::String8::appendFormat(v2, "B_Comp_2: %f\n", *(float *)&dword_40128);
        android::String8::appendFormat(v2, "B_Comp_3: %f\n", *(float *)&dword_4012C);
        android::String8::appendFormat(v2, "Greyscale_1: %f\n", *(float *)&dword_40134);
        android::String8::appendFormat(v2, "Greyscale_2: %f\n", *(float *)&dword_40138);
        android::String8::appendFormat(v2, "Greyscale_3: %f\n", *(float *)&dword_4013C);
        android::String8::appendFormat(v2, "W_Comp_1: %f\n", *(float *)&dword_40140);
        android::String8::appendFormat(v2, "W_Comp_2: %f\n", *(float *)&dword_40144);
        android::String8::appendFormat(v2, "W_Comp_3: %f\n", *(float *)&dword_40148);
        android::String8::appendFormat(v2, "rou_coe: %f\n", *(float *)&dword_40154);
        android::String8::appendFormat(v2, "threshold: %d\n", (unsigned int)dword_40264);
        return android::String8::appendFormat(v2, "level_cal_arg: %f\n", *(float *)&dword_40150);

@pgw308
Copy link
Author

pgw308 commented May 8, 2021

So now the WiFi signal problem has been resolved. Is the issue of automatic brightness being discussed now?

@pgw308
Copy link
Author

pgw308 commented May 8, 2021

Try to disassemble stock libsensorservice. I was amazed at how complex the whole thing was, there are like 20 variables controlling the whole correction code.

        android::String8::append(v2, "fusionlight args list:\n");
        android::String8::appendFormat(v2, "R_max: %f\n", *(float *)&R_max);
        android::String8::appendFormat(v2, "G_max: %f\n", *(float *)&G_max);
        android::String8::appendFormat(v2, "B_max: %f\n", *(float *)&B_max);
        android::String8::appendFormat(v2, "W_max: %f\n", *(float *)&W_max);
        android::String8::appendFormat(v2, "R_max_cal: %f\n", *(float *)&dword_400F4);
        android::String8::appendFormat(v2, "G_max_cal: %f\n", *(float *)&dword_400F8);
        android::String8::appendFormat(v2, "B_max_cal: %f\n", *(float *)&dword_400FC);
        android::String8::appendFormat(v2, "W_max_cal: %f\n", *(float *)&dword_40100);
        android::String8::appendFormat(v2, "R_Comp_1: %f\n", *(float *)&dword_40104);
        android::String8::appendFormat(v2, "R_Comp_2: %f\n", *(float *)&dword_40108);
        android::String8::appendFormat(v2, "R_Comp_3: %f\n", *(float *)&dword_4010C);
        android::String8::appendFormat(v2, "G_Comp_1: %f\n", *(float *)&dword_40114);
        android::String8::appendFormat(v2, "G_Comp_2: %f\n", *(float *)&dword_40118);
        android::String8::appendFormat(v2, "G_Comp_3: %f\n", *(float *)&dword_4011C);
        android::String8::appendFormat(v2, "B_Comp_1: %f\n", *(float *)&dword_40124);
        android::String8::appendFormat(v2, "B_Comp_2: %f\n", *(float *)&dword_40128);
        android::String8::appendFormat(v2, "B_Comp_3: %f\n", *(float *)&dword_4012C);
        android::String8::appendFormat(v2, "Greyscale_1: %f\n", *(float *)&dword_40134);
        android::String8::appendFormat(v2, "Greyscale_2: %f\n", *(float *)&dword_40138);
        android::String8::appendFormat(v2, "Greyscale_3: %f\n", *(float *)&dword_4013C);
        android::String8::appendFormat(v2, "W_Comp_1: %f\n", *(float *)&dword_40140);
        android::String8::appendFormat(v2, "W_Comp_2: %f\n", *(float *)&dword_40144);
        android::String8::appendFormat(v2, "W_Comp_3: %f\n", *(float *)&dword_40148);
        android::String8::appendFormat(v2, "rou_coe: %f\n", *(float *)&dword_40154);
        android::String8::appendFormat(v2, "threshold: %d\n", (unsigned int)dword_40264);
        return android::String8::appendFormat(v2, "level_cal_arg: %f\n", *(float *)&dword_40150);

Does this mean that if you use this solution, the automatic brightness problem will be solved?

@luk1337
Copy link

luk1337 commented May 8, 2021

No, this means that you can't read the most basic code. The actual logic has to be reverse engineered and it's too complex for me.

@Hikari-no-Tenshi
Copy link

Last year i asked some guys for help to reverse engineer libsensorservice lib. They told me that this lib built with flags that make reverse engineer very hard or impossible.
Also asked OnePlus to share correction code (just in case). They politely refused.

Community have to мake its own correction algorithm.
Tried to make something useful in this direction, but it is too complicated for my skills.

@Hikari-no-Tenshi
Copy link

I think that for correction must also be taken to account: night mode, color profile, dc-dimming.
Someone with professional measuring tools for lcd display brightness can measure brightness of display in different modes to correct OnePlus values.

@Hikari-no-Tenshi
Copy link

I've tried to do something for night mode Hikari-no-Tenshi/android_device_oneplus_guacamoleb-merged@e2e4da0
Code maybe ugly, but i'm not a proffessioal programmer.

@pgw308
Copy link
Author

pgw308 commented May 8, 2021

I think that for correction must also be taken to account: night mode, color profile, dc-dimming.
Someone with professional measuring tools for lcd display brightness can measure brightness of display in different modes to correct OnePlus values.

@YuHuang65 Yes, there is also a problem with the color profile. OnePlus does need to be corrected.

@gotenksIN
Copy link

@vl3550 @Hikari-no-Tenshi try this change https://review.lineageos.org/c/LineageOS/android_device_oneplus_sm8150-common/+/309551

well another good outcome out of this was that now that we are loading proper wlan firmware, wifi aware and wifi rtt works perfectly, just enabled feature flags and props in yaap/device_oneplus_sm8150-common@dcf2bbe
image

@luk1337
Copy link

luk1337 commented May 9, 2021

That app only checks if permission/feature is available...

@gotenksIN
Copy link

gotenksIN commented May 9, 2021

well logs are pretty happy about it being enabled as well

05-09 13:33:35.039  1411  1411 I SystemServerTiming: StartRttService
05-09 13:33:35.040  1411  1411 I SystemServiceManager: Starting com.android.server.wifi.rtt.RttService
05-09 13:33:35.040  1411  1411 I RttService: Registering wifirtt
05-09 13:33:35.041  1411  1411 I SystemServerTiming: StartWifiAware
05-09 13:33:35.041  1411  1411 I SystemServiceManager: Starting com.android.server.wifi.aware.WifiAwareService
05-09 13:33:35.042  1411  1411 I WifiAwareService: Registering wifiaware
05-09 13:33:35.042  1411  1411 I SystemServerTiming: StartWifiP2P
05-09 13:33:35.042  1411  1411 I SystemServiceManager: Starting com.android.server.wifi.p2p.WifiP2pService

I haven't seen any crashes after that

@luk1337
Copy link

luk1337 commented May 9, 2021

I'd look for some more detailed info than "services start". Granted I don't really know how to test Aware/RTT.

@gotenksIN
Copy link

gotenksIN commented May 9, 2021

well last I tried, it spammed logs miserably about not being able to connect to the wifi interface we specified in the props.
will see if I can test it between devices that support it

Edit: taking a fresh log from boot

05-09 14:59:15.849  1444  1486 I ActivityManagerTiming: OnBootPhase_1000_com.android.server.wifi.aware.WifiAwareService
05-09 14:59:15.849  1444  2204 D WifiService: Handle boot completed
05-09 14:59:15.849  1444  1486 I WifiAwareService: Late initialization of Wi-Fi Aware service
05-09 14:59:16.288  1444  2245 D WIFI_AWARE_FACTORY: got request NetworkRequest [ TRACK_DEFAULT id=26, [ Capabilities: INTERNET&NOT_RESTRICTED&TRUSTED Uid: 10176 AdministratorUids: [] RequestorUid: 10176 RequestorPackageName: com.qualcomm.qti.cne] ] with score 50 and providerId 4

this is all I get apart from service started.

@gotenksIN
Copy link

I'd look for some more detailed info than "services start". Granted I don't really know how to test Aware/RTT.

can check with https://play.google.com/store/apps/details?id=com.google.android.apps.location.rtt.wifinanscan, sadly I don't have any other device on hand which does support wifi aware

@luk1337
Copy link

luk1337 commented May 9, 2021

None of my devices declare support for it...I could theoretically use two hacked OnePlus phones for it tho.

@luk1337
Copy link

luk1337 commented May 9, 2021

Apparently, CtsVerifier has tests for WiFi Aware but it appears that I can't even pass mWifiAwareManager.isAvailable() check there...

@luk1337
Copy link

luk1337 commented May 9, 2021

ok, after enabling some qcacld-3.0 debugging i got the following msg: [ 20.945963] NAN separate vdev supported by host, not supported by firmware so RIP Aware/NAN.

@gotenksIN
Copy link

gotenksIN commented May 9, 2021

F, so stock enables this for keks or are we missing more firmware? since caf seems to enable this by default on msmnile targets

@luk1337
Copy link

luk1337 commented May 9, 2021

Stock does not have WiFi Aware/NAN.

@gotenksIN
Copy link

I see, well thanks for info

@timocapa
Copy link

@luk1337
Copy link

luk1337 commented May 10, 2021

Unlikely.

elginsk8r pushed a commit to elginsk8r/android_kernel_oneplus_sm8150 that referenced this issue Nov 21, 2021
https://bugzilla.kernel.org/show_bug.cgi?id=208565

PID: 257    TASK: ecdd0000  CPU: 0   COMMAND: "init"
  #0 [<c0b420ec>] (__schedule) from [<c0b423c8>]
  PeterCxy#1 [<c0b423c8>] (schedule) from [<c0b459d4>]
  OnePlusOSS#2 [<c0b459d4>] (rwsem_down_read_failed) from [<c0b44fa0>]
  OnePlusOSS#3 [<c0b44fa0>] (down_read) from [<c044233c>]
  OnePlusOSS#4 [<c044233c>] (f2fs_truncate_blocks) from [<c0442890>]
  OnePlusOSS#5 [<c0442890>] (f2fs_truncate) from [<c044d408>]
  OnePlusOSS#6 [<c044d408>] (f2fs_evict_inode) from [<c030be18>]
  OnePlusOSS#7 [<c030be18>] (evict) from [<c030a558>]
  OnePlusOSS#8 [<c030a558>] (iput) from [<c047c600>]
  OnePlusOSS#9 [<c047c600>] (f2fs_sync_node_pages) from [<c0465414>]
 OnePlusOSS#10 [<c0465414>] (f2fs_write_checkpoint) from [<c04575f4>]
 OnePlusOSS#11 [<c04575f4>] (f2fs_sync_fs) from [<c0441918>]
 OnePlusOSS#12 [<c0441918>] (f2fs_do_sync_file) from [<c0441098>]
 OnePlusOSS#13 [<c0441098>] (f2fs_sync_file) from [<c0323fa0>]
 OnePlusOSS#14 [<c0323fa0>] (vfs_fsync_range) from [<c0324294>]
 OnePlusOSS#15 [<c0324294>] (do_fsync) from [<c0324014>]
 OnePlusOSS#16 [<c0324014>] (sys_fsync) from [<c0108bc0>]

This can be caused by flush_dirty_inode() in f2fs_sync_node_pages() where
iput() requires f2fs_lock_op() again resulting in livelock.

Reported-by: Zhiguo Niu <Zhiguo.Niu@unisoc.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
elginsk8r pushed a commit to elginsk8r/android_kernel_oneplus_sm8150 that referenced this issue Nov 21, 2021
This patch is to fix a crash:

 OnePlusOSS#3 [ffffb6580689f898] oops_end at ffffffffa2835bc2
 OnePlusOSS#4 [ffffb6580689f8b8] no_context at ffffffffa28766e7
 OnePlusOSS#5 [ffffb6580689f920] async_page_fault at ffffffffa320135e
    [exception RIP: f2fs_is_compressed_page+34]
    RIP: ffffffffa2ba83a2  RSP: ffffb6580689f9d8  RFLAGS: 00010213
    RAX: 0000000000000001  RBX: fffffc0f50b34bc0  RCX: 0000000000002122
    RDX: 0000000000002123  RSI: 0000000000000c00  RDI: fffffc0f50b34bc0
    RBP: ffff97e815a40178   R8: 0000000000000000   R9: ffff97e83ffc9000
    R10: 0000000000032300  R11: 0000000000032380  R12: ffffb6580689fa38
    R13: fffffc0f50b34bc0  R14: ffff97e825cbd000  R15: 0000000000000c00
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 OnePlusOSS#6 [ffffb6580689f9d8] __is_cp_guaranteed at ffffffffa2b7ea98
 OnePlusOSS#7 [ffffb6580689f9f0] f2fs_submit_page_write at ffffffffa2b81a69
 OnePlusOSS#8 [ffffb6580689fa30] f2fs_do_write_meta_page at ffffffffa2b99777
 OnePlusOSS#9 [ffffb6580689fae0] __f2fs_write_meta_page at ffffffffa2b75f1a
 OnePlusOSS#10 [ffffb6580689fb18] f2fs_sync_meta_pages at ffffffffa2b77466
 OnePlusOSS#11 [ffffb6580689fc98] do_checkpoint at ffffffffa2b78e46
 OnePlusOSS#12 [ffffb6580689fd88] f2fs_write_checkpoint at ffffffffa2b79c29
 OnePlusOSS#13 [ffffb6580689fdd0] f2fs_sync_fs at ffffffffa2b69d95
 OnePlusOSS#14 [ffffb6580689fe20] sync_filesystem at ffffffffa2ad2574
 OnePlusOSS#15 [ffffb6580689fe30] generic_shutdown_super at ffffffffa2a9b582
 OnePlusOSS#16 [ffffb6580689fe48] kill_block_super at ffffffffa2a9b6d1
 OnePlusOSS#17 [ffffb6580689fe60] kill_f2fs_super at ffffffffa2b6abe1
 OnePlusOSS#18 [ffffb6580689fea0] deactivate_locked_super at ffffffffa2a9afb6
 OnePlusOSS#19 [ffffb6580689feb8] cleanup_mnt at ffffffffa2abcad4
 OnePlusOSS#20 [ffffb6580689fee0] task_work_run at ffffffffa28bca28
 OnePlusOSS#21 [ffffb6580689ff00] exit_to_usermode_loop at ffffffffa28050b7
 OnePlusOSS#22 [ffffb6580689ff38] do_syscall_64 at ffffffffa280560e
 OnePlusOSS#23 [ffffb6580689ff50] entry_SYSCALL_64_after_hwframe at ffffffffa320008c

This occurred when umount f2fs if enable F2FS_FS_COMPRESSION
with F2FS_IO_TRACE. Fixes it by adding IS_IO_TRACED_PAGE to check
validity of pid for page_private.

Signed-off-by: Yu Changchun <yuchangchun1@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
srgrusso pushed a commit to BlissRoms-Devices/android_kernel_oneplus_sm8150 that referenced this issue Jan 31, 2022
…elect()

commit e0a2c28da11e2c2b963fc01d50acbf03045ac732 upstream.

In resp_mode_select() sanity check the block descriptor len to avoid UAF.

BUG: KASAN: use-after-free in resp_mode_select+0xa4c/0xb40 drivers/scsi/scsi_debug.c:2509
Read of size 1 at addr ffff888026670f50 by task scsicmd/15032

CPU: 1 PID: 15032 Comm: scsicmd Not tainted 5.15.0-01d0625 OnePlusOSS#15
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
Call Trace:
 <TASK>
 dump_stack_lvl+0x89/0xb5 lib/dump_stack.c:107
 print_address_description.constprop.9+0x28/0x160 mm/kasan/report.c:257
 kasan_report.cold.14+0x7d/0x117 mm/kasan/report.c:443
 __asan_report_load1_noabort+0x14/0x20 mm/kasan/report_generic.c:306
 resp_mode_select+0xa4c/0xb40 drivers/scsi/scsi_debug.c:2509
 schedule_resp+0x4af/0x1a10 drivers/scsi/scsi_debug.c:5483
 scsi_debug_queuecommand+0x8c9/0x1e70 drivers/scsi/scsi_debug.c:7537
 scsi_queue_rq+0x16b4/0x2d10 drivers/scsi/scsi_lib.c:1521
 blk_mq_dispatch_rq_list+0xb9b/0x2700 block/blk-mq.c:1640
 __blk_mq_sched_dispatch_requests+0x28f/0x590 block/blk-mq-sched.c:325
 blk_mq_sched_dispatch_requests+0x105/0x190 block/blk-mq-sched.c:358
 __blk_mq_run_hw_queue+0xe5/0x150 block/blk-mq.c:1762
 __blk_mq_delay_run_hw_queue+0x4f8/0x5c0 block/blk-mq.c:1839
 blk_mq_run_hw_queue+0x18d/0x350 block/blk-mq.c:1891
 blk_mq_sched_insert_request+0x3db/0x4e0 block/blk-mq-sched.c:474
 blk_execute_rq_nowait+0x16b/0x1c0 block/blk-exec.c:63
 sg_common_write.isra.18+0xeb3/0x2000 drivers/scsi/sg.c:837
 sg_new_write.isra.19+0x570/0x8c0 drivers/scsi/sg.c:775
 sg_ioctl_common+0x14d6/0x2710 drivers/scsi/sg.c:941
 sg_ioctl+0xa2/0x180 drivers/scsi/sg.c:1166
 __x64_sys_ioctl+0x19d/0x220 fs/ioctl.c:52
 do_syscall_64+0x3a/0x80 arch/x86/entry/common.c:50
 entry_SYSCALL_64_after_hwframe+0x44/0xae arch/x86/entry/entry_64.S:113

Link: https://lore.kernel.org/r/1637262208-28850-1-git-send-email-george.kennedy@oracle.com
Reported-by: syzkaller <syzkaller@googlegroups.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: George Kennedy <george.kennedy@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
elginsk8r pushed a commit to elginsk8r/android_kernel_oneplus_sm8150 that referenced this issue Jul 17, 2022
[ Upstream commit 4224cfd7fb6523f7a9d1c8bb91bb5df1e38eb624 ]

When bringing down the netdevice or system shutdown, a panic can be
triggered while accessing the sysfs path because the device is already
removed.

    [  755.549084] mlx5_core 0000:12:00.1: Shutdown was called
    [  756.404455] mlx5_core 0000:12:00.0: Shutdown was called
    ...
    [  757.937260] BUG: unable to handle kernel NULL pointer dereference at           (null)
    [  758.031397] IP: [<ffffffff8ee11acb>] dma_pool_alloc+0x1ab/0x280

    crash> bt
    ...
    PID: 12649  TASK: ffff8924108f2100  CPU: 1   COMMAND: "amsd"
    ...
     OnePlusOSS#9 [ffff89240e1a38b0] page_fault at ffffffff8f38c778
        [exception RIP: dma_pool_alloc+0x1ab]
        RIP: ffffffff8ee11acb  RSP: ffff89240e1a3968  RFLAGS: 00010046
        RAX: 0000000000000246  RBX: ffff89243d874100  RCX: 0000000000001000
        RDX: 0000000000000000  RSI: 0000000000000246  RDI: ffff89243d874090
        RBP: ffff89240e1a39c0   R8: 000000000001f080   R9: ffff8905ffc03c00
        R10: ffffffffc04680d4  R11: ffffffff8edde9fd  R12: 00000000000080d0
        R13: ffff89243d874090  R14: ffff89243d874080  R15: 0000000000000000
        ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
    OnePlusOSS#10 [ffff89240e1a39c8] mlx5_alloc_cmd_msg at ffffffffc04680f3 [mlx5_core]
    OnePlusOSS#11 [ffff89240e1a3a18] cmd_exec at ffffffffc046ad62 [mlx5_core]
    OnePlusOSS#12 [ffff89240e1a3ab8] mlx5_cmd_exec at ffffffffc046b4fb [mlx5_core]
    OnePlusOSS#13 [ffff89240e1a3ae8] mlx5_core_access_reg at ffffffffc0475434 [mlx5_core]
    OnePlusOSS#14 [ffff89240e1a3b40] mlx5e_get_fec_caps at ffffffffc04a7348 [mlx5_core]
    OnePlusOSS#15 [ffff89240e1a3bb0] get_fec_supported_advertised at ffffffffc04992bf [mlx5_core]
    OnePlusOSS#16 [ffff89240e1a3c08] mlx5e_get_link_ksettings at ffffffffc049ab36 [mlx5_core]
    OnePlusOSS#17 [ffff89240e1a3ce8] __ethtool_get_link_ksettings at ffffffff8f25db46
    OnePlusOSS#18 [ffff89240e1a3d48] speed_show at ffffffff8f277208
    OnePlusOSS#19 [ffff89240e1a3dd8] dev_attr_show at ffffffff8f0b70e3
    OnePlusOSS#20 [ffff89240e1a3df8] sysfs_kf_seq_show at ffffffff8eedbedf
    OnePlusOSS#21 [ffff89240e1a3e18] kernfs_seq_show at ffffffff8eeda596
    OnePlusOSS#22 [ffff89240e1a3e28] seq_read at ffffffff8ee76d10
    OnePlusOSS#23 [ffff89240e1a3e98] kernfs_fop_read at ffffffff8eedaef5
    #24 [ffff89240e1a3ed8] vfs_read at ffffffff8ee4e3ff
    #25 [ffff89240e1a3f08] sys_read at ffffffff8ee4f27f
    #26 [ffff89240e1a3f50] system_call_fastpath at ffffffff8f395f92

    crash> net_device.state ffff89443b0c0000
      state = 0x5  (__LINK_STATE_START| __LINK_STATE_NOCARRIER)

To prevent this scenario, we also make sure that the netdevice is present.

Signed-off-by: suresh kumar <suresh2514@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
elginsk8r pushed a commit to elginsk8r/android_kernel_oneplus_sm8150 that referenced this issue Jul 17, 2022
commit 6d5aa418b3bd42cdccc36e94ee199af423ef7c84 upstream.

The reference to `explicit_in_reply_to` is pointless as when the
reference was added in the form of "OnePlusOSS#15" [1], Section 15) was "The
canonical patch format".
The reference of "OnePlusOSS#15" had not been properly updated in a couple of
reorganizations during the plain-text SubmittingPatches era.

Fix it by using `the_canonical_patch_format`.

[1]: 2ae19ac ("Documentation: Add "how to write a good patch summary" to SubmittingPatches")

Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
Fixes: 5903019 ("Documentation/SubmittingPatches: convert it to ReST markup")
Fixes: 9b2c767 ("Documentation/SubmittingPatches: enrich the Sphinx output")
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: stable@vger.kernel.org # v4.9+
Link: https://lore.kernel.org/r/64e105a5-50be-23f2-6cae-903a2ea98e18@gmail.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
elginsk8r pushed a commit to elginsk8r/android_kernel_oneplus_sm8150 that referenced this issue Jul 17, 2022
commit 050133e1aa2cb49bb17be847d48a4431598ef562 upstream.

commit 0622cab ("bonding: fix 802.3ad aggregator reselection"),
resolve case, when there is several aggregation groups in the same bond.
bond_3ad_unbind_slave will invalidate (clear) aggregator when
__agg_active_ports return zero. So, ad_clear_agg can be executed even, when
num_of_ports!=0. Than bond_3ad_unbind_slave can be executed again for,
previously cleared aggregator. NOTE: at this time bond_3ad_unbind_slave
will not update slave ports list, because lag_ports==NULL. So, here we
got slave ports, pointing to freed aggregator memory.

Fix with checking actual number of ports in group (as was before
commit 0622cab ("bonding: fix 802.3ad aggregator reselection") ),
before ad_clear_agg().

The KASAN logs are as follows:

[  767.617392] ==================================================================
[  767.630776] BUG: KASAN: use-after-free in bond_3ad_state_machine_handler+0x13dc/0x1470
[  767.638764] Read of size 2 at addr ffff00011ba9d430 by task kworker/u8:7/767
[  767.647361] CPU: 3 PID: 767 Comm: kworker/u8:7 Tainted: G           O 5.15.11 OnePlusOSS#15
[  767.655329] Hardware name: DNI AmazonGo1 A7040 board (DT)
[  767.660760] Workqueue: lacp_1 bond_3ad_state_machine_handler
[  767.666468] Call trace:
[  767.668930]  dump_backtrace+0x0/0x2d0
[  767.672625]  show_stack+0x24/0x30
[  767.675965]  dump_stack_lvl+0x68/0x84
[  767.679659]  print_address_description.constprop.0+0x74/0x2b8
[  767.685451]  kasan_report+0x1f0/0x260
[  767.689148]  __asan_load2+0x94/0xd0
[  767.692667]  bond_3ad_state_machine_handler+0x13dc/0x1470

Fixes: 0622cab ("bonding: fix 802.3ad aggregator reselection")
Co-developed-by: Maksym Glubokiy <maksym.glubokiy@plvision.eu>
Signed-off-by: Maksym Glubokiy <maksym.glubokiy@plvision.eu>
Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu>
Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Link: https://lore.kernel.org/r/20220629012914.361-1-yevhen.orlov@plvision.eu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
elginsk8r pushed a commit to elginsk8r/android_kernel_oneplus_sm8150 that referenced this issue Dec 9, 2023
[ Upstream commit a154f5f643c6ecddd44847217a7a3845b4350003 ]

The following call trace shows a deadlock issue due to recursive locking of
mutex "device_mutex". First lock acquire is in target_for_each_device() and
second in target_free_device().

 PID: 148266   TASK: ffff8be21ffb5d00  CPU: 10   COMMAND: "iscsi_ttx"
  #0 [ffffa2bfc9ec3b18] __schedule at ffffffffa8060e7f
  PeterCxy#1 [ffffa2bfc9ec3ba0] schedule at ffffffffa8061224
  OnePlusOSS#2 [ffffa2bfc9ec3bb8] schedule_preempt_disabled at ffffffffa80615ee
  OnePlusOSS#3 [ffffa2bfc9ec3bc8] __mutex_lock at ffffffffa8062fd7
  OnePlusOSS#4 [ffffa2bfc9ec3c40] __mutex_lock_slowpath at ffffffffa80631d3
  OnePlusOSS#5 [ffffa2bfc9ec3c50] mutex_lock at ffffffffa806320c
  OnePlusOSS#6 [ffffa2bfc9ec3c68] target_free_device at ffffffffc0935998 [target_core_mod]
  OnePlusOSS#7 [ffffa2bfc9ec3c90] target_core_dev_release at ffffffffc092f975 [target_core_mod]
  OnePlusOSS#8 [ffffa2bfc9ec3ca0] config_item_put at ffffffffa79d250f
  OnePlusOSS#9 [ffffa2bfc9ec3cd0] config_item_put at ffffffffa79d2583
 OnePlusOSS#10 [ffffa2bfc9ec3ce0] target_devices_idr_iter at ffffffffc0933f3a [target_core_mod]
 OnePlusOSS#11 [ffffa2bfc9ec3d00] idr_for_each at ffffffffa803f6fc
 OnePlusOSS#12 [ffffa2bfc9ec3d60] target_for_each_device at ffffffffc0935670 [target_core_mod]
 OnePlusOSS#13 [ffffa2bfc9ec3d98] transport_deregister_session at ffffffffc0946408 [target_core_mod]
 OnePlusOSS#14 [ffffa2bfc9ec3dc8] iscsit_close_session at ffffffffc09a44a6 [iscsi_target_mod]
 OnePlusOSS#15 [ffffa2bfc9ec3df0] iscsit_close_connection at ffffffffc09a4a88 [iscsi_target_mod]
 OnePlusOSS#16 [ffffa2bfc9ec3df8] finish_task_switch at ffffffffa76e5d07
 OnePlusOSS#17 [ffffa2bfc9ec3e78] iscsit_take_action_for_connection_exit at ffffffffc0991c23 [iscsi_target_mod]
 OnePlusOSS#18 [ffffa2bfc9ec3ea0] iscsi_target_tx_thread at ffffffffc09a403b [iscsi_target_mod]
 OnePlusOSS#19 [ffffa2bfc9ec3f08] kthread at ffffffffa76d8080
 OnePlusOSS#20 [ffffa2bfc9ec3f50] ret_from_fork at ffffffffa8200364

Fixes: 36d4cb4 ("scsi: target: Avoid that EXTENDED COPY commands trigger lock inversion")
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Link: https://lore.kernel.org/r/20230918225848.66463-1-junxiao.bi@oracle.com
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
elginsk8r pushed a commit to elginsk8r/android_kernel_oneplus_sm8150 that referenced this issue Dec 18, 2023
[ Upstream commit 0b0747d507bffb827e40fc0f9fb5883fffc23477 ]

The following processes run into a deadlock. CPU 41 was waiting for CPU 29
to handle a CSD request while holding spinlock "crashdump_lock", but CPU 29
was hung by that spinlock with IRQs disabled.

  PID: 17360    TASK: ffff95c1090c5c40  CPU: 41  COMMAND: "mrdiagd"
  !# 0 [ffffb80edbf37b58] __read_once_size at ffffffff9b871a40 include/linux/compiler.h:185:0
  !# 1 [ffffb80edbf37b58] atomic_read at ffffffff9b871a40 arch/x86/include/asm/atomic.h:27:0
  !# 2 [ffffb80edbf37b58] dump_stack at ffffffff9b871a40 lib/dump_stack.c:54:0
   # 3 [ffffb80edbf37b78] csd_lock_wait_toolong at ffffffff9b131ad5 kernel/smp.c:364:0
   # 4 [ffffb80edbf37b78] __csd_lock_wait at ffffffff9b131ad5 kernel/smp.c:384:0
   # 5 [ffffb80edbf37bf8] csd_lock_wait at ffffffff9b13267a kernel/smp.c:394:0
   # 6 [ffffb80edbf37bf8] smp_call_function_many at ffffffff9b13267a kernel/smp.c:843:0
   # 7 [ffffb80edbf37c50] smp_call_function at ffffffff9b13279d kernel/smp.c:867:0
   # 8 [ffffb80edbf37c50] on_each_cpu at ffffffff9b13279d kernel/smp.c:976:0
   # 9 [ffffb80edbf37c78] flush_tlb_kernel_range at ffffffff9b085c4b arch/x86/mm/tlb.c:742:0
   OnePlusOSS#10 [ffffb80edbf37cb8] __purge_vmap_area_lazy at ffffffff9b23a1e0 mm/vmalloc.c:701:0
   OnePlusOSS#11 [ffffb80edbf37ce0] try_purge_vmap_area_lazy at ffffffff9b23a2cc mm/vmalloc.c:722:0
   OnePlusOSS#12 [ffffb80edbf37ce0] free_vmap_area_noflush at ffffffff9b23a2cc mm/vmalloc.c:754:0
   OnePlusOSS#13 [ffffb80edbf37cf8] free_unmap_vmap_area at ffffffff9b23bb3b mm/vmalloc.c:764:0
   OnePlusOSS#14 [ffffb80edbf37cf8] remove_vm_area at ffffffff9b23bb3b mm/vmalloc.c:1509:0
   OnePlusOSS#15 [ffffb80edbf37d18] __vunmap at ffffffff9b23bb8a mm/vmalloc.c:1537:0
   OnePlusOSS#16 [ffffb80edbf37d40] vfree at ffffffff9b23bc85 mm/vmalloc.c:1612:0
   OnePlusOSS#17 [ffffb80edbf37d58] megasas_free_host_crash_buffer [megaraid_sas] at ffffffffc020b7f2 drivers/scsi/megaraid/megaraid_sas_fusion.c:3932:0
   OnePlusOSS#18 [ffffb80edbf37d80] fw_crash_state_store [megaraid_sas] at ffffffffc01f804d drivers/scsi/megaraid/megaraid_sas_base.c:3291:0
   OnePlusOSS#19 [ffffb80edbf37dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0
   OnePlusOSS#20 [ffffb80edbf37dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0
   OnePlusOSS#21 [ffffb80edbf37de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0
   OnePlusOSS#22 [ffffb80edbf37e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0
   OnePlusOSS#23 [ffffb80edbf37ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0
   #24 [ffffb80edbf37ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0
   #25 [ffffb80edbf37ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0
   #26 [ffffb80edbf37f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0
   #27 [ffffb80edbf37f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0

  PID: 17355    TASK: ffff95c1090c3d80  CPU: 29  COMMAND: "mrdiagd"
  !# 0 [ffffb80f2d3c7d30] __read_once_size at ffffffff9b0f2ab0 include/linux/compiler.h:185:0
  !# 1 [ffffb80f2d3c7d30] native_queued_spin_lock_slowpath at ffffffff9b0f2ab0 kernel/locking/qspinlock.c:368:0
   # 2 [ffffb80f2d3c7d58] pv_queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/paravirt.h:674:0
   # 3 [ffffb80f2d3c7d58] queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/qspinlock.h:53:0
   # 4 [ffffb80f2d3c7d68] queued_spin_lock at ffffffff9b8961a6 include/asm-generic/qspinlock.h:90:0
   # 5 [ffffb80f2d3c7d68] do_raw_spin_lock_flags at ffffffff9b8961a6 include/linux/spinlock.h:173:0
   # 6 [ffffb80f2d3c7d68] __raw_spin_lock_irqsave at ffffffff9b8961a6 include/linux/spinlock_api_smp.h:122:0
   # 7 [ffffb80f2d3c7d68] _raw_spin_lock_irqsave at ffffffff9b8961a6 kernel/locking/spinlock.c:160:0
   # 8 [ffffb80f2d3c7d88] fw_crash_buffer_store [megaraid_sas] at ffffffffc01f8129 drivers/scsi/megaraid/megaraid_sas_base.c:3205:0
   # 9 [ffffb80f2d3c7dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0
   OnePlusOSS#10 [ffffb80f2d3c7dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0
   OnePlusOSS#11 [ffffb80f2d3c7de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0
   OnePlusOSS#12 [ffffb80f2d3c7e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0
   OnePlusOSS#13 [ffffb80f2d3c7ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0
   OnePlusOSS#14 [ffffb80f2d3c7ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0
   OnePlusOSS#15 [ffffb80f2d3c7ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0
   OnePlusOSS#16 [ffffb80f2d3c7f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0
   OnePlusOSS#17 [ffffb80f2d3c7f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0

The lock is used to synchronize different sysfs operations, it doesn't
protect any resource that will be touched by an interrupt. Consequently
it's not required to disable IRQs. Replace the spinlock with a mutex to fix
the deadlock.

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Link: https://lore.kernel.org/r/20230828221018.19471-1-junxiao.bi@oracle.com
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
elginsk8r pushed a commit to elginsk8r/android_kernel_oneplus_sm8150 that referenced this issue Dec 18, 2023
[ Upstream commit a154f5f643c6ecddd44847217a7a3845b4350003 ]

The following call trace shows a deadlock issue due to recursive locking of
mutex "device_mutex". First lock acquire is in target_for_each_device() and
second in target_free_device().

 PID: 148266   TASK: ffff8be21ffb5d00  CPU: 10   COMMAND: "iscsi_ttx"
  #0 [ffffa2bfc9ec3b18] __schedule at ffffffffa8060e7f
  PeterCxy#1 [ffffa2bfc9ec3ba0] schedule at ffffffffa8061224
  OnePlusOSS#2 [ffffa2bfc9ec3bb8] schedule_preempt_disabled at ffffffffa80615ee
  OnePlusOSS#3 [ffffa2bfc9ec3bc8] __mutex_lock at ffffffffa8062fd7
  OnePlusOSS#4 [ffffa2bfc9ec3c40] __mutex_lock_slowpath at ffffffffa80631d3
  OnePlusOSS#5 [ffffa2bfc9ec3c50] mutex_lock at ffffffffa806320c
  OnePlusOSS#6 [ffffa2bfc9ec3c68] target_free_device at ffffffffc0935998 [target_core_mod]
  OnePlusOSS#7 [ffffa2bfc9ec3c90] target_core_dev_release at ffffffffc092f975 [target_core_mod]
  OnePlusOSS#8 [ffffa2bfc9ec3ca0] config_item_put at ffffffffa79d250f
  OnePlusOSS#9 [ffffa2bfc9ec3cd0] config_item_put at ffffffffa79d2583
 OnePlusOSS#10 [ffffa2bfc9ec3ce0] target_devices_idr_iter at ffffffffc0933f3a [target_core_mod]
 OnePlusOSS#11 [ffffa2bfc9ec3d00] idr_for_each at ffffffffa803f6fc
 OnePlusOSS#12 [ffffa2bfc9ec3d60] target_for_each_device at ffffffffc0935670 [target_core_mod]
 OnePlusOSS#13 [ffffa2bfc9ec3d98] transport_deregister_session at ffffffffc0946408 [target_core_mod]
 OnePlusOSS#14 [ffffa2bfc9ec3dc8] iscsit_close_session at ffffffffc09a44a6 [iscsi_target_mod]
 OnePlusOSS#15 [ffffa2bfc9ec3df0] iscsit_close_connection at ffffffffc09a4a88 [iscsi_target_mod]
 OnePlusOSS#16 [ffffa2bfc9ec3df8] finish_task_switch at ffffffffa76e5d07
 OnePlusOSS#17 [ffffa2bfc9ec3e78] iscsit_take_action_for_connection_exit at ffffffffc0991c23 [iscsi_target_mod]
 OnePlusOSS#18 [ffffa2bfc9ec3ea0] iscsi_target_tx_thread at ffffffffc09a403b [iscsi_target_mod]
 OnePlusOSS#19 [ffffa2bfc9ec3f08] kthread at ffffffffa76d8080
 OnePlusOSS#20 [ffffa2bfc9ec3f50] ret_from_fork at ffffffffa8200364

Fixes: 36d4cb4 ("scsi: target: Avoid that EXTENDED COPY commands trigger lock inversion")
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Link: https://lore.kernel.org/r/20230918225848.66463-1-junxiao.bi@oracle.com
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
elginsk8r pushed a commit to elginsk8r/android_kernel_oneplus_sm8150 that referenced this issue May 1, 2024
…s_del_by_dev()

[ Upstream commit 01a564bab4876007ce35f312e16797dfe40e4823 ]

I got the below warning trace:

WARNING: CPU: 4 PID: 4056 at net/core/dev.c:11066 unregister_netdevice_many_notify
CPU: 4 PID: 4056 Comm: ip Not tainted 6.7.0-rc4+ OnePlusOSS#15
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
RIP: 0010:unregister_netdevice_many_notify+0x9a4/0x9b0
Call Trace:
 rtnl_dellink
 rtnetlink_rcv_msg
 netlink_rcv_skb
 netlink_unicast
 netlink_sendmsg
 __sock_sendmsg
 ____sys_sendmsg
 ___sys_sendmsg
 __sys_sendmsg
 do_syscall_64
 entry_SYSCALL_64_after_hwframe

It can be repoduced via:

    ip netns add ns1
    ip netns exec ns1 ip link add bond0 type bond mode 0
    ip netns exec ns1 ip link add bond_slave_1 type veth peer veth2
    ip netns exec ns1 ip link set bond_slave_1 master bond0
[1] ip netns exec ns1 ethtool -K bond0 rx-vlan-filter off
[2] ip netns exec ns1 ip link add link bond_slave_1 name bond_slave_1.0 type vlan id 0
[3] ip netns exec ns1 ip link add link bond0 name bond0.0 type vlan id 0
[4] ip netns exec ns1 ip link set bond_slave_1 nomaster
[5] ip netns exec ns1 ip link del veth2
    ip netns del ns1

This is all caused by command [1] turning off the rx-vlan-filter function
of bond0. The reason is the same as commit 01f4fd270870 ("bonding: Fix
incorrect deletion of ETH_P_8021AD protocol vid from slaves"). Commands
[2] [3] add the same vid to slave and master respectively, causing
command [4] to empty slave->vlan_info. The following command [5] triggers
this problem.

To fix this problem, we should add VLAN_FILTER feature checks in
vlan_vids_add_by_dev() and vlan_vids_del_by_dev() to prevent incorrect
addition or deletion of vlan_vid information.

Fixes: 348a144 ("vlan: introduce functions to do mass addition/deletion of vids by another device")
Signed-off-by: Liu Jian <liujian56@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
elginsk8r pushed a commit to elginsk8r/android_kernel_oneplus_sm8150 that referenced this issue Jun 26, 2024
…s_del_by_dev()

[ Upstream commit 01a564bab4876007ce35f312e16797dfe40e4823 ]

I got the below warning trace:

WARNING: CPU: 4 PID: 4056 at net/core/dev.c:11066 unregister_netdevice_many_notify
CPU: 4 PID: 4056 Comm: ip Not tainted 6.7.0-rc4+ OnePlusOSS#15
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
RIP: 0010:unregister_netdevice_many_notify+0x9a4/0x9b0
Call Trace:
 rtnl_dellink
 rtnetlink_rcv_msg
 netlink_rcv_skb
 netlink_unicast
 netlink_sendmsg
 __sock_sendmsg
 ____sys_sendmsg
 ___sys_sendmsg
 __sys_sendmsg
 do_syscall_64
 entry_SYSCALL_64_after_hwframe

It can be repoduced via:

    ip netns add ns1
    ip netns exec ns1 ip link add bond0 type bond mode 0
    ip netns exec ns1 ip link add bond_slave_1 type veth peer veth2
    ip netns exec ns1 ip link set bond_slave_1 master bond0
[1] ip netns exec ns1 ethtool -K bond0 rx-vlan-filter off
[2] ip netns exec ns1 ip link add link bond_slave_1 name bond_slave_1.0 type vlan id 0
[3] ip netns exec ns1 ip link add link bond0 name bond0.0 type vlan id 0
[4] ip netns exec ns1 ip link set bond_slave_1 nomaster
[5] ip netns exec ns1 ip link del veth2
    ip netns del ns1

This is all caused by command [1] turning off the rx-vlan-filter function
of bond0. The reason is the same as commit 01f4fd270870 ("bonding: Fix
incorrect deletion of ETH_P_8021AD protocol vid from slaves"). Commands
[2] [3] add the same vid to slave and master respectively, causing
command [4] to empty slave->vlan_info. The following command [5] triggers
this problem.

To fix this problem, we should add VLAN_FILTER feature checks in
vlan_vids_add_by_dev() and vlan_vids_del_by_dev() to prevent incorrect
addition or deletion of vlan_vid information.

Fixes: 348a144 ("vlan: introduce functions to do mass addition/deletion of vids by another device")
Signed-off-by: Liu Jian <liujian56@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
elginsk8r pushed a commit to elginsk8r/android_kernel_oneplus_sm8150 that referenced this issue Jun 26, 2024
…s_del_by_dev()

[ Upstream commit 01a564bab4876007ce35f312e16797dfe40e4823 ]

I got the below warning trace:

WARNING: CPU: 4 PID: 4056 at net/core/dev.c:11066 unregister_netdevice_many_notify
CPU: 4 PID: 4056 Comm: ip Not tainted 6.7.0-rc4+ OnePlusOSS#15
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
RIP: 0010:unregister_netdevice_many_notify+0x9a4/0x9b0
Call Trace:
 rtnl_dellink
 rtnetlink_rcv_msg
 netlink_rcv_skb
 netlink_unicast
 netlink_sendmsg
 __sock_sendmsg
 ____sys_sendmsg
 ___sys_sendmsg
 __sys_sendmsg
 do_syscall_64
 entry_SYSCALL_64_after_hwframe

It can be repoduced via:

    ip netns add ns1
    ip netns exec ns1 ip link add bond0 type bond mode 0
    ip netns exec ns1 ip link add bond_slave_1 type veth peer veth2
    ip netns exec ns1 ip link set bond_slave_1 master bond0
[1] ip netns exec ns1 ethtool -K bond0 rx-vlan-filter off
[2] ip netns exec ns1 ip link add link bond_slave_1 name bond_slave_1.0 type vlan id 0
[3] ip netns exec ns1 ip link add link bond0 name bond0.0 type vlan id 0
[4] ip netns exec ns1 ip link set bond_slave_1 nomaster
[5] ip netns exec ns1 ip link del veth2
    ip netns del ns1

This is all caused by command [1] turning off the rx-vlan-filter function
of bond0. The reason is the same as commit 01f4fd270870 ("bonding: Fix
incorrect deletion of ETH_P_8021AD protocol vid from slaves"). Commands
[2] [3] add the same vid to slave and master respectively, causing
command [4] to empty slave->vlan_info. The following command [5] triggers
this problem.

To fix this problem, we should add VLAN_FILTER feature checks in
vlan_vids_add_by_dev() and vlan_vids_del_by_dev() to prevent incorrect
addition or deletion of vlan_vid information.

Fixes: 348a144 ("vlan: introduce functions to do mass addition/deletion of vids by another device")
Signed-off-by: Liu Jian <liujian56@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit e1e5104)
[Vegard: update vlan_hw_filter_capable() calls to work around the fact
 that we don't have commit 9daae9b
 ("net: Call add/kill vid ndo on vlan filter feature toggling") from
 v4.17.]
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
elginsk8r pushed a commit to elginsk8r/android_kernel_oneplus_sm8150 that referenced this issue Jun 26, 2024
[ Upstream commit f8bbc07ac535593139c875ffa19af924b1084540 ]

vhost_worker will call tun call backs to receive packets. If too many
illegal packets arrives, tun_do_read will keep dumping packet contents.
When console is enabled, it will costs much more cpu time to dump
packet and soft lockup will be detected.

net_ratelimit mechanism can be used to limit the dumping rate.

PID: 33036    TASK: ffff949da6f20000  CPU: 23   COMMAND: "vhost-32980"
 #0 [fffffe00003fce50] crash_nmi_callback at ffffffff89249253
 PeterCxy#1 [fffffe00003fce58] nmi_handle at ffffffff89225fa3
 OnePlusOSS#2 [fffffe00003fceb0] default_do_nmi at ffffffff8922642e
 OnePlusOSS#3 [fffffe00003fced0] do_nmi at ffffffff8922660d
 OnePlusOSS#4 [fffffe00003fcef0] end_repeat_nmi at ffffffff89c01663
    [exception RIP: io_serial_in+20]
    RIP: ffffffff89792594  RSP: ffffa655314979e8  RFLAGS: 00000002
    RAX: ffffffff89792500  RBX: ffffffff8af428a0  RCX: 0000000000000000
    RDX: 00000000000003fd  RSI: 0000000000000005  RDI: ffffffff8af428a0
    RBP: 0000000000002710   R8: 0000000000000004   R9: 000000000000000f
    R10: 0000000000000000  R11: ffffffff8acbf64f  R12: 0000000000000020
    R13: ffffffff8acbf698  R14: 0000000000000058  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 OnePlusOSS#5 [ffffa655314979e8] io_serial_in at ffffffff89792594
 OnePlusOSS#6 [ffffa655314979e8] wait_for_xmitr at ffffffff89793470
 OnePlusOSS#7 [ffffa65531497a08] serial8250_console_putchar at ffffffff897934f6
 OnePlusOSS#8 [ffffa65531497a20] uart_console_write at ffffffff8978b605
 OnePlusOSS#9 [ffffa65531497a48] serial8250_console_write at ffffffff89796558
 OnePlusOSS#10 [ffffa65531497ac8] console_unlock at ffffffff89316124
 OnePlusOSS#11 [ffffa65531497b10] vprintk_emit at ffffffff89317c07
 OnePlusOSS#12 [ffffa65531497b68] printk at ffffffff89318306
 OnePlusOSS#13 [ffffa65531497bc8] print_hex_dump at ffffffff89650765
 OnePlusOSS#14 [ffffa65531497ca8] tun_do_read at ffffffffc0b06c27 [tun]
 OnePlusOSS#15 [ffffa65531497d38] tun_recvmsg at ffffffffc0b06e34 [tun]
 OnePlusOSS#16 [ffffa65531497d68] handle_rx at ffffffffc0c5d682 [vhost_net]
 OnePlusOSS#17 [ffffa65531497ed0] vhost_worker at ffffffffc0c644dc [vhost]
 OnePlusOSS#18 [ffffa65531497f10] kthread at ffffffff892d2e72
 OnePlusOSS#19 [ffffa65531497f50] ret_from_fork at ffffffff89c0022f

Fixes: ef3db4a ("tun: avoid BUG, dump packet on GSO errors")
Signed-off-by: Lei Chen <lei.chen@smartx.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/r/20240415020247.2207781-1-lei.chen@smartx.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 68459b8e3ee554ce71878af9eb69659b9462c588)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants