Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

where to ask for support? #5

Open
jobenvil opened this issue May 29, 2016 · 35 comments
Open

where to ask for support? #5

jobenvil opened this issue May 29, 2016 · 35 comments

Comments

@jobenvil
Copy link

Hi @tobetter where is the place to post or ask for help where you are active?
I would ask which kernel starts to supports uas on odroidxu4? I see on def_conf is istill uncommented. I have usb device controller (JMicron 567) which supports scsi + uas. Thanks!

@tobetter
Copy link
Owner

@jobenvil it's ok to post here your questions or drop me an email any time. As of now, I've not tested UAS yet. So no proof XU4 kernel can support it perfectly as of now.

@jobenvil
Copy link
Author

Thanks Kim.

Enabling in .config CONFIG_USB_UAS=y I have no succes, even with 4.7.-rc.2-next.

I bought two different USB-SATA-Adapter which should support uas:

  • JMicron USA Technology Corp. JMS567
root@hiperborea ~ # lsusb
Bus 006 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 005 Device 002: ID 0bda:8153 Realtek Semiconductor Corp.
Bus 005 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 004 Device 004: ID 152d:3562 JMicron Technology Corp. / JMicron USA Technology Corp. JMS567 SATA 6Gb/s bridge
Bus 004 Device 003: ID 05e3:0616 Genesys Logic, Inc. hub
Bus 004 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 003 Device 002: ID 05e3:0610 Genesys Logic, Inc. 4-port hub
Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

root@hiperborea ~ # lsusb -vvv -d   152d:3562 | grep -ie 'id\|maxpower\|streams\|protocol'
Bus 004 Device 004: ID 152d:3562 JMicron Technology Corp. / JMicron USA Technology Corp. JMS567 SATA 6Gb/s bridge
  bDeviceProtocol         0
  idVendor           0x152d JMicron Technology Corp. / JMicron USA Technology Corp.
  idProduct          0x3562 JMS567 SATA 6Gb/s bridge
    MaxPower                2mA
      bInterfaceProtocol     80 Bulk-Only
      bInterfaceProtocol     98
        MaxStreams             32
        MaxStreams             32
        MaxStreams             32

and

  • ASMedia Technology Inc. ASM1153 SATA 3Gb/s bridge
root@hiperborea ~ # lsusb
Bus 006 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 005 Device 003: ID 0bda:8153 Realtek Semiconductor Corp.
Bus 005 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 004 Device 003: ID 174c:55aa ASMedia Technology Inc. ASM1051E SATA 6Gb/s bridge, ASM1053E SATA 6Gb/s bridge, ASM1153 SATA 3Gb/s bridge
Bus 004 Device 002: ID 05e3:0616 Genesys Logic, Inc. hub
Bus 004 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 003 Device 002: ID 05e3:0610 Genesys Logic, Inc. 4-port hub
Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub


root@hiperborea ~ # lsusb -vvv -d  174c:55aa | grep -ie 'id\|maxpower\|streams\|protocol'
Bus 004 Device 003: ID 174c:55aa ASMedia Technology Inc. ASM1051E SATA 6Gb/s bridge, ASM1053E SATA 6Gb/s bridge, ASM1153 SATA 3Gb/s bridge
  bDeviceProtocol         0
  idVendor           0x174c ASMedia Technology Inc.
  idProduct          0x55aa ASM1051E SATA 6Gb/s bridge, ASM1053E SATA 6Gb/s bridge, ASM1153 SATA 3Gb/s bridge
    MaxPower                0mA
      bInterfaceProtocol     80 Bulk-Only
      bInterfaceProtocol     98
        MaxStreams             32
        MaxStreams             32
        MaxStreams             32

but kernel messages indicate that uas will not work and therefore Bulk-Only protocol will rule:

[    4.404523] usb 4-1.1: new SuperSpeed USB device number 3 using xhci-hcd
[    4.431002] usb 4-1.1: New USB device found, idVendor=174c, idProduct=55aa
[    4.441566] usb 4-1.1: New USB device strings: Mfr=2, Product=3, SerialNumber=1
[    4.452431] usb 4-1.1: Product: ASMT1051
[    4.459905] usb 4-1.1: Manufacturer: asmedia
[    4.467710] usb 4-1.1: SerialNumber: 12345678DCA3
[    4.479543] usb 4-1.1: USB controller xhci-hcd.2.auto does not support streams, which are required by the UAS driver.
[    4.493897] usb 4-1.1: Please try an other USB controller if you wish to use UAS.
[    4.505279] usb 4-1.1: USB controller xhci-hcd.2.auto does not support streams, which are required by the UAS driver.
[    4.519732] usb 4-1.1: Please try an other USB controller if you wish to use UAS.

According to MaxPower = 0 mW and uas-detect.h , I should have the ASM1153. Some hint or direction to dig? Thanks in advance.

@tobetter
Copy link
Owner

Hey @jobenvil,

What's the product of ASM1153?
Any reason you have to use ASM1153, not JMS567?

@jobenvil
Copy link
Author

jobenvil commented Jun 14, 2016

No special reason, I just bought to test them because they have USB attached SCSI support.

Update: With iotf master branch 4.6.0+ are recogniced both adapters like USB Attached SCSI / uas:

  • Example lsusb -v JM567:
Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       1
      bNumEndpoints           4
      bInterfaceClass         8 Mass Storage
      bInterfaceSubClass      6 SCSI
      bInterfaceProtocol     98
      iInterface             10 MSC USB Attached SCSI

dmesg JM567

[    8.665627] usb 4-1.1: new SuperSpeed USB device number 3 using xhci-hcd
[    8.686851] usb 4-1.1: New USB device found, idVendor=152d, idProduct=3562
[    8.692283] usb 4-1.1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[    8.699563] usb 4-1.1: Product: AD TO BE II
[    8.703713] usb 4-1.1: Manufacturer: ADMKIV
[    8.707874] usb 4-1.1: SerialNumber: DB123456789628
[    8.724382] scsi host0: uas
[    8.727286] scsi 0:0:0:0: Direct-Access     ADplus   SuperVer         6302 PQ: 0 ANSI: 6
[    8.767014] sd 0:0:0:0: [sda] 976773168 512-byte logical blocks: (500 GB/466 GiB)
[    8.767224] sd 0:0:0:0: Attached scsi generic sg0 type 0
[    8.778375] sd 0:0:0:0: [sda] 4096-byte physical blocks
[    8.784305] sd 0:0:0:0: [sda] Write Protect is off
[    8.788308] sd 0:0:0:0: [sda] Mode Sense: 53 00 00 08
[    8.788680] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    8.881738]  sda: sda1 sda2
[    8.885725] sd 0:0:0:0: [sda] Attached SCSI disk
...
...
[   12.901106] usbcore: registered new interface driver r8152
[   12.931370] usbcore: registered new interface driver cdc_ether
[   13.098440] usb 6-1: reset SuperSpeed USB device number 2 using xhci-hcd
[   13.185062] r8152 6-1:1.0 eth0: v1.08.3
[   13.402296] usb 4-1.1: USB disconnect, device number 3
[   13.421351] sd 0:0:0:0: [sda] tag#0 uas_zap_pending 0 uas-tag 1 inflight: CMD
[   13.421376] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 01 10 00 00 f0 00
[   13.421407] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=0x00
[   13.421420] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 01 10 00 00 f0 00
[   13.421430] blk_update_request: I/O error, dev sda, sector 272
[   13.426899] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[   13.565685] sd 0:0:0:0: [sda] Synchronize Cache(10) failed: Result: hostbyte=0x07 driverbyte=0x00
[   17.135625] usb 4-1.1: new SuperSpeed USB device number 4 using xhci-hcd
[   17.151751] usb 4-1.1: New USB device found, idVendor=152d, idProduct=3562
[   17.151765] usb 4-1.1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[   17.151774] usb 4-1.1: Product: AD TO BE II
[   17.151783] usb 4-1.1: Manufacturer: ADMKIV
[   17.151793] usb 4-1.1: SerialNumber: DB123456789628
[   17.155010] scsi host0: uas
[   17.157219] scsi 0:0:0:0: Direct-Access     ADplus   SuperVer         6302 PQ: 0 ANSI: 6
[   17.206579] sd 0:0:0:0: Attached scsi generic sg0 type 0
[   17.206749] sd 0:0:0:0: [sda] 976773168 512-byte logical blocks: (500 GB/466 GiB)
[   17.206769] sd 0:0:0:0: [sda] 4096-byte physical blocks
[   17.207577] sd 0:0:0:0: [sda] Write Protect is off
[   17.207593] sd 0:0:0:0: [sda] Mode Sense: 53 00 00 08
[   17.207997] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[   17.269708]  sda: sda1 sda2
[   17.272640] sd 0:0:0:0: [sda] Attached SCSI disk


  • ASMedia ASM1153:
[    3.534477] usb 4-1.1: new SuperSpeed USB device number 3 using xhci-hcd
[    3.555420] usb 4-1.1: New USB device found, idVendor=174c, idProduct=55aa
[    3.561388] usb 4-1.1: New USB device strings: Mfr=2, Product=3, SerialNumber=1
[    3.568613] usb 4-1.1: Product: ASMT1051
[    3.572096] usb 4-1.1: Manufacturer: asmedia
[    3.576515] usb 4-1.1: SerialNumber: 12345678DCA3
[    3.593457] scsi host0: uas
[    3.595907] scsi 0:0:0:0: Direct-Access     ASMT     2115             0    PQ: 0 ANSI: 6
[    3.615155] random: nonblocking pool is initialized
[    3.646398] sd 0:0:0:0: Attached scsi generic sg0 type 0
[    3.646614] sd 0:0:0:0: [sda] Spinning up disk...
[    4.024476] raid6: int32x1  gen()   429 MB/s
[    4.109377] raid6: int32x1  xor()   325 MB/s
[    4.194421] raid6: int32x2  gen()   625 MB/s
[    4.279385] raid6: int32x2  xor()   471 MB/s
[    4.364420] raid6: int32x4  gen()   719 MB/s
[    4.449358] raid6: int32x4  xor()   581 MB/s
[    4.534393] raid6: int32x8  gen()   947 MB/s
[    4.619338] raid6: int32x8  xor()   592 MB/s
[    4.622125] raid6: using algorithm int32x8 gen() 947 MB/s
[    4.627518] raid6: .... xor() 592 MB/s, rmw enabled
[    4.632370] raid6: using intx1 recovery algorithm
[    4.637818] xor: measuring software checksum speed
[    4.649400] .
[    4.689356]    arm4regs  :  2943.200 MB/sec
[    4.739369]    8regs     :  2458.400 MB/sec
[    4.789342]    32regs    :  2351.200 MB/sec
[    4.792046] xor: using function: arm4regs (2943.200 MB/sec)
[    4.810989] Btrfs loaded, assert=on
[    5.654403] ....ready
[    8.670778] sd 0:0:0:0: [sda] 976773168 512-byte logical blocks: (500 GB/466 GiB)
[    8.677382] sd 0:0:0:0: [sda] Write Protect is off
[    8.681583] sd 0:0:0:0: [sda] Mode Sense: 43 00 00 00
[    8.681878] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    8.800104]  sda: sda1 sda2
[    8.803908] sd 0:0:0:0: [sda] Attached SCSI disk

and corresponding lsusb -v:

    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       1
      bNumEndpoints           4
      bInterfaceClass         8 Mass Storage
      bInterfaceSubClass      6 SCSI
      bInterfaceProtocol     98
      iInterface              0

I cannot update you about the uas performance compared with other kernels since I don't have a stable environment to test them. When I have it, I will update you. You may like close the issue.

@tobetter
Copy link
Owner

@jobenvil Ok, thank you for update. I recently got a board of JMicro, different chipset with yours. I was planning to adapt it to XU4. FYI, I've started to create a branch with v4.6 recently, hope it could work.
https://github.com/tobetter/linux/commits/odroidxu4-v4.6

@jobenvil
Copy link
Author

FYI, I've started to create a branch with v4.6 recently, hope it could work.
https://github.com/tobetter/linux/commits/odroidxu4-v4.6

@tobetter thank you, this will be very usefull since I'm still testing.

@uDude
Copy link

uDude commented Jun 14, 2016

Ams1053 works well but does not have pm.
On Jun 14, 2016 5:02 AM, "José Benlloch" notifications@github.com wrote:

FYI, I've started to create a branch with v4.6 recently, hope it could
work.
https://github.com/tobetter/linux/commits/odroidxu4-v4.6

@tobetter https://github.com/tobetter thank you, this will be very
usefull since I'm still testing.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#5 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AIiMeCKi486kmRnFYru-TlcslWieyyKQks5qLopOgaJpZM4IpQzo
.

@tobetter
Copy link
Owner

@uDude could you share more detail about your running environment? Like kernel version and platform?
@jobenvil would you mind to share your expectation in terms of the performance and how you are planning to measure it?

@uDude
Copy link

uDude commented Jun 14, 2016

Sure,

I have xu4 to an asmedia 1053 based chip set (
http://www.datoptic.com/ec/usb3-to-sata3-6gb-s-bridge.html). This connects
to western digital red 3T disks.

Kernel is the tobetter 4.2.x.

I get about 144 MB/s write speed.

I am using CloudDB, Ubuntu 16.04. These are server only so I cannot
comment on sound or video.
On Jun 14, 2016 6:48 AM, "Dongjin Kim" notifications@github.com wrote:

@uDude https://github.com/uDude could you share more detail about your
running environment? Like kernel version and platform?
@jobenvil https://github.com/jobenvil would you mind to share your
expectation in terms of the performance and how you are planning to measure
it?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#5 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AIiMeIHW0UgS9a1G1HovuTWpzgLr0KBiks5qLqMFgaJpZM4IpQzo
.

@jobenvil
Copy link
Author

@tobetter I read some time ago this sunxi article regarding UASP and found interesting why not to get enabled in odroidxu4 if is only a matter of find out a working UAS adapter. After that, @tkaiser from ARMBIAN asked in ODROID Forum for check the uas performance in OdroidXU4 devices. This is the history behind.

and how you are planning to measure it?

Don't expect more than what I know, like hdparm, dd, iozone, HELIOS LanTest. Basically blocking the uas on kernel module load. @uDude if you have uas enabled, can you try blacklisting the uas module and test performance w/o?

would you mind to share your expectation in terms of the performance

UAS is really performing better than w/o. Check the notes here. On Lemaker Banana Pro running Kernel 4.6.2 / Ubuntu 16.04 using the USB2.0 ports with JM567 I got around 40MB/S for write and read, what it is worth to consider the re-use of the USB2.0 ports again.

@uDude
Copy link

uDude commented Jun 15, 2016

Hmmmm,

I do my builds on the XU4 and somewhere in the recent commits the kernel
compile is broken. See below:

drivers/phy/phy-exynos5-usbdrd.c: In function
‘exynos5420_usbdrd_phy_calibrate’:
drivers/phy/phy-exynos5-usbdrd.c:652:10: warning: ‘return’ with a value, in
function returning void
return ret;
^
drivers/phy/phy-exynos5-usbdrd.c:666:10: warning: ‘return’ with a value, in
function returning void
return ret;
^
drivers/phy/phy-exynos5-usbdrd.c:699:9: warning: ‘return’ with a value, in
function returning void
return ret;
^
drivers/phy/phy-exynos5-usbdrd.c: At top level:
drivers/phy/phy-exynos5-usbdrd.c:804:28: error: initialization from
incompatible pointer type [-Werror=incompatible-pointer-types]
.phy_exynos_calibrate = exynos5420_usbdrd_phy_calibrate,
^
drivers/phy/phy-exynos5-usbdrd.c:804:28: note: (near initialization for
‘exynos5420_usbdrd_phy.phy_exynos_calibrate’)

I didn't think the aufs patch updated the usbdrd. Regardless, when I get a
chance I have a separate branch that compiles that I'll enable the
requested changes on and let you know. I want to do the speed tests
against identical kernels with only that one difference.

I had terrible performance with some jmicro chipsets, but the ASMedia
1053 (no port-multiplier) works great for me.

It will be a few days be patient and I'll get you the requested
comparisons. Note my builds giving 140 MB/s (not 140 Mb/s) are using the
@tobetter defconfig.

uDude

On Tue, Jun 14, 2016 at 10:03 PM, José Benlloch notifications@github.com
wrote:

@tobetter https://github.com/tobetter I read some time ago this sunxi
article regarding UASP http://linux-sunxi.org/USB/UAS and found
interesting why not to get enabled in odroidxu4 if is only a matter of find
out a working UAS adapter. After that, @tkaiser from ARMBIAN asked in
ODROID Forum http://forum.odroid.com/viewtopic.php?f=97&t=17349 for
check the uas performance in OdroidXU4 devices. This is the history behind.

and how you are planning to measure it?

Don't expect more than what I know, like hdparm, dd, iozone, HELIOS
LanTest. Basically blocking the uas on kernel module load. @uDude
https://github.com/uDude if you have uas enabled, can you try
blacklisting the uas module and test performance w/o?

would you mind to share your expectation in terms of the performance

UAS is really performing better than w/o. Check the notes here.
http://pastebin.com/aGZyeryT On Lemaker Banana Pro running Kernel 4.6.2
/ Ubuntu 16.04 using the USB2.0 ports with JM567 I got around 40MB/S for
write and read, what it is worth to consider the re-use of the USB2.0 ports
again.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#5 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AIiMeIKll4Ek5BRzRsT8e0iTI74tAZKsks5qLyUzgaJpZM4IpQzo
.

@uDude
Copy link

uDude commented Jun 15, 2016

Change the return value of the function to static int from static void and
all is well. I say that w/o testing only that the build completes.

On Wed, Jun 15, 2016 at 1:55 AM, Mike Kyle mfkyle@gmail.com wrote:

Hmmmm,

I do my builds on the XU4 and somewhere in the recent commits the kernel
compile is broken. See below:

drivers/phy/phy-exynos5-usbdrd.c: In function
‘exynos5420_usbdrd_phy_calibrate’:
drivers/phy/phy-exynos5-usbdrd.c:652:10: warning: ‘return’ with a value,
in function returning void
return ret;
^
drivers/phy/phy-exynos5-usbdrd.c:666:10: warning: ‘return’ with a value,
in function returning void
return ret;
^
drivers/phy/phy-exynos5-usbdrd.c:699:9: warning: ‘return’ with a value, in
function returning void
return ret;
^
drivers/phy/phy-exynos5-usbdrd.c: At top level:
drivers/phy/phy-exynos5-usbdrd.c:804:28: error: initialization from
incompatible pointer type [-Werror=incompatible-pointer-types]
.phy_exynos_calibrate = exynos5420_usbdrd_phy_calibrate,
^
drivers/phy/phy-exynos5-usbdrd.c:804:28: note: (near initialization for
‘exynos5420_usbdrd_phy.phy_exynos_calibrate’)

I didn't think the aufs patch updated the usbdrd. Regardless, when I get
a chance I have a separate branch that compiles that I'll enable the
requested changes on and let you know. I want to do the speed tests
against identical kernels with only that one difference.

I had terrible performance with some jmicro chipsets, but the ASMedia
1053 (no port-multiplier) works great for me.

It will be a few days be patient and I'll get you the requested
comparisons. Note my builds giving 140 MB/s (not 140 Mb/s) are using the
@tobetter defconfig.

uDude

On Tue, Jun 14, 2016 at 10:03 PM, José Benlloch notifications@github.com
wrote:

@tobetter https://github.com/tobetter I read some time ago this sunxi
article regarding UASP http://linux-sunxi.org/USB/UAS and found
interesting why not to get enabled in odroidxu4 if is only a matter of find
out a working UAS adapter. After that, @tkaiser from ARMBIAN asked in
ODROID Forum http://forum.odroid.com/viewtopic.php?f=97&t=17349 for
check the uas performance in OdroidXU4 devices. This is the history behind.

and how you are planning to measure it?

Don't expect more than what I know, like hdparm, dd, iozone, HELIOS
LanTest. Basically blocking the uas on kernel module load. @uDude
https://github.com/uDude if you have uas enabled, can you try
blacklisting the uas module and test performance w/o?

would you mind to share your expectation in terms of the performance

UAS is really performing better than w/o. Check the notes here.
http://pastebin.com/aGZyeryT On Lemaker Banana Pro running Kernel
4.6.2 / Ubuntu 16.04 using the USB2.0 ports with JM567 I got around 40MB/S
for write and read, what it is worth to consider the re-use of the USB2.0
ports again.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#5 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AIiMeIKll4Ek5BRzRsT8e0iTI74tAZKsks5qLyUzgaJpZM4IpQzo
.

@tobetter
Copy link
Owner

@uDude I think you need this patch,
13c3c92

Guys, this is good input for me. Actually I am not very expert at such performance tuning and integration, but very interesting...I don't have ASMedia device, would try to purchase if possible to compare with JMicron.

Please keep me in your track. I hope to share you guys with v4.6 kernel in debian or Ubuntu server on next week if possible.

@uDude
Copy link

uDude commented Jun 15, 2016

I have two 4.6 kernels ready to test. I need to obtain an adaptor with a
chipset that supports UASP. My ASMEDIA 1053 does not. The 1053E does. I
prefer to stick with a chipset that I know works well with the xu4 ss chip.

I'll scour amazon and see what I can find. If this works It will be worth
the $$ for me. If you know of one please drop me a line.

On Wed, Jun 15, 2016 at 2:29 AM, Mike Kyle mfkyle@gmail.com wrote:

Change the return value of the function to static int from static void and
all is well. I say that w/o testing only that the build completes.

On Wed, Jun 15, 2016 at 1:55 AM, Mike Kyle mfkyle@gmail.com wrote:

Hmmmm,

I do my builds on the XU4 and somewhere in the recent commits the kernel
compile is broken. See below:

drivers/phy/phy-exynos5-usbdrd.c: In function
‘exynos5420_usbdrd_phy_calibrate’:
drivers/phy/phy-exynos5-usbdrd.c:652:10: warning: ‘return’ with a value,
in function returning void
return ret;
^
drivers/phy/phy-exynos5-usbdrd.c:666:10: warning: ‘return’ with a value,
in function returning void
return ret;
^
drivers/phy/phy-exynos5-usbdrd.c:699:9: warning: ‘return’ with a value,
in function returning void
return ret;
^
drivers/phy/phy-exynos5-usbdrd.c: At top level:
drivers/phy/phy-exynos5-usbdrd.c:804:28: error: initialization from
incompatible pointer type [-Werror=incompatible-pointer-types]
.phy_exynos_calibrate = exynos5420_usbdrd_phy_calibrate,
^
drivers/phy/phy-exynos5-usbdrd.c:804:28: note: (near initialization for
‘exynos5420_usbdrd_phy.phy_exynos_calibrate’)

I didn't think the aufs patch updated the usbdrd. Regardless, when I get
a chance I have a separate branch that compiles that I'll enable the
requested changes on and let you know. I want to do the speed tests
against identical kernels with only that one difference.

I had terrible performance with some jmicro chipsets, but the ASMedia
1053 (no port-multiplier) works great for me.

It will be a few days be patient and I'll get you the requested
comparisons. Note my builds giving 140 MB/s (not 140 Mb/s) are using the
@tobetter defconfig.

uDude

On Tue, Jun 14, 2016 at 10:03 PM, José Benlloch <notifications@github.com

wrote:

@tobetter https://github.com/tobetter I read some time ago this sunxi
article regarding UASP http://linux-sunxi.org/USB/UAS and found
interesting why not to get enabled in odroidxu4 if is only a matter of find
out a working UAS adapter. After that, @tkaiser from ARMBIAN asked in
ODROID Forum http://forum.odroid.com/viewtopic.php?f=97&t=17349 for
check the uas performance in OdroidXU4 devices. This is the history behind.

and how you are planning to measure it?

Don't expect more than what I know, like hdparm, dd, iozone, HELIOS
LanTest. Basically blocking the uas on kernel module load. @uDude
https://github.com/uDude if you have uas enabled, can you try
blacklisting the uas module and test performance w/o?

would you mind to share your expectation in terms of the performance

UAS is really performing better than w/o. Check the notes here.
http://pastebin.com/aGZyeryT On Lemaker Banana Pro running Kernel
4.6.2 / Ubuntu 16.04 using the USB2.0 ports with JM567 I got around 40MB/S
for write and read, what it is worth to consider the re-use of the USB2.0
ports again.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#5 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AIiMeIKll4Ek5BRzRsT8e0iTI74tAZKsks5qLyUzgaJpZM4IpQzo
.

@jobenvil
Copy link
Author

Some comments:

I had terrible performance with some jmicro chipsets, but the ASMedia 1053 (no port-multiplier) works great for me.

Yes, but newer JMicron chipsets performs well, see these chipsets here and explained. In the past some UAS ASMedia Chipset were blacklisted. Ensure that you get ASM1153 chip and it will work with UAS.

I want to do the speed tests against identical kernels with only that one difference.

Thats the point. It is not a matter of max read/write results, instead a porcentual gain since the other parameters depend on harddisk model, filesystem, I/O scheduler, mounting flags, etc. Don't expect more gain with UAS if your maximal specifications for your hardrive are already 140MB/s as well

I'll scour amazon and see what I can find. If this works It will be worth the $$ for me. If you know of one please drop me a line.

I have these two which really support UAS. See my third post.

@uDude
Copy link

uDude commented Jun 15, 2016

Not sure where I put it, but I have a 1T Samsung Evo sad that should be
suitable for the tests.
On Jun 15, 2016 2:46 AM, "José Benlloch" notifications@github.com wrote:

Some comments:

I had terrible performance with some jmicro chipsets, but the ASMedia
1053 (no port-multiplier) works great for me.

Yes, but newer JMicron chipsets performs well, see these chipsets here
and explained http://linux-sunxi.org/USB/UAS. In the past some UAS
ASMedia Chipset were blacklisted. Ensure that you get ASM1153 chip and it
will work with UAS.

I want to do the speed tests against identical kernels with only that
one difference
.

Thats the point. It is not a matter of max read/write results, instead a
porcentual gain since the other parameters depend on harddisk model,
filesystem, I/O scheduler, mounting flags, etc. Don't expect more gain with
UAS if your maximal specifications for your hardrive are already 140MB/s as
well

I'll scour amazon and see what I can find. If this works It will be worth
the $$ for me. If you know of one please drop me a line.

I have these two which really support UAS. See my third post.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#5 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AIiMeHB72WgH3MsCwICMIjNn3Xy1lhX4ks5qL7v9gaJpZM4IpQzo
.

@uDude
Copy link

uDude commented Jun 15, 2016

Regardless, it is clear that UASP should be enabled as a module by default.
On Jun 15, 2016 8:26 AM, "Mike Kyle" mfkyle@gmail.com wrote:

Not sure where I put it, but I have a 1T Samsung Evo sad that should be
suitable for the tests.
On Jun 15, 2016 2:46 AM, "José Benlloch" notifications@github.com wrote:

Some comments:

I had terrible performance with some jmicro chipsets, but the ASMedia
1053 (no port-multiplier) works great for me.

Yes, but newer JMicron chipsets performs well, see these chipsets here
and explained http://linux-sunxi.org/USB/UAS. In the past some UAS
ASMedia Chipset were blacklisted. Ensure that you get ASM1153 chip and it
will work with UAS.

I want to do the speed tests against identical kernels with only that
one difference
.

Thats the point. It is not a matter of max read/write results, instead a
porcentual gain since the other parameters depend on harddisk model,
filesystem, I/O scheduler, mounting flags, etc. Don't expect more gain with
UAS if your maximal specifications for your hardrive are already 140MB/s as
well

I'll scour amazon and see what I can find. If this works It will be worth
the $$ for me. If you know of one please drop me a line.

I have these two which really support UAS. See my third post.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#5 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AIiMeHB72WgH3MsCwICMIjNn3Xy1lhX4ks5qL7v9gaJpZM4IpQzo
.

@uDude
Copy link

uDude commented Jun 21, 2016

So I got a 1153 device in and used an EVO SSD.

W/O UAS I am getting 185 MB/s, with UAS I am getting 210 MB/s (a 15%
increase). Not as attractive as you'd like to see with the EVO, but still
screaming fast for an ARM SBC.

uDude

On Wed, Jun 15, 2016 at 2:30 PM, Mike Kyle mfkyle@gmail.com wrote:

Regardless, it is clear that UASP should be enabled as a module by default.
On Jun 15, 2016 8:26 AM, "Mike Kyle" mfkyle@gmail.com wrote:

Not sure where I put it, but I have a 1T Samsung Evo sad that should be
suitable for the tests.
On Jun 15, 2016 2:46 AM, "José Benlloch" notifications@github.com
wrote:

Some comments:

I had terrible performance with some jmicro chipsets, but the ASMedia
1053 (no port-multiplier) works great for me.

Yes, but newer JMicron chipsets performs well, see these chipsets here
and explained http://linux-sunxi.org/USB/UAS. In the past some UAS
ASMedia Chipset were blacklisted. Ensure that you get ASM1153 chip and it
will work with UAS.

I want to do the speed tests against identical kernels with only that
one difference
.

Thats the point. It is not a matter of max read/write results, instead a
porcentual gain since the other parameters depend on harddisk model,
filesystem, I/O scheduler, mounting flags, etc. Don't expect more gain with
UAS if your maximal specifications for your hardrive are already 140MB/s as
well

I'll scour amazon and see what I can find. If this works It will be
worth the $$ for me. If you know of one please drop me a line.

I have these two which really support UAS. See my third post.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#5 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AIiMeHB72WgH3MsCwICMIjNn3Xy1lhX4ks5qL7v9gaJpZM4IpQzo
.

@jobenvil
Copy link
Author

jobenvil commented Jun 21, 2016

@uDude awesome, quite amazing speeds! If you have some time free, could you please perform these tests:

iozone -a -g 4000m -s 4000m -i 0 -i 1 -r 4K
iozone -a -g 4000m -s 4000m -i 0 -i 1 -r 1024K
iozone -O -i 0 -i 1 -i 2 -e -+n -r 4K -s 2000m

you can install the iozone deb package from here. Ensure before run iozone that CPU 's governor is settled to performance:

cpufreq-set -c 0 -g performance; cpufreq-set -c 4 -g performance

@rhkean
Copy link

rhkean commented Jun 21, 2016

@jobenvil ,

I've been running the hardkernel 4.2-rc1 kernel with UAS for about 3 months with 12 TB of hard drive space active. I haven't tested with @tobetter 's 4.2 kernel since early April, but I'm pretty sure it worked also. I came over here to see about testing the 4.6 kernel and discovered this discussion.

DietPi.com forum post on speed comparison here

I have 2 Seagate USB3 5TB drives and a 2TB SATA3 drive installed in a VANTEC NST-370A31-BK USB3.1 enclosure.

note the "Driver=uas"

root@keanbean:~# lsusb -t
/:  Bus 06.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/1p, 5000M
    |__ Port 1: Dev 2, If 0, Class=Vendor Specific Class, Driver=r8152, 5000M
/:  Bus 05.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/1p, 480M
/:  Bus 04.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/1p, 5000M
    |__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/2p, 5000M
        |__ Port 1: Dev 3, If 0, Class=Hub, Driver=hub/4p, 5000M
            |__ Port 1: Dev 4, If 0, Class=Hub, Driver=hub/4p, 5000M
                |__ Port 1: Dev 6, If 0, Class=Mass Storage, Driver=uas, 5000M
                |__ Port 2: Dev 7, If 0, Class=Mass Storage, Driver=uas, 5000M
                |__ Port 3: Dev 8, If 0, Class=Mass Storage, Driver=uas, 5000M
            |__ Port 4: Dev 5, If 0, Class=Hub, Driver=hub/4p, 5000M
                |__ Port 1: Dev 9, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
/:  Bus 03.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/1p, 480M
    |__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/2p, 480M
        |__ Port 1: Dev 3, If 0, Class=Hub, Driver=hub/4p, 480M
            |__ Port 1: Dev 5, If 0, Class=Hub, Driver=hub/4p, 480M
            |__ Port 4: Dev 6, If 0, Class=Hub, Driver=hub/4p, 480M
/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=exynos-ohci/3p, 12M
    |__ Port 1: Dev 2, If 0, Class=Wireless, Driver=btusb, 12M
    |__ Port 1: Dev 2, If 1, Class=Wireless, Driver=btusb, 12M
/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=exynos-ehci/3p, 480M

@jobenvil
Copy link
Author

@rhkean thanks for the comments. Yes, indeed in my 3rd post, it is indicated that both enclosures work properly with uas. Earlier kernels <4 has already support on UAS.Yes, HardKernel odroidx4_defconfig has UAS module not activated. It is necessary to enable it. This could be a easy PR ;-)
If you have time, can you test with iozone commands.

@rhkean
Copy link

rhkean commented Jun 22, 2016

how does iozone work? There's no option to specify a device to test? And the man page makes reference system-wide performance. If that is the case, wouldn't it be testing all drives and averaging them? And, if that is so, wouldn't that include your boot device which could be pretty slow if you're using a sub-class-10 uSD?

Anyhow... tests are running... I'll paste the results when i have them.

@rhkean
Copy link

rhkean commented Jun 22, 2016

nevermind... I'm an idiot.... It just clicked in my head that the test is dependent on $PWD

@rhkean
Copy link

rhkean commented Jun 22, 2016

Here's from one of the Seagate Expansion 5TB drive

root@keanbean:/srv/data/data2# cpufreq-info -o
          minimum CPU frequency  -  maximum CPU frequency  -  governor
CPU  0       200000 kHz ( 14 %)  -    1400000 kHz (100 %)  -  performance
CPU  1       200000 kHz ( 10 %)  -    2000000 kHz (100 %)  -  performance
CPU  2       200000 kHz ( 10 %)  -    2000000 kHz (100 %)  -  performance
CPU  3       200000 kHz ( 10 %)  -    2000000 kHz (100 %)  -  performance
CPU  4       200000 kHz ( 10 %)  -    2000000 kHz (100 %)  -  performance
CPU  5       200000 kHz ( 14 %)  -    1400000 kHz (100 %)  -  performance
CPU  6       200000 kHz ( 14 %)  -    1400000 kHz (100 %)  -  performance
CPU  7       200000 kHz ( 14 %)  -    1400000 kHz (100 %)  -  performance
root@keanbean:/srv/data/data2# iozone -a -g 4000m -s 4000m -i 0 -i 1 -r 4K
        Iozone: Performance Test of File I/O
                Version $Revision: 3.397 $
                Compiled for 32 bit mode.
                Build: linux

        Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
                     Al Slater, Scott Rhine, Mike Wisner, Ken Goss
                     Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
                     Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner,
                     Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone,
                     Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root,
                     Fabrice Bacchella, Zhenghua Xue, Qin Li, Darren Sawyer.
                     Ben England.

        Run began: Tue Jun 21 22:28:18 2016

        Auto Mode
        Using maximum file size of 4096000 kilobytes.
        File size set to 4096000 KB
        Record Size 4 KB
        Command line used: iozone -a -g 4000m -s 4000m -i 0 -i 1 -r 4K
        Output is in Kbytes/sec
        Time Resolution = 0.000001 seconds.
        Processor cache size set to 1024 Kbytes.
        Processor cache line size set to 32 bytes.
        File stride size set to 17 * record size.
                                                            random  random    bkwd   record   stride
              KB  reclen   write rewrite    read    reread    read   write    read  rewrite     read   fwrite frewrite   fread  freread
         4096000       4   94362  102709    71886    71798

iozone test complete.
root@keanbean:/srv/data/data2# iozone -a -g 4000m -s 4000m -i 0 -i 1 -r 1024K
        Iozone: Performance Test of File I/O
                Version $Revision: 3.397 $
                Compiled for 32 bit mode.
                Build: linux

        Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
                     Al Slater, Scott Rhine, Mike Wisner, Ken Goss
                     Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
                     Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner,
                     Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone,
                     Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root,
                     Fabrice Bacchella, Zhenghua Xue, Qin Li, Darren Sawyer.
                     Ben England.

        Run began: Tue Jun 21 22:31:40 2016

        Auto Mode
        Using maximum file size of 4096000 kilobytes.
        File size set to 4096000 KB
        Record Size 1024 KB
        Command line used: iozone -a -g 4000m -s 4000m -i 0 -i 1 -r 1024K
        Output is in Kbytes/sec
        Time Resolution = 0.000001 seconds.
        Processor cache size set to 1024 Kbytes.
        Processor cache line size set to 32 bytes.
        File stride size set to 17 * record size.
                                                            random  random    bkwd   record   stride
              KB  reclen   write rewrite    read    reread    read   write    read  rewrite     read   fwrite frewrite   fread  freread
         4096000    1024  107345  110874   104107   108021

iozone test complete.
root@keanbean:/srv/data/data2# iozone -O -i 0 -i 1 -i 2 -e -+n -r 4K -s 2000m
        Iozone: Performance Test of File I/O
                Version $Revision: 3.397 $
                Compiled for 32 bit mode.
                Build: linux

        Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
                     Al Slater, Scott Rhine, Mike Wisner, Ken Goss
                     Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
                     Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner,
                     Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone,
                     Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root,
                     Fabrice Bacchella, Zhenghua Xue, Qin Li, Darren Sawyer.
                     Ben England.

        Run began: Tue Jun 21 22:34:17 2016

        OPS Mode. Output is in operations per second.
        Include fsync in write timing
        No retest option selected
        Record Size 4 KB
        File size set to 2048000 KB
        Command line used: iozone -O -i 0 -i 1 -i 2 -e -+n -r 4K -s 2000m
        Time Resolution = 0.000001 seconds.
        Processor cache size set to 1024 Kbytes.
        Processor cache line size set to 32 bytes.
        File stride size set to 17 * record size.
                                                            random  random    bkwd   record   stride
              KB  reclen   write rewrite    read    reread    read   write    read  rewrite     read   fwrite frewrite   fread  freread
         2048000       4   25653       0    23837        0     329    1007

iozone test complete.

@uDude
Copy link

uDude commented Jun 22, 2016

I would use 40GB rather than 10. The reason performance a hours jump abbot
60 or 70%. The write performance will stay in the 15 to 30 percent increaae.

I'll try \of get to iozone testing on identical @tobetter 4.2 kernel with
and w/o uas. Informal testing shows noticeable increases using an EVO 840
1T SSD. Hard numbers coming from a setup here.
On Jun 21, 2016 10:14 PM, "Robert Kean" notifications@github.com wrote:

Here's from one of the Seagate Expansion 5TB drive

root@keanbean:/srv/data/data2# cpufreq-info -o
minimum CPU frequency - maximum CPU frequency - governor
CPU 0 200000 kHz ( 14 %) - 1400000 kHz (100 %) - performance
CPU 1 200000 kHz ( 10 %) - 2000000 kHz (100 %) - performance
CPU 2 200000 kHz ( 10 %) - 2000000 kHz (100 %) - performance
CPU 3 200000 kHz ( 10 %) - 2000000 kHz (100 %) - performance
CPU 4 200000 kHz ( 10 %) - 2000000 kHz (100 %) - performance
CPU 5 200000 kHz ( 14 %) - 1400000 kHz (100 %) - performance
CPU 6 200000 kHz ( 14 %) - 1400000 kHz (100 %) - performance
CPU 7 200000 kHz ( 14 %) - 1400000 kHz (100 %) - performance
root@keanbean:/srv/data/data2# iozone -a -g 4000m -s 4000m -i 0 -i 1 -r 4K
Iozone: Performance Test of File I/O
Version $Revision: 3.397 $
Compiled for 32 bit mode.
Build: linux

    Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
                 Al Slater, Scott Rhine, Mike Wisner, Ken Goss
                 Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
                 Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner,
                 Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone,
                 Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root,
                 Fabrice Bacchella, Zhenghua Xue, Qin Li, Darren Sawyer.
                 Ben England.

    Run began: Tue Jun 21 22:28:18 2016

    Auto Mode
    Using maximum file size of 4096000 kilobytes.
    File size set to 4096000 KB
    Record Size 4 KB
    Command line used: iozone -a -g 4000m -s 4000m -i 0 -i 1 -r 4K
    Output is in Kbytes/sec
    Time Resolution = 0.000001 seconds.
    Processor cache size set to 1024 Kbytes.
    Processor cache line size set to 32 bytes.
    File stride size set to 17 * record size.
                                                        random  random    bkwd   record   stride
          KB  reclen   write rewrite    read    reread    read   write    read  rewrite     read   fwrite frewrite   fread  freread
     4096000       4   94362  102709    71886    71798

iozone test complete.
root@keanbean:/srv/data/data2# iozone -a -g 4000m -s 4000m -i 0 -i 1 -r 1024K
Iozone: Performance Test of File I/O
Version $Revision: 3.397 $
Compiled for 32 bit mode.
Build: linux

    Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
                 Al Slater, Scott Rhine, Mike Wisner, Ken Goss
                 Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
                 Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner,
                 Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone,
                 Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root,
                 Fabrice Bacchella, Zhenghua Xue, Qin Li, Darren Sawyer.
                 Ben England.

    Run began: Tue Jun 21 22:31:40 2016

    Auto Mode
    Using maximum file size of 4096000 kilobytes.
    File size set to 4096000 KB
    Record Size 1024 KB
    Command line used: iozone -a -g 4000m -s 4000m -i 0 -i 1 -r 1024K
    Output is in Kbytes/sec
    Time Resolution = 0.000001 seconds.
    Processor cache size set to 1024 Kbytes.
    Processor cache line size set to 32 bytes.
    File stride size set to 17 * record size.
                                                        random  random    bkwd   record   stride
          KB  reclen   write rewrite    read    reread    read   write    read  rewrite     read   fwrite frewrite   fread  freread
     4096000    1024  107345  110874   104107   108021

iozone test complete.
root@keanbean:/srv/data/data2# iozone -O -i 0 -i 1 -i 2 -e -+n -r 4K -s 2000m
Iozone: Performance Test of File I/O
Version $Revision: 3.397 $
Compiled for 32 bit mode.
Build: linux

    Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
                 Al Slater, Scott Rhine, Mike Wisner, Ken Goss
                 Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
                 Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner,
                 Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone,
                 Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root,
                 Fabrice Bacchella, Zhenghua Xue, Qin Li, Darren Sawyer.
                 Ben England.

    Run began: Tue Jun 21 22:34:17 2016

    OPS Mode. Output is in operations per second.
    Include fsync in write timing
    No retest option selected
    Record Size 4 KB
    File size set to 2048000 KB
    Command line used: iozone -O -i 0 -i 1 -i 2 -e -+n -r 4K -s 2000m
    Time Resolution = 0.000001 seconds.
    Processor cache size set to 1024 Kbytes.
    Processor cache line size set to 32 bytes.
    File stride size set to 17 * record size.
                                                        random  random    bkwd   record   stride
          KB  reclen   write rewrite    read    reread    read   write    read  rewrite     read   fwrite frewrite   fread  freread
     2048000       4   25653       0    23837        0     329    1007

iozone test complete.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#5 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AIiMeDehqax4ZNlqZc3PIUqscgtOwFOYks5qOLajgaJpZM4IpQzo
.

@jobenvil
Copy link
Author

@uDude Info: for me was not working blocking the uas module. When I blocked it, the device was not listed and therefore not possible to test. It was necessary to recompile.

@uDude
Copy link

uDude commented Jun 23, 2016

I tested with separate kernels, as well.
On Jun 22, 2016 12:03 AM, "José Benlloch" notifications@github.com wrote:

@uDude https://github.com/uDude Info: for me was not working blocking
the uas module. When I blocked it, the device was not listed and therefore
not possible to test. It was necessary to recompile.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#5 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AIiMePuPU59D2ywEMhho0a8AMq55__V1ks5qONApgaJpZM4IpQzo
.

@rhkean
Copy link

rhkean commented Jun 23, 2016

VANTEC NST-370A31-BK USB 3.1 enclosure

chipset: Bus 004 Device 008: ID 174c:1351 ASMedia Technology Inc.
ASM1351 chipset

interestingly, the writes are a 2-15% slower, but the reads are 40-45% faster
iozone -a -g 4000m -s 4000m -i 0 -i 1 -r 4K

module write rewrite read reread
usb-storage 184369 194620 144026 145958
uas 171599 185977 208177 212003

iozone -a -g 4000m -s 4000m -i 0 -i 1 -r 1024K

module write rewrite read reread
usb-storage 189773 193279 147610 150657
uas 161176 189238 211724 212117

iozone -O -i 0 -i 1 -i 2 -e -+n -r 4K -s 2000m

module write rewrite read reread random read random write
usb-storage 45745 0 36518 0 294 224
uas 39719 0 51881 0 330 225

@jobenvil
Copy link
Author

@rhkean is the second iozone command with 4kor 1024k? Is your Kernel still 4.2-rc1? Your info is very wellcome, since I don't have such performant harddrives or enclosures. I'm observing right now that independently of the uas, the results are tinted or oscillating to much -in my case- because the properly task distribution of the tasks in the diferent type of CPUs. Because less frequency cores are being used and they get in some point of the tests saturated. I tested harddisk and adapters in another ARMv7 SoC with two equal cores and the results were not oscillating so much.

@rhkean
Copy link

rhkean commented Jun 23, 2016

is the second iozone command with 4kor 1024k?

1024K .... sorry, bad copy/paste.

Is your Kernel still 4.2-rc1?

user1@odroidxu4:~$ uname -a
Linux odroidxu4.local.net 4.2.0-rc1+ #2 SMP PREEMPT Sun Jun 19 08:42:58 BST 2016 armv7l GNU/Linux

tobetter pushed a commit that referenced this issue Oct 26, 2023
commit b502c87 upstream.

If an UNDEFINED exception is taken from EL1, and do_undefinstr() doesn't
find any suitable undef_hook, it will call:

	BUG_ON(!user_mode(regs))

... and the kernel will report a failure witin do_undefinstr() rather
than reporting the original context that the UNDEFINED exception was
taken from. The pt_regs and ESR value reported within the BUG() handler
will be from within do_undefinstr() and the code dump will be for the
BRK in BUG_ON(), which isn't sufficient to debug the cause of the
original exception.

This patch makes the reporting better by having do_undefinstr() call
die() directly in this case to report the original context from which
the UNDEFINED exception was taken.

Prior to this patch, an undefined instruction is reported as:

| kernel BUG at arch/arm64/kernel/traps.c:497!
| Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
| Modules linked in:
| CPU: 0 PID: 0 Comm: swapper Not tainted 5.19.0-rc3-00127-geff044f1b04e-dirty #3
| Hardware name: linux,dummy-virt (DT)
| pstate: 000000c5 (nzcv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
| pc : do_undefinstr+0x28c/0x2ac
| lr : do_undefinstr+0x298/0x2ac
| sp : ffff800009f63bc0
| x29: ffff800009f63bc0 x28: ffff800009f73c00 x27: ffff800009644a70
| x26: ffff8000096778a8 x25: 0000000000000040 x24: 0000000000000000
| x23: 00000000800000c5 x22: ffff800009894060 x21: ffff800009f63d90
| x20: 0000000000000000 x19: ffff800009f63c40 x18: 0000000000000006
| x17: 0000000000403000 x16: 00000000bfbfd000 x15: ffff800009f63830
| x14: ffffffffffffffff x13: 0000000000000000 x12: 0000000000000019
| x11: 0101010101010101 x10: 0000000000161b98 x9 : 0000000000000000
| x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000
| x5 : ffff800009f761d0 x4 : 0000000000000000 x3 : ffff80000a2b80f8
| x2 : 0000000000000000 x1 : ffff800009f73c00 x0 : 00000000800000c5
| Call trace:
|  do_undefinstr+0x28c/0x2ac
|  el1_undef+0x2c/0x4c
|  el1h_64_sync_handler+0x84/0xd0
|  el1h_64_sync+0x64/0x68
|  setup_arch+0x550/0x598
|  start_kernel+0x88/0x6ac
|  __primary_switched+0xb8/0xc0
| Code: 17ffff95 a9425bf5 17ffffb8 a9025bf5 (d4210000)

With this patch applied, an undefined instruction is reported as:

| Internal error: Oops - Undefined instruction: 0 [#1] PREEMPT SMP
| Modules linked in:
| CPU: 0 PID: 0 Comm: swapper Not tainted 5.19.0-rc3-00128-gf27cfcc80e52-dirty #5
| Hardware name: linux,dummy-virt (DT)
| pstate: 800000c5 (Nzcv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
| pc : setup_arch+0x550/0x598
| lr : setup_arch+0x50c/0x598
| sp : ffff800009f63d90
| x29: ffff800009f63d90 x28: 0000000081000200 x27: ffff800009644a70
| x26: ffff8000096778c8 x25: 0000000000000040 x24: 0000000000000000
| x23: 0000000000000100 x22: ffff800009f69a58 x21: ffff80000a2b80b8
| x20: 0000000000000000 x19: 0000000000000000 x18: 0000000000000006
| x17: 0000000000403000 x16: 00000000bfbfd000 x15: ffff800009f63830
| x14: ffffffffffffffff x13: 0000000000000000 x12: 0000000000000019
| x11: 0101010101010101 x10: 0000000000161b98 x9 : 0000000000000000
| x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000
| x5 : 0000000000000008 x4 : 0000000000000010 x3 : 0000000000000000
| x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000000
| Call trace:
|  setup_arch+0x550/0x598
|  start_kernel+0x88/0x6ac
|  __primary_switched+0xb8/0xc0
| Code: b4000080 90ffed80 912ac000 97db745f (00000000)

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Amit Daniel Kachhap <amit.kachhap@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20220913101732.3925290-2-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
tobetter pushed a commit that referenced this issue Nov 23, 2023
[ Upstream commit a84fbf2 ]

Generating metrics llc_code_read_mpi_demand_plus_prefetch,
llc_data_read_mpi_demand_plus_prefetch,
llc_miss_local_memory_bandwidth_read,
llc_miss_local_memory_bandwidth_write,
nllc_miss_remote_memory_bandwidth_read, memory_bandwidth_read,
memory_bandwidth_write, uncore_frequency, upi_data_transmit_bw,
C2_Pkg_Residency, C3_Core_Residency, C3_Pkg_Residency,
C6_Core_Residency, C6_Pkg_Residency, C7_Core_Residency,
C7_Pkg_Residency, UNCORE_FREQ and tma_info_system_socket_clks would
trigger an address sanitizer heap-buffer-overflows on a SkylakeX.

```
==2567752==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x5020003ed098 at pc 0x5621a816654e bp 0x7fffb55d4da0 sp 0x7fffb55d4d98
READ of size 4 at 0x5020003eee78 thread T0
    #0 0x558265d6654d in aggr_cpu_id__is_empty tools/perf/util/cpumap.c:694:12
    #1 0x558265c914da in perf_stat__get_aggr tools/perf/builtin-stat.c:1490:6
    #2 0x558265c914da in perf_stat__get_global_cached tools/perf/builtin-stat.c:1530:9
    #3 0x558265e53290 in should_skip_zero_counter tools/perf/util/stat-display.c:947:31
    #4 0x558265e53290 in print_counter_aggrdata tools/perf/util/stat-display.c:985:18
    #5 0x558265e51931 in print_counter tools/perf/util/stat-display.c:1110:3
    #6 0x558265e51931 in evlist__print_counters tools/perf/util/stat-display.c:1571:5
    #7 0x558265c8ec87 in print_counters tools/perf/builtin-stat.c:981:2
    #8 0x558265c8cc71 in cmd_stat tools/perf/builtin-stat.c:2837:3
    #9 0x558265bb9bd4 in run_builtin tools/perf/perf.c:323:11
    #10 0x558265bb98eb in handle_internal_command tools/perf/perf.c:377:8
    #11 0x558265bb9389 in run_argv tools/perf/perf.c:421:2
    #12 0x558265bb9389 in main tools/perf/perf.c:537:3
```

The issue was the use of testing a cpumap with NULL rather than using
empty, as a map containing the dummy value isn't NULL and the -1
results in an empty aggr map being allocated which legitimately
overflows when any member is accessed.

Fixes: 8a96f45 ("perf stat: Avoid SEGV if core.cpus isn't set")
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230906003912.3317462-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
tobetter pushed a commit that referenced this issue Dec 18, 2023
commit d8b90d6 upstream.

When scanning namespaces, it is possible to get valid data from the first
call to nvme_identify_ns() in nvme_alloc_ns(), but not from the second
call in nvme_update_ns_info_block().  In particular, if the NSID becomes
inactive between the two commands, a storage device may return a buffer
filled with zero as per 4.1.5.1.  In this case, we can get a kernel crash
due to a divide-by-zero in blk_stack_limits() because ns->lba_shift will
be set to zero.

PID: 326      TASK: ffff95fec3cd8000  CPU: 29   COMMAND: "kworker/u98:10"
 #0 [ffffad8f8702f9e0] machine_kexec at ffffffff91c76ec7
 #1 [ffffad8f8702fa38] __crash_kexec at ffffffff91dea4fa
 #2 [ffffad8f8702faf8] crash_kexec at ffffffff91deb788
 #3 [ffffad8f8702fb00] oops_end at ffffffff91c2e4bb
 #4 [ffffad8f8702fb20] do_trap at ffffffff91c2a4ce
 #5 [ffffad8f8702fb70] do_error_trap at ffffffff91c2a595
 #6 [ffffad8f8702fbb0] exc_divide_error at ffffffff928506e6
 #7 [ffffad8f8702fbd0] asm_exc_divide_error at ffffffff92a00926
    [exception RIP: blk_stack_limits+434]
    RIP: ffffffff92191872  RSP: ffffad8f8702fc80  RFLAGS: 00010246
    RAX: 0000000000000000  RBX: ffff95efa0c91800  RCX: 0000000000000001
    RDX: 0000000000000000  RSI: 0000000000000001  RDI: 0000000000000001
    RBP: 00000000ffffffff   R8: ffff95fec7df35a8   R9: 0000000000000000
    R10: 0000000000000000  R11: 0000000000000001  R12: 0000000000000000
    R13: 0000000000000000  R14: 0000000000000000  R15: ffff95fed33c09a8
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #8 [ffffad8f8702fce0] nvme_update_ns_info_block at ffffffffc06d3533 [nvme_core]
 #9 [ffffad8f8702fd18] nvme_scan_ns at ffffffffc06d6fa7 [nvme_core]

This happened when the check for valid data was moved out of nvme_identify_ns()
into one of the callers.  Fix this by checking in both callers.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=218186
Fixes: 0dd6fff ("nvme: bring back auto-removal of deleted namespaces during sequential scan")
Cc: stable@vger.kernel.org
Signed-off-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
tobetter pushed a commit that referenced this issue Dec 18, 2023
[ Upstream commit e3e82fc ]

When creating ceq_0 during probing irdma, cqp.sc_cqp will be sent as a
cqp_request to cqp->sc_cqp.sq_ring. If the request is pending when
removing the irdma driver or unplugging its aux device, cqp.sc_cqp will be
dereferenced as wrong struct in irdma_free_pending_cqp_request().

  PID: 3669   TASK: ffff88aef892c000  CPU: 28  COMMAND: "kworker/28:0"
   #0 [fffffe0000549e38] crash_nmi_callback at ffffffff810e3a34
   #1 [fffffe0000549e40] nmi_handle at ffffffff810788b2
   #2 [fffffe0000549ea0] default_do_nmi at ffffffff8107938f
   #3 [fffffe0000549eb8] do_nmi at ffffffff81079582
   #4 [fffffe0000549ef0] end_repeat_nmi at ffffffff82e016b4
      [exception RIP: native_queued_spin_lock_slowpath+1291]
      RIP: ffffffff8127e72b  RSP: ffff88aa841ef778  RFLAGS: 00000046
      RAX: 0000000000000000  RBX: ffff88b01f849700  RCX: ffffffff8127e47e
      RDX: 0000000000000000  RSI: 0000000000000004  RDI: ffffffff83857ec0
      RBP: ffff88afe3e4efc8   R8: ffffed15fc7c9dfa   R9: ffffed15fc7c9dfa
      R10: 0000000000000001  R11: ffffed15fc7c9df9  R12: 0000000000740000
      R13: ffff88b01f849708  R14: 0000000000000003  R15: ffffed1603f092e1
      ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0000
  -- <NMI exception stack> --
   #5 [ffff88aa841ef778] native_queued_spin_lock_slowpath at ffffffff8127e72b
   #6 [ffff88aa841ef7b0] _raw_spin_lock_irqsave at ffffffff82c22aa4
   #7 [ffff88aa841ef7c8] __wake_up_common_lock at ffffffff81257363
   #8 [ffff88aa841ef888] irdma_free_pending_cqp_request at ffffffffa0ba12cc [irdma]
   #9 [ffff88aa841ef958] irdma_cleanup_pending_cqp_op at ffffffffa0ba1469 [irdma]
   #10 [ffff88aa841ef9c0] irdma_ctrl_deinit_hw at ffffffffa0b2989f [irdma]
   #11 [ffff88aa841efa28] irdma_remove at ffffffffa0b252df [irdma]
   #12 [ffff88aa841efae8] auxiliary_bus_remove at ffffffff8219afdb
   #13 [ffff88aa841efb00] device_release_driver_internal at ffffffff821882e6
   #14 [ffff88aa841efb38] bus_remove_device at ffffffff82184278
   #15 [ffff88aa841efb88] device_del at ffffffff82179d23
   #16 [ffff88aa841efc48] ice_unplug_aux_dev at ffffffffa0eb1c14 [ice]
   #17 [ffff88aa841efc68] ice_service_task at ffffffffa0d88201 [ice]
   #18 [ffff88aa841efde8] process_one_work at ffffffff811c589a
   #19 [ffff88aa841efe60] worker_thread at ffffffff811c71ff
   #20 [ffff88aa841eff10] kthread at ffffffff811d87a0
   #21 [ffff88aa841eff50] ret_from_fork at ffffffff82e0022f

Fixes: 44d9e52 ("RDMA/irdma: Implement device initialization definitions")
Link: https://lore.kernel.org/r/20231130081415.891006-1-lishifeng@sangfor.com.cn
Suggested-by: "Ismail, Mustafa" <mustafa.ismail@intel.com>
Signed-off-by: Shifeng Li <lishifeng@sangfor.com.cn>
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
tobetter pushed a commit that referenced this issue Jan 2, 2024
[ Upstream commit 1469417 ]

Trying to suspend to RAM on SAMA5D27 EVK leads to the following lockdep
warning:

 ============================================
 WARNING: possible recursive locking detected
 6.7.0-rc5-wt+ torvalds#532 Not tainted
 --------------------------------------------
 sh/92 is trying to acquire lock:
 c3cf306c (&irq_desc_lock_class){-.-.}-{2:2}, at: __irq_get_desc_lock+0xe8/0x100

 but task is already holding lock:
 c3d7c46c (&irq_desc_lock_class){-.-.}-{2:2}, at: __irq_get_desc_lock+0xe8/0x100

 other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock(&irq_desc_lock_class);
   lock(&irq_desc_lock_class);

  *** DEADLOCK ***

  May be due to missing lock nesting notation

 6 locks held by sh/92:
  #0: c3aa0258 (sb_writers#6){.+.+}-{0:0}, at: ksys_write+0xd8/0x178
  #1: c4c2df44 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x138/0x284
  #2: c32684a0 (kn->active){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x148/0x284
  #3: c232b6d4 (system_transition_mutex){+.+.}-{3:3}, at: pm_suspend+0x13c/0x4e8
  #4: c387b088 (&dev->mutex){....}-{3:3}, at: __device_suspend+0x1e8/0x91c
  #5: c3d7c46c (&irq_desc_lock_class){-.-.}-{2:2}, at: __irq_get_desc_lock+0xe8/0x100

 stack backtrace:
 CPU: 0 PID: 92 Comm: sh Not tainted 6.7.0-rc5-wt+ torvalds#532
 Hardware name: Atmel SAMA5
  unwind_backtrace from show_stack+0x18/0x1c
  show_stack from dump_stack_lvl+0x34/0x48
  dump_stack_lvl from __lock_acquire+0x19ec/0x3a0c
  __lock_acquire from lock_acquire.part.0+0x124/0x2d0
  lock_acquire.part.0 from _raw_spin_lock_irqsave+0x5c/0x78
  _raw_spin_lock_irqsave from __irq_get_desc_lock+0xe8/0x100
  __irq_get_desc_lock from irq_set_irq_wake+0xa8/0x204
  irq_set_irq_wake from atmel_gpio_irq_set_wake+0x58/0xb4
  atmel_gpio_irq_set_wake from irq_set_irq_wake+0x100/0x204
  irq_set_irq_wake from gpio_keys_suspend+0xec/0x2b8
  gpio_keys_suspend from dpm_run_callback+0xe4/0x248
  dpm_run_callback from __device_suspend+0x234/0x91c
  __device_suspend from dpm_suspend+0x224/0x43c
  dpm_suspend from dpm_suspend_start+0x9c/0xa8
  dpm_suspend_start from suspend_devices_and_enter+0x1e0/0xa84
  suspend_devices_and_enter from pm_suspend+0x460/0x4e8
  pm_suspend from state_store+0x78/0xe4
  state_store from kernfs_fop_write_iter+0x1a0/0x284
  kernfs_fop_write_iter from vfs_write+0x38c/0x6f4
  vfs_write from ksys_write+0xd8/0x178
  ksys_write from ret_fast_syscall+0x0/0x1c
 Exception stack(0xc52b3fa8 to 0xc52b3ff0)
 3fa0:                   00000004 005a0ae8 00000001 005a0ae8 00000004 00000001
 3fc0: 00000004 005a0ae8 00000001 00000004 00000004 b6c616c0 00000020 0059d190
 3fe0: 00000004 b6c61678 aec5a041 aebf1a26

This warning is raised because pinctrl-at91-pio4 uses chained IRQ. Whenever
a wake up source configures an IRQ through irq_set_irq_wake, it will
lock the corresponding IRQ desc, and then call irq_set_irq_wake on "parent"
IRQ which will do the same on its own IRQ desc, but since those two locks
share the same class, lockdep reports this as an issue.

Fix lockdep false positive by setting a different class for parent and
children IRQ

Fixes: 7761808 ("pinctrl: introduce driver for Atmel PIO4 controller")
Signed-off-by: Alexis Lothoré <alexis.lothore@bootlin.com>
Link: https://lore.kernel.org/r/20231215-lockdep_warning-v1-1-8137b2510ed5@bootlin.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
tobetter pushed a commit that referenced this issue Jan 2, 2024
commit a8f68b1 upstream.

Validate next header's offset in ->next_header() so that it isn't
smaller than MID_HEADER_SIZE(server) and then standard_receive3() or
->receive() ends up writing off the end of the buffer because
'pdu_length - MID_HEADER_SIZE(server)' wraps up to a huge length:

  BUG: KASAN: slab-out-of-bounds in _copy_to_iter+0x4fc/0x840
  Write of size 701 at addr ffff88800caf407f by task cifsd/1090

  CPU: 0 PID: 1090 Comm: cifsd Not tainted 6.7.0-rc4 #5
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
  rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
  Call Trace:
   <TASK>
   dump_stack_lvl+0x4a/0x80
   print_report+0xcf/0x650
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? __phys_addr+0x46/0x90
   kasan_report+0xd8/0x110
   ? _copy_to_iter+0x4fc/0x840
   ? _copy_to_iter+0x4fc/0x840
   kasan_check_range+0x105/0x1b0
   __asan_memcpy+0x3c/0x60
   _copy_to_iter+0x4fc/0x840
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? hlock_class+0x32/0xc0
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? __pfx__copy_to_iter+0x10/0x10
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? lock_is_held_type+0x90/0x100
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? __might_resched+0x278/0x360
   ? __pfx___might_resched+0x10/0x10
   ? srso_alias_return_thunk+0x5/0xfbef5
   __skb_datagram_iter+0x2c2/0x460
   ? __pfx_simple_copy_to_iter+0x10/0x10
   skb_copy_datagram_iter+0x6c/0x110
   tcp_recvmsg_locked+0x9be/0xf40
   ? __pfx_tcp_recvmsg_locked+0x10/0x10
   ? mark_held_locks+0x5d/0x90
   ? srso_alias_return_thunk+0x5/0xfbef5
   tcp_recvmsg+0xe2/0x310
   ? __pfx_tcp_recvmsg+0x10/0x10
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? lock_acquire+0x14a/0x3a0
   ? srso_alias_return_thunk+0x5/0xfbef5
   inet_recvmsg+0xd0/0x370
   ? __pfx_inet_recvmsg+0x10/0x10
   ? __pfx_lock_release+0x10/0x10
   ? do_raw_spin_trylock+0xd1/0x120
   sock_recvmsg+0x10d/0x150
   cifs_readv_from_socket+0x25a/0x490 [cifs]
   ? __pfx_cifs_readv_from_socket+0x10/0x10 [cifs]
   ? srso_alias_return_thunk+0x5/0xfbef5
   cifs_read_from_socket+0xb5/0x100 [cifs]
   ? __pfx_cifs_read_from_socket+0x10/0x10 [cifs]
   ? __pfx_lock_release+0x10/0x10
   ? do_raw_spin_trylock+0xd1/0x120
   ? _raw_spin_unlock+0x23/0x40
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? __smb2_find_mid+0x126/0x230 [cifs]
   cifs_demultiplex_thread+0xd39/0x1270 [cifs]
   ? __pfx_cifs_demultiplex_thread+0x10/0x10 [cifs]
   ? __pfx_lock_release+0x10/0x10
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? mark_held_locks+0x1a/0x90
   ? lockdep_hardirqs_on_prepare+0x136/0x210
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? __kthread_parkme+0xce/0xf0
   ? __pfx_cifs_demultiplex_thread+0x10/0x10 [cifs]
   kthread+0x18d/0x1d0
   ? kthread+0xdb/0x1d0
   ? __pfx_kthread+0x10/0x10
   ret_from_fork+0x34/0x60
   ? __pfx_kthread+0x10/0x10
   ret_from_fork_asm+0x1b/0x30
   </TASK>

Fixes: 8ce79ec ("cifs: update multiplex loop to handle compounded responses")
Cc: stable@vger.kernel.org
Reported-by: Robert Morris <rtm@csail.mit.edu>
Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
tobetter pushed a commit that referenced this issue Jan 2, 2024
commit 33eae65 upstream.

A small CIFS buffer (448 bytes) isn't big enough to hold
SMB2_QUERY_INFO request along with user's input data from
CIFS_QUERY_INFO ioctl.  That is, if the user passed an input buffer >
344 bytes, the client will memcpy() off the end of @req->Buffer in
SMB2_query_info_init() thus causing the following KASAN splat:

  BUG: KASAN: slab-out-of-bounds in SMB2_query_info_init+0x242/0x250 [cifs]
  Write of size 1023 at addr ffff88801308c5a8 by task a.out/1240

  CPU: 1 PID: 1240 Comm: a.out Not tainted 6.7.0-rc4 #5
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
  rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
  Call Trace:
   <TASK>
   dump_stack_lvl+0x4a/0x80
   print_report+0xcf/0x650
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? __phys_addr+0x46/0x90
   kasan_report+0xd8/0x110
   ? SMB2_query_info_init+0x242/0x250 [cifs]
   ? SMB2_query_info_init+0x242/0x250 [cifs]
   kasan_check_range+0x105/0x1b0
   __asan_memcpy+0x3c/0x60
   SMB2_query_info_init+0x242/0x250 [cifs]
   ? __pfx_SMB2_query_info_init+0x10/0x10 [cifs]
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? smb_rqst_len+0xa6/0xc0 [cifs]
   smb2_ioctl_query_info+0x4f4/0x9a0 [cifs]
   ? __pfx_smb2_ioctl_query_info+0x10/0x10 [cifs]
   ? __pfx_cifsConvertToUTF16+0x10/0x10 [cifs]
   ? kasan_set_track+0x25/0x30
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? __kasan_kmalloc+0x8f/0xa0
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? cifs_strndup_to_utf16+0x12d/0x1a0 [cifs]
   ? __build_path_from_dentry_optional_prefix+0x19d/0x2d0 [cifs]
   ? __pfx_smb2_ioctl_query_info+0x10/0x10 [cifs]
   cifs_ioctl+0x11c7/0x1de0 [cifs]
   ? __pfx_cifs_ioctl+0x10/0x10 [cifs]
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? rcu_is_watching+0x23/0x50
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? __rseq_handle_notify_resume+0x6cd/0x850
   ? __pfx___schedule+0x10/0x10
   ? blkcg_iostat_update+0x250/0x290
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? ksys_write+0xe9/0x170
   __x64_sys_ioctl+0xc9/0x100
   do_syscall_64+0x47/0xf0
   entry_SYSCALL_64_after_hwframe+0x6f/0x77
  RIP: 0033:0x7f893dde49cf
  Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48
  89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89>
  c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00
  RSP: 002b:00007ffc03ff4160 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
  RAX: ffffffffffffffda RBX: 00007ffc03ff4378 RCX: 00007f893dde49cf
  RDX: 00007ffc03ff41d0 RSI: 00000000c018cf07 RDI: 0000000000000003
  RBP: 00007ffc03ff4260 R08: 0000000000000410 R09: 0000000000000001
  R10: 00007f893dce7300 R11: 0000000000000246 R12: 0000000000000000
  R13: 00007ffc03ff4388 R14: 00007f893df15000 R15: 0000000000406de0
   </TASK>

Fix this by increasing size of SMB2_QUERY_INFO request buffers and
validating input length to prevent other callers from overflowing @Req
in SMB2_query_info_init() as well.

Fixes: f5b05d6 ("cifs: add IOCTL for QUERY_INFO passthrough to userspace")
Cc: stable@vger.kernel.org
Reported-by: Robert Morris <rtm@csail.mit.edu>
Signed-off-by: Paulo Alcantara <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
tobetter pushed a commit that referenced this issue Jan 2, 2024
[ Upstream commit e3e82fc ]

When creating ceq_0 during probing irdma, cqp.sc_cqp will be sent as a
cqp_request to cqp->sc_cqp.sq_ring. If the request is pending when
removing the irdma driver or unplugging its aux device, cqp.sc_cqp will be
dereferenced as wrong struct in irdma_free_pending_cqp_request().

  PID: 3669   TASK: ffff88aef892c000  CPU: 28  COMMAND: "kworker/28:0"
   #0 [fffffe0000549e38] crash_nmi_callback at ffffffff810e3a34
   #1 [fffffe0000549e40] nmi_handle at ffffffff810788b2
   #2 [fffffe0000549ea0] default_do_nmi at ffffffff8107938f
   #3 [fffffe0000549eb8] do_nmi at ffffffff81079582
   #4 [fffffe0000549ef0] end_repeat_nmi at ffffffff82e016b4
      [exception RIP: native_queued_spin_lock_slowpath+1291]
      RIP: ffffffff8127e72b  RSP: ffff88aa841ef778  RFLAGS: 00000046
      RAX: 0000000000000000  RBX: ffff88b01f849700  RCX: ffffffff8127e47e
      RDX: 0000000000000000  RSI: 0000000000000004  RDI: ffffffff83857ec0
      RBP: ffff88afe3e4efc8   R8: ffffed15fc7c9dfa   R9: ffffed15fc7c9dfa
      R10: 0000000000000001  R11: ffffed15fc7c9df9  R12: 0000000000740000
      R13: ffff88b01f849708  R14: 0000000000000003  R15: ffffed1603f092e1
      ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0000
  -- <NMI exception stack> --
   #5 [ffff88aa841ef778] native_queued_spin_lock_slowpath at ffffffff8127e72b
   #6 [ffff88aa841ef7b0] _raw_spin_lock_irqsave at ffffffff82c22aa4
   #7 [ffff88aa841ef7c8] __wake_up_common_lock at ffffffff81257363
   #8 [ffff88aa841ef888] irdma_free_pending_cqp_request at ffffffffa0ba12cc [irdma]
   #9 [ffff88aa841ef958] irdma_cleanup_pending_cqp_op at ffffffffa0ba1469 [irdma]
   #10 [ffff88aa841ef9c0] irdma_ctrl_deinit_hw at ffffffffa0b2989f [irdma]
   #11 [ffff88aa841efa28] irdma_remove at ffffffffa0b252df [irdma]
   #12 [ffff88aa841efae8] auxiliary_bus_remove at ffffffff8219afdb
   #13 [ffff88aa841efb00] device_release_driver_internal at ffffffff821882e6
   #14 [ffff88aa841efb38] bus_remove_device at ffffffff82184278
   #15 [ffff88aa841efb88] device_del at ffffffff82179d23
   #16 [ffff88aa841efc48] ice_unplug_aux_dev at ffffffffa0eb1c14 [ice]
   #17 [ffff88aa841efc68] ice_service_task at ffffffffa0d88201 [ice]
   #18 [ffff88aa841efde8] process_one_work at ffffffff811c589a
   #19 [ffff88aa841efe60] worker_thread at ffffffff811c71ff
   #20 [ffff88aa841eff10] kthread at ffffffff811d87a0
   #21 [ffff88aa841eff50] ret_from_fork at ffffffff82e0022f

Fixes: 44d9e52 ("RDMA/irdma: Implement device initialization definitions")
Link: https://lore.kernel.org/r/20231130081415.891006-1-lishifeng@sangfor.com.cn
Suggested-by: "Ismail, Mustafa" <mustafa.ismail@intel.com>
Signed-off-by: Shifeng Li <lishifeng@sangfor.com.cn>
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
tobetter pushed a commit that referenced this issue Jan 17, 2024
[ Upstream commit 1469417 ]

Trying to suspend to RAM on SAMA5D27 EVK leads to the following lockdep
warning:

 ============================================
 WARNING: possible recursive locking detected
 6.7.0-rc5-wt+ torvalds#532 Not tainted
 --------------------------------------------
 sh/92 is trying to acquire lock:
 c3cf306c (&irq_desc_lock_class){-.-.}-{2:2}, at: __irq_get_desc_lock+0xe8/0x100

 but task is already holding lock:
 c3d7c46c (&irq_desc_lock_class){-.-.}-{2:2}, at: __irq_get_desc_lock+0xe8/0x100

 other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock(&irq_desc_lock_class);
   lock(&irq_desc_lock_class);

  *** DEADLOCK ***

  May be due to missing lock nesting notation

 6 locks held by sh/92:
  #0: c3aa0258 (sb_writers#6){.+.+}-{0:0}, at: ksys_write+0xd8/0x178
  #1: c4c2df44 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x138/0x284
  #2: c32684a0 (kn->active){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x148/0x284
  #3: c232b6d4 (system_transition_mutex){+.+.}-{3:3}, at: pm_suspend+0x13c/0x4e8
  #4: c387b088 (&dev->mutex){....}-{3:3}, at: __device_suspend+0x1e8/0x91c
  #5: c3d7c46c (&irq_desc_lock_class){-.-.}-{2:2}, at: __irq_get_desc_lock+0xe8/0x100

 stack backtrace:
 CPU: 0 PID: 92 Comm: sh Not tainted 6.7.0-rc5-wt+ torvalds#532
 Hardware name: Atmel SAMA5
  unwind_backtrace from show_stack+0x18/0x1c
  show_stack from dump_stack_lvl+0x34/0x48
  dump_stack_lvl from __lock_acquire+0x19ec/0x3a0c
  __lock_acquire from lock_acquire.part.0+0x124/0x2d0
  lock_acquire.part.0 from _raw_spin_lock_irqsave+0x5c/0x78
  _raw_spin_lock_irqsave from __irq_get_desc_lock+0xe8/0x100
  __irq_get_desc_lock from irq_set_irq_wake+0xa8/0x204
  irq_set_irq_wake from atmel_gpio_irq_set_wake+0x58/0xb4
  atmel_gpio_irq_set_wake from irq_set_irq_wake+0x100/0x204
  irq_set_irq_wake from gpio_keys_suspend+0xec/0x2b8
  gpio_keys_suspend from dpm_run_callback+0xe4/0x248
  dpm_run_callback from __device_suspend+0x234/0x91c
  __device_suspend from dpm_suspend+0x224/0x43c
  dpm_suspend from dpm_suspend_start+0x9c/0xa8
  dpm_suspend_start from suspend_devices_and_enter+0x1e0/0xa84
  suspend_devices_and_enter from pm_suspend+0x460/0x4e8
  pm_suspend from state_store+0x78/0xe4
  state_store from kernfs_fop_write_iter+0x1a0/0x284
  kernfs_fop_write_iter from vfs_write+0x38c/0x6f4
  vfs_write from ksys_write+0xd8/0x178
  ksys_write from ret_fast_syscall+0x0/0x1c
 Exception stack(0xc52b3fa8 to 0xc52b3ff0)
 3fa0:                   00000004 005a0ae8 00000001 005a0ae8 00000004 00000001
 3fc0: 00000004 005a0ae8 00000001 00000004 00000004 b6c616c0 00000020 0059d190
 3fe0: 00000004 b6c61678 aec5a041 aebf1a26

This warning is raised because pinctrl-at91-pio4 uses chained IRQ. Whenever
a wake up source configures an IRQ through irq_set_irq_wake, it will
lock the corresponding IRQ desc, and then call irq_set_irq_wake on "parent"
IRQ which will do the same on its own IRQ desc, but since those two locks
share the same class, lockdep reports this as an issue.

Fix lockdep false positive by setting a different class for parent and
children IRQ

Fixes: 7761808 ("pinctrl: introduce driver for Atmel PIO4 controller")
Signed-off-by: Alexis Lothoré <alexis.lothore@bootlin.com>
Link: https://lore.kernel.org/r/20231215-lockdep_warning-v1-1-8137b2510ed5@bootlin.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
tobetter pushed a commit that referenced this issue Jan 17, 2024
[ Upstream commit 33eae65 ]

A small CIFS buffer (448 bytes) isn't big enough to hold
SMB2_QUERY_INFO request along with user's input data from
CIFS_QUERY_INFO ioctl.  That is, if the user passed an input buffer >
344 bytes, the client will memcpy() off the end of @req->Buffer in
SMB2_query_info_init() thus causing the following KASAN splat:

  BUG: KASAN: slab-out-of-bounds in SMB2_query_info_init+0x242/0x250 [cifs]
  Write of size 1023 at addr ffff88801308c5a8 by task a.out/1240

  CPU: 1 PID: 1240 Comm: a.out Not tainted 6.7.0-rc4 #5
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
  rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
  Call Trace:
   <TASK>
   dump_stack_lvl+0x4a/0x80
   print_report+0xcf/0x650
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? __phys_addr+0x46/0x90
   kasan_report+0xd8/0x110
   ? SMB2_query_info_init+0x242/0x250 [cifs]
   ? SMB2_query_info_init+0x242/0x250 [cifs]
   kasan_check_range+0x105/0x1b0
   __asan_memcpy+0x3c/0x60
   SMB2_query_info_init+0x242/0x250 [cifs]
   ? __pfx_SMB2_query_info_init+0x10/0x10 [cifs]
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? smb_rqst_len+0xa6/0xc0 [cifs]
   smb2_ioctl_query_info+0x4f4/0x9a0 [cifs]
   ? __pfx_smb2_ioctl_query_info+0x10/0x10 [cifs]
   ? __pfx_cifsConvertToUTF16+0x10/0x10 [cifs]
   ? kasan_set_track+0x25/0x30
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? __kasan_kmalloc+0x8f/0xa0
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? cifs_strndup_to_utf16+0x12d/0x1a0 [cifs]
   ? __build_path_from_dentry_optional_prefix+0x19d/0x2d0 [cifs]
   ? __pfx_smb2_ioctl_query_info+0x10/0x10 [cifs]
   cifs_ioctl+0x11c7/0x1de0 [cifs]
   ? __pfx_cifs_ioctl+0x10/0x10 [cifs]
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? rcu_is_watching+0x23/0x50
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? __rseq_handle_notify_resume+0x6cd/0x850
   ? __pfx___schedule+0x10/0x10
   ? blkcg_iostat_update+0x250/0x290
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? ksys_write+0xe9/0x170
   __x64_sys_ioctl+0xc9/0x100
   do_syscall_64+0x47/0xf0
   entry_SYSCALL_64_after_hwframe+0x6f/0x77
  RIP: 0033:0x7f893dde49cf
  Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48
  89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89>
  c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00
  RSP: 002b:00007ffc03ff4160 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
  RAX: ffffffffffffffda RBX: 00007ffc03ff4378 RCX: 00007f893dde49cf
  RDX: 00007ffc03ff41d0 RSI: 00000000c018cf07 RDI: 0000000000000003
  RBP: 00007ffc03ff4260 R08: 0000000000000410 R09: 0000000000000001
  R10: 00007f893dce7300 R11: 0000000000000246 R12: 0000000000000000
  R13: 00007ffc03ff4388 R14: 00007f893df15000 R15: 0000000000406de0
   </TASK>

Fix this by increasing size of SMB2_QUERY_INFO request buffers and
validating input length to prevent other callers from overflowing @Req
in SMB2_query_info_init() as well.

Fixes: f5b05d6 ("cifs: add IOCTL for QUERY_INFO passthrough to userspace")
Cc: stable@vger.kernel.org
Reported-by: Robert Morris <rtm@csail.mit.edu>
Signed-off-by: Paulo Alcantara <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
tobetter pushed a commit that referenced this issue Feb 6, 2024
[ Upstream commit fc3a553 ]

An issue occurred while reading an ELF file in libbpf.c during fuzzing:

	Program received signal SIGSEGV, Segmentation fault.
	0x0000000000958e97 in bpf_object.collect_prog_relos () at libbpf.c:4206
	4206 in libbpf.c
	(gdb) bt
	#0 0x0000000000958e97 in bpf_object.collect_prog_relos () at libbpf.c:4206
	#1 0x000000000094f9d6 in bpf_object.collect_relos () at libbpf.c:6706
	#2 0x000000000092bef3 in bpf_object_open () at libbpf.c:7437
	#3 0x000000000092c046 in bpf_object.open_mem () at libbpf.c:7497
	#4 0x0000000000924afa in LLVMFuzzerTestOneInput () at fuzz/bpf-object-fuzzer.c:16
	#5 0x000000000060be11 in testblitz_engine::fuzzer::Fuzzer::run_one ()
	#6 0x000000000087ad92 in tracing::span::Span::in_scope ()
	#7 0x00000000006078aa in testblitz_engine::fuzzer::util::walkdir ()
	#8 0x00000000005f3217 in testblitz_engine::entrypoint::main::{{closure}} ()
	#9 0x00000000005f2601 in main ()
	(gdb)

scn_data was null at this code(tools/lib/bpf/src/libbpf.c):

	if (rel->r_offset % BPF_INSN_SZ || rel->r_offset >= scn_data->d_size) {

The scn_data is derived from the code above:

	scn = elf_sec_by_idx(obj, sec_idx);
	scn_data = elf_sec_data(obj, scn);

	relo_sec_name = elf_sec_str(obj, shdr->sh_name);
	sec_name = elf_sec_name(obj, scn);
	if (!relo_sec_name || !sec_name)// don't check whether scn_data is NULL
		return -EINVAL;

In certain special scenarios, such as reading a malformed ELF file,
it is possible that scn_data may be a null pointer

Signed-off-by: Mingyi Zhang <zhangmingyi5@huawei.com>
Signed-off-by: Xin Liu <liuxin350@huawei.com>
Signed-off-by: Changye Wu <wuchangye@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20231221033947.154564-1-liuxin350@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
tobetter pushed a commit that referenced this issue Feb 8, 2024
[ Upstream commit fc3a553 ]

An issue occurred while reading an ELF file in libbpf.c during fuzzing:

	Program received signal SIGSEGV, Segmentation fault.
	0x0000000000958e97 in bpf_object.collect_prog_relos () at libbpf.c:4206
	4206 in libbpf.c
	(gdb) bt
	#0 0x0000000000958e97 in bpf_object.collect_prog_relos () at libbpf.c:4206
	#1 0x000000000094f9d6 in bpf_object.collect_relos () at libbpf.c:6706
	#2 0x000000000092bef3 in bpf_object_open () at libbpf.c:7437
	#3 0x000000000092c046 in bpf_object.open_mem () at libbpf.c:7497
	#4 0x0000000000924afa in LLVMFuzzerTestOneInput () at fuzz/bpf-object-fuzzer.c:16
	#5 0x000000000060be11 in testblitz_engine::fuzzer::Fuzzer::run_one ()
	#6 0x000000000087ad92 in tracing::span::Span::in_scope ()
	#7 0x00000000006078aa in testblitz_engine::fuzzer::util::walkdir ()
	#8 0x00000000005f3217 in testblitz_engine::entrypoint::main::{{closure}} ()
	#9 0x00000000005f2601 in main ()
	(gdb)

scn_data was null at this code(tools/lib/bpf/src/libbpf.c):

	if (rel->r_offset % BPF_INSN_SZ || rel->r_offset >= scn_data->d_size) {

The scn_data is derived from the code above:

	scn = elf_sec_by_idx(obj, sec_idx);
	scn_data = elf_sec_data(obj, scn);

	relo_sec_name = elf_sec_str(obj, shdr->sh_name);
	sec_name = elf_sec_name(obj, scn);
	if (!relo_sec_name || !sec_name)// don't check whether scn_data is NULL
		return -EINVAL;

In certain special scenarios, such as reading a malformed ELF file,
it is possible that scn_data may be a null pointer

Signed-off-by: Mingyi Zhang <zhangmingyi5@huawei.com>
Signed-off-by: Xin Liu <liuxin350@huawei.com>
Signed-off-by: Changye Wu <wuchangye@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20231221033947.154564-1-liuxin350@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
tobetter pushed a commit that referenced this issue Feb 19, 2024
commit 817349b upstream.

Upstream commit bac1ec5 ("usb: xhci: Set quirk for
XHCI_SG_TRB_CACHE_SIZE_QUIRK") introduced a new quirk in XHCI
which fixes XHC timeout, which was seen on synopsys XHCs while
using SG buffers. But the support for this quirk isn't present
in the DWC3 layer.

We will encounter this XHCI timeout/hung issue if we run iperf
loopback tests using RTL8156 ethernet adaptor on DWC3 targets
with scatter-gather enabled. This gets resolved after enabling
the XHCI_SG_TRB_CACHE_SIZE_QUIRK. This patch enables it using
the xhci device property since its needed for DWC3 controller.

In Synopsys DWC3 databook,
Table 9-3: xHCI Debug Capability Limitations
Chained TRBs greater than TRB cache size: The debug capability
driver must not create a multi-TRB TD that describes smaller
than a 1K packet that spreads across 8 or more TRBs on either
the IN TR or the OUT TR.

Cc: stable@vger.kernel.org #5.11
Signed-off-by: Prashanth K <quic_prashk@quicinc.com>
Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Link: https://lore.kernel.org/r/20240116055816.1169821-2-quic_prashk@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
tobetter pushed a commit that referenced this issue Feb 20, 2024
commit 817349b upstream.

Upstream commit bac1ec5 ("usb: xhci: Set quirk for
XHCI_SG_TRB_CACHE_SIZE_QUIRK") introduced a new quirk in XHCI
which fixes XHC timeout, which was seen on synopsys XHCs while
using SG buffers. But the support for this quirk isn't present
in the DWC3 layer.

We will encounter this XHCI timeout/hung issue if we run iperf
loopback tests using RTL8156 ethernet adaptor on DWC3 targets
with scatter-gather enabled. This gets resolved after enabling
the XHCI_SG_TRB_CACHE_SIZE_QUIRK. This patch enables it using
the xhci device property since its needed for DWC3 controller.

In Synopsys DWC3 databook,
Table 9-3: xHCI Debug Capability Limitations
Chained TRBs greater than TRB cache size: The debug capability
driver must not create a multi-TRB TD that describes smaller
than a 1K packet that spreads across 8 or more TRBs on either
the IN TR or the OUT TR.

Cc: stable@vger.kernel.org #5.11
Signed-off-by: Prashanth K <quic_prashk@quicinc.com>
Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Link: https://lore.kernel.org/r/20240116055816.1169821-2-quic_prashk@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
tobetter pushed a commit that referenced this issue Mar 13, 2024
[ Upstream commit d3ea125 ]

If CONFIG_HARDENED_USERCOPY is enabled, copying completion record from
event log cache to user triggers a kernel bug.

[ 1987.159822] usercopy: Kernel memory exposure attempt detected from SLUB object 'dsa0' (offset 74, size 31)!
[ 1987.170845] ------------[ cut here ]------------
[ 1987.176086] kernel BUG at mm/usercopy.c:102!
[ 1987.180946] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[ 1987.186866] CPU: 17 PID: 528 Comm: kworker/17:1 Not tainted 6.8.0-rc2+ #5
[ 1987.194537] Hardware name: Intel Corporation AvenueCity/AvenueCity, BIOS BHSDCRB1.86B.2492.D03.2307181620 07/18/2023
[ 1987.206405] Workqueue: wq0.0 idxd_evl_fault_work [idxd]
[ 1987.212338] RIP: 0010:usercopy_abort+0x72/0x90
[ 1987.217381] Code: 58 65 9c 50 48 c7 c2 17 85 61 9c 57 48 c7 c7 98 fd 6b 9c 48 0f 44 d6 48 c7 c6 b3 08 62 9c 4c 89 d1 49 0f 44 f3 e8 1e 2e d5 ff <0f> 0b 49 c7 c1 9e 42 61 9c 4c 89 cf 4d 89 c8 eb a9 66 66 2e 0f 1f
[ 1987.238505] RSP: 0018:ff62f5cf20607d60 EFLAGS: 00010246
[ 1987.244423] RAX: 000000000000005f RBX: 000000000000001f RCX: 0000000000000000
[ 1987.252480] RDX: 0000000000000000 RSI: ffffffff9c61429e RDI: 00000000ffffffff
[ 1987.260538] RBP: ff62f5cf20607d78 R08: ff2a6a89ef3fffe8 R09: 00000000fffeffff
[ 1987.268595] R10: ff2a6a89eed00000 R11: 0000000000000003 R12: ff2a66934849c89a
[ 1987.276652] R13: 0000000000000001 R14: ff2a66934849c8b9 R15: ff2a66934849c899
[ 1987.284710] FS:  0000000000000000(0000) GS:ff2a66b22fe40000(0000) knlGS:0000000000000000
[ 1987.293850] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1987.300355] CR2: 00007fe291a37000 CR3: 000000010fbd4005 CR4: 0000000000f71ef0
[ 1987.308413] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1987.316470] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
[ 1987.324527] PKRU: 55555554
[ 1987.327622] Call Trace:
[ 1987.330424]  <TASK>
[ 1987.332826]  ? show_regs+0x6e/0x80
[ 1987.336703]  ? die+0x3c/0xa0
[ 1987.339988]  ? do_trap+0xd4/0xf0
[ 1987.343662]  ? do_error_trap+0x75/0xa0
[ 1987.347922]  ? usercopy_abort+0x72/0x90
[ 1987.352277]  ? exc_invalid_op+0x57/0x80
[ 1987.356634]  ? usercopy_abort+0x72/0x90
[ 1987.360988]  ? asm_exc_invalid_op+0x1f/0x30
[ 1987.365734]  ? usercopy_abort+0x72/0x90
[ 1987.370088]  __check_heap_object+0xb7/0xd0
[ 1987.374739]  __check_object_size+0x175/0x2d0
[ 1987.379588]  idxd_copy_cr+0xa9/0x130 [idxd]
[ 1987.384341]  idxd_evl_fault_work+0x127/0x390 [idxd]
[ 1987.389878]  process_one_work+0x13e/0x300
[ 1987.394435]  ? __pfx_worker_thread+0x10/0x10
[ 1987.399284]  worker_thread+0x2f7/0x420
[ 1987.403544]  ? _raw_spin_unlock_irqrestore+0x2b/0x50
[ 1987.409171]  ? __pfx_worker_thread+0x10/0x10
[ 1987.414019]  kthread+0x107/0x140
[ 1987.417693]  ? __pfx_kthread+0x10/0x10
[ 1987.421954]  ret_from_fork+0x3d/0x60
[ 1987.426019]  ? __pfx_kthread+0x10/0x10
[ 1987.430281]  ret_from_fork_asm+0x1b/0x30
[ 1987.434744]  </TASK>

The issue arises because event log cache is created using
kmem_cache_create() which is not suitable for user copy.

Fix the issue by creating event log cache with
kmem_cache_create_usercopy(), ensuring safe user copy.

Fixes: c2f156b ("dmaengine: idxd: create kmem cache for event log fault items")
Reported-by: Tony Zhu <tony.zhu@intel.com>
Tested-by: Tony Zhu <tony.zhu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Lijun Pan <lijun.pan@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20240209191412.1050270-1-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
tobetter pushed a commit that referenced this issue Mar 25, 2024
[ Upstream commit fc3a553 ]

An issue occurred while reading an ELF file in libbpf.c during fuzzing:

	Program received signal SIGSEGV, Segmentation fault.
	0x0000000000958e97 in bpf_object.collect_prog_relos () at libbpf.c:4206
	4206 in libbpf.c
	(gdb) bt
	#0 0x0000000000958e97 in bpf_object.collect_prog_relos () at libbpf.c:4206
	#1 0x000000000094f9d6 in bpf_object.collect_relos () at libbpf.c:6706
	#2 0x000000000092bef3 in bpf_object_open () at libbpf.c:7437
	#3 0x000000000092c046 in bpf_object.open_mem () at libbpf.c:7497
	#4 0x0000000000924afa in LLVMFuzzerTestOneInput () at fuzz/bpf-object-fuzzer.c:16
	#5 0x000000000060be11 in testblitz_engine::fuzzer::Fuzzer::run_one ()
	#6 0x000000000087ad92 in tracing::span::Span::in_scope ()
	#7 0x00000000006078aa in testblitz_engine::fuzzer::util::walkdir ()
	#8 0x00000000005f3217 in testblitz_engine::entrypoint::main::{{closure}} ()
	#9 0x00000000005f2601 in main ()
	(gdb)

scn_data was null at this code(tools/lib/bpf/src/libbpf.c):

	if (rel->r_offset % BPF_INSN_SZ || rel->r_offset >= scn_data->d_size) {

The scn_data is derived from the code above:

	scn = elf_sec_by_idx(obj, sec_idx);
	scn_data = elf_sec_data(obj, scn);

	relo_sec_name = elf_sec_str(obj, shdr->sh_name);
	sec_name = elf_sec_name(obj, scn);
	if (!relo_sec_name || !sec_name)// don't check whether scn_data is NULL
		return -EINVAL;

In certain special scenarios, such as reading a malformed ELF file,
it is possible that scn_data may be a null pointer

Signed-off-by: Mingyi Zhang <zhangmingyi5@huawei.com>
Signed-off-by: Xin Liu <liuxin350@huawei.com>
Signed-off-by: Changye Wu <wuchangye@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20231221033947.154564-1-liuxin350@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
tobetter pushed a commit that referenced this issue Mar 25, 2024
commit 817349b upstream.

Upstream commit bac1ec5 ("usb: xhci: Set quirk for
XHCI_SG_TRB_CACHE_SIZE_QUIRK") introduced a new quirk in XHCI
which fixes XHC timeout, which was seen on synopsys XHCs while
using SG buffers. But the support for this quirk isn't present
in the DWC3 layer.

We will encounter this XHCI timeout/hung issue if we run iperf
loopback tests using RTL8156 ethernet adaptor on DWC3 targets
with scatter-gather enabled. This gets resolved after enabling
the XHCI_SG_TRB_CACHE_SIZE_QUIRK. This patch enables it using
the xhci device property since its needed for DWC3 controller.

In Synopsys DWC3 databook,
Table 9-3: xHCI Debug Capability Limitations
Chained TRBs greater than TRB cache size: The debug capability
driver must not create a multi-TRB TD that describes smaller
than a 1K packet that spreads across 8 or more TRBs on either
the IN TR or the OUT TR.

Cc: stable@vger.kernel.org #5.11
Signed-off-by: Prashanth K <quic_prashk@quicinc.com>
Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Link: https://lore.kernel.org/r/20240116055816.1169821-2-quic_prashk@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
tobetter pushed a commit that referenced this issue Apr 15, 2024
…pty list

[ Upstream commit 55b6c73 ]

If all peers are removed via wg_peer_remove_all(), rather than setting
peer_list to empty, the peer is added to a temporary list with a head on
the stack of wg_peer_remove_all(). If a netlink dump is resumed and the
cursored peer is one that has been removed via wg_peer_remove_all(), it
will iterate from that peer and then attempt to dump freed peers.

Fix this by instead checking peer->is_dead, which was explictly created
for this purpose. Also move up the device_update_lock lockdep assertion,
since reading is_dead relies on that.

It can be reproduced by a small script like:

    echo "Setting config..."
    ip link add dev wg0 type wireguard
    wg setconf wg0 /big-config
    (
            while true; do
                    echo "Showing config..."
                    wg showconf wg0 > /dev/null
            done
    ) &
    sleep 4
    wg setconf wg0 <(printf "[Peer]\nPublicKey=$(wg genkey)\n")

Resulting in:

    BUG: KASAN: slab-use-after-free in __lock_acquire+0x182a/0x1b20
    Read of size 8 at addr ffff88811956ec70 by task wg/59
    CPU: 2 PID: 59 Comm: wg Not tainted 6.8.0-rc2-debug+ #5
    Call Trace:
     <TASK>
     dump_stack_lvl+0x47/0x70
     print_address_description.constprop.0+0x2c/0x380
     print_report+0xab/0x250
     kasan_report+0xba/0xf0
     __lock_acquire+0x182a/0x1b20
     lock_acquire+0x191/0x4b0
     down_read+0x80/0x440
     get_peer+0x140/0xcb0
     wg_get_device_dump+0x471/0x1130

Cc: stable@vger.kernel.org
Fixes: e7096c1 ("net: WireGuard secure network tunnel")
Reported-by: Lillian Berry <lillian@star-ark.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
tobetter pushed a commit that referenced this issue Apr 15, 2024
commit 4be9075 upstream.

The driver creates /sys/kernel/debug/dri/0/mob_ttm even when the
corresponding ttm_resource_manager is not allocated.
This leads to a crash when trying to read from this file.

Add a check to create mob_ttm, system_mob_ttm, and gmr_ttm debug file
only when the corresponding ttm_resource_manager is allocated.

crash> bt
PID: 3133409  TASK: ffff8fe4834a5000  CPU: 3    COMMAND: "grep"
 #0 [ffffb954506b3b20] machine_kexec at ffffffffb2a6bec3
 #1 [ffffb954506b3b78] __crash_kexec at ffffffffb2bb598a
 #2 [ffffb954506b3c38] crash_kexec at ffffffffb2bb68c1
 #3 [ffffb954506b3c50] oops_end at ffffffffb2a2a9b1
 #4 [ffffb954506b3c70] no_context at ffffffffb2a7e913
 #5 [ffffb954506b3cc8] __bad_area_nosemaphore at ffffffffb2a7ec8c
 #6 [ffffb954506b3d10] do_page_fault at ffffffffb2a7f887
 #7 [ffffb954506b3d40] page_fault at ffffffffb360116e
    [exception RIP: ttm_resource_manager_debug+0x11]
    RIP: ffffffffc04afd11  RSP: ffffb954506b3df0  RFLAGS: 00010246
    RAX: ffff8fe41a6d1200  RBX: 0000000000000000  RCX: 0000000000000940
    RDX: 0000000000000000  RSI: ffffffffc04b4338  RDI: 0000000000000000
    RBP: ffffb954506b3e08   R8: ffff8fee3ffad000   R9: 0000000000000000
    R10: ffff8fe41a76a000  R11: 0000000000000001  R12: 00000000ffffffff
    R13: 0000000000000001  R14: ffff8fe5bb6f3900  R15: ffff8fe41a6d1200
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #8 [ffffb954506b3e00] ttm_resource_manager_show at ffffffffc04afde7 [ttm]
 #9 [ffffb954506b3e30] seq_read at ffffffffb2d8f9f3
    RIP: 00007f4c4eda8985  RSP: 00007ffdbba9e9f8  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 000000000037e000  RCX: 00007f4c4eda8985
    RDX: 000000000037e000  RSI: 00007f4c41573000  RDI: 0000000000000003
    RBP: 000000000037e000   R8: 0000000000000000   R9: 000000000037fe30
    R10: 0000000000000000  R11: 0000000000000246  R12: 00007f4c41573000
    R13: 0000000000000003  R14: 00007f4c41572010  R15: 0000000000000003
    ORIG_RAX: 0000000000000000  CS: 0033  SS: 002b

Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>
Fixes: af4a25b ("drm/vmwgfx: Add debugfs entries for various ttm resource managers")
Cc: <stable@vger.kernel.org>
Reviewed-by: Zack Rusin <zack.rusin@broadcom.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240312093551.196609-1-jfalempe@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
tobetter pushed a commit that referenced this issue Apr 15, 2024
[ Upstream commit 0bef512 ]

Based on a syzbot report, it appears many virtual
drivers do not yet use netdev_lockdep_set_classes(),
triggerring lockdep false positives.

WARNING: possible recursive locking detected
6.8.0-rc4-next-20240212-syzkaller #0 Not tainted

syz-executor.0/19016 is trying to acquire lock:
 ffff8880162cb298 (_xmit_ETHER#2){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
 ffff8880162cb298 (_xmit_ETHER#2){+.-.}-{2:2}, at: __netif_tx_lock include/linux/netdevice.h:4452 [inline]
 ffff8880162cb298 (_xmit_ETHER#2){+.-.}-{2:2}, at: sch_direct_xmit+0x1c4/0x5f0 net/sched/sch_generic.c:340

but task is already holding lock:
 ffff8880223db4d8 (_xmit_ETHER#2){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
 ffff8880223db4d8 (_xmit_ETHER#2){+.-.}-{2:2}, at: __netif_tx_lock include/linux/netdevice.h:4452 [inline]
 ffff8880223db4d8 (_xmit_ETHER#2){+.-.}-{2:2}, at: sch_direct_xmit+0x1c4/0x5f0 net/sched/sch_generic.c:340

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
  lock(_xmit_ETHER#2);
  lock(_xmit_ETHER#2);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

9 locks held by syz-executor.0/19016:
  #0: ffffffff8f385208 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock net/core/rtnetlink.c:79 [inline]
  #0: ffffffff8f385208 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x82c/0x1040 net/core/rtnetlink.c:6603
  #1: ffffc90000a08c00 ((&in_dev->mr_ifc_timer)){+.-.}-{0:0}, at: call_timer_fn+0xc0/0x600 kernel/time/timer.c:1697
  #2: ffffffff8e131520 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:298 [inline]
  #2: ffffffff8e131520 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:750 [inline]
  #2: ffffffff8e131520 (rcu_read_lock){....}-{1:2}, at: ip_finish_output2+0x45f/0x1360 net/ipv4/ip_output.c:228
  #3: ffffffff8e131580 (rcu_read_lock_bh){....}-{1:2}, at: local_bh_disable include/linux/bottom_half.h:20 [inline]
  #3: ffffffff8e131580 (rcu_read_lock_bh){....}-{1:2}, at: rcu_read_lock_bh include/linux/rcupdate.h:802 [inline]
  #3: ffffffff8e131580 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x2c4/0x3b10 net/core/dev.c:4284
  #4: ffff8880416e3258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: spin_trylock include/linux/spinlock.h:361 [inline]
  #4: ffff8880416e3258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: qdisc_run_begin include/net/sch_generic.h:195 [inline]
  #4: ffff8880416e3258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: __dev_xmit_skb net/core/dev.c:3771 [inline]
  #4: ffff8880416e3258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: __dev_queue_xmit+0x1262/0x3b10 net/core/dev.c:4325
  #5: ffff8880223db4d8 (_xmit_ETHER#2){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
  #5: ffff8880223db4d8 (_xmit_ETHER#2){+.-.}-{2:2}, at: __netif_tx_lock include/linux/netdevice.h:4452 [inline]
  #5: ffff8880223db4d8 (_xmit_ETHER#2){+.-.}-{2:2}, at: sch_direct_xmit+0x1c4/0x5f0 net/sched/sch_generic.c:340
  #6: ffffffff8e131520 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:298 [inline]
  #6: ffffffff8e131520 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:750 [inline]
  #6: ffffffff8e131520 (rcu_read_lock){....}-{1:2}, at: ip_finish_output2+0x45f/0x1360 net/ipv4/ip_output.c:228
  #7: ffffffff8e131580 (rcu_read_lock_bh){....}-{1:2}, at: local_bh_disable include/linux/bottom_half.h:20 [inline]
  #7: ffffffff8e131580 (rcu_read_lock_bh){....}-{1:2}, at: rcu_read_lock_bh include/linux/rcupdate.h:802 [inline]
  #7: ffffffff8e131580 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x2c4/0x3b10 net/core/dev.c:4284
  #8: ffff888014d9d258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: spin_trylock include/linux/spinlock.h:361 [inline]
  #8: ffff888014d9d258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: qdisc_run_begin include/net/sch_generic.h:195 [inline]
  #8: ffff888014d9d258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: __dev_xmit_skb net/core/dev.c:3771 [inline]
  #8: ffff888014d9d258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: __dev_queue_xmit+0x1262/0x3b10 net/core/dev.c:4325

stack backtrace:
CPU: 1 PID: 19016 Comm: syz-executor.0 Not tainted 6.8.0-rc4-next-20240212-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/25/2024
Call Trace:
 <IRQ>
  __dump_stack lib/dump_stack.c:88 [inline]
  dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
  check_deadlock kernel/locking/lockdep.c:3062 [inline]
  validate_chain+0x15c1/0x58e0 kernel/locking/lockdep.c:3856
  __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
  lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
  __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
  _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154
  spin_lock include/linux/spinlock.h:351 [inline]
  __netif_tx_lock include/linux/netdevice.h:4452 [inline]
  sch_direct_xmit+0x1c4/0x5f0 net/sched/sch_generic.c:340
  __dev_xmit_skb net/core/dev.c:3784 [inline]
  __dev_queue_xmit+0x1912/0x3b10 net/core/dev.c:4325
  neigh_output include/net/neighbour.h:542 [inline]
  ip_finish_output2+0xe66/0x1360 net/ipv4/ip_output.c:235
  iptunnel_xmit+0x540/0x9b0 net/ipv4/ip_tunnel_core.c:82
  ip_tunnel_xmit+0x20ee/0x2960 net/ipv4/ip_tunnel.c:831
  erspan_xmit+0x9de/0x1460 net/ipv4/ip_gre.c:720
  __netdev_start_xmit include/linux/netdevice.h:4989 [inline]
  netdev_start_xmit include/linux/netdevice.h:5003 [inline]
  xmit_one net/core/dev.c:3555 [inline]
  dev_hard_start_xmit+0x242/0x770 net/core/dev.c:3571
  sch_direct_xmit+0x2b6/0x5f0 net/sched/sch_generic.c:342
  __dev_xmit_skb net/core/dev.c:3784 [inline]
  __dev_queue_xmit+0x1912/0x3b10 net/core/dev.c:4325
  neigh_output include/net/neighbour.h:542 [inline]
  ip_finish_output2+0xe66/0x1360 net/ipv4/ip_output.c:235
  igmpv3_send_cr net/ipv4/igmp.c:723 [inline]
  igmp_ifc_timer_expire+0xb71/0xd90 net/ipv4/igmp.c:813
  call_timer_fn+0x17e/0x600 kernel/time/timer.c:1700
  expire_timers kernel/time/timer.c:1751 [inline]
  __run_timers+0x621/0x830 kernel/time/timer.c:2038
  run_timer_softirq+0x67/0xf0 kernel/time/timer.c:2051
  __do_softirq+0x2bc/0x943 kernel/softirq.c:554
  invoke_softirq kernel/softirq.c:428 [inline]
  __irq_exit_rcu+0xf2/0x1c0 kernel/softirq.c:633
  irq_exit_rcu+0x9/0x30 kernel/softirq.c:645
  instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1076 [inline]
  sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1076
 </IRQ>
 <TASK>
  asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702
 RIP: 0010:resched_offsets_ok kernel/sched/core.c:10127 [inline]
 RIP: 0010:__might_resched+0x16f/0x780 kernel/sched/core.c:10142
Code: 00 4c 89 e8 48 c1 e8 03 48 ba 00 00 00 00 00 fc ff df 48 89 44 24 38 0f b6 04 10 84 c0 0f 85 87 04 00 00 41 8b 45 00 c1 e0 08 <01> d8 44 39 e0 0f 85 d6 00 00 00 44 89 64 24 1c 48 8d bc 24 a0 00
RSP: 0018:ffffc9000ee069e0 EFLAGS: 00000246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff8880296a9e00
RDX: dffffc0000000000 RSI: ffff8880296a9e00 RDI: ffffffff8bfe8fa0
RBP: ffffc9000ee06b00 R08: ffffffff82326877 R09: 1ffff11002b5ad1b
R10: dffffc0000000000 R11: ffffed1002b5ad1c R12: 0000000000000000
R13: ffff8880296aa23c R14: 000000000000062a R15: 1ffff92001dc0d44
  down_write+0x19/0x50 kernel/locking/rwsem.c:1578
  kernfs_activate fs/kernfs/dir.c:1403 [inline]
  kernfs_add_one+0x4af/0x8b0 fs/kernfs/dir.c:819
  __kernfs_create_file+0x22e/0x2e0 fs/kernfs/file.c:1056
  sysfs_add_file_mode_ns+0x24a/0x310 fs/sysfs/file.c:307
  create_files fs/sysfs/group.c:64 [inline]
  internal_create_group+0x4f4/0xf20 fs/sysfs/group.c:152
  internal_create_groups fs/sysfs/group.c:192 [inline]
  sysfs_create_groups+0x56/0x120 fs/sysfs/group.c:218
  create_dir lib/kobject.c:78 [inline]
  kobject_add_internal+0x472/0x8d0 lib/kobject.c:240
  kobject_add_varg lib/kobject.c:374 [inline]
  kobject_init_and_add+0x124/0x190 lib/kobject.c:457
  netdev_queue_add_kobject net/core/net-sysfs.c:1706 [inline]
  netdev_queue_update_kobjects+0x1f3/0x480 net/core/net-sysfs.c:1758
  register_queue_kobjects net/core/net-sysfs.c:1819 [inline]
  netdev_register_kobject+0x265/0x310 net/core/net-sysfs.c:2059
  register_netdevice+0x1191/0x19c0 net/core/dev.c:10298
  bond_newlink+0x3b/0x90 drivers/net/bonding/bond_netlink.c:576
  rtnl_newlink_create net/core/rtnetlink.c:3506 [inline]
  __rtnl_newlink net/core/rtnetlink.c:3726 [inline]
  rtnl_newlink+0x158f/0x20a0 net/core/rtnetlink.c:3739
  rtnetlink_rcv_msg+0x885/0x1040 net/core/rtnetlink.c:6606
  netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2543
  netlink_unicast_kernel net/netlink/af_netlink.c:1341 [inline]
  netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1367
  netlink_sendmsg+0xa3c/0xd70 net/netlink/af_netlink.c:1908
  sock_sendmsg_nosec net/socket.c:730 [inline]
  __sock_sendmsg+0x221/0x270 net/socket.c:745
  __sys_sendto+0x3a4/0x4f0 net/socket.c:2191
  __do_sys_sendto net/socket.c:2203 [inline]
  __se_sys_sendto net/socket.c:2199 [inline]
  __x64_sys_sendto+0xde/0x100 net/socket.c:2199
 do_syscall_64+0xfb/0x240
 entry_SYSCALL_64_after_hwframe+0x6d/0x75
RIP: 0033:0x7fc3fa87fa9c

Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240212140700.2795436-4-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
tobetter pushed a commit that referenced this issue Apr 15, 2024
[ Upstream commit 1947b92 ]

Parallel testing appears to show a race between allocating and setting
evsel ids. As there is a bounds check on the xyarray it yields a segv
like:

```
AddressSanitizer:DEADLYSIGNAL

=================================================================

==484408==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000010

==484408==The signal is caused by a WRITE memory access.

==484408==Hint: address points to the zero page.

    #0 0x55cef5d4eff4 in perf_evlist__id_hash tools/lib/perf/evlist.c:256
    #1 0x55cef5d4f132 in perf_evlist__id_add tools/lib/perf/evlist.c:274
    #2 0x55cef5d4f545 in perf_evlist__id_add_fd tools/lib/perf/evlist.c:315
    #3 0x55cef5a1923f in store_evsel_ids util/evsel.c:3130
    #4 0x55cef5a19400 in evsel__store_ids util/evsel.c:3147
    #5 0x55cef5888204 in __run_perf_stat tools/perf/builtin-stat.c:832
    #6 0x55cef5888c06 in run_perf_stat tools/perf/builtin-stat.c:960
    #7 0x55cef58932db in cmd_stat tools/perf/builtin-stat.c:2878
...
```

Avoid this crash by early exiting the perf_evlist__id_add_fd and
perf_evlist__id_add is the access is out-of-bounds.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240229070757.796244-1-irogers@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
tobetter pushed a commit that referenced this issue Apr 15, 2024
…pty list

[ Upstream commit 55b6c73 ]

If all peers are removed via wg_peer_remove_all(), rather than setting
peer_list to empty, the peer is added to a temporary list with a head on
the stack of wg_peer_remove_all(). If a netlink dump is resumed and the
cursored peer is one that has been removed via wg_peer_remove_all(), it
will iterate from that peer and then attempt to dump freed peers.

Fix this by instead checking peer->is_dead, which was explictly created
for this purpose. Also move up the device_update_lock lockdep assertion,
since reading is_dead relies on that.

It can be reproduced by a small script like:

    echo "Setting config..."
    ip link add dev wg0 type wireguard
    wg setconf wg0 /big-config
    (
            while true; do
                    echo "Showing config..."
                    wg showconf wg0 > /dev/null
            done
    ) &
    sleep 4
    wg setconf wg0 <(printf "[Peer]\nPublicKey=$(wg genkey)\n")

Resulting in:

    BUG: KASAN: slab-use-after-free in __lock_acquire+0x182a/0x1b20
    Read of size 8 at addr ffff88811956ec70 by task wg/59
    CPU: 2 PID: 59 Comm: wg Not tainted 6.8.0-rc2-debug+ #5
    Call Trace:
     <TASK>
     dump_stack_lvl+0x47/0x70
     print_address_description.constprop.0+0x2c/0x380
     print_report+0xab/0x250
     kasan_report+0xba/0xf0
     __lock_acquire+0x182a/0x1b20
     lock_acquire+0x191/0x4b0
     down_read+0x80/0x440
     get_peer+0x140/0xcb0
     wg_get_device_dump+0x471/0x1130

Cc: stable@vger.kernel.org
Fixes: e7096c1 ("net: WireGuard secure network tunnel")
Reported-by: Lillian Berry <lillian@star-ark.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
tobetter pushed a commit that referenced this issue Apr 15, 2024
commit 4be9075 upstream.

The driver creates /sys/kernel/debug/dri/0/mob_ttm even when the
corresponding ttm_resource_manager is not allocated.
This leads to a crash when trying to read from this file.

Add a check to create mob_ttm, system_mob_ttm, and gmr_ttm debug file
only when the corresponding ttm_resource_manager is allocated.

crash> bt
PID: 3133409  TASK: ffff8fe4834a5000  CPU: 3    COMMAND: "grep"
 #0 [ffffb954506b3b20] machine_kexec at ffffffffb2a6bec3
 #1 [ffffb954506b3b78] __crash_kexec at ffffffffb2bb598a
 #2 [ffffb954506b3c38] crash_kexec at ffffffffb2bb68c1
 #3 [ffffb954506b3c50] oops_end at ffffffffb2a2a9b1
 #4 [ffffb954506b3c70] no_context at ffffffffb2a7e913
 #5 [ffffb954506b3cc8] __bad_area_nosemaphore at ffffffffb2a7ec8c
 #6 [ffffb954506b3d10] do_page_fault at ffffffffb2a7f887
 #7 [ffffb954506b3d40] page_fault at ffffffffb360116e
    [exception RIP: ttm_resource_manager_debug+0x11]
    RIP: ffffffffc04afd11  RSP: ffffb954506b3df0  RFLAGS: 00010246
    RAX: ffff8fe41a6d1200  RBX: 0000000000000000  RCX: 0000000000000940
    RDX: 0000000000000000  RSI: ffffffffc04b4338  RDI: 0000000000000000
    RBP: ffffb954506b3e08   R8: ffff8fee3ffad000   R9: 0000000000000000
    R10: ffff8fe41a76a000  R11: 0000000000000001  R12: 00000000ffffffff
    R13: 0000000000000001  R14: ffff8fe5bb6f3900  R15: ffff8fe41a6d1200
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #8 [ffffb954506b3e00] ttm_resource_manager_show at ffffffffc04afde7 [ttm]
 #9 [ffffb954506b3e30] seq_read at ffffffffb2d8f9f3
    RIP: 00007f4c4eda8985  RSP: 00007ffdbba9e9f8  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 000000000037e000  RCX: 00007f4c4eda8985
    RDX: 000000000037e000  RSI: 00007f4c41573000  RDI: 0000000000000003
    RBP: 000000000037e000   R8: 0000000000000000   R9: 000000000037fe30
    R10: 0000000000000000  R11: 0000000000000246  R12: 00007f4c41573000
    R13: 0000000000000003  R14: 00007f4c41572010  R15: 0000000000000003
    ORIG_RAX: 0000000000000000  CS: 0033  SS: 002b

Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>
Fixes: af4a25b ("drm/vmwgfx: Add debugfs entries for various ttm resource managers")
Cc: <stable@vger.kernel.org>
Reviewed-by: Zack Rusin <zack.rusin@broadcom.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240312093551.196609-1-jfalempe@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
tobetter pushed a commit that referenced this issue Apr 15, 2024
[ Upstream commit 1947b92 ]

Parallel testing appears to show a race between allocating and setting
evsel ids. As there is a bounds check on the xyarray it yields a segv
like:

```
AddressSanitizer:DEADLYSIGNAL

=================================================================

==484408==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000010

==484408==The signal is caused by a WRITE memory access.

==484408==Hint: address points to the zero page.

    #0 0x55cef5d4eff4 in perf_evlist__id_hash tools/lib/perf/evlist.c:256
    #1 0x55cef5d4f132 in perf_evlist__id_add tools/lib/perf/evlist.c:274
    #2 0x55cef5d4f545 in perf_evlist__id_add_fd tools/lib/perf/evlist.c:315
    #3 0x55cef5a1923f in store_evsel_ids util/evsel.c:3130
    #4 0x55cef5a19400 in evsel__store_ids util/evsel.c:3147
    #5 0x55cef5888204 in __run_perf_stat tools/perf/builtin-stat.c:832
    #6 0x55cef5888c06 in run_perf_stat tools/perf/builtin-stat.c:960
    #7 0x55cef58932db in cmd_stat tools/perf/builtin-stat.c:2878
...
```

Avoid this crash by early exiting the perf_evlist__id_add_fd and
perf_evlist__id_add is the access is out-of-bounds.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240229070757.796244-1-irogers@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
tobetter pushed a commit that referenced this issue Apr 15, 2024
…pty list

[ Upstream commit 55b6c73 ]

If all peers are removed via wg_peer_remove_all(), rather than setting
peer_list to empty, the peer is added to a temporary list with a head on
the stack of wg_peer_remove_all(). If a netlink dump is resumed and the
cursored peer is one that has been removed via wg_peer_remove_all(), it
will iterate from that peer and then attempt to dump freed peers.

Fix this by instead checking peer->is_dead, which was explictly created
for this purpose. Also move up the device_update_lock lockdep assertion,
since reading is_dead relies on that.

It can be reproduced by a small script like:

    echo "Setting config..."
    ip link add dev wg0 type wireguard
    wg setconf wg0 /big-config
    (
            while true; do
                    echo "Showing config..."
                    wg showconf wg0 > /dev/null
            done
    ) &
    sleep 4
    wg setconf wg0 <(printf "[Peer]\nPublicKey=$(wg genkey)\n")

Resulting in:

    BUG: KASAN: slab-use-after-free in __lock_acquire+0x182a/0x1b20
    Read of size 8 at addr ffff88811956ec70 by task wg/59
    CPU: 2 PID: 59 Comm: wg Not tainted 6.8.0-rc2-debug+ #5
    Call Trace:
     <TASK>
     dump_stack_lvl+0x47/0x70
     print_address_description.constprop.0+0x2c/0x380
     print_report+0xab/0x250
     kasan_report+0xba/0xf0
     __lock_acquire+0x182a/0x1b20
     lock_acquire+0x191/0x4b0
     down_read+0x80/0x440
     get_peer+0x140/0xcb0
     wg_get_device_dump+0x471/0x1130

Cc: stable@vger.kernel.org
Fixes: e7096c1 ("net: WireGuard secure network tunnel")
Reported-by: Lillian Berry <lillian@star-ark.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
tobetter pushed a commit that referenced this issue Apr 15, 2024
[ Upstream commit 1947b92 ]

Parallel testing appears to show a race between allocating and setting
evsel ids. As there is a bounds check on the xyarray it yields a segv
like:

```
AddressSanitizer:DEADLYSIGNAL

=================================================================

==484408==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000010

==484408==The signal is caused by a WRITE memory access.

==484408==Hint: address points to the zero page.

    #0 0x55cef5d4eff4 in perf_evlist__id_hash tools/lib/perf/evlist.c:256
    #1 0x55cef5d4f132 in perf_evlist__id_add tools/lib/perf/evlist.c:274
    #2 0x55cef5d4f545 in perf_evlist__id_add_fd tools/lib/perf/evlist.c:315
    #3 0x55cef5a1923f in store_evsel_ids util/evsel.c:3130
    #4 0x55cef5a19400 in evsel__store_ids util/evsel.c:3147
    #5 0x55cef5888204 in __run_perf_stat tools/perf/builtin-stat.c:832
    #6 0x55cef5888c06 in run_perf_stat tools/perf/builtin-stat.c:960
    #7 0x55cef58932db in cmd_stat tools/perf/builtin-stat.c:2878
...
```

Avoid this crash by early exiting the perf_evlist__id_add_fd and
perf_evlist__id_add is the access is out-of-bounds.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240229070757.796244-1-irogers@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
tobetter pushed a commit that referenced this issue Apr 16, 2024
…pty list

[ Upstream commit 55b6c73 ]

If all peers are removed via wg_peer_remove_all(), rather than setting
peer_list to empty, the peer is added to a temporary list with a head on
the stack of wg_peer_remove_all(). If a netlink dump is resumed and the
cursored peer is one that has been removed via wg_peer_remove_all(), it
will iterate from that peer and then attempt to dump freed peers.

Fix this by instead checking peer->is_dead, which was explictly created
for this purpose. Also move up the device_update_lock lockdep assertion,
since reading is_dead relies on that.

It can be reproduced by a small script like:

    echo "Setting config..."
    ip link add dev wg0 type wireguard
    wg setconf wg0 /big-config
    (
            while true; do
                    echo "Showing config..."
                    wg showconf wg0 > /dev/null
            done
    ) &
    sleep 4
    wg setconf wg0 <(printf "[Peer]\nPublicKey=$(wg genkey)\n")

Resulting in:

    BUG: KASAN: slab-use-after-free in __lock_acquire+0x182a/0x1b20
    Read of size 8 at addr ffff88811956ec70 by task wg/59
    CPU: 2 PID: 59 Comm: wg Not tainted 6.8.0-rc2-debug+ #5
    Call Trace:
     <TASK>
     dump_stack_lvl+0x47/0x70
     print_address_description.constprop.0+0x2c/0x380
     print_report+0xab/0x250
     kasan_report+0xba/0xf0
     __lock_acquire+0x182a/0x1b20
     lock_acquire+0x191/0x4b0
     down_read+0x80/0x440
     get_peer+0x140/0xcb0
     wg_get_device_dump+0x471/0x1130

Cc: stable@vger.kernel.org
Fixes: e7096c1 ("net: WireGuard secure network tunnel")
Reported-by: Lillian Berry <lillian@star-ark.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
tobetter pushed a commit that referenced this issue Apr 16, 2024
commit 4be9075 upstream.

The driver creates /sys/kernel/debug/dri/0/mob_ttm even when the
corresponding ttm_resource_manager is not allocated.
This leads to a crash when trying to read from this file.

Add a check to create mob_ttm, system_mob_ttm, and gmr_ttm debug file
only when the corresponding ttm_resource_manager is allocated.

crash> bt
PID: 3133409  TASK: ffff8fe4834a5000  CPU: 3    COMMAND: "grep"
 #0 [ffffb954506b3b20] machine_kexec at ffffffffb2a6bec3
 #1 [ffffb954506b3b78] __crash_kexec at ffffffffb2bb598a
 #2 [ffffb954506b3c38] crash_kexec at ffffffffb2bb68c1
 #3 [ffffb954506b3c50] oops_end at ffffffffb2a2a9b1
 #4 [ffffb954506b3c70] no_context at ffffffffb2a7e913
 #5 [ffffb954506b3cc8] __bad_area_nosemaphore at ffffffffb2a7ec8c
 #6 [ffffb954506b3d10] do_page_fault at ffffffffb2a7f887
 #7 [ffffb954506b3d40] page_fault at ffffffffb360116e
    [exception RIP: ttm_resource_manager_debug+0x11]
    RIP: ffffffffc04afd11  RSP: ffffb954506b3df0  RFLAGS: 00010246
    RAX: ffff8fe41a6d1200  RBX: 0000000000000000  RCX: 0000000000000940
    RDX: 0000000000000000  RSI: ffffffffc04b4338  RDI: 0000000000000000
    RBP: ffffb954506b3e08   R8: ffff8fee3ffad000   R9: 0000000000000000
    R10: ffff8fe41a76a000  R11: 0000000000000001  R12: 00000000ffffffff
    R13: 0000000000000001  R14: ffff8fe5bb6f3900  R15: ffff8fe41a6d1200
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #8 [ffffb954506b3e00] ttm_resource_manager_show at ffffffffc04afde7 [ttm]
 #9 [ffffb954506b3e30] seq_read at ffffffffb2d8f9f3
    RIP: 00007f4c4eda8985  RSP: 00007ffdbba9e9f8  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 000000000037e000  RCX: 00007f4c4eda8985
    RDX: 000000000037e000  RSI: 00007f4c41573000  RDI: 0000000000000003
    RBP: 000000000037e000   R8: 0000000000000000   R9: 000000000037fe30
    R10: 0000000000000000  R11: 0000000000000246  R12: 00007f4c41573000
    R13: 0000000000000003  R14: 00007f4c41572010  R15: 0000000000000003
    ORIG_RAX: 0000000000000000  CS: 0033  SS: 002b

Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>
Fixes: af4a25b ("drm/vmwgfx: Add debugfs entries for various ttm resource managers")
Cc: <stable@vger.kernel.org>
Reviewed-by: Zack Rusin <zack.rusin@broadcom.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240312093551.196609-1-jfalempe@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
tobetter pushed a commit that referenced this issue Apr 16, 2024
[ Upstream commit 0bef512 ]

Based on a syzbot report, it appears many virtual
drivers do not yet use netdev_lockdep_set_classes(),
triggerring lockdep false positives.

WARNING: possible recursive locking detected
6.8.0-rc4-next-20240212-syzkaller #0 Not tainted

syz-executor.0/19016 is trying to acquire lock:
 ffff8880162cb298 (_xmit_ETHER#2){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
 ffff8880162cb298 (_xmit_ETHER#2){+.-.}-{2:2}, at: __netif_tx_lock include/linux/netdevice.h:4452 [inline]
 ffff8880162cb298 (_xmit_ETHER#2){+.-.}-{2:2}, at: sch_direct_xmit+0x1c4/0x5f0 net/sched/sch_generic.c:340

but task is already holding lock:
 ffff8880223db4d8 (_xmit_ETHER#2){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
 ffff8880223db4d8 (_xmit_ETHER#2){+.-.}-{2:2}, at: __netif_tx_lock include/linux/netdevice.h:4452 [inline]
 ffff8880223db4d8 (_xmit_ETHER#2){+.-.}-{2:2}, at: sch_direct_xmit+0x1c4/0x5f0 net/sched/sch_generic.c:340

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
  lock(_xmit_ETHER#2);
  lock(_xmit_ETHER#2);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

9 locks held by syz-executor.0/19016:
  #0: ffffffff8f385208 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock net/core/rtnetlink.c:79 [inline]
  #0: ffffffff8f385208 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x82c/0x1040 net/core/rtnetlink.c:6603
  #1: ffffc90000a08c00 ((&in_dev->mr_ifc_timer)){+.-.}-{0:0}, at: call_timer_fn+0xc0/0x600 kernel/time/timer.c:1697
  #2: ffffffff8e131520 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:298 [inline]
  #2: ffffffff8e131520 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:750 [inline]
  #2: ffffffff8e131520 (rcu_read_lock){....}-{1:2}, at: ip_finish_output2+0x45f/0x1360 net/ipv4/ip_output.c:228
  #3: ffffffff8e131580 (rcu_read_lock_bh){....}-{1:2}, at: local_bh_disable include/linux/bottom_half.h:20 [inline]
  #3: ffffffff8e131580 (rcu_read_lock_bh){....}-{1:2}, at: rcu_read_lock_bh include/linux/rcupdate.h:802 [inline]
  #3: ffffffff8e131580 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x2c4/0x3b10 net/core/dev.c:4284
  #4: ffff8880416e3258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: spin_trylock include/linux/spinlock.h:361 [inline]
  #4: ffff8880416e3258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: qdisc_run_begin include/net/sch_generic.h:195 [inline]
  #4: ffff8880416e3258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: __dev_xmit_skb net/core/dev.c:3771 [inline]
  #4: ffff8880416e3258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: __dev_queue_xmit+0x1262/0x3b10 net/core/dev.c:4325
  #5: ffff8880223db4d8 (_xmit_ETHER#2){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
  #5: ffff8880223db4d8 (_xmit_ETHER#2){+.-.}-{2:2}, at: __netif_tx_lock include/linux/netdevice.h:4452 [inline]
  #5: ffff8880223db4d8 (_xmit_ETHER#2){+.-.}-{2:2}, at: sch_direct_xmit+0x1c4/0x5f0 net/sched/sch_generic.c:340
  #6: ffffffff8e131520 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:298 [inline]
  #6: ffffffff8e131520 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:750 [inline]
  #6: ffffffff8e131520 (rcu_read_lock){....}-{1:2}, at: ip_finish_output2+0x45f/0x1360 net/ipv4/ip_output.c:228
  #7: ffffffff8e131580 (rcu_read_lock_bh){....}-{1:2}, at: local_bh_disable include/linux/bottom_half.h:20 [inline]
  #7: ffffffff8e131580 (rcu_read_lock_bh){....}-{1:2}, at: rcu_read_lock_bh include/linux/rcupdate.h:802 [inline]
  #7: ffffffff8e131580 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x2c4/0x3b10 net/core/dev.c:4284
  #8: ffff888014d9d258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: spin_trylock include/linux/spinlock.h:361 [inline]
  #8: ffff888014d9d258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: qdisc_run_begin include/net/sch_generic.h:195 [inline]
  #8: ffff888014d9d258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: __dev_xmit_skb net/core/dev.c:3771 [inline]
  #8: ffff888014d9d258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: __dev_queue_xmit+0x1262/0x3b10 net/core/dev.c:4325

stack backtrace:
CPU: 1 PID: 19016 Comm: syz-executor.0 Not tainted 6.8.0-rc4-next-20240212-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/25/2024
Call Trace:
 <IRQ>
  __dump_stack lib/dump_stack.c:88 [inline]
  dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
  check_deadlock kernel/locking/lockdep.c:3062 [inline]
  validate_chain+0x15c1/0x58e0 kernel/locking/lockdep.c:3856
  __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
  lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
  __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
  _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154
  spin_lock include/linux/spinlock.h:351 [inline]
  __netif_tx_lock include/linux/netdevice.h:4452 [inline]
  sch_direct_xmit+0x1c4/0x5f0 net/sched/sch_generic.c:340
  __dev_xmit_skb net/core/dev.c:3784 [inline]
  __dev_queue_xmit+0x1912/0x3b10 net/core/dev.c:4325
  neigh_output include/net/neighbour.h:542 [inline]
  ip_finish_output2+0xe66/0x1360 net/ipv4/ip_output.c:235
  iptunnel_xmit+0x540/0x9b0 net/ipv4/ip_tunnel_core.c:82
  ip_tunnel_xmit+0x20ee/0x2960 net/ipv4/ip_tunnel.c:831
  erspan_xmit+0x9de/0x1460 net/ipv4/ip_gre.c:720
  __netdev_start_xmit include/linux/netdevice.h:4989 [inline]
  netdev_start_xmit include/linux/netdevice.h:5003 [inline]
  xmit_one net/core/dev.c:3555 [inline]
  dev_hard_start_xmit+0x242/0x770 net/core/dev.c:3571
  sch_direct_xmit+0x2b6/0x5f0 net/sched/sch_generic.c:342
  __dev_xmit_skb net/core/dev.c:3784 [inline]
  __dev_queue_xmit+0x1912/0x3b10 net/core/dev.c:4325
  neigh_output include/net/neighbour.h:542 [inline]
  ip_finish_output2+0xe66/0x1360 net/ipv4/ip_output.c:235
  igmpv3_send_cr net/ipv4/igmp.c:723 [inline]
  igmp_ifc_timer_expire+0xb71/0xd90 net/ipv4/igmp.c:813
  call_timer_fn+0x17e/0x600 kernel/time/timer.c:1700
  expire_timers kernel/time/timer.c:1751 [inline]
  __run_timers+0x621/0x830 kernel/time/timer.c:2038
  run_timer_softirq+0x67/0xf0 kernel/time/timer.c:2051
  __do_softirq+0x2bc/0x943 kernel/softirq.c:554
  invoke_softirq kernel/softirq.c:428 [inline]
  __irq_exit_rcu+0xf2/0x1c0 kernel/softirq.c:633
  irq_exit_rcu+0x9/0x30 kernel/softirq.c:645
  instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1076 [inline]
  sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1076
 </IRQ>
 <TASK>
  asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702
 RIP: 0010:resched_offsets_ok kernel/sched/core.c:10127 [inline]
 RIP: 0010:__might_resched+0x16f/0x780 kernel/sched/core.c:10142
Code: 00 4c 89 e8 48 c1 e8 03 48 ba 00 00 00 00 00 fc ff df 48 89 44 24 38 0f b6 04 10 84 c0 0f 85 87 04 00 00 41 8b 45 00 c1 e0 08 <01> d8 44 39 e0 0f 85 d6 00 00 00 44 89 64 24 1c 48 8d bc 24 a0 00
RSP: 0018:ffffc9000ee069e0 EFLAGS: 00000246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff8880296a9e00
RDX: dffffc0000000000 RSI: ffff8880296a9e00 RDI: ffffffff8bfe8fa0
RBP: ffffc9000ee06b00 R08: ffffffff82326877 R09: 1ffff11002b5ad1b
R10: dffffc0000000000 R11: ffffed1002b5ad1c R12: 0000000000000000
R13: ffff8880296aa23c R14: 000000000000062a R15: 1ffff92001dc0d44
  down_write+0x19/0x50 kernel/locking/rwsem.c:1578
  kernfs_activate fs/kernfs/dir.c:1403 [inline]
  kernfs_add_one+0x4af/0x8b0 fs/kernfs/dir.c:819
  __kernfs_create_file+0x22e/0x2e0 fs/kernfs/file.c:1056
  sysfs_add_file_mode_ns+0x24a/0x310 fs/sysfs/file.c:307
  create_files fs/sysfs/group.c:64 [inline]
  internal_create_group+0x4f4/0xf20 fs/sysfs/group.c:152
  internal_create_groups fs/sysfs/group.c:192 [inline]
  sysfs_create_groups+0x56/0x120 fs/sysfs/group.c:218
  create_dir lib/kobject.c:78 [inline]
  kobject_add_internal+0x472/0x8d0 lib/kobject.c:240
  kobject_add_varg lib/kobject.c:374 [inline]
  kobject_init_and_add+0x124/0x190 lib/kobject.c:457
  netdev_queue_add_kobject net/core/net-sysfs.c:1706 [inline]
  netdev_queue_update_kobjects+0x1f3/0x480 net/core/net-sysfs.c:1758
  register_queue_kobjects net/core/net-sysfs.c:1819 [inline]
  netdev_register_kobject+0x265/0x310 net/core/net-sysfs.c:2059
  register_netdevice+0x1191/0x19c0 net/core/dev.c:10298
  bond_newlink+0x3b/0x90 drivers/net/bonding/bond_netlink.c:576
  rtnl_newlink_create net/core/rtnetlink.c:3506 [inline]
  __rtnl_newlink net/core/rtnetlink.c:3726 [inline]
  rtnl_newlink+0x158f/0x20a0 net/core/rtnetlink.c:3739
  rtnetlink_rcv_msg+0x885/0x1040 net/core/rtnetlink.c:6606
  netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2543
  netlink_unicast_kernel net/netlink/af_netlink.c:1341 [inline]
  netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1367
  netlink_sendmsg+0xa3c/0xd70 net/netlink/af_netlink.c:1908
  sock_sendmsg_nosec net/socket.c:730 [inline]
  __sock_sendmsg+0x221/0x270 net/socket.c:745
  __sys_sendto+0x3a4/0x4f0 net/socket.c:2191
  __do_sys_sendto net/socket.c:2203 [inline]
  __se_sys_sendto net/socket.c:2199 [inline]
  __x64_sys_sendto+0xde/0x100 net/socket.c:2199
 do_syscall_64+0xfb/0x240
 entry_SYSCALL_64_after_hwframe+0x6d/0x75
RIP: 0033:0x7fc3fa87fa9c

Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240212140700.2795436-4-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
tobetter pushed a commit that referenced this issue Apr 16, 2024
[ Upstream commit 1947b92 ]

Parallel testing appears to show a race between allocating and setting
evsel ids. As there is a bounds check on the xyarray it yields a segv
like:

```
AddressSanitizer:DEADLYSIGNAL

=================================================================

==484408==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000010

==484408==The signal is caused by a WRITE memory access.

==484408==Hint: address points to the zero page.

    #0 0x55cef5d4eff4 in perf_evlist__id_hash tools/lib/perf/evlist.c:256
    #1 0x55cef5d4f132 in perf_evlist__id_add tools/lib/perf/evlist.c:274
    #2 0x55cef5d4f545 in perf_evlist__id_add_fd tools/lib/perf/evlist.c:315
    #3 0x55cef5a1923f in store_evsel_ids util/evsel.c:3130
    #4 0x55cef5a19400 in evsel__store_ids util/evsel.c:3147
    #5 0x55cef5888204 in __run_perf_stat tools/perf/builtin-stat.c:832
    #6 0x55cef5888c06 in run_perf_stat tools/perf/builtin-stat.c:960
    #7 0x55cef58932db in cmd_stat tools/perf/builtin-stat.c:2878
...
```

Avoid this crash by early exiting the perf_evlist__id_add_fd and
perf_evlist__id_add is the access is out-of-bounds.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240229070757.796244-1-irogers@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants