Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pi 4: rtl_test hangs with RTL2838U chips #3060

Closed
P33M opened this issue Jul 8, 2019 · 20 comments
Closed

Pi 4: rtl_test hangs with RTL2838U chips #3060

P33M opened this issue Jul 8, 2019 · 20 comments

Comments

@P33M
Copy link
Contributor

P33M commented Jul 8, 2019

Doing rtl_test on a Pi 4 invariably results in:

pi@raspberrypi:~$ rtl_test
Found 1 device(s):
  0:  Realtek, RTL2838UHIDIR, SN: 000000041

Using device 0: Sweex DVB-T USB
Detached kernel driver
rtlsdr_read_reg failed with -7
rtlsdr_write_reg failed with -7
rtlsdr_read_reg failed with -7
rtlsdr_write_reg failed with -7
^C

This is a USB2.0 device, and the issue happens on both 0137a8 and 013701 bridge chip firmwares. A long sequence of short control transfers are used to poke various registers in the chip, and for some reason the VLI chip goes quiet after receiving a stall response from one of these transfers.

On some occasions, the entire sequence completes successfully.

@pelwell
Copy link
Contributor

pelwell commented Jul 8, 2019

Could this be related to #3054? Can you test with 3GB?

@P33M
Copy link
Contributor Author

P33M commented Jul 8, 2019

The issue persists with total_mem=3072 (or 1024, for that matter).

@P33M
Copy link
Contributor Author

P33M commented Jul 8, 2019

I have wireshark captures of a good trace vs a bad trace. In the bad case, all URBs submitted to the control endpoint after the first STALL is received are returned with status -ENOENT and the requested transfer doesn't go out on the bus. In the functional case, the kernel does not respond to the first transfer after the STALL response with -ENOENT but the actual status (which is another STALL response) and all subsequent requested transfers result in bus activity.

rtl_test isn't submitting any CLEAR_FEATURE ENDPOINT_HALT requests in either of these two situations.

@P33M
Copy link
Contributor Author

P33M commented Jul 8, 2019

I get the same result on a x86 PC with a VLI chipset. Annoyingly, dynamic_debug upsets timing enough that the Pi never completes a sequence successfully. On the PC, here's two traces:
workingtrace.txt
failtrace.txt

After the first stall response, the driver then goes and does something different.

@mutability
Copy link

If you need more data / testers, there are several people seeing the same problem here: https://discussions.flightaware.com/t/new-raspberry-pi-available-pi-4/51584/28

@P33M
Copy link
Contributor Author

P33M commented Jul 9, 2019

Once more, with feeling:
a.txt
b.txt
Kernel function tracing is useful. a.txt is a successful trace. b.txt is an unsuccessful trace. The very first line that is different happens on a.txt line 1340: there's a URB enqueue on cpu 0 when cpu 2 is doing other stuff.

The first sign that the hardware is unhappy is line 1353 - an event TRB with "length invalid" response.

@P33M
Copy link
Contributor Author

P33M commented Jul 9, 2019

From the broken trace:

          <idle>-0     [002] d.h. 70029.195608: xhci_handle_event: EVENT: TRB 0000000455151e40 status 'Stall Error' len 0 slot 2 ep 1 type 'Transfer Event' flags e:c
          <idle>-0     [002] d.h. 70029.195622: xhci_handle_transfer: CTRL: Buffer 0000000000000000 length 0 TD size 0 intr 0 type 'Status Stage' flags I:c:e:c
          <idle>-0     [002] d.h. 70029.195626: xhci_queue_trb: CMD: Reset Endpoint Command: ctx 0000000000000000 slot 2 ep 1 flags C
          <idle>-0     [002] d.h. 70029.195627: xhci_inc_enq: CMD 00000000328b5997: enq 0x00000004526aa1e0(0x00000004526aa000) deq 0x00000004526aa1d0(0x00000004526aa000) segs 1 stream 0 free_trbs 253 bounce 0 cycle 1
          <idle>-0     [002] d.h. 70029.195632: xhci_dbg_reset_ep: Cleaning up stalled endpoint ring
          <idle>-0     [002] d.h. 70029.195636: xhci_dbg_cancel_urb: Finding endpoint context
          <idle>-0     [002] d.h. 70029.195640: xhci_dbg_cancel_urb: Cycle state = 0x1
          <idle>-0     [002] d.h. 70029.195644: xhci_dbg_cancel_urb: New dequeue segment = 00000000513830d3 (virtual)
          <idle>-0     [002] d.h. 70029.195649: xhci_dbg_cancel_urb: New dequeue pointer = 0x455151e50 (DMA)
          <idle>-0     [002] d.h. 70029.195653: xhci_dbg_reset_ep: Queueing new dequeue state
          <idle>-0     [002] d.h. 70029.195658: xhci_dbg_cancel_urb: Set TR Deq Ptr cmd, new deq seg = 00000000513830d3 (0x455151000 dma), new deq ptr = 000000008bc1c5e5 (0x455151e50 dma), new cycle = 1
          <idle>-0     [002] d.h. 70029.195661: xhci_queue_trb: CMD: Set TR Dequeue Pointer Command: deq 0000000455151e51 stream 0 slot 2 ep 1 flags C
          <idle>-0     [002] d.h. 70029.195661: xhci_inc_enq: CMD 00000000328b5997: enq 0x00000004526aa1f0(0x00000004526aa000) deq 0x00000004526aa1d0(0x00000004526aa000) segs 1 stream 0 free_trbs 252 bounce 0 cycle 1
          <idle>-0     [002] d.h. 70029.195699: xhci_urb_giveback: ep0out-control: urb 00000000048c9d96 pipe 2147487488 slot 2 length 0/1 sgs 0/0 stream 0 flags 00110000
          <idle>-0     [002] d.h. 70029.195708: xhci_inc_deq: EVENT 000000005bcbabe3: enq 0x00000004526ac000(0x00000004526ac000) deq 0x00000004526ac650(0x00000004526ac000) segs 1 stream 0 free_trbs 254 bounce 0 cycle 0
          <idle>-0     [002] dNh. 70029.195734: xhci_handle_event: EVENT: TRB 00000004526aa1d0 status 'Success' len 0 slot 2 ep 0 type 'Command Completion Event' flags e:c
          <idle>-0     [002] dNh. 70029.195735: xhci_handle_command: CMD: Reset Endpoint Command: ctx 0000000000000000 slot 2 ep 1 flags C
          <idle>-0     [002] dNh. 70029.195736: xhci_handle_cmd_reset_ep: State stopped mult 1 max P. Streams 0 interval 125 us max ESIT payload 0 CErr 3 Type Ctrl burst 0 maxp 64 deq 0000000455151e41 avg trb len 0
          <idle>-0     [002] dNh. 70029.195740: xhci_dbg_reset_ep: Ignoring reset ep completion code of 1
          <idle>-0     [002] dNh. 70029.195742: xhci_inc_deq: CMD 00000000328b5997: enq 0x00000004526aa1f0(0x00000004526aa000) deq 0x00000004526aa1e0(0x00000004526aa000) segs 1 stream 0 free_trbs 253 bounce 0 cycle 1
          <idle>-0     [002] dNh. 70029.195743: xhci_inc_deq: EVENT 000000005bcbabe3: enq 0x00000004526ac000(0x00000004526ac000) deq 0x00000004526ac660(0x00000004526ac000) segs 1 stream 0 free_trbs 254 bounce 0 cycle 0
             cat-4566  [002] d.h. 70029.195774: xhci_handle_event: EVENT: TRB 00000004526aa1e0 status 'Success' len 0 slot 2 ep 0 type 'Command Completion Event' flags e:c
             cat-4566  [002] d.h. 70029.195775: xhci_handle_command: CMD: Set TR Dequeue Pointer Command: deq 0000000455151e51 stream 0 slot 2 ep 1 flags C
             cat-4566  [002] d.h. 70029.195777: xhci_handle_cmd_set_deq: RS 00004 high-speed Ctx Entries 3 MEL 0 us Port# 1/0 [TT Slot 0 Port# 0 TTT 0 Intr 0] Addr 2 State configured
             cat-4566  [002] d.h. 70029.195778: xhci_handle_cmd_set_deq_ep: State stopped mult 1 max P. Streams 0 interval 125 us max ESIT payload 0 CErr 3 Type Ctrl burst 0 maxp 64 deq 0000000455151e51 avg trb len 0
             cat-4566  [002] d.h. 70029.195790: xhci_dbg_cancel_urb: Successful Set TR Deq Ptr cmd, deq = @455151e50
        rtl_test-4567  [000] .... 70029.195791: xhci_urb_enqueue: ep0out-control: urb 00000000048c9d96 pipe 2147487616 slot 2 length 0/1 sgs 0/0 stream 0 flags 00110200
             cat-4566  [002] d.h. 70029.195791: xhci_inc_deq: CMD 00000000328b5997: enq 0x00000004526aa1f0(0x00000004526aa000) deq 0x00000004526aa1f0(0x00000004526aa000) segs 1 stream 0 free_trbs 254 bounce 0 cycle 1
             cat-4566  [002] d.h. 70029.195792: xhci_inc_deq: EVENT 000000005bcbabe3: enq 0x00000004526ac000(0x00000004526ac000) deq 0x00000004526ac670(0x00000004526ac000) segs 1 stream 0 free_trbs 254 bounce 0 cycle 0
        rtl_test-4567  [000] d... 70029.195826: xhci_queue_trb: CTRL: bRequestType c0 bRequest 00 wValue 00c8 wIndex 0600 wLength 1 length 8 TD size 0 intr 0 type 'Setup Stage' flags I:i:C
        rtl_test-4567  [000] d... 70029.195827: xhci_inc_enq: CTRL 0000000097bb5814: enq 0x0000000455151e60(0x0000000455151000) deq 0x0000000455151e50(0x0000000455151000) segs 2 stream 0 free_trbs 508 bounce 0 cycle 0
        rtl_test-4567  [000] d... 70029.195827: xhci_queue_trb: CTRL: Buffer 00000004579a5720 length 1 TD size 0 intr 0 type 'Data Stage' flags i:i:c:s:I:e:c
        rtl_test-4567  [000] d... 70029.195828: xhci_inc_enq: CTRL 0000000097bb5814: enq 0x0000000455151e70(0x0000000455151000) deq 0x0000000455151e50(0x0000000455151000) segs 2 stream 0 free_trbs 507 bounce 0 cycle 0
        rtl_test-4567  [000] d... 70029.195828: xhci_queue_trb: CTRL: Buffer 0000000000000000 length 0 TD size 0 intr 0 type 'Status Stage' flags I:c:e:c
        rtl_test-4567  [000] d... 70029.195829: xhci_inc_enq: CTRL 0000000097bb5814: enq 0x0000000455151e80(0x0000000455151000) deq 0x0000000455151e50(0x0000000455151000) segs 2 stream 0 free_trbs 506 bounce 0 cycle 0
        rtl_test-4567  [000] d... 70029.495853: xhci_urb_dequeue: ep0out-control: urb 00000000048c9d96 pipe 2147487616 slot 2 length 0/1 sgs 0/0 stream 0 flags 00110200
        rtl_test-4567  [000] d... 70029.495897: xhci_dbg_cancel_urb: Cancel URB 00000000048c9d96, dev 1.4, ep 0x0, starting at offset 0x455151e50
        rtl_test-4567  [000] d... 70029.495902: xhci_queue_trb: CMD: Stop Ring Command: slot 2 sp 0 ep 1 flags C
        rtl_test-4567  [000] d... 70029.495903: xhci_inc_enq: CMD 00000000328b5997: enq 0x00000004526aa200(0x00000004526aa000) deq 0x00000004526aa1f0(0x00000004526aa000) segs 1 stream 0 free_trbs 253 bounce 0 cycle 1
          <idle>-0     [002] dNh. 70029.495984: xhci_handle_event: EVENT: TRB 0000000455151e50 status 'Stopped - Length Invalid' len 0 slot 2 ep 1 type 'Transfer Event' flags e:c
          <idle>-0     [002] dNh. 70029.495996: xhci_handle_transfer: CTRL: bRequestType c0 bRequest 00 wValue 00c8 wIndex 0600 wLength 1 length 8 TD size 0 intr 0 type 'Setup Stage' flags I:i:c

My understanding of how this is going wrong:

  • We get an event reporting a stall on TRB 455151e40
  • The reset endpoint command is sent automatically by the driver (doesn't appear to cause bus traffic)
  • A Set TR Dequeue Pointer command is issued which advances the TRB dequeue pointer to 455151e50
  • A URB is submitted almost immediately which adds 3 TRBs to the ring starting at 455151e50
  • The URB then gets dequeued (perhaps because the endpoint is still regarded as halted?)
  • A "stop ring" command is issued
  • The next event we get for this ring is a "length invalid" error corresponding to the TRB that we just queued

In the working case, the URB is not dequeued.

The bit missing from this is when the doorbell gets rung, because this is the kicker for the hardware updating its view of the world. The error still happens on a Pi with isolcpus=1,2,3 set so I don't think this is a SMP locking bug.

@P33M
Copy link
Contributor Author

P33M commented Jul 9, 2019

Hmm. From the point of view of the hardware, it appears to not be going off-piste:

          <idle>-0     [002] dNh. 70029.495984: xhci_handle_event: EVENT: TRB 0000000455151e50 status 'Stopped - Length Invalid' len 0 slot 2 ep 1 type 'Transfer Event' flags e:c
          <idle>-0     [002] dNh. 70029.495996: xhci_handle_transfer: CTRL: bRequestType c0 bRequest 00 wValue 00c8 wIndex 0600 wLength 1 length 8 TD size 0 intr 0 type 'Setup Stage' flags I:i:c
          <idle>-0     [002] dNh. 70029.495997: xhci_inc_deq: EVENT 000000005bcbabe3: enq 0x00000004526ac000(0x00000004526ac000) deq 0x00000004526ac680(0x00000004526ac000) segs 1 stream 0 free_trbs 254 bounce 0 cycle 0
          <idle>-0     [002] dNh. 70029.495998: xhci_handle_event: EVENT: TRB 00000004526aa1f0 status 'Success' len 0 slot 2 ep 0 type 'Command Completion Event' flags e:c

We get an "invalid length" event followed by a "success command completion" event as per xHCI spec:

If  the  command  is  executed  between  TDs,  then the  xHC  shall  perform  a Force Stopped Event(FSE) operation by generating a Transfer Event for the endpoint with Condition Code= Stopped -Invalid Length, TRB Pointer= current Dequeue Pointer value, and TRB Transfer Length= 0, then generate a Success Command Completion Event for the command.

It looks like the driver is in some oddball state because every URB that subsequently gets enqueued is immediately rejected afterwards.

@P33M
Copy link
Contributor Author

P33M commented Jul 9, 2019

The dequeues are a red herring - they're all triggered by libusb timing out after 300ms. Given that the analyser shows no bus activity after the first stall, there's either hardware state not being handled properly by the driver or the hardware is in a broken state.

I never get a "bad" URB enqueue sequence on an intel controller in the same PC, even when the rtl-test program and interrupts are on different cores.

@P33M
Copy link
Contributor Author

P33M commented Jul 10, 2019

This is a lot like the issue I'm seeing:
https://lore.kernel.org/patchwork/patch/458074/

I believe the bus goes quiet because there's a mismatch between the TRB cycle bit, the consumer cycle state (internal to the controller) and/or Linux's assumption as to what the consumer cycle state is.

@P33M
Copy link
Contributor Author

P33M commented Jul 10, 2019

Aha: you don't even need a USB device attached to replicate the hang. The internal USB2.0 hub responds with a stall response to a "get debug descriptor" request and if you spam lsusb -v -d 2109:3431, which repeatedly gets all descriptors, then eventually you get a hang.

When the cycle bit as reported by Linux is 0 and the endpoint is halted, it all goes wrong.

@P33M
Copy link
Contributor Author

P33M commented Jul 10, 2019

The readback value of the endpoint context DCS field on the VLI controller is always 1 (the initial state). If I compare with the Intel controller, I see the readback toggle between 1 and 0. This is used to initialise the state of the TR dequeue command's cycle pointer and Linux then walks the transfer ring, swapping it every time it jumps over a link TRB. With the value always 1, and the stalled TRB is 0, the HC never resumes transferring data because there's now a mismatch.

Dequeue Cycle State (DCS). This bit identifies the value of the xHC Consumer Cycle State(CCS) flag for the TRB referenced by the TR Dequeue Pointer. Refer to section 4.9.2 for more information.

The hardware is supposed to update this value with the current value in use in the endpoint context block. The Intel controller is doing this, but I don't know if the VLI controller is either a) not writing the value at all or b) always writing the wrong value.

P33M pushed a commit to P33M/linux that referenced this issue Jul 11, 2019
Seen on a VLI VL805 PCIe to USB controller. For non-stream endpoints
at least, if the xHC halts on a particular TRB due to an error then
the DCS field in the Out Endpoint Context maintianed by the hardware
is not updated with the current cycle state.

Using the quirk XHCI_EP_CTX_BROKEN_DCS and instead fetch the DCS bit
from the TRB that the xHC stopped on.

See: raspberrypi#3060

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
P33M pushed a commit to P33M/linux that referenced this issue Jul 11, 2019
Seen on a VLI VL805 PCIe to USB controller. For non-stream endpoints
at least, if the xHC halts on a particular TRB due to an error then
the DCS field in the Out Endpoint Context maintained by the hardware
is not updated with the current cycle state.

Using the quirk XHCI_EP_CTX_BROKEN_DCS and instead fetch the DCS bit
from the TRB that the xHC stopped on.

See: raspberrypi#3060

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
P33M pushed a commit to P33M/linux that referenced this issue Jul 12, 2019
Seen on a VLI VL805 PCIe to USB controller. For non-stream endpoints
at least, if the xHC halts on a particular TRB due to an error then
the DCS field in the Out Endpoint Context maintained by the hardware
is not updated with the current cycle state.

Using the quirk XHCI_EP_CTX_BROKEN_DCS and instead fetch the DCS bit
from the TRB that the xHC stopped on.

See: raspberrypi#3060

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
pelwell pushed a commit that referenced this issue Jul 12, 2019
Seen on a VLI VL805 PCIe to USB controller. For non-stream endpoints
at least, if the xHC halts on a particular TRB due to an error then
the DCS field in the Out Endpoint Context maintained by the hardware
is not updated with the current cycle state.

Using the quirk XHCI_EP_CTX_BROKEN_DCS and instead fetch the DCS bit
from the TRB that the xHC stopped on.

See: #3060

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
@P33M
Copy link
Contributor Author

P33M commented Jul 12, 2019

People seem to have already tested the fix before it's even merged - closing

@P33M P33M closed this as completed Jul 12, 2019
@pelwell
Copy link
Contributor

pelwell commented Jul 12, 2019

That's the way we like it - thanks, everyone.

popcornmix added a commit to raspberrypi/firmware that referenced this issue Jul 15, 2019
kernel: i2c: bcm2835: Set clock-stretch timeout to 35ms
See: raspberrypi/linux#3064

kernel: xhci: add quirk for host controllers that don't update endpoint DCS
See: raspberrypi/linux#3060

kernel: tty: amba-pl011: Make TX optimisation conditional
See: #1017

kernel: overlays: Add real parameters to the rpi-poe overlay
kernel: overlays: Correct gpio-fan gpio flags for 4.19
See: raspberrypi/linux#2715

kernel: overlays: i2c-gpio: Fix the bus parameter
See: raspberrypi/linux#3062

kernel: overlays: Rename pi3- overlays to be less model-specific
See: raspberrypi/linux#3052

firmware: dispmanx: Fix handling of disable_overscan to not disable it totally
See: raspberrypi/linux#3059

firmware: power: Enable/disable H264 and ISP clocks with domain

firmware: arm_loader: arm_64bit=0 should disable loading of kernel8.img

firmware: dt-blob: CM has no activity LED
popcornmix added a commit to Hexxeh/rpi-firmware that referenced this issue Jul 15, 2019
kernel: i2c: bcm2835: Set clock-stretch timeout to 35ms
See: raspberrypi/linux#3064

kernel: xhci: add quirk for host controllers that don't update endpoint DCS
See: raspberrypi/linux#3060

kernel: tty: amba-pl011: Make TX optimisation conditional
See: raspberrypi/firmware#1017

kernel: overlays: Add real parameters to the rpi-poe overlay
kernel: overlays: Correct gpio-fan gpio flags for 4.19
See: raspberrypi/linux#2715

kernel: overlays: i2c-gpio: Fix the bus parameter
See: raspberrypi/linux#3062

kernel: overlays: Rename pi3- overlays to be less model-specific
See: raspberrypi/linux#3052

firmware: dispmanx: Fix handling of disable_overscan to not disable it totally
See: raspberrypi/linux#3059

firmware: power: Enable/disable H264 and ISP clocks with domain

firmware: arm_loader: arm_64bit=0 should disable loading of kernel8.img

firmware: dt-blob: CM has no activity LED
pelwell pushed a commit that referenced this issue Jul 19, 2019
Seen on a VLI VL805 PCIe to USB controller. For non-stream endpoints
at least, if the xHC halts on a particular TRB due to an error then
the DCS field in the Out Endpoint Context maintained by the hardware
is not updated with the current cycle state.

Using the quirk XHCI_EP_CTX_BROKEN_DCS and instead fetch the DCS bit
from the TRB that the xHC stopped on.

See: #3060

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
pelwell pushed a commit that referenced this issue Jul 19, 2019
Seen on a VLI VL805 PCIe to USB controller. For non-stream endpoints
at least, if the xHC halts on a particular TRB due to an error then
the DCS field in the Out Endpoint Context maintained by the hardware
is not updated with the current cycle state.

Using the quirk XHCI_EP_CTX_BROKEN_DCS and instead fetch the DCS bit
from the TRB that the xHC stopped on.

See: #3060

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
pelwell pushed a commit that referenced this issue Jul 19, 2019
Seen on a VLI VL805 PCIe to USB controller. For non-stream endpoints
at least, if the xHC halts on a particular TRB due to an error then
the DCS field in the Out Endpoint Context maintained by the hardware
is not updated with the current cycle state.

Using the quirk XHCI_EP_CTX_BROKEN_DCS and instead fetch the DCS bit
from the TRB that the xHC stopped on.

See: #3060

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
pelwell pushed a commit that referenced this issue Jul 19, 2019
Seen on a VLI VL805 PCIe to USB controller. For non-stream endpoints
at least, if the xHC halts on a particular TRB due to an error then
the DCS field in the Out Endpoint Context maintained by the hardware
is not updated with the current cycle state.

Using the quirk XHCI_EP_CTX_BROKEN_DCS and instead fetch the DCS bit
from the TRB that the xHC stopped on.

See: #3060

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
TiejunChina pushed a commit that referenced this issue Jul 23, 2019
Seen on a VLI VL805 PCIe to USB controller. For non-stream endpoints
at least, if the xHC halts on a particular TRB due to an error then
the DCS field in the Out Endpoint Context maintained by the hardware
is not updated with the current cycle state.

Using the quirk XHCI_EP_CTX_BROKEN_DCS and instead fetch the DCS bit
from the TRB that the xHC stopped on.

See: #3060

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
popcornmix pushed a commit that referenced this issue Jul 25, 2019
Seen on a VLI VL805 PCIe to USB controller. For non-stream endpoints
at least, if the xHC halts on a particular TRB due to an error then
the DCS field in the Out Endpoint Context maintained by the hardware
is not updated with the current cycle state.

Using the quirk XHCI_EP_CTX_BROKEN_DCS and instead fetch the DCS bit
from the TRB that the xHC stopped on.

See: #3060

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
popcornmix pushed a commit that referenced this issue Jul 25, 2019
Seen on a VLI VL805 PCIe to USB controller. For non-stream endpoints
at least, if the xHC halts on a particular TRB due to an error then
the DCS field in the Out Endpoint Context maintained by the hardware
is not updated with the current cycle state.

Using the quirk XHCI_EP_CTX_BROKEN_DCS and instead fetch the DCS bit
from the TRB that the xHC stopped on.

See: #3060

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
popcornmix pushed a commit that referenced this issue Apr 27, 2021
Seen on a VLI VL805 PCIe to USB controller. For non-stream endpoints
at least, if the xHC halts on a particular TRB due to an error then
the DCS field in the Out Endpoint Context maintained by the hardware
is not updated with the current cycle state.

Using the quirk XHCI_EP_CTX_BROKEN_DCS and instead fetch the DCS bit
from the TRB that the xHC stopped on.

See: #3060

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
popcornmix pushed a commit that referenced this issue Apr 27, 2021
Seen on a VLI VL805 PCIe to USB controller. For non-stream endpoints
at least, if the xHC halts on a particular TRB due to an error then
the DCS field in the Out Endpoint Context maintained by the hardware
is not updated with the current cycle state.

Using the quirk XHCI_EP_CTX_BROKEN_DCS and instead fetch the DCS bit
from the TRB that the xHC stopped on.

See: #3060

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
popcornmix pushed a commit that referenced this issue Apr 30, 2021
Seen on a VLI VL805 PCIe to USB controller. For non-stream endpoints
at least, if the xHC halts on a particular TRB due to an error then
the DCS field in the Out Endpoint Context maintained by the hardware
is not updated with the current cycle state.

Using the quirk XHCI_EP_CTX_BROKEN_DCS and instead fetch the DCS bit
from the TRB that the xHC stopped on.

See: #3060

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
popcornmix pushed a commit that referenced this issue May 7, 2021
Seen on a VLI VL805 PCIe to USB controller. For non-stream endpoints
at least, if the xHC halts on a particular TRB due to an error then
the DCS field in the Out Endpoint Context maintained by the hardware
is not updated with the current cycle state.

Using the quirk XHCI_EP_CTX_BROKEN_DCS and instead fetch the DCS bit
from the TRB that the xHC stopped on.

See: #3060

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
popcornmix pushed a commit that referenced this issue May 7, 2021
Seen on a VLI VL805 PCIe to USB controller. For non-stream endpoints
at least, if the xHC halts on a particular TRB due to an error then
the DCS field in the Out Endpoint Context maintained by the hardware
is not updated with the current cycle state.

Using the quirk XHCI_EP_CTX_BROKEN_DCS and instead fetch the DCS bit
from the TRB that the xHC stopped on.

See: #3060

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
@MSLaaf
Copy link

MSLaaf commented May 13, 2021

Has this ever been put into the mainstream? I am working through the Pi patchlist - and the code has totally changed in the 5.12 kernel...

popcornmix pushed a commit that referenced this issue May 13, 2021
Seen on a VLI VL805 PCIe to USB controller. For non-stream endpoints
at least, if the xHC halts on a particular TRB due to an error then
the DCS field in the Out Endpoint Context maintained by the hardware
is not updated with the current cycle state.

Using the quirk XHCI_EP_CTX_BROKEN_DCS and instead fetch the DCS bit
from the TRB that the xHC stopped on.

See: #3060

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
popcornmix pushed a commit that referenced this issue May 13, 2021
Seen on a VLI VL805 PCIe to USB controller. For non-stream endpoints
at least, if the xHC halts on a particular TRB due to an error then
the DCS field in the Out Endpoint Context maintained by the hardware
is not updated with the current cycle state.

Using the quirk XHCI_EP_CTX_BROKEN_DCS and instead fetch the DCS bit
from the TRB that the xHC stopped on.

See: #3060

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
popcornmix pushed a commit that referenced this issue May 19, 2021
Seen on a VLI VL805 PCIe to USB controller. For non-stream endpoints
at least, if the xHC halts on a particular TRB due to an error then
the DCS field in the Out Endpoint Context maintained by the hardware
is not updated with the current cycle state.

Using the quirk XHCI_EP_CTX_BROKEN_DCS and instead fetch the DCS bit
from the TRB that the xHC stopped on.

See: #3060

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
popcornmix pushed a commit that referenced this issue May 19, 2021
Seen on a VLI VL805 PCIe to USB controller. For non-stream endpoints
at least, if the xHC halts on a particular TRB due to an error then
the DCS field in the Out Endpoint Context maintained by the hardware
is not updated with the current cycle state.

Using the quirk XHCI_EP_CTX_BROKEN_DCS and instead fetch the DCS bit
from the TRB that the xHC stopped on.

See: #3060

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
popcornmix pushed a commit that referenced this issue May 25, 2021
Seen on a VLI VL805 PCIe to USB controller. For non-stream endpoints
at least, if the xHC halts on a particular TRB due to an error then
the DCS field in the Out Endpoint Context maintained by the hardware
is not updated with the current cycle state.

Using the quirk XHCI_EP_CTX_BROKEN_DCS and instead fetch the DCS bit
from the TRB that the xHC stopped on.

See: #3060

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
popcornmix pushed a commit that referenced this issue Jun 8, 2021
Seen on a VLI VL805 PCIe to USB controller. For non-stream endpoints
at least, if the xHC halts on a particular TRB due to an error then
the DCS field in the Out Endpoint Context maintained by the hardware
is not updated with the current cycle state.

Using the quirk XHCI_EP_CTX_BROKEN_DCS and instead fetch the DCS bit
from the TRB that the xHC stopped on.

See: #3060

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
popcornmix pushed a commit that referenced this issue Jun 14, 2021
Seen on a VLI VL805 PCIe to USB controller. For non-stream endpoints
at least, if the xHC halts on a particular TRB due to an error then
the DCS field in the Out Endpoint Context maintained by the hardware
is not updated with the current cycle state.

Using the quirk XHCI_EP_CTX_BROKEN_DCS and instead fetch the DCS bit
from the TRB that the xHC stopped on.

See: #3060

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
limeng-linux pushed a commit to limeng-linux/linux-yocto-develop that referenced this issue Jun 27, 2021
commit  74b560b63f21ea66d797d6cb2a8afc2cd41b7a01 from
https://github.com/raspberrypi/linux.git rpi-5.12.y

Seen on a VLI VL805 PCIe to USB controller. For non-stream endpoints
at least, if the xHC halts on a particular TRB due to an error then
the DCS field in the Out Endpoint Context maintained by the hardware
is not updated with the current cycle state.

Using the quirk XHCI_EP_CTX_BROKEN_DCS and instead fetch the DCS bit
from the TRB that the xHC stopped on.

See: raspberrypi/linux#3060

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
Signed-off-by: Meng Li <Meng.Li@windriver.com>
fengguang pushed a commit to 0day-ci/linux that referenced this issue Jul 2, 2021
Seen on a VLI VL805 PCIe to USB controller. For non-stream endpoints
at least, if the xHC halts on a particular TRB due to an error then
the DCS field in the Out Endpoint Context maintained by the hardware
is not updated with the current cycle state.

Using the quirk XHCI_EP_CTX_BROKEN_DCS and instead fetch the DCS bit
from the TRB that the xHC stopped on.

Cc: stable@vger.kernel.org
Link: raspberrypi/linux#3060
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Noltari pushed a commit to Noltari/rpi-linux that referenced this issue Aug 20, 2021
Seen on a VLI VL805 PCIe to USB controller. For non-stream endpoints
at least, if the xHC halts on a particular TRB due to an error then
the DCS field in the Out Endpoint Context maintained by the hardware
is not updated with the current cycle state.

Using the quirk XHCI_EP_CTX_BROKEN_DCS and instead fetch the DCS bit
from the TRB that the xHC stopped on.

See: raspberrypi#3060

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
fengguang pushed a commit to 0day-ci/linux that referenced this issue Oct 9, 2021
Seen on a VLI VL805 PCIe to USB controller. For non-stream endpoints
at least, if the xHC halts on a particular TRB due to an error then
the DCS field in the Out Endpoint Context maintained by the hardware
is not updated with the current cycle state.

Using the quirk XHCI_EP_CTX_BROKEN_DCS and instead fetch the DCS bit
from the TRB that the xHC stopped on.

[ bjorn: rebased to v5.14-rc2 ]
Cc: stable@vger.kernel.org
Link: raspberrypi/linux#3060
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
ColinIanKing pushed a commit to ColinIanKing/linux-next that referenced this issue Oct 12, 2021
Seen on a VLI VL805 PCIe to USB controller. For non-stream endpoints
at least, if the xHC halts on a particular TRB due to an error then
the DCS field in the Out Endpoint Context maintained by the hardware
is not updated with the current cycle state.

Using the quirk XHCI_EP_CTX_BROKEN_DCS and instead fetch the DCS bit
from the TRB that the xHC stopped on.

[ bjorn: rebased to v5.14-rc2 ]

Link: raspberrypi/linux#3060
Cc: stable@vger.kernel.org
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20211008092547.3996295-3-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
imaami pushed a commit to imaami/linux that referenced this issue Oct 19, 2021
commit 5255660 upstream.

Seen on a VLI VL805 PCIe to USB controller. For non-stream endpoints
at least, if the xHC halts on a particular TRB due to an error then
the DCS field in the Out Endpoint Context maintained by the hardware
is not updated with the current cycle state.

Using the quirk XHCI_EP_CTX_BROKEN_DCS and instead fetch the DCS bit
from the TRB that the xHC stopped on.

[ bjorn: rebased to v5.14-rc2 ]

Link: raspberrypi/linux#3060
Cc: stable@vger.kernel.org
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20211008092547.3996295-3-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Whissi pushed a commit to Whissi/linux-stable that referenced this issue Oct 20, 2021
commit 5255660 upstream.

Seen on a VLI VL805 PCIe to USB controller. For non-stream endpoints
at least, if the xHC halts on a particular TRB due to an error then
the DCS field in the Out Endpoint Context maintained by the hardware
is not updated with the current cycle state.

Using the quirk XHCI_EP_CTX_BROKEN_DCS and instead fetch the DCS bit
from the TRB that the xHC stopped on.

[ bjorn: rebased to v5.14-rc2 ]

Link: raspberrypi/linux#3060
Cc: stable@vger.kernel.org
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20211008092547.3996295-3-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Whissi pushed a commit to Whissi/linux-stable that referenced this issue Oct 27, 2021
commit 5255660 upstream.

Seen on a VLI VL805 PCIe to USB controller. For non-stream endpoints
at least, if the xHC halts on a particular TRB due to an error then
the DCS field in the Out Endpoint Context maintained by the hardware
is not updated with the current cycle state.

Using the quirk XHCI_EP_CTX_BROKEN_DCS and instead fetch the DCS bit
from the TRB that the xHC stopped on.

[ bjorn: rebased to v5.14-rc2 ]

Link: raspberrypi/linux#3060
Cc: stable@vger.kernel.org
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20211008092547.3996295-3-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
it-is-a-robot pushed a commit to openeuler-mirror/kernel that referenced this issue Nov 16, 2021
stable inclusion
from stable-5.10.76
commit b6f32897af190d4716412e156ee0abcc16e4f1e5
bugzilla: 182988 https://gitee.com/openeuler/kernel/issues/I4IAHF

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=b6f32897af190d4716412e156ee0abcc16e4f1e5

--------------------------------

commit 5255660 upstream.

Seen on a VLI VL805 PCIe to USB controller. For non-stream endpoints
at least, if the xHC halts on a particular TRB due to an error then
the DCS field in the Out Endpoint Context maintained by the hardware
is not updated with the current cycle state.

Using the quirk XHCI_EP_CTX_BROKEN_DCS and instead fetch the DCS bit
from the TRB that the xHC stopped on.

[ bjorn: rebased to v5.14-rc2 ]

Link: raspberrypi/linux#3060
Cc: stable@vger.kernel.org
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20211008092547.3996295-3-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Signed-off-by: Chen Jun <chenjun102@huawei.com>
Acked-by: Weilong Chen <chenweilong@huawei.com>

Signed-off-by: Chen Jun <chenjun102@huawei.com>
Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com>
vinzv pushed a commit to tuxedocomputers/linux that referenced this issue Dec 1, 2021
BugLink: https://bugs.launchpad.net/bugs/1952136

commit 5255660 upstream.

Seen on a VLI VL805 PCIe to USB controller. For non-stream endpoints
at least, if the xHC halts on a particular TRB due to an error then
the DCS field in the Out Endpoint Context maintained by the hardware
is not updated with the current cycle state.

Using the quirk XHCI_EP_CTX_BROKEN_DCS and instead fetch the DCS bit
from the TRB that the xHC stopped on.

[ bjorn: rebased to v5.14-rc2 ]

Link: raspberrypi/linux#3060
Cc: stable@vger.kernel.org
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20211008092547.3996295-3-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
@sieren
Copy link

sieren commented Oct 13, 2023

Any chance this bug snuck back into 6.1.0-rpi4-rpi-v8 ? (Pi 4 / RTL2832U)

@trejan
Copy link

trejan commented Oct 13, 2023

Any chance this bug snuck back into 6.1.0-rpi4-rpi-v8 ? (Pi 4 / RTL2832U)

Yeah. It did. The fix has been merged but new kernel hasn't been pushed out via apt yet. rpi-update if you can't wait. Do a backup first etc...

@sieren
Copy link

sieren commented Oct 13, 2023

Good to know! Thanks for the swift response!
Can you point me to the commit or PR?

Update: Nevermind, found it: #5642

@trejan
Copy link

trejan commented Oct 13, 2023

#5642

jai-raptee pushed a commit to jai-raptee/iliteck1 that referenced this issue Apr 30, 2024
Seen on a VLI VL805 PCIe to USB controller. For non-stream endpoints
at least, if the xHC halts on a particular TRB due to an error then
the DCS field in the Out Endpoint Context maintained by the hardware
is not updated with the current cycle state.

Using the quirk XHCI_EP_CTX_BROKEN_DCS and instead fetch the DCS bit
from the TRB that the xHC stopped on.

See: raspberrypi/linux#3060

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants