Skip to content

Conversation

@ArcaneNibble
Copy link
Collaborator

There is no guarantee that these events can only happen one at a time. If we happen to miss an event, it may cause the USB control transfer state machine to fall out of sync, which can cause enumeration failures.

This might fix pybricks/support#2297

Note that the bulk of the diff is indentation changes.

@ArcaneNibble
Copy link
Collaborator Author

@dlech @BertLindeman can you test this? (potential fix for EV3 enumeration issues)

@coveralls
Copy link

coveralls commented Jul 26, 2025

Coverage Status

coverage: 56.967%. remained the same
when pulling f42d61b on ArcaneNibble:usb
into 2b2ae66 on pybricks:master.

@dlech
Copy link
Member

dlech commented Jul 26, 2025

I tried it but didn't see any change. If anything it might even be worse as I never saw it enumerate even the string descriptors even after multiple tries/reboots. USB 3.0 hosts still work though.

@ArcaneNibble
Copy link
Collaborator Author

Hrm okay. Can you get a new packet capture of an unsuccessful enumeration?

@BertLindeman
Copy link
Contributor

I do see improvement:
linux usb-2:
Start with the brick in update mode and before an ev3 boot ok. Did a few EV3 OFF and unplug USB:

[za jul 26 23:37:38 2025] usb 1-1.2: new high-speed USB device number 20 using ehci-pci
[za jul 26 23:37:38 2025] usb 1-1.2: New USB device found, idVendor=0694, idProduct=0006, bcdDevice= 0.00
[za jul 26 23:37:38 2025] usb 1-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[za jul 26 23:37:38 2025] usb 1-1.2: Product: EV3 Firmware Update
[za jul 26 23:37:38 2025] usb 1-1.2: Manufacturer: LEGO
[za jul 26 23:37:38 2025] usb 1-1.2: SerialNumber: 12345sejs
[za jul 26 23:37:38 2025] hid-generic 0003:0694:0006.0054: hiddev0,hidraw4: USB HID v1.10 Device [LEGO EV3 Firmware Update] on usb-0000:00:1a.0-1.2/input0
[za jul 26 23:38:01 2025] usbhid 1-1.2:1.0: can't add hid device: -71
[za jul 26 23:38:01 2025] usbhid: probe of 1-1.2:1.0 failed with error -71
[za jul 26 23:38:01 2025] usb 1-1.2: USB disconnect, device number 20
[za jul 26 23:38:01 2025] usb 1-1.2: new high-speed USB device number 21 using ehci-pci
[za jul 26 23:38:01 2025] usb 1-1.2: device descriptor read/all, error -71
[za jul 26 23:38:01 2025] usb 1-1.2: new high-speed USB device number 22 using ehci-pci
[za jul 26 23:38:01 2025] usb 1-1.2: device descriptor read/all, error -71
[za jul 26 23:38:01 2025] usb 1-1-port2: attempt power cycle
[za jul 26 23:38:02 2025] usb 1-1.2: new high-speed USB device number 23 using ehci-pci
[za jul 26 23:38:02 2025] usb 1-1.2: New USB device found, idVendor=0694, idProduct=0005, bcdDevice= 2.00
[za jul 26 23:38:02 2025] usb 1-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[za jul 26 23:38:02 2025] usb 1-1.2: Product: LEGO MINDSTORMS EV3 + Pybricks
[za jul 26 23:38:02 2025] usb 1-1.2: Manufacturer: LEGO System A/S
[za jul 26 23:38:02 2025] usb 1-1.2: SerialNumber: 0016535166E2
[za jul 26 23:38:36 2025] usb 1-1.2: USB disconnect, device number 23
[za jul 26 23:38:39 2025] usb 1-1.2: new high-speed USB device number 24 using ehci-pci
[za jul 26 23:38:39 2025] usb 1-1.2: device descriptor read/all, error -71
[za jul 26 23:38:39 2025] usb 1-1.2: new high-speed USB device number 25 using ehci-pci
[za jul 26 23:38:39 2025] usb 1-1.2: New USB device found, idVendor=0694, idProduct=0005, bcdDevice= 2.00
[za jul 26 23:38:39 2025] usb 1-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[za jul 26 23:38:39 2025] usb 1-1.2: Product: LEGO MINDSTORMS EV3 + Pybricks
[za jul 26 23:38:39 2025] usb 1-1.2: Manufacturer: LEGO System A/S
[za jul 26 23:38:39 2025] usb 1-1.2: SerialNumber: 0016535166E2
[za jul 26 23:38:43 2025] usb 1-1.2: USB disconnect, device number 25
[za jul 26 23:38:49 2025] usb 1-1.2: new high-speed USB device number 26 using ehci-pci
[za jul 26 23:38:49 2025] usb 1-1.2: device descriptor read/all, error -71
[za jul 26 23:38:49 2025] usb 1-1.2: new high-speed USB device number 27 using ehci-pci
[za jul 26 23:38:49 2025] usb 1-1.2: New USB device found, idVendor=0694, idProduct=0005, bcdDevice= 2.00
[za jul 26 23:38:49 2025] usb 1-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[za jul 26 23:38:49 2025] usb 1-1.2: Product: LEGO MINDSTORMS EV3 + Pybricks
[za jul 26 23:38:49 2025] usb 1-1.2: Manufacturer: LEGO System A/S
[za jul 26 23:38:49 2025] usb 1-1.2: SerialNumber: 0016535166E2

linux usb-3:

[za jul 26 23:38:00 2025] usb 3-2: new high-speed USB device number 31 using xhci_hcd
[za jul 26 23:38:00 2025] usb 3-2: New USB device found, idVendor=0694, idProduct=0005, bcdDevice= 2.00
[za jul 26 23:38:00 2025] usb 3-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[za jul 26 23:38:00 2025] usb 3-2: Product: LEGO MINDSTORMS EV3 + Pybricks
[za jul 26 23:38:00 2025] usb 3-2: Manufacturer: LEGO System A/S
[za jul 26 23:38:00 2025] usb 3-2: SerialNumber: 0016535166E2
[za jul 26 23:38:05 2025] usb 3-2: USB disconnect, device number 31
[za jul 26 23:38:09 2025] usb 3-2: new high-speed USB device number 32 using xhci_hcd
[za jul 26 23:38:09 2025] usb 3-2: New USB device found, idVendor=0694, idProduct=0005, bcdDevice= 2.00
[za jul 26 23:38:09 2025] usb 3-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[za jul 26 23:38:09 2025] usb 3-2: Product: LEGO MINDSTORMS EV3 + Pybricks
[za jul 26 23:38:09 2025] usb 3-2: Manufacturer: LEGO System A/S
[za jul 26 23:38:09 2025] usb 3-2: SerialNumber: 0016535166E2

windows 11
OK

My eye are getting smaller.
Bedtime.

@dlech
Copy link
Member

dlech commented Jul 26, 2025

Can you get a new packet capture of an unsuccessful enumeration?

Here you go: ev3-usb2.0-no-work.zip

It is device 3.0. You can see that it just never responds to the GET DESCRIPTOR Request.

@ArcaneNibble ArcaneNibble changed the title pbio/drv/usb/usb_ev3.c: Handle multiple EP0 events at once [EV3] Fix more USB corner cases and minor bugs Jul 26, 2025
@ArcaneNibble
Copy link
Collaborator Author

I might've figured it out. @dlech can you test again?

If this works, I still want to merge all the intermediate miscellaneous fixes. They don't visibly do anything, but they're still "more correct"

@dlech
Copy link
Member

dlech commented Jul 27, 2025

It works! I didn't do super-thorough testing but it at least is working much better than before.

@BertLindeman
Copy link
Contributor

USB-2
OK: after firmware flash, no reboot needed for USB.
Assume these two messages are from pybricksdev flash:

[zo jul 27 09:54:09 2025] usbhid 1-1.2:1.0: can't add hid device: -71
[zo jul 27 09:54:09 2025] usbhid: probe of 1-1.2:1.0 failed with error -71

No longer error messages on plug/unplug.
The two messages from above in context:

dmesg output
[zo jul 27 09:53:36 2025] usb 1-1.2: new high-speed USB device number 32 using ehci-pci
[zo jul 27 09:53:36 2025] usb 1-1.2: New USB device found, idVendor=0694, idProduct=0006, bcdDevice= 0.00
[zo jul 27 09:53:36 2025] usb 1-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[zo jul 27 09:53:36 2025] usb 1-1.2: Product: EV3 Firmware Update
[zo jul 27 09:53:36 2025] usb 1-1.2: Manufacturer: LEGO
[zo jul 27 09:53:36 2025] usb 1-1.2: SerialNumber: 12345sejs
[zo jul 27 09:53:36 2025] hid-generic 0003:0694:0006.005C: hiddev0,hidraw4: USB HID v1.10 Device [LEGO EV3 Firmware Update] on usb-0000:00:1a.0-1.2/input0
[zo jul 27 09:54:09 2025] usbhid 1-1.2:1.0: can't add hid device: -71
[zo jul 27 09:54:09 2025] usbhid: probe of 1-1.2:1.0 failed with error -71
[zo jul 27 09:54:09 2025] usb 1-1.2: USB disconnect, device number 32
[zo jul 27 09:54:09 2025] usb 1-1.2: new high-speed USB device number 33 using ehci-pci
[zo jul 27 09:54:09 2025] usb 1-1.2: New USB device found, idVendor=0694, idProduct=0005, bcdDevice= 2.00
[zo jul 27 09:54:09 2025] usb 1-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[zo jul 27 09:54:09 2025] usb 1-1.2: Product: LEGO MINDSTORMS EV3 + Pybricks
[zo jul 27 09:54:09 2025] usb 1-1.2: Manufacturer: LEGO System A/S
[zo jul 27 09:54:09 2025] usb 1-1.2: SerialNumber: 0016535166E2
[zo jul 27 09:54:54 2025] usb 1-1.2: USB disconnect, device number 33
[zo jul 27 09:54:59 2025] usb 1-1.2: new high-speed USB device number 34 using ehci-pci
[zo jul 27 09:54:59 2025] usb 1-1.2: New USB device found, idVendor=0694, idProduct=0005, bcdDevice= 2.00
[zo jul 27 09:54:59 2025] usb 1-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[zo jul 27 09:54:59 2025] usb 1-1.2: Product: LEGO MINDSTORMS EV3 + Pybricks
[zo jul 27 09:54:59 2025] usb 1-1.2: Manufacturer: LEGO System A/S
[zo jul 27 09:54:59 2025] usb 1-1.2: SerialNumber: 0016535166E2
[zo jul 27 09:55:03 2025] usb 1-1.2: USB disconnect, device number 34
[zo jul 27 09:55:06 2025] usb 1-1.2: new high-speed USB device number 35 using ehci-pci
[zo jul 27 09:55:06 2025] usb 1-1.2: New USB device found, idVendor=0694, idProduct=0005, bcdDevice= 2.00
[zo jul 27 09:55:06 2025] usb 1-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[zo jul 27 09:55:06 2025] usb 1-1.2: Product: LEGO MINDSTORMS EV3 + Pybricks
[zo jul 27 09:55:06 2025] usb 1-1.2: Manufacturer: LEGO System A/S
[zo jul 27 09:55:06 2025] usb 1-1.2: SerialNumber: 0016535166E2
[zo jul 27 09:55:10 2025] usb 1-1.2: USB disconnect, device number 35
[zo jul 27 09:55:11 2025] usb 1-1.2: new high-speed USB device number 36 using ehci-pci
[zo jul 27 09:55:12 2025] usb 1-1.2: New USB device found, idVendor=0694, idProduct=0005, bcdDevice= 2.00
[zo jul 27 09:55:12 2025] usb 1-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[zo jul 27 09:55:12 2025] usb 1-1.2: Product: LEGO MINDSTORMS EV3 + Pybricks
[zo jul 27 09:55:12 2025] usb 1-1.2: Manufacturer: LEGO System A/S
[zo jul 27 09:55:12 2025] usb 1-1.2: SerialNumber: 0016535166E2
[zo jul 27 09:55:21 2025] usb 1-1.2: USB disconnect, device number 36

USB-3
Extra USB disconnects are from the KVM switch I use to switch between the two computers.
The USB plug(s) I use is not on the KVM switch.

Did not flash on this pc.

dmesg output
[zo jul 27 09:39:59 2025] usb 3-2: new high-speed USB device number 33 using xhci_hcd
[zo jul 27 09:39:59 2025] usb 3-2: New USB device found, idVendor=0694, idProduct=0005, bcdDevice= 2.00
[zo jul 27 09:39:59 2025] usb 3-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[zo jul 27 09:39:59 2025] usb 3-2: Product: LEGO MINDSTORMS EV3 + Pybricks
[zo jul 27 09:39:59 2025] usb 3-2: Manufacturer: LEGO System A/S
[zo jul 27 09:39:59 2025] usb 3-2: SerialNumber: 0016535166E2
[zo jul 27 09:40:02 2025] usb 3-2: USB disconnect, device number 33
[zo jul 27 09:40:06 2025] usb 3-2: new high-speed USB device number 34 using xhci_hcd
[zo jul 27 09:40:06 2025] usb 3-2: New USB device found, idVendor=0694, idProduct=0005, bcdDevice= 2.00
[zo jul 27 09:40:06 2025] usb 3-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[zo jul 27 09:40:06 2025] usb 3-2: Product: LEGO MINDSTORMS EV3 + Pybricks
[zo jul 27 09:40:06 2025] usb 3-2: Manufacturer: LEGO System A/S
[zo jul 27 09:40:06 2025] usb 3-2: SerialNumber: 0016535166E2
[zo jul 27 09:40:14 2025] usb 2-1.2: USB disconnect, device number 89
[zo jul 27 09:40:14 2025] usb 2-1.2.1: USB disconnect, device number 90
[zo jul 27 09:40:14 2025] usb 2-1.2.1.1: USB disconnect, device number 92
[zo jul 27 09:40:15 2025] usb 2-1.2.2: USB disconnect, device number 91
[zo jul 27 09:40:17 2025] usb 3-2: USB disconnect, device number 34

windows
Correctly recognized.

@ArcaneNibble
Copy link
Collaborator Author

Nice, looks like we can merge this.

@ArcaneNibble
Copy link
Collaborator Author

Regarding the last commit in the series

The SETUPEND bit in the PERI_CSR0 seems to do exactly what it says. The bit gets set if a control transaction ends (either by receiving a new SETUP token, or by completing the status phase with an ACK) before the DATAEND bit is set.

However, clearing RXPKTRDY seems to be the only flag needed before the MUSB IP moves on to automatically allowing the status phase to complete. The datasheet hints at this by saying:

The interval between setting SERV_RXPKTRDY bit and DATAEND bit should be very small to avoid getting a SetupEnd error condition.

By setting them separately like we were doing before, if the host controller completed the status phase while we were still running through the IRQ handler, the USB IP would detect SetupEnd, we would interpret it as an error, and then we would fail to act on the SET_ADDRESS command, which resulted in @dlech 's packet capture showing no response to the subsequent GET_DESCRIPTOR.

We now set the bits at the exact same time, so this race window is closed.

This race window does not apply to requests with a data stage due to the extra packets which are expected.

Regarding the first commit in the series

If a host controller is fast enough, it can turn around from getting a STALL during a control transaction to immediately issuing the next setup stage before the interrupt handler ever gets invoked. This can result in multiple status flags getting set (both the SENTSTALL confirmation bit and RXPKTRDY for the new request). By dropping the second one, we get out of sync and never respond to the host correctly.

Because this is timing sensitive, it makes sense why, without the last commit in the series, it helped on Bert's machine but made David's machine worse.

Having both fixes in place should work reliably with all ranges of timing.

I don't know for sure, but I am speculating that xHCI doesn't manifest this bug because it is slower than EHCI at turning around basic operations like this if the host isn't intentionally trying to maximize the bus usage (which it won't be during initialization and enumeration)

@BertLindeman
Copy link
Contributor

BertLindeman commented Jul 29, 2025

To complete the EV3 USB test, I found a keyboard with a low-power USB hub.

Did this test with firmware at git hash ffb2184

The hub is USB 1.1 2port downstream low power hub

Test with the EV3 in dmesg:

[Tue Jul 29 10:37:05 2025] usb 2-1.4.1.3: new full-speed USB device number 13 using ehci-pci
[Tue Jul 29 10:37:05 2025] usb 2-1.4.1.3: not running at top speed; connect to a high speed hub
[Tue Jul 29 10:37:05 2025] usb 2-1.4.1.3: New USB device found, idVendor=0694, idProduct=0005, bcdDevice= 2.00
[Tue Jul 29 10:37:05 2025] usb 2-1.4.1.3: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[Tue Jul 29 10:37:05 2025] usb 2-1.4.1.3: Product: LEGO MINDSTORMS EV3 + Pybricks
[Tue Jul 29 10:37:05 2025] usb 2-1.4.1.3: Manufacturer: LEGO System A/S
[Tue Jul 29 10:37:05 2025] usb 2-1.4.1.3: SerialNumber: 0016535166E2

Part of a lsusb -t:

            ID 0bda:5411 Realtek Semiconductor Corp. RTS5411 Hub
            |__ Port 001: Dev 008, If 0, Class=Hub, Driver=hub/3p, 12M
                ID 04b3:301a IBM Corp. 2-port low-power hub
                |__ Port 001: Dev 012, If 0, Class=Human Interface Device, Driver=usbhid, 12M
                    ID 04b3:301b IBM Corp. SK-8815 Keyboard
                |__ Port 001: Dev 012, If 1, Class=Human Interface Device, Driver=usbhid, 12M
                    ID 04b3:301b IBM Corp. SK-8815 Keyboard
                |__ Port 003: Dev 014, If 0, Class=Vendor Specific Class, Driver=[none], 12M
                    ID 0694:0005 Lego Group Mindstorms EV3

[EDIT]
Tried to flash the EV3 firmware on the USB 1.1 port. That hangs on ev3.py line 267, in get_version

traceback
 pybricksdev flash ev3-firmware-build-4062-gitffb21843_R_USB.zip
Creating firmware...
^CTraceback (most recent call last):
  File "/usr/lib/python3.12/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/asyncio/base_events.py", line 687, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/home/bert/.local/pipx/venvs/pybricksdev/lib/python3.12/site-packages/pybricksdev/cli/flash.py", line 483, in flash_firmware
    await flash_ev3(firmware)
  File "/home/bert/.local/pipx/venvs/pybricksdev/lib/python3.12/site-packages/pybricksdev/cli/flash.py", line 385, in flash_ev3
    fw, hw = await bootloader.get_version()
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/bert/.local/pipx/venvs/pybricksdev/lib/python3.12/site-packages/pybricksdev/connections/ev3.py", line 267, in get_version
    return await asyncio.get_running_loop().run_in_executor(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/bert/.local/bin/pybricksdev", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/bert/.local/pipx/venvs/pybricksdev/lib/python3.12/site-packages/pybricksdev/cli/__init__.py", line 570, in main
    asyncio.run(subparsers.choices[args.tool].tool.run(args))
  File "/usr/lib/python3.12/asyncio/runners.py", line 194, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/asyncio/runners.py", line 123, in run
    raise KeyboardInterrupt()
KeyboardInterrupt

Can create a new issue if David thinks it is worth it.

@dlech
Copy link
Member

dlech commented Jul 29, 2025

Regarding the...

This sounds like info that would be useful to add to the commit messages.

Can create a new issue if David thinks it is worth it.

No, there isn't anything we can do about firmware flashing and USB 1.1.

There is no guarantee that these events can only happen one at a time.
If a host controller is fast enough, it can turn around from getting
a STALL during a control transaction to immediately issuing the next
setup stage before the interrupt handler ever gets invoked.
This can result in multiple status flags getting set
(both the SENTSTALL confirmation bit and RXPKTRDY for the new request).

If we happen to miss an event, it may cause the USB control transfer
state machine to fall out of sync, which can cause enumeration failures.

This commit depends on a subsequent fix to work across all ranges of timings.
Although this does not visibly change anything, the AM1808 errata document
claims that not doing this sequence can theoretically result in a timing violation
inside the logic, which can cause unpredictable results. Since it is simple,
we can just apply the recommended workaorund sequence.
The SETUPEND bit in the PERI_CSR0 seems to do _exactly_ what it says.
The bit gets set if a control transaction ends (either by receiving
a new SETUP token, or by completing the status phase with an ACK)
before the DATAEND bit is set.

However, clearing RXPKTRDY seems to be the only flag needed before the
MUSB IP moves on to automatically allowing the status phase to complete.
The datasheet hints at this by saying:

> The interval between setting SERV_RXPKTRDY bit and DATAEND bit
> should be very small to avoid getting a SetupEnd error condition.

By setting the bits separately like we were doing before, if the host
controller completed the status phase while we were still running
through the IRQ handler, the USB IP would detect SetupEnd,
we would interpret it as an error, and then we would fail to
act on, in particular, the SET_ADDRESS command. This manifested in
the device suddenly no longer responding to subsequent GET_DESCRIPTORs.

We now set the bits at the exact same time, so this race window is closed.

This race window does not apply to requests with a data stage
due to the extra packets which are expected.
@ArcaneNibble
Copy link
Collaborator Author

Just added the long explanations into the commit messages.

@dlech dlech merged commit 3e1a107 into pybricks:master Jul 29, 2025
17 checks passed
@dlech
Copy link
Member

dlech commented Jul 29, 2025

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Follow up EV3 enumeration issues seemingly related to EHCI on Linux

4 participants