Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tinyusb: Support USB device drivers written in Python #9497

Merged
merged 5 commits into from Mar 15, 2024

Conversation

projectgus
Copy link
Contributor

@projectgus projectgus commented Oct 4, 2022

The goal is to allow implementing USB device drivers in Python via TinyUSB at runtime, in addition to any existing compile-time TinyUSB class drivers (i.e. CDC and optionally MSC).

Python API with the more interesting user-facing code is at micropython/micropython-lib#558 - if checking out this branch in micropython, go to the lib/micropython-lib subdirectory and check out the feature/usbd_python branch there as well.

Port Support

This PR implements this for the rp2 and samd ports.

The port-level requirement is that the TinyUSB task is run in a "scheduled" static callback node, as implemented in #12846 for rp2 & samd.) I think this approach can be extended to most of the other ports pretty easily.

Challenging ports (future work):

  • esp32. ESP-IDF runs TinyUSB in its own FreeRTOS Task. I think it can be made to work here by using a cross-task Python callback (see jimmo's implementation here), or by changing the TinyUSB implementation to work similar to the other ports (this will save some RAM and on single core ESP32s it might even improve performance. On dual core it might reduce performance a little but it also might not.)
  • stm32. Currently doesn't use TinyUSB. It may be possible to port this approach to the ST USB stack, but prefer to make a compile-time option to build stm32 port with TinyUSB. This will be less work and more maintainable into the future.
  • zephyr. The Zephyr USB device driver API (example: hid core) looks like it would adapt to this approach easily, but haven't verified this yet.

Binary Size Impact

Where possible, the implementation has been placed into micropython-lib to reduce the baseline binary size impact. If enabled at compile time (currently not the default) the code size change is:

Port Board Code Size Delta Static RAM Size Delta
rp2 RPI_PICO 322808 -> 325704 +2896 5872 -> 5876 +4
samd SEEED_WIO_TERMINAL (samd51) 261136 -> 263984 +2848 3068 -> 3072 +4

Note: This doesn't include freezing in the "usbd" module from micropython-lib. Currently that module is approximately 15KB.

How To Use

  • Edit the mpconfigport.h file for your port or board and change MICROPY_HW_ENABLE_USB_RUNTIME_DEVICE to 1.
  • (Recommended) edit themanifest.py file for your port or board and insert require("usbd").
  • Rebuild the firmware

See micropython/micropython-lib#558 for more.

Debugging Tips

REPL debugging via the USB serial port is obviously perilous when the USB stack itself is being debugged, and the whole device needs to re-enumerate regularly for the host to see changes.

Hardware UART

Switch the REPL interface to a hardware UART if you can (for rp2, set MICROPY_HW_ENABLE_UART_REPL in ports/rps2/mpconfigport.h.) This makes mpremote mount very slow, but it means crashes that take down USB are still reported on the REPL!

TinyUSB debug

To enable TinyUSB debug output, add the lines #undef CFG_TUSB_DEBUG and #define CFG_TUSB_DEBUG 2 (or a different level) to the shared/tinyusb/tusb_config.h header. Debugging output goes to stdout, so don't do this unless you've also switched to a hardware UART.

Semihosting

Another option is semihosting. On rp2 with an SWD debugger connected, can apply this quick patch and then run pyocd as follows:

pyocd gdbserver --semihosting --target rp2040 --frequency 32m --persist 

Read from localhost port 4444 and you'll see stdout at any time that USB-CDC device is not open. (This failover behaviour is useful because semihosting is painfully slow, and also pyocd doesn't support non-UTF-8 input or output so any "raw REPL" traffic will cause it to crash.)

In recent testing the semihosting patch is still not stable when using mpremote to run Python code that triggers re-enumeration, it panics trying to write to the CDC while the device is being reset (more debugging needed).

This work was funded through GitHub Sponsors.

@projectgus projectgus force-pushed the feature/usbd_python branch 3 times, most recently from e5acbe5 to a2957c2 Compare October 4, 2022 05:44
@codecov-commenter
Copy link

codecov-commenter commented Oct 4, 2022

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.39%. Comparing base (fff66c3) to head (7f5d8c4).

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #9497   +/-   ##
=======================================
  Coverage   98.39%   98.39%           
=======================================
  Files         161      161           
  Lines       21078    21078           
=======================================
  Hits        20739    20739           
  Misses        339      339           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@projectgus
Copy link
Contributor Author

Have updated this branch a bit, and also submitted a draft set of micropython-lib changes at micropython/micropython-lib#558

@github-actions
Copy link

github-actions bot commented Mar 2, 2023

Code size report:

   bare-arm:    +0 +0.000% 
minimal x86:    +0 +0.000% 
   unix x64:    +0 +0.000% standard
      stm32:    +0 +0.000% PYBV10
     mimxrt:    +0 +0.000% TEENSY40
        rp2: +2944 +0.891% RPI_PICO[incl +8(bss)]
       samd: +2864 +1.094% ADAFRUIT_ITSYBITSY_M4_EXPRESS[incl +4(bss)]

@turmoni
Copy link

turmoni commented Jun 16, 2023

I've decided I like pain and so I'm writing a barebones read-only mass storage driver for my use case using this work (not pain because the work's hard to use, just because now I have to understand USB and SCSI at the same time). I'm trying to figure out if it's possible to stall an endpoint with the current implementation? I believe that's what I'm meant to do in the case that I get an invalid request, but I'm not seeing that that's functionality that's exposed. However, I also don't really understand what I'm talking about, so stalling an endpoint might actually be a synonym for something else that is exposed.

@turmoni
Copy link

turmoni commented Jun 18, 2023

I now understand slightly more, I think, so I can give a better overview. Mass storage mostly works via data being sent through the IN and OUT endpoints, with a couple of exceptions (literally two, and one of those is reset). The correct response to an invalid request coming in on OUT is to stall IN, but I'm not seeing how I'd be able to do this. Now, I have got what I need working on Linux without doing this, but I don't know how robust it is.

@projectgus
Copy link
Contributor Author

projectgus commented Jun 19, 2023

@turmoni Great plan!

I think you are correct, there are 3 TinyUSB usbd_priv.h interface function calls: usbd_edpt_stall, usbd_edpt_clear_stall, and usbd_edpt_stalled that are not exposed via the Python API yet, but will need to be in order to implement the Mass Storage class driver and likely other driver corner cases.

I haven't looked closely at MSC yet. My vague plan was to work up in complexity from HID (mostly done), to MIDI (unexpectedly complex), to CDC, and then MSC (complex because SCSI). However, from a very quick look at the TinyUSB class driver it seems like stalling the endpoint is needed for two cases:

  1. The host sends an invalid request via USB.
  2. The SCSI storage "backend device" is not ready yet or is otherwise in a bad state.

So if your particular implementation is happy to ignore (1) or handle it in a more extreme way, and if it doesn't need (2) because the read-only backing storage is always available, you might be alright without endpoints stalling.

If you do want to add those APIs to the Python interface and feel like sending a PR to add them to this branch, then that would also be great - they will certainly be needed before this work can be completed.

@turmoni
Copy link

turmoni commented Jun 20, 2023

I think I should be OK without having to implement them for my needs. I've got my device working as much as I need it to on Linux, just Windows is causing me issues. Specifically, I keep getting maximum recursion depth exceeded and I'm trying to figure out where that's coming from so I can avoid it. I suspect it's some kind of issue where the callbacks are happening so quickly that the function that initiated the transfer hasn't finished yet, but I've yet to properly track it down.

Before realising I had some UARTs I could hook up for better logging I did get what I thought was weird corruption on my two-line LCD when I hit the recursion depth error, but looking at more verbose output over my UART I think it's just supporting my idea that it's the callback happening quickly.

I don't know if this may have been something you ran into at any point? I can provide my code that currently reliably has this issue for me on Windows 10, but it's currently between the states of "nice tidy code that has been written before testing any of it" and "reasonable somewhat tidy code that has been debugged and cleaned up", so it's now at "horrible hacky mess with various attempts at making things work shoehorned in"

@turmoni
Copy link

turmoni commented Jun 20, 2023

Thinking about it, I might as well just attach it here regardless - I do intend to add it to turmoni/micropython-lib when it's in a better (working, cleaner, more documented) state, but with the caveats that I know it's broken and in the middle of debugging I'm not too embarrassed to share the code.

To try it out, create an MSCInterface object as you have for the other examples, filesystem is either None or a bytes-like object (I'm currently feeding it a tiny FAT filesystem). uart and lcd are optional parameters for debugging (I suspect the logging might be contributing to the recursion depth error). I have the USB interface and the SCSI interface abstracted away from one another in a way that the TinyUSB code doesn't, but hopefully this makes it a bit easier to follow.

The basic flow is:

  • Standard setup stuff happens, host asks for the number of LUNs on the control channel.
  • The device prepares to accept a CBW (which contains the command) on the OUT endpoint
  • On receiving a CBW, the device validates it, sends any data that's been requested on the IN endpoint, then submits a CSW (status) report on the IN endpoint
  • When the CSW has been sent, we start again from the preparation to accept the CBW

What I'm seeing on Windows is that it's mostly working fine, until it ends up failing to do something, and then I don't think I have the ability to get it back in sync, but ideally I shouldn't need to.

(I don't think anything ever hits handle_endpoint_control_xfer, so feel free to ignore that.)
brokenmsc.txt

@projectgus
Copy link
Contributor Author

projectgus commented Jun 21, 2023

Thanks @turmoni. At a glance this looks quite good to me! I haven't had time to run it, might be a while before I have that time.

I don't think callback frequency by itself would trigger a recursion depth issue: the callbacks should all be called from the tud_task_ext "main loop task function" of TinyUSB, which is itself called from the usbd_task() method in shared/tinyusb/mp_usbd.c, which is called from MICROPY_EVENT_POLL_HOOK_FAST. That code should never be re-entrant, it handles one DCD (low-level peripheral) event at a time. Therefore, even if one transfer callback did something that triggered an immediate new callback at the hardware level then you'd still expect the inner callback to fully return all the way back through tud_task_ext, before MicroPython keeps running and hits the next event hook causing tud_task_ext to be called again and the new event to trigger from the top.

EDIT: The above is wrong!

The default stack size for rp2 port is 8KB, set via PICO_STACK_SIZE. Is it possible you're actually exceeding this, without infinite recursion? The maximum depth will be determined by the total stack depth of the "main" Python code, plus the potential "event hook" callback stack frames on top of that from a single callback event. 8KB sounds like heaps, but actually it might be getting up there...? If you're able to switch the REPL to the UART, you should get a decent stack trace when this triggers (I hope!)

@projectgus
Copy link
Contributor Author

OK, what I confidently wrote just before is totally wrong - there are currently a bunch of places where an event hook may call back into tud_task_ext(), which is quite a nasty bug with some nasty implications!

For most cases we can put an anti-recursion check into tud_task_ext(), but there are some places even this won't work (like if a callback tries to write to a USB device like the USB-CDC and that relies on event processing to make space available.)

So, it may have to be one of: treat USB xfer callbacks almost like hard interrupts and do no significant work in them, or change the callbacks to be called via mp_sched_schedule instead of mp_call_function_n_kw meaning they get "re-queued" and handled there. This is a little worse for performance, but it might end up being the way it has to be.

@turmoni
Copy link

turmoni commented Jun 21, 2023

Thanks, I did quickly do a new firmware build with the REPL on UART 0 and that didn't seem to enlighten me any further, so I suspect it is some kind of unexpected interaction. I had a quick look to see if I could hackily change how the callbacks are called, but I'm neither proficient enough at C nor the micropython codebase to be able to do anything useful there in the short time I have to look at it this evening.

@turmoni
Copy link

turmoni commented Jun 22, 2023

Right, so, I've done terrible things to your code and I feel dirty, but I think I understand one of the issues I'm having. Firstly, I replaced all my callbacks with short functions that called micropython.schedule() to schedule something that actually does the work, so hopefully if that's an issue, that's no longer a factor with what I'm doing. This still didn't fix it though, so I did some more digging.

I've copied and pasted various helper functions from the micropython source to try to figure out what the issue is that I'm now facing, and I think that it's _submit_xfer is too slow to add the callback (or something along those lines). My horrific debug output is as follows:

About to submit xfer
Submitting 55534253020000000000000000, length: 13
Result 1
In runtime_dev_xfer_cb, ep_addr: 133, result: 0, xferred: 13
Calling callback
object <bound_method> is of type bound_method
<bound_method><bound_method 2001c5f0 <_USBDevice object at 2001a5e0>.<function _xfer_cb at 0x20013380>>
In _xfer_cb!
itf: <MSCInterface object at 2001bc00>, cb: None
It's not none
Submitting xfer - 133, 55534253020000000000000000. itf = <MSCInterface object at 2001bc00>, cb = <bound_method>
Submitted xfer?

So one of the first things I realised was that I felt silly because I didn't actually have to do anything with the C code, because all it did was point me to the fact that I should be looking at _xfer_cb because that's the thing that actually invokes the right Python callback. But this does (probably) show the overall flow as it happened; line-by-line:

  • _submit_xfer is called (Python)
  • This calls _usbd.submit_xfer, and the C code is printing the value it's sending and the length
  • "Result 1" is the result from calling usbd_edpt_xfer
  • "In runtime_..." is the callback handler in the C side
  • "Calling callback" comes after the mp_obj_is_callable check, and is followed by a couple of red herrings I didn't need to care about
  • "In _xfer_cb!" is back in the Python code
  • "itf: ..., cb: None" is the important part - at the time that the callback is invoked, it doesn't think there's a user-defined callback to call
  • "It's not none" is back in the C code and can be ignored
  • "Submitting xfer" is back in the Python code, in _submit_xfer, and is a print statement I put in that happens just before the interface and callback are set.
  • "Submitted xfer?" is just before _submit_xfer returns True

This makes a lot of sense for the non-recursion-error symptoms I'm seeing, if the callback handler is being called before the callback has had a chance to be written.

Edit: of course, my debug code could be affecting the timing here and making it more prone to happen, but I don't see how it would be introducing the issue.

@turmoni
Copy link

turmoni commented Jun 22, 2023

Well having worked around this (see below) it looks like it was just an error I was exposing for myself, and it wasn't the original problem I was having, where my CSW claims to have been submitted with the right data but goes missing and doesn't make it to my USB capture in Wireshark. I guess that's tomorrow's problem for me to investigate.

    def _retry_xfer_cb(self, args):
        print("Retry!")
        (ep_addr, result, xferred_bytes) = args
        self._xfer_cb(ep_addr, result, xferred_bytes)

    def _xfer_cb(self, ep_addr, result, xferred_bytes):
        # Singleton callback from TinyUSB custom class driver when a transfer completes.
        print("In _xfer_cb!")
        try:
            itf, cb = self._eps[ep_addr]
            print(f"itf: {itf}, cb: {str(cb)}")
            if cb is None:
                micropython.schedule(self._retry_xfer_cb, (ep_addr, result, xferred_bytes))
                return

            self._eps[ep_addr] = (itf, None)
        except KeyError:
            cb = None
        if cb:
            print("Got a cb")
            cb(ep_addr, result, xferred_bytes)

@turmoni
Copy link

turmoni commented Jun 23, 2023

Right, I think I've solved my magical disappearing CSW problem, and that aspect was my fault; as per the spec:

  • If the device intends to send less data than the host indicated, then:
    • The device shall send the intended data.
      • The device may send fill data to pad up to a total of dCBWDataTransferLength.
      • If the device actually transfers less data than the host indicated, then:
        • The device may end the transfer with a short packet.
        • The device shall STALL the Bulk-In pipe.
    • The device shall set bCSWStatus to 00h or 01h.
    • The device shall set dCSWDataResidue to the difference between dCBWDataTransferLength and the actual amount of relevant data sent.

I was sending less than dCBWDataTransferLength, but not STALLing (since we can't do that yet). Padding out the response to dCBWDataTransferLength seems to solve my issue. Now to go and see how much of this code I can rip out again...

@turmoni
Copy link

turmoni commented Jun 24, 2023

I'm now trying to set up a composite device, using both HID and MSC, and my brain has been slightly fried. It looks like whichever interface I add_interface first is the only one that gets control callbacks fired for it - I can see in my USB capture that my host is sending a control transfer with a wIndex of 3, but that doesn't seem to make it through to runtime_dev_control_xfer_cb in the C code. Since there's only one relevant request that comes through for mass storage, I can somewhat fudge it and make both work for a bit for me by just initialising the endpoint regardless, but I don't know how robust this is.

(Another issue I ran into is with the Python side where the current logic for picking endpoints doesn't work because once you have an IN endpoint >=0x80, adding one to it won't get you back into OUT endpoint territory, but I'm putting that here more as a note for anyone else trying this.)

@turmoni
Copy link

turmoni commented Jun 28, 2023

OK so I've now done everything I needed to; to summarise the things I'm working around:

  • The inability to STALL endpoints
  • Control transfer callbacks not firing for anything other than the first interface that's added
  • Potential race condition where a callback might fire before the specific callback handler has been registered in self._eps
  • Potential issues with callbacks being weird due to being directly called rather than scheduled - I think the root of my actual issue there may have been with me not sending the right data, mind
  • Not really seeing a way to handle input on the control endpoint

To elaborate on that last one, since I hadn't mentioned it earlier, USB HID allows you to either receive OUT-style requests on the OUT endpoint if you've defined one, or on the control endpoint. I spent a little bit of time trying to figure out how you'd make this work generically before deciding it was easier to just add an OUT endpoint to USB HID for my purposes.

Thanks for doing this work, I'm glad I didn't have to do all my logic in C!

@projectgus
Copy link
Contributor Author

Have just pushed a major refactor based on some suggestions from @dpgeorge. Any existing code will need to be updated for this new version:

  • The machine.USBD.rst file in this PR is up to date with docs for the low-level API.
  • The micropython-lib PR is updated and uses this new API.

Any questions, concerns, or something is not working then please comment!

@projectgus

This comment was marked as resolved.

@projectgus projectgus force-pushed the feature/usbd_python branch 5 times, most recently from bd176f0 to 108e7f0 Compare March 13, 2024 01:19
@projectgus
Copy link
Contributor Author

Pushed another big update with some breaking API changes that came up in review. Check the included docs for details.

Have also updated the size numbers in the first post.

This update also squashes everything down to be merge-ready.

mp_obj_t control_xfer_cb;
mp_obj_t xfer_cb;

mp_usbd_builtin_driver_t builtin_driver;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this just be mp_obj_t, and it just stores the pointer to the current built-in driver object?

Then you don't need an enum, and the logic to get/set the attribute for this is very simple.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. I think at the time I didn't want that detail to bleed into mp_usbd.c, but can simply declare extern consts for the various allowed constants. Will change!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's simpler to have the extern consts than the enum and the enum<->const mapping.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, PTAL!

extmod/machine_usb_device.c Outdated Show resolved Hide resolved
ports/rp2/mpconfigport.h Outdated Show resolved Hide resolved
ports/samd/mpconfigport.h Outdated Show resolved Hide resolved
@projectgus projectgus force-pushed the feature/usbd_python branch 2 times, most recently from 1bcccab to 9a231ea Compare March 14, 2024 06:43
@projectgus projectgus force-pushed the feature/usbd_python branch 2 times, most recently from d7f396c to fa7a8ff Compare March 14, 2024 07:08
@slorquet
Copy link

I'm really looking forward to see this being finalized ! You are doing a huge job with a high value, thank you for that.
I really want to do a HID device with micropython without heavy adafruit stuff!

This work was funded through GitHub Sponsors.

Signed-off-by: Angus Gratton <angus@redyak.com.au>
Previously USB was always enabled, but this created some conflicts when
adding guards to other files on other ports.

Note the configuration with USB disabled hasn't been tested and probably
won't build or run without further work.

Signed-off-by: Angus Gratton <angus@redyak.com.au>
This new machine-module driver provides a "USBDevice" singleton object and
a shim TinyUSB "runtime" driver that delegates the descriptors and all of
the TinyUSB callbacks to Python functions.  This allows writing arbitrary
USB devices in pure Python.  It's also possible to have a base built-in
USB device implemented in C (eg CDC, or CDC+MSC) and a Python USB device
added on top of that.

This work was funded through GitHub Sponsors.

Signed-off-by: Angus Gratton <angus@redyak.com.au>
This work was funded through GitHub Sponsors.

Signed-off-by: Angus Gratton <angus@redyak.com.au>
This work was funded through GitHub Sponsors.

Signed-off-by: Angus Gratton <angus@redyak.com.au>
@dpgeorge dpgeorge merged commit 7f5d8c4 into micropython:master Mar 15, 2024
61 checks passed
@dpgeorge
Copy link
Member

Merged!!

Thanks @projectgus for all your efforts on this. It's a great feature, will be widely used.

@Luckinber
Copy link

Congrats @projectgus!

Super exciting to see this move forward, I'll see you in the lib pull request 😉

@hoihu
Copy link
Sponsor Contributor

hoihu commented Mar 15, 2024

terrific feature! Congrats!

@slorquet
Copy link

Excellent news, congratulations and thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet