Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SAM V71B Initial USB Transfer Drops Data Bytes #48671

Closed
nick-kraus opened this issue Aug 4, 2022 · 4 comments · Fixed by #49888
Closed

SAM V71B Initial USB Transfer Drops Data Bytes #48671

nick-kraus opened this issue Aug 4, 2022 · 4 comments · Fixed by #49888
Assignees
Labels
area: USB Universal Serial Bus bug The issue is a bug, or the PR is fixing a bug platform: Microchip SAM Microchip SAM Platform (formerly Atmel SAM) priority: low Low impact/importance bug
Milestone

Comments

@nick-kraus
Copy link
Contributor

Describe the bug
After a reboot, the first instance of writing data from the device (sam_v71b_xult board, usbhs) to a host (PC) gets dropped and no data is available to be read from the host. After the first write, subsequent writes work as expected, even when resetting and re-configuring the usb port.

Problem was originally discovered when developing a custom firmware which acts as a usb composite device, saw same behavior on both a raw bulk endpoint loopback, and a configured cdc acm interface which was looped back over a uart interface (where the serial emulator app wouldn't register the first byte sent to the device after a reboot). In both these cases, with additional logging enabled, the board usb subsystem saw the incoming usb out packet from the host, did its processing (loopback), and as far as the board knows correctly wrote a usb in packet of the same data and size.

The same behavior is not seen on a nucleo_h743zi dev board with the same codebase.

To Reproduce
Steps to reproduce the behavior:

  1. build and flash the webusb sample onto an Atmel sam_v71b_xult dev board
  2. reset the dev board
  3. use the below python and pyusb script to view the buggy behavior

Expected behavior
Even after a full device reboot, the first usb out transaction should be properly looped back and able to be read by the host.

Impact
Major annoyance for proper device usage and testing, as any time the usb is used it now needs to be 'primed' with a write before doing any useful data, as we don't have a way of knowing after connecting if the board has been freshly rebooted.

Logs and console output
The webusb sample has been modified to have additional debug logging on the usb device. In the below snippet both of the usb loopback transfers appear identical to the device.

[00:00:46.470,000] ←[0m<dbg> usb_transfer: usb_transfer_work: Transfer done, ep 0x02, status 0, size 3←[0m
[00:00:46.470,000] ←[0m<dbg> webusb: webusb_read_cb: cfg 0x204001a0 ep 2 size 3←[0m
[00:00:46.470,000] ←[0m<dbg> usb_transfer: usb_transfer: Transfer start, ep 0x81, data 0x204047e4, dlen 3←[0m
[00:00:46.470,000] ←[0m<dbg> usb_transfer: usb_transfer: Transfer start, ep 0x02, data 0x204047e4, dlen 512←[0m
[00:00:46.470,000] ←[0m<dbg> usb_transfer: usb_transfer_work: Transfer done, ep 0x81, status 0, size 3←[0m
[00:00:46.470,000] ←[0m<dbg> webusb: webusb_write_cb: ep 81 size 3←[0m
[00:00:46.470,000] ←[0m<dbg> usb_transfer: usb_transfer_work: Transfer cancelled or completed, ep 0x02←[0m
[00:00:46.470,000] ←[0m<dbg> usb_transfer: usb_transfer_work: Transfer done, ep 0x02, status 0, size 3←[0m
[00:00:46.470,000] ←[0m<dbg> webusb: webusb_read_cb: cfg 0x204001a0 ep 2 size 3←[0m
[00:00:46.470,000] ←[0m<dbg> usb_transfer: usb_transfer: Transfer start, ep 0x81, data 0x204047e4, dlen 3←[0m
[00:00:46.470,000] ←[0m<dbg> usb_transfer: usb_transfer: Transfer start, ep 0x02, data 0x204047e4, dlen 512←[0m
[00:00:46.471,000] ←[0m<dbg> usb_transfer: usb_transfer_work: Transfer done, ep 0x81, status 0, size 3←[0m
[00:00:46.471,000] ←[0m<dbg> webusb: webusb_write_cb: ep 81 size 3←[0m

Environment (please complete the following information):

  • OS: Windows 10 21H1
  • Toolchain: gnuarmemb, 2020-q4-major, v10.2.1
  • Zephyr Release Tag: zephyr-v3.1.0

Additional context

# python script used to cause buggy behavior
import usb.core
dev = usb.core.find(idVendor=0x2fe3, idProduct=0x000a)
# replace 0x02 and 0x81 with the necessary endpoint address, if they differ from here
# below will always print '3'
print(f'wrote {dev.write(0x02, [1, 2, 3])} bytes to usb')
# below will not print the loopback data after initial board boot or reset
# but it will print the loopback data correctly on the second run of the script (without a reset)
print(f'received {dev.read(0x81, 3)} back from usb')
# below will always print '3'
print(f'wrote {dev.write(0x02, [4, 5, 6])} bytes to usb')
# below will always print '[4, 5, 6]', even when above fails, so the data isn't just delayed
print(f'received {dev.read(0x81, 3)} back from usb')
@nick-kraus nick-kraus added the bug The issue is a bug, or the PR is fixing a bug label Aug 4, 2022
@nick-kraus nick-kraus changed the title SAM V71B Initial USB Connection Drops Data Bytes SAM V71B Initial USB Transfer Dops Data Bytes Aug 4, 2022
@nick-kraus nick-kraus changed the title SAM V71B Initial USB Transfer Dops Data Bytes SAM V71B Initial USB Transfer Drops Data Bytes Aug 4, 2022
@henrikbrixandersen henrikbrixandersen added platform: Microchip SAM Microchip SAM Platform (formerly Atmel SAM) area: USB Universal Serial Bus labels Aug 9, 2022
@fabiobaltieri fabiobaltieri added the priority: low Low impact/importance bug label Aug 9, 2022
@nandojve nandojve added this to the future milestone Aug 13, 2022
@nick-kraus
Copy link
Contributor Author

Hi @nandojve , I've been trying to look into and diagnose what exactly is happening here, and I think I've found what is going on. It looks like the pointer to the endpoint FIFO for any given endpoint isn't being initialized until after the first write attempt, causing the first write transfer on any endpoint to be dropped.

In the existing code, when usb_dc_ep_write() is called for the first time on any given IN endpoint, the FIFO pointer is null, as seen in the logs below ('into fifo ((nil))...'):

[00:02:43.402,000] <wrn> usb_transfer: Transfer start, ep 0x81, data 0x20402798, dlen 4
[00:02:43.402,000] <wrn> usb_transfer: Transfer start, ep 0x02, data 0x20402798, dlen 512
[00:02:43.402,000] <wrn> usb_dc_sam_usbhs: putting 4 bytes of data into fifo ((nil)) of ep 0x81
[00:02:43.402,000] <wrn> usb_transfer: Transfer done, ep 0x81, status 0, size 4
[00:02:43.403,000] <wrn> usb_transfer: Transfer done, ep 0x02, status 0, size 3
[00:02:43.403,000] <wrn> usb_transfer: Transfer start, ep 0x81, data 0x20402798, dlen 3
[00:02:43.403,000] <wrn> usb_transfer: Transfer start, ep 0x02, data 0x20402798, dlen 512
[00:02:43.403,000] <wrn> usb_dc_sam_usbhs: putting 3 bytes of data into fifo (0xa0108000) of ep 0x81
[00:02:43.403,000] <wrn> usb_transfer: Transfer done, ep 0x81, status 0, size 3
[00:02:43.403,000] <wrn> usb_transfer: Transfer done, ep 0x02, status 0, size 2
[00:02:43.403,000] <wrn> usb_transfer: Transfer start, ep 0x81, data 0x20402798, dlen 2
[00:02:43.403,000] <wrn> usb_transfer: Transfer start, ep 0x02, data 0x20402798, dlen 512
[00:02:43.403,000] <wrn> usb_dc_sam_usbhs: putting 2 bytes of data into fifo (0xa0108000) of ep 0x81
[00:02:43.403,000] <wrn> usb_transfer: Transfer done, ep 0x81, status 0, size 2

By simply adding a check to see if the fifo pointer is null before filling, and if so resetting the endpoint fifo, it seems to have now corrected the incorrect behavior. I unfortunately haven't looked through the code enough to understand if this is the proper or best way to fix this issue, I mostly tried to isolate the issue and hack in whatever solution I could.

I would be more than happy to do the work to resolve this issue and get a fix into a pull request, but I would appreciate some guidance if possible by someone who knows this particular code better.

I have added the code I used to test / diagnose / verify what I have found to this branch on my fork of zephyr. The most recent commit contains a hacky fix to the problem, and the commit before contains a python test script to help show the behavior and additional log statements that help to diagnose it.

@nandojve
Copy link
Member

Hi @nick-kraus ,

One thing that help me in the past was ran testusb. I`ll see if I can check this issue on the weekend once it seems you maybe found the root of problem.

@nick-kraus
Copy link
Contributor Author

Hi @nandojve,

I spend some time this weekend trying to find any issues with the testusb sample, however I wasn't able to get the testusb application to run more than just the control transfer tests 0, 9, and 10; which all passed (even after following the directions where you specify usbtest uses the "Gadget Zero" interface). Unfortunately I only had access to a Linux VM (Ubuntu 22.04) and not a dedicated Linux box itself, which may have played a role in the difficulties I was having (connecting / keeping the device connected to the VM ended up rather flaky).

After messing around further with zephyr testusb sample, I found I wasn't even able to get the simple loopback functionality to work at all. The device enumerated with a single BULK OUT endpoint at address 0x02 and a single BULK IN endpoint at address 0x81, and while I was able to write data to the OUT endpoint, I couldn't ever receive data back from the IN endpoint (any reads timed out, indicating that no data was ever written back out to that endpoint). I am not sure if I was just doing something egregiously wrong here or not, if you could verify (or not) this behavior that would be appreciated.

Regardless, I am still able to reliably recreate the original issue using the webusb sample (modified to add extra logging), which also just uses a usb loopback (but with a different code path). I have included a python script that shows the failure I'm describing (and how it works successfully after my hacky fix); it uses the pyusb package for transferring to / from the endpoints and the pyocd package for ensuring the device is in a freshly restarted state each iteration.

Python test script
import time
import usb.core
from pyocd.core.helpers import ConnectHelper
session = ConnectHelper.session_with_chosen_probe()
session.open()
session.board.target.reset()
time.sleep(1)
dev = usb.core.find(idVendor=0x2fe3, idProduct=0x000a)
dev.write(0x02, [1, 2, 3])
if dev.read(0x81, 3).tobytes() != b'\x01\x02\x03':
    raise ValueError("test failed, data was not returned after fresh reboot")
else:
    print("test success")

When this script is run on commit f15c4c237488ca819d87e295ffcac3ad4382330f of my repo (upstream zephyr main but with some extra logging statements) is throws the "test failed" error, because the first transfer doesn't fully loopback as expected, and the following logs are printed. As you can see, the address of the internal driver FIFO of ep 0x81 is NULL at the time of the first USB transfer, which is clearly a bug.

Failing test logs
*** Booting Zephyr OS build zephyr-v3.1.0-2872-gf15c4c237488  ***
[00:00:00.180,000] <wrn> usb_dc_sam_usbhs: putting 18 bytes of data into fifo (0xa0100008) of ep 0x80
[00:00:00.181,000] <wrn> usb_dc_sam_usbhs: putting 0 bytes of data into fifo (0xa0100008) of ep 0x80
[00:00:00.191,000] <wrn> usb_dc_sam_usbhs: putting 18 bytes of data into fifo (0xa0100008) of ep 0x80
[00:00:00.196,000] <wrn> usb_dc_sam_usbhs: putting 32 bytes of data into fifo (0xa0100008) of ep 0x80
[00:00:00.196,000] <wrn> usb_dc_sam_usbhs: putting 64 bytes of data into fifo (0xa0100008) of ep 0x80
[00:00:00.196,000] <wrn> usb_dc_sam_usbhs: putting 0 bytes of data into fifo (0xa0100000) of ep 0x80
[00:00:00.196,000] <wrn> usb_dc_sam_usbhs: putting 38 bytes of data into fifo (0xa0100008) of ep 0x80
[00:00:00.196,000] <wrn> usb_dc_sam_usbhs: putting 34 bytes of data into fifo (0xa0100008) of ep 0x80
[00:00:00.196,000] <wrn> usb_dc_sam_usbhs: putting 4 bytes of data into fifo (0xa0100008) of ep 0x80
[00:00:00.196,000] <wrn> usb_dc_sam_usbhs: putting 42 bytes of data into fifo (0xa0100008) of ep 0x80
[00:00:00.717,000] <wrn> usb_dc_sam_usbhs: putting 2 bytes of data into fifo (0xa0100008) of ep 0x80
[00:00:00.717,000] <wrn> usb_transfer: Transfer start, ep 0x02, data 0x20402798, dlen 512
[00:00:00.717,000] <wrn> usb_dc_sam_usbhs: putting 0 bytes of data into fifo (0xa0100008) of ep 0x80
[00:00:00.729,000] <wrn> usb_dc_sam_usbhs: putting 4 bytes of data into fifo (0xa0100008) of ep 0x80
[00:00:00.729,000] <wrn> usb_dc_sam_usbhs: putting 4 bytes of data into fifo (0xa0100008) of ep 0x80
[00:00:00.730,000] <wrn> usb_dc_sam_usbhs: putting 14 bytes of data into fifo (0xa0100008) of ep 0x80
[00:00:00.730,000] <wrn> usb_dc_sam_usbhs: putting 14 bytes of data into fifo (0xa0100008) of ep 0x80
[00:00:00.730,000] <wrn> usb_dc_sam_usbhs: putting 42 bytes of data into fifo (0xa0100008) of ep 0x80
[00:00:00.731,000] <wrn> usb_dc_sam_usbhs: putting 42 bytes of data into fifo (0xa0100008) of ep 0x80
[00:00:00.735,000] <wrn> usb_dc_sam_usbhs: putting 17 bytes of data into fifo (0xa0100008) of ep 0x80
[00:00:01.082,000] <wrn> usb_transfer: Transfer done, ep 0x02, status 0, size 3
[00:00:01.082,000] <wrn> usb_transfer: Transfer start, ep 0x81, data 0x20402798, dlen 3
[00:00:01.082,000] <wrn> usb_transfer: Transfer start, ep 0x02, data 0x20402798, dlen 512
[00:00:01.082,000] <wrn> usb_dc_sam_usbhs: putting 3 bytes of data into fifo ((nil)) of ep 0x81
[00:00:01.082,000] <wrn> usb_transfer: Transfer done, ep 0x81, status 0, size 3

Re-running the script on commit fd596012c515a9217777e47ef7a6e02da91e1833 of my repo (hacky fix for bug, need guidance on how to improve) will produce the successful output, since the fix I made will check that the FIFO is a valid address (not NULL) when the usb_dc_ep_write is called, and if it is not valid then it will reset the FIFO (which sets it to the correct address). In the current zephyr repo it seems that this fifo is not being initialized early enough, since it is always NULL on the first call to usb_dc_ep_write for any given (non-control) endpoint. Below are the logs showing the first call with that endpoint showing a valid address.

Successful test logs
*** Booting Zephyr OS build zephyr-v3.1.0-2873-gfd596012c515  ***
[00:00:00.182,000] <wrn> usb_dc_sam_usbhs: putting 18 bytes of data into fifo (0xa0100008) of ep 0x80
[00:00:00.182,000] <wrn> usb_dc_sam_usbhs: putting 0 bytes of data into fifo (0xa0100008) of ep 0x80
[00:00:00.193,000] <wrn> usb_dc_sam_usbhs: putting 18 bytes of data into fifo (0xa0100008) of ep 0x80
[00:00:00.204,000] <wrn> usb_dc_sam_usbhs: putting 32 bytes of data into fifo (0xa0100008) of ep 0x80
[00:00:00.204,000] <wrn> usb_dc_sam_usbhs: putting 64 bytes of data into fifo (0xa0100008) of ep 0x80
[00:00:00.204,000] <wrn> usb_dc_sam_usbhs: putting 0 bytes of data into fifo (0xa0100000) of ep 0x80
[00:00:00.204,000] <wrn> usb_dc_sam_usbhs: putting 38 bytes of data into fifo (0xa0100008) of ep 0x80
[00:00:00.204,000] <wrn> usb_dc_sam_usbhs: putting 34 bytes of data into fifo (0xa0100008) of ep 0x80
[00:00:00.204,000] <wrn> usb_dc_sam_usbhs: putting 4 bytes of data into fifo (0xa0100008) of ep 0x80
[00:00:00.204,000] <wrn> usb_dc_sam_usbhs: putting 42 bytes of data into fifo (0xa0100008) of ep 0x80
[00:00:00.211,000] <wrn> usb_dc_sam_usbhs: putting 2 bytes of data into fifo (0xa0100008) of ep 0x80
[00:00:00.212,000] <wrn> usb_transfer: Transfer start, ep 0x02, data 0x20402798, dlen 512
[00:00:00.212,000] <wrn> usb_dc_sam_usbhs: putting 0 bytes of data into fifo (0xa0100008) of ep 0x80
[00:00:00.219,000] <wrn> usb_dc_sam_usbhs: putting 4 bytes of data into fifo (0xa0100008) of ep 0x80
[00:00:00.219,000] <wrn> usb_dc_sam_usbhs: putting 4 bytes of data into fifo (0xa0100008) of ep 0x80
[00:00:00.220,000] <wrn> usb_dc_sam_usbhs: putting 14 bytes of data into fifo (0xa0100008) of ep 0x80
[00:00:00.220,000] <wrn> usb_dc_sam_usbhs: putting 14 bytes of data into fifo (0xa0100008) of ep 0x80
[00:00:00.220,000] <wrn> usb_dc_sam_usbhs: putting 42 bytes of data into fifo (0xa0100008) of ep 0x80
[00:00:00.220,000] <wrn> usb_dc_sam_usbhs: putting 42 bytes of data into fifo (0xa0100008) of ep 0x80
[00:00:00.222,000] <wrn> usb_dc_sam_usbhs: putting 17 bytes of data into fifo (0xa0100008) of ep 0x80
[00:00:01.069,000] <wrn> usb_transfer: Transfer done, ep 0x02, status 0, size 3
[00:00:01.069,000] <wrn> usb_transfer: Transfer start, ep 0x81, data 0x20402798, dlen 3
[00:00:01.069,000] <wrn> usb_transfer: Transfer start, ep 0x02, data 0x20402798, dlen 512
[00:00:01.069,000] <wrn> usb_dc_sam_usbhs: putting 3 bytes of data into fifo (0xa0108000) of ep 0x81
[00:00:01.069,000] <wrn> usb_transfer: Transfer done, ep 0x81, status 0, size 3

If there is any chance that you can recreate this issue, and then also recreate the working solution, with the above repo and scripts, that would be much appreciated. Once I have got some confirmation that this issue is reproducible for others and that my quick fix is working I can begin to look into an improved fix for this problem, and eventually open a PR.

@nick-kraus
Copy link
Contributor Author

I ended up finding a good place to put the FIFO reset / initialization call, which resolved the issue, so I ended up creating pr #49888.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: USB Universal Serial Bus bug The issue is a bug, or the PR is fixing a bug platform: Microchip SAM Microchip SAM Platform (formerly Atmel SAM) priority: low Low impact/importance bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants