qdl: Allow user to decide USB OUT buffer multiplier #39

WillDoItMyself · 2023-03-05T19:42:07Z

Due to issue related in 760b3df qdl is not sending big chunk of data at once to USB driver. That affects performance of flashing.

This change allows user to set multiplier of buffer size. If may cause issues on some machines but gives huge speed gain.

Due to issue related in 760b3df qdl is not sending big chunk of data at once to USB driver. That affects performance of flashing. This change allows user to set multiplier of buffer size. If may cause issues on some machines but gives huge speed gain.

bmx666 · 2023-04-10T20:20:44Z

Very useful feature and improvement, it saves me a lot of time, instead of 15 minutes I can flash devices in 2 minutes max! Thank you!

this can be used to avoid endless blocking in seek call in case of no EDL device available

z3ntu · 2024-04-17T14:52:38Z

Can confirm this works quite well. With this flashing goes from ~14MiB/s without this PR to about ~44MiB/s with --multiplier 2048

multiplier	flashing speed
128	38262kB/s
256	40948kB/s
512	42437kB/s
1024	43222kB/s
2048	44038kB/s

When rebasing on master there's some minor conflicts, struct qdl_device has moved to the file qdl.h

add timeout parameter

andersson · 2024-05-03T23:20:57Z

Looking at that problem description again, I am puzzled to the exact details and why this wasn't fixed in the kernel instead - it seems unreasonable to allow user space to DOS the kernel like that. And indeed, there is a comment in the kernel documenting that the length can be "almost arbitrarily large". I also don't understand why I accepted changing this to 512 bytes.

With that in mind, I think the proposed multiplier-based mechanism favors user control over user friendliness, and I instead suggest that we clamp the value to e.g 128KB and if that breaks people's systems we 1) debug the kernel 2) consider our options for knobs etc.

Since commit 760b3df ("qdl: Rework qdl_write to limit write sizes to out_maxpktsize") outgoing transfers has been chunked into blocks of wMaxPacketSize. As reported by Wiktor Drewniak, Maksim Paimushkin, and Luca Weiss in [1] there's huge benefits to be found in reverting this change, but out of caution the limit was untouched. With the transition to libusb, the throughput dropped another ~15% on my machine. The numbers for HighSpeed and SuperSpeed are also in the same ballpark. With SuperSpeed, benchmarking of different chunk sizes in the megabyte range shows improvement over these numbers in the range of 15-20x on the same machine/board combination. The bug report related to the reduction in size describes a host machine running out of swiotlb, from the fact that we requested 1MB physically contiguous memory. It's unclear if attempts was made to correct the kernel's behavior. libusb does inquiry the capabilities of the kernel and will split the buffer and submitting multiple URBs at once, as needed. While no definitive guidance has been found, multiple sources seems to recommend passing 1-2MB of buffer to libusb at a time. So, let's move the default chunk size back up to 1MB and hope that libusb resolves the reported problem. Additionally, introduce a new option --out-chunk-size, which allow the user to override the chunk size. This would allow any user to reduce the size if needed, but also allow the value to be increased as needed. [1] linux-msm#39 Reported-by: Wiktor Drewniak Reported-by: Maksim Paimushkin Reported-by: Luca Weiss Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com>

Since commit 760b3df ("qdl: Rework qdl_write to limit write sizes to out_maxpktsize") outgoing transfers has been chunked into blocks of wMaxPacketSize. As reported by Wiktor Drewniak, Maksim Paimushkin, and Luca Weiss in [1] there's huge benefits to be found in reverting this change, but out of caution the limit was untouched. With the transition to libusb, the throughput dropped another ~15% on my machine. The numbers for HighSpeed and SuperSpeed are also in the same ballpark. With SuperSpeed, benchmarking of different chunk sizes in the megabyte range shows improvement over these numbers in the range of 15-20x on the same machine/board combination. The bug report related to the reduction in size describes a host machine running out of swiotlb, from the fact that we requested 1MB physically contiguous memory. It's unclear if attempts was made to correct the kernel's behavior. libusb does inquiry the capabilities of the kernel and will split the buffer and submitting multiple URBs at once, as needed. While no definitive guidance has been found, multiple sources seems to recommend passing 1-2MB of buffer to libusb at a time. So, let's move the default chunk size back up to 1MB and hope that libusb resolves the reported problem. Additionally, introduce a new option --out-chunk-size, which allow the user to override the chunk size. This would allow any user to reduce the size if needed, but also allow the value to be increased as needed. [1] #39 Reported-by: Wiktor Drewniak Reported-by: Maksim Paimushkin Reported-by: Luca Weiss Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com>

andersson · 2024-06-10T15:34:07Z

Replaced with #73, thanks for pushing for this!

z3ntu · 2024-08-05T12:19:12Z

It's taken me a while to test the new version, but flashing without any extra arguments now take 2.6 minutes for my device, with flashing speed for super partition being around 41000kB/s to 42000kB/s, so being very close to what I tested at #39 (comment), and I have a bit different USB connections now compared to when I last tested.

Thanks @andersson!

WillDoItMyself added 3 commits March 5, 2023 19:38

Add working link to README

7452eb6

Set default multiplier to 128

bb69742

MarSianer1 added 2 commits March 6, 2024 14:48

add timeout parameter

cbd987d

this can be used to avoid endless blocking in seek call in case of no EDL device available

extend help text with new timeout parameter

7478852

Merge pull request #1 from MarSianer1/buffer_size_multiplier

a22b49c

add timeout parameter

quic-bjorande mentioned this pull request Jun 8, 2024

usb: Allow overriding bulk write size and increase default #73

Merged

andersson closed this Jun 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

qdl: Allow user to decide USB OUT buffer multiplier #39

qdl: Allow user to decide USB OUT buffer multiplier #39

WillDoItMyself commented Mar 5, 2023

bmx666 commented Apr 10, 2023

z3ntu commented Apr 17, 2024

andersson commented May 3, 2024

andersson commented Jun 10, 2024

z3ntu commented Aug 5, 2024

qdl: Allow user to decide USB OUT buffer multiplier #39

qdl: Allow user to decide USB OUT buffer multiplier #39

Conversation

WillDoItMyself commented Mar 5, 2023

bmx666 commented Apr 10, 2023

z3ntu commented Apr 17, 2024

andersson commented May 3, 2024

andersson commented Jun 10, 2024

z3ntu commented Aug 5, 2024