-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
qdl: Allow user to decide USB OUT buffer multiplier #39
qdl: Allow user to decide USB OUT buffer multiplier #39
Conversation
Due to issue related in 760b3df qdl is not sending big chunk of data at once to USB driver. That affects performance of flashing. This change allows user to set multiplier of buffer size. If may cause issues on some machines but gives huge speed gain.
Very useful feature and improvement, it saves me a lot of time, instead of 15 minutes I can flash devices in 2 minutes max! Thank you! |
this can be used to avoid endless blocking in seek call in case of no EDL device available
Can confirm this works quite well. With this flashing goes from ~14MiB/s without this PR to about ~44MiB/s with
When rebasing on master there's some minor conflicts, |
add timeout parameter
Looking at that problem description again, I am puzzled to the exact details and why this wasn't fixed in the kernel instead - it seems unreasonable to allow user space to DOS the kernel like that. And indeed, there is a comment in the kernel documenting that the length can be "almost arbitrarily large". I also don't understand why I accepted changing this to 512 bytes. With that in mind, I think the proposed multiplier-based mechanism favors user control over user friendliness, and I instead suggest that we clamp the value to e.g 128KB and if that breaks people's systems we 1) debug the kernel 2) consider our options for knobs etc. |
Since commit 760b3df ("qdl: Rework qdl_write to limit write sizes to out_maxpktsize") outgoing transfers has been chunked into blocks of wMaxPacketSize. As reported by Wiktor Drewniak, Maksim Paimushkin, and Luca Weiss in [1] there's huge benefits to be found in reverting this change, but out of caution the limit was untouched. With the transition to libusb, the throughput dropped another ~15% on my machine. The numbers for HighSpeed and SuperSpeed are also in the same ballpark. With SuperSpeed, benchmarking of different chunk sizes in the megabyte range shows improvement over these numbers in the range of 15-20x on the same machine/board combination. The bug report related to the reduction in size describes a host machine running out of swiotlb, from the fact that we requested 1MB physically contiguous memory. It's unclear if attempts was made to correct the kernel's behavior. libusb does inquiry the capabilities of the kernel and will split the buffer and submitting multiple URBs at once, as needed. While no definitive guidance has been found, multiple sources seems to recommend passing 1-2MB of buffer to libusb at a time. So, let's move the default chunk size back up to 1MB and hope that libusb resolves the reported problem. Additionally, introduce a new option --out-chunk-size, which allow the user to override the chunk size. This would allow any user to reduce the size if needed, but also allow the value to be increased as needed. [1] linux-msm#39 Reported-by: Wiktor Drewniak Reported-by: Maksim Paimushkin Reported-by: Luca Weiss Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com>
Since commit 760b3df ("qdl: Rework qdl_write to limit write sizes to out_maxpktsize") outgoing transfers has been chunked into blocks of wMaxPacketSize. As reported by Wiktor Drewniak, Maksim Paimushkin, and Luca Weiss in [1] there's huge benefits to be found in reverting this change, but out of caution the limit was untouched. With the transition to libusb, the throughput dropped another ~15% on my machine. The numbers for HighSpeed and SuperSpeed are also in the same ballpark. With SuperSpeed, benchmarking of different chunk sizes in the megabyte range shows improvement over these numbers in the range of 15-20x on the same machine/board combination. The bug report related to the reduction in size describes a host machine running out of swiotlb, from the fact that we requested 1MB physically contiguous memory. It's unclear if attempts was made to correct the kernel's behavior. libusb does inquiry the capabilities of the kernel and will split the buffer and submitting multiple URBs at once, as needed. While no definitive guidance has been found, multiple sources seems to recommend passing 1-2MB of buffer to libusb at a time. So, let's move the default chunk size back up to 1MB and hope that libusb resolves the reported problem. Additionally, introduce a new option --out-chunk-size, which allow the user to override the chunk size. This would allow any user to reduce the size if needed, but also allow the value to be increased as needed. [1] #39 Reported-by: Wiktor Drewniak Reported-by: Maksim Paimushkin Reported-by: Luca Weiss Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com>
Replaced with #73, thanks for pushing for this! |
It's taken me a while to test the new version, but flashing without any extra arguments now take 2.6 minutes for my device, with flashing speed for super partition being around 41000kB/s to 42000kB/s, so being very close to what I tested at #39 (comment), and I have a bit different USB connections now compared to when I last tested. Thanks @andersson! |
Due to issue related in 760b3df qdl is not sending big chunk of data at once to USB driver. That affects performance of flashing.
This change allows user to set multiplier of buffer size. If may cause issues on some machines but gives huge speed gain.