Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

qdl: Allow user to decide USB OUT buffer multiplier #39

Closed

Conversation

WillDoItMyself
Copy link

Due to issue related in 760b3df qdl is not sending big chunk of data at once to USB driver. That affects performance of flashing.

This change allows user to set multiplier of buffer size. If may cause issues on some machines but gives huge speed gain.

Due to issue related in 760b3df qdl is not sending
big chunk of data at once to USB driver. That affects
performance of flashing. This change allows user
to set multiplier of buffer size. If may cause
issues on some machines but gives huge speed
gain.
@bmx666
Copy link

bmx666 commented Apr 10, 2023

Very useful feature and improvement, it saves me a lot of time, instead of 15 minutes I can flash devices in 2 minutes max! Thank you!

this can be used to avoid endless blocking in
seek call in case of no EDL device available
@z3ntu
Copy link

z3ntu commented Apr 17, 2024

Can confirm this works quite well. With this flashing goes from ~14MiB/s without this PR to about ~44MiB/s with --multiplier 2048

multiplier flashing speed
128 38262kB/s
256 40948kB/s
512 42437kB/s
1024 43222kB/s
2048 44038kB/s

When rebasing on master there's some minor conflicts, struct qdl_device has moved to the file qdl.h

@andersson
Copy link
Collaborator

Looking at that problem description again, I am puzzled to the exact details and why this wasn't fixed in the kernel instead - it seems unreasonable to allow user space to DOS the kernel like that. And indeed, there is a comment in the kernel documenting that the length can be "almost arbitrarily large". I also don't understand why I accepted changing this to 512 bytes.

With that in mind, I think the proposed multiplier-based mechanism favors user control over user friendliness, and I instead suggest that we clamp the value to e.g 128KB and if that breaks people's systems we 1) debug the kernel 2) consider our options for knobs etc.

andersson pushed a commit to quic-bjorande/qdl that referenced this pull request Jun 8, 2024
Since commit 760b3df ("qdl: Rework qdl_write to limit write sizes
to out_maxpktsize") outgoing transfers has been chunked into blocks of
wMaxPacketSize.

As reported by Wiktor Drewniak, Maksim Paimushkin, and Luca Weiss in [1]
there's huge benefits to be found in reverting this change, but out of
caution the limit was untouched.

With the transition to libusb, the throughput dropped another ~15% on my
machine. The numbers for HighSpeed and SuperSpeed are also in the same
ballpark.

With SuperSpeed, benchmarking of different chunk sizes in the megabyte
range shows improvement over these numbers in the range of 15-20x on the
same machine/board combination.

The bug report related to the reduction in size describes a host machine
running out of swiotlb, from the fact that we requested 1MB physically
contiguous memory. It's unclear if attempts was made to correct the
kernel's behavior. libusb does inquiry the capabilities of the kernel
and will split the buffer and submitting multiple URBs at once, as
needed.  While no definitive guidance has been found, multiple sources
seems to recommend passing 1-2MB of buffer to libusb at a time.  So,
let's move the default chunk size back up to 1MB and hope that libusb
resolves the reported problem.

Additionally, introduce a new option --out-chunk-size, which allow the
user to override the chunk size. This would allow any user to reduce the
size if needed, but also allow the value to be increased as needed.

[1] linux-msm#39

Reported-by: Wiktor Drewniak
Reported-by: Maksim Paimushkin
Reported-by: Luca Weiss
Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com>
andersson pushed a commit that referenced this pull request Jun 10, 2024
Since commit 760b3df ("qdl: Rework qdl_write to limit write sizes
to out_maxpktsize") outgoing transfers has been chunked into blocks of
wMaxPacketSize.

As reported by Wiktor Drewniak, Maksim Paimushkin, and Luca Weiss in [1]
there's huge benefits to be found in reverting this change, but out of
caution the limit was untouched.

With the transition to libusb, the throughput dropped another ~15% on my
machine. The numbers for HighSpeed and SuperSpeed are also in the same
ballpark.

With SuperSpeed, benchmarking of different chunk sizes in the megabyte
range shows improvement over these numbers in the range of 15-20x on the
same machine/board combination.

The bug report related to the reduction in size describes a host machine
running out of swiotlb, from the fact that we requested 1MB physically
contiguous memory. It's unclear if attempts was made to correct the
kernel's behavior. libusb does inquiry the capabilities of the kernel
and will split the buffer and submitting multiple URBs at once, as
needed.  While no definitive guidance has been found, multiple sources
seems to recommend passing 1-2MB of buffer to libusb at a time.  So,
let's move the default chunk size back up to 1MB and hope that libusb
resolves the reported problem.

Additionally, introduce a new option --out-chunk-size, which allow the
user to override the chunk size. This would allow any user to reduce the
size if needed, but also allow the value to be increased as needed.

[1] #39

Reported-by: Wiktor Drewniak
Reported-by: Maksim Paimushkin
Reported-by: Luca Weiss
Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com>
@andersson
Copy link
Collaborator

Replaced with #73, thanks for pushing for this!

@andersson andersson closed this Jun 10, 2024
@z3ntu
Copy link

z3ntu commented Aug 5, 2024

It's taken me a while to test the new version, but flashing without any extra arguments now take 2.6 minutes for my device, with flashing speed for super partition being around 41000kB/s to 42000kB/s, so being very close to what I tested at #39 (comment), and I have a bit different USB connections now compared to when I last tested.

Thanks @andersson!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants