Skip to content

Conversation

@DePasqualeOrg
Copy link
Contributor

I noticed that CPU usage in my app was above 300% during 4 concurrent model downloads. I made some changes that appear to reduce the CPU usage by about half:

  • Bytes are appended to the buffer in batches of 16 kB instead of individually.
  • ContiguousArray<UInt8> is used for collecting batches instead of Array<UInt8>.
  • The unnecessary creation of a TaskGroup on each update was removed.

The use of URLSession.bytes(for:) is inherently expensive due to the processing of bytes one at a time in an AsyncSequence, but I'm not sure if there's a better alternative that allows for resumable downloads.

Copy link
Member

@pcuenca pcuenca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks ok to me. Do you have any feedback @ardaatahan?

@pcuenca pcuenca merged commit d93354d into huggingface:main Nov 26, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants