Skip to content

Conversation

wbruna
Copy link
Contributor

@wbruna wbruna commented Sep 24, 2025

While debugging #838 , I noticed that some models were being loaded with the maximum number of threads regardless of the --threads parameter, so I adjusted those calls to pass the thread number. For textual inversions, I kept the processing threads at 1, since the thread spawning was giving me higher latency than just processing those smaller files directly.

Edit: removed the fix for #838

@leejet
Copy link
Owner

leejet commented Sep 24, 2025

I think the fix in #844 is a bit clearer, so I chose to merge #844

@wbruna
Copy link
Contributor Author

wbruna commented Sep 24, 2025

I think the fix in #844 is a bit clearer, so I chose to merge #844

Alright. I'll rebase this, keeping only the other fixes, then.

Many code paths were defaulting to 0, which launch max
threads regardless of the --threads parameter.
For consistency, since std::thread::hardware_concurrency can
return double the number of cores on SMT machines.
@wbruna wbruna force-pushed the fixes_tensor_loading_multithread branch from 206bfde to d4534a7 Compare September 24, 2025 15:37
@leejet leejet merged commit f3140ea into leejet:master Sep 24, 2025
9 checks passed
@leejet
Copy link
Owner

leejet commented Sep 24, 2025

Thank you for your contribution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants