-
-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multithreading in diart.benchmark
#85
Comments
For progress bars, see p_tqdm, tqdm with locks |
Alternative: |
There are two options for progress bars:
I would accept both but strongly prefer the second. |
I've been working on this lately. Rich works well with multithreading, but for some reason it's extremely slow to spawn new workers (maybe because of the GIL?). Whenever multiprocessing is not needed, rich is used by default. I'm also implementing it in a way that users can manually choose the progress bar they want. |
Implemented in #124 |
Problem
Running a benchmark on a huge dataset can take a lot of time. One of the main bottlenecks is that files are processed sequentially.
Idea
Make
diart.benchmark
(and hencediart.tune
) run concurrently on many files at once with a predefined number of workers.It would be great if progress bars could be kept, otherwise we need to find a good solution to show progress.
Another potential problem is having
N
segmentation and embedding model copies in memory, but since they're stateless there should be a workaround to share them. However I would accept a first version withN
models in RAM anyways and think about potential improvements afterwards.See RxPY concurrency
The text was updated successfully, but these errors were encountered: