Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multithreaded queries problem #23

Closed
konstantin-sancom opened this issue Jan 25, 2023 · 5 comments
Closed

Multithreaded queries problem #23

konstantin-sancom opened this issue Jan 25, 2023 · 5 comments

Comments

@konstantin-sancom
Copy link

Hi, @JorenSix .

I tried to add multithreading and find out, that there is a problem with multithreading:

  • in unrestricted mode(new threads created as fast as it is possible) from 464 input files Olaf found only 26 in DB
  • with delay of 1-0.5 second there was a result as good as in serial mode(no threading at all)
  • with delay of 0.25 second or less results are degrading

Could you explain, is it 'by design' or is it a BUG also?

@konstantin-sancom
Copy link
Author

konstantin-sancom commented Jan 25, 2023

Update.
The problem is here:

              //The fft struct is reused
              PFFFT_Setup *fftSetup = processor->runner->fftSetup;
              float *fft_in= processor->runner->fft_in;
              float *fft_out= processor->runner->fft_out;

In multithreading it is not possible to "reuse" those objects.

@JorenSix
Copy link
Owner

Hi,

I am not sure about the added value of multi-threading. Perhaps your tests might prove otherwise but I think decoding and storage are the bottleneck. From the readme:

Olaf is single threaded. The main reasons are simplicity and limitations of embedded platforms. The single threaded design keeps the code simple. On embedded platforms with single core CPU’s multithreading makes no sense. On traditional computers there might be a performance gain by implementing multi-threading. However, the time spent on decoding audio and storing fingerprints is much larger than analysis/extraction so the gain might be limited. As an work-around multiple processes can be used simultaniously to query the database.

@JorenSix
Copy link
Owner

But please do not let that stop you to experiment and look where data is shared which should better not be shared. Or how to improve Olaf in general. I am grateful for all constructive criticism on the code! Thanks!

@konstantin-sancom
Copy link
Author

But please do not let that stop you to experiment and look where data is shared which should better not be shared. Or how to improve Olaf in general. I am grateful for all constructive criticism on the code! Thanks!

Ok. I see now - this code is for embeded systems.
On x86 platforms there is a a reason for multithreading and I made it work, BTW.
DB for reading is not a "bottleneck" - it is threadsafe (from the lmdb documentation), they say: there may be a lot of readers as processes or threads without locking eacj other, even there can one writer, not blocking readers.

@JorenSix
Copy link
Owner

Congrats! Very curious to see which changes were needed and how that would look and which effects it would have on (query) performance. Especially vs queries from multiple processes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants