Set the number of threads that OMP can use. by arjenpdevries · Pull Request #2 · cwida/PDX

arjenpdevries · 2025-04-25T19:06:39Z

Finally found some time to try out the great PDX library!

The initial results were a little too impressive, as I noticed a 200x speedup (instead of the 5-7 reported). Then I realized I was working on a machine with 48 cores - which is still highly impressive to see how PDX uses them all to the max. But FAISS needs a little help to use internal multiprocessing.

The pull request uses the function from the FAISS python bindings to increase the number of threads for OMP, and sets it to half the available cores according to the python multiprocessing library.

(I'd understand if you do not want to do this by default, but would recommend documenting it if so.)

arjenpdevries · 2025-04-25T20:42:39Z

PS: also fixed a minor bug. Apologies for not using git branches correctly so it went into the same pullreq.

lkuffo · 2025-05-13T16:30:34Z

Hi Arjen,
This is a good observation. When I was using FAISS on an AMD Zen4 machine, it automatically used all of the available threads for index construction. Maybe this is not the case in some CPU architectures.

I was also wondering where you saw the 200x speedup? The examples are supposed to use FAISS by performing queries one by one (no multithreading).

I will, of course, accept the PR; thanks a lot! We are working to improve the code in a separate branch.
We are also working to implement our own IVF index construction phase and quantization algorithm (to use 8 and 4 bits per dimension)!

(Sorry for the long delay in answering; I went away for a long holiday!)

arjenpdevries · 2025-05-13T16:39:00Z

No problem! The 200x referred to my first trying the "simple" example that does not use FAISS; that used all cores, but then the FAISS vatiant did not do that.

I noticed the benchmark scripts are controlling that perfectly, indeed. (Hope you had a great holiday!)

* Super IMI-PDX * Trying Balanced Clusters * Towards releasing 8 bit PDX * Code is nice and refactored * Better kernels benchmarking * New structure * AVX512 kernels * AVX512 kernels * AVX512 kernels * AVX512 kernels * AVX512 kernels * AVX512 kernels * BOND 32-bit working again * Python bindings v0.2 * First readme iteration * First readme iteration * First readme iteration * IVF probe as a true parameter * Fixing datasets of d=128 * FAISS SQ8 benchmarking script * Adding simple plotter * Adding simple plotter * Adding updated datasets link * AVX build fix * AVX build fix * Final touches * Readme * Readme * Readme * Readme * Readme * Readme * Readme * Readme * Readme * Readme * Readme * Readme * Readme * Readme * Removing unnecessary stuff from sigmod * README * Removing results showcasing in examples

arjenpdevries added 2 commits April 25, 2025 21:00

Set the number of threads that OMP can use.

21d3b94

Loop variable name mismatch (caused compilation errors on my end).

1d6facd

lkuffo merged commit ad359cf into cwida:main May 14, 2025

fangshil mentioned this pull request Jul 15, 2025

Segfault when testing ./examples/pdxearch_simple.py #3

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set the number of threads that OMP can use.#2

Set the number of threads that OMP can use.#2
lkuffo merged 2 commits intocwida:mainfrom
arjenpdevries:main

arjenpdevries commented Apr 25, 2025

Uh oh!

arjenpdevries commented Apr 25, 2025

Uh oh!

lkuffo commented May 13, 2025

Uh oh!

arjenpdevries commented May 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

arjenpdevries commented Apr 25, 2025

Uh oh!

arjenpdevries commented Apr 25, 2025

Uh oh!

lkuffo commented May 13, 2025

Uh oh!

arjenpdevries commented May 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants