Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up ivfflat build: use float instead of double for dot product #180

Merged

Conversation

pashkinelfe
Copy link
Contributor

This is the first patch of patchset #178 as requested in the comments #178 (comment)

Regarding the conversion of the distance functions interface from double to float it will need modification of vector.sql interface which may need a major extension upgrade. If we're ready for this, I'd easily modify the patch as requested in #178 (comment) As of now I've left it as is for compatibility, and as this modifications will not make it much faster to build ivfflat index.

@ankane ankane mentioned this pull request Jul 12, 2023
34 tasks
@pashkinelfe pashkinelfe force-pushed the index-build-speed-optimizations-1 branch from c93807e to b798a75 Compare July 14, 2023 07:06
calculation

On ARM this makes CPU using vector multiply-add instruction (fmadd)
instead of vector multiplication + conversion to double + addition
(fmul + fcvt + fadd) at each vector dimension.

Output of distance functions and calculations that are are done once
per vector pair are left double as this don't make speed difference
and for compatibility.
@pashkinelfe pashkinelfe force-pushed the index-build-speed-optimizations-1 branch from b798a75 to e0dc34d Compare July 14, 2023 07:38
@pashkinelfe
Copy link
Contributor Author

The results of tests are following:

  1. ANN-benchmark: current master vs patch with float distance calculation (pgvector-float), vs patch with acos cubic Lagrange approximation with sign extension (pgvector-acos) vs both patches (pgvector-acos-float).

Base: dbpedia-openai-1M, index on inner product and queries using <#> operator, results dbpedia-openai-angular.png.

Float calculation has speedup in select queries probably because it is used calculation in query vector to ivfflat list vectors.
Acos approximation introduces slowdown probably because index quality regression at index build phase which is the only place where acos calculation is used.
dbpedia-openai-angular

  1. Index build time (by stage): master vs patch with float distance calculation.

Speedup from patch for k-means and assign stage #1 is approximately equal (in percent)
For 1000 lists most of the time is spent at assign stage so, overall speedup from the patch is more due to assign stage.
For 4000 lists k-means and assign take almost equal time, so overall speedup from the patch is from k-means and assign staged equally.
pgvector-idxbuild-stages

  1. Seq scan time master vs patch with float distance calculation and different distance functions.
    Using storage PLAIN.
    pgvector-seqscan

@ankane ankane merged commit 3950bc3 into pgvector:master Jul 18, 2023
@ankane
Copy link
Member

ankane commented Jul 18, 2023

Thanks @pashkinelfe, this is really great!! (both the performance win and the benchmarks) 🚀 🚀 🚀

Looks like it unlocks fused multiply-add on both aarch64 and x86_64 (additional benchmarks).

We should probably apply it to the cosine_distance function as well.

@ankane
Copy link
Member

ankane commented Jul 18, 2023

Updated cosine_distance in b710dc6.

@ankane
Copy link
Member

ankane commented Aug 9, 2023

Just fyi: switched to double for vector_norm in dab8f25 to avoid overflows. It did not affect performance in my testing (as it's only called once per sample and once per tuple during index builds).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants