Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pgvector: ensure vector is sent in binary representation #335

Merged
merged 1 commit into from
Jun 28, 2024

Commits on Jun 14, 2024

  1. pgvector: ensure vector is sent in binary representation

    PostgreSQL supports two methods of passing data from client to
    server: text and binary. While for many data types the difference
    may not be noticeable, we can see significant performance impact
    when converting a vector from binary => text => binary representation.
    See previous explanation here[1].
    
    While the pgvector loading code accounts for this, the query code
    did not. This is due to the use of a list[float] type, which
    the pgvector-python adapter currently doesn't support. However,
    this adapter does support direct binary transfer if the data
    is represent as a Numpy array[2]. Testing shows that moving to
    a direct binary representation does have a significant impact on
    query results - my tests are showing a 3x impact --  and provides
    a more accurate representation for how this workload would execute.
    
    [1] erikbern/ann-benchmarks#488
    [2] https://github.com/pgvector/pgvector-python?tab=readme-ov-file#psycopg-3
    jkatz committed Jun 14, 2024
    Configuration menu
    Copy the full SHA
    031f1e1 View commit details
    Browse the repository at this point in the history