Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(server)!: pgvecto.rs 0.2 and pgvector compatibility #6785

Merged
merged 37 commits into from Feb 7, 2024

Conversation

mertalev
Copy link
Contributor

@mertalev mertalev commented Jan 31, 2024

Description

This PR upgrades the codebase to be compatible with pgvecto.rs 0.2 and future patch releases. This change is beneficial in several respects:

  • It allows for searching with VBASE, a novel vector search algorithm that allows fetching an arbitrary number of rows instead of a fixed number upfront
  • There have been many bug fixes as well as stability and performance improvements since the 0.1.11 release
  • It makes adding and maintaining compatibility with pgvector easier
  • Updating is simpler for admins as they can use any 0.2.* release, not a specific patch release
  • Maintaining Immich is easier for third-party platforms as they have more freedom in the pgvecto.rs version they use
  • It fixes a kernel issue with some Synology servers

Additionally, it now allows pgvector to be used instead. This is added for compatibility reasons - there are environments where pgvecto.rs simply cannot be used right now.

Summary of changes:

  • Check for available extension upgrades at startup and update if possible
  • Generalize vector extension code to work with both pgvecto.rs and pgvector
    • Use pgvecto.rs's pgvector compatibility mode for more code reusability between the two
  • Add an env that determines which extension is used
    • Migrations are applied based on this env, so this choice cannot be changed after a successful startup
  • Make parts of database startup run in a lock to prevent both server and microservices from making changes at the same time
  • Add vectors to the database search path
    • pgvecto.rs now uses this schema to be able to coexist with pgvector
  • Update version constraints to allow pinning to the patch/minor/major part of the minimum version

To do:

  • Add more unit tests

Follow-up work:

  • Refactor the vector DB code - there's some duplication and the logic is spread apart across the database and smart info repos as well as migrations
    • It might be good to move the existing dim size update and migration code all to the database repo
  • Update e2e to run with both extensions to make sure we don't break pgvector
  • Prepare documentation
    • The instructions in the 1.91.0 release notes will be outdated - should we consider making a page for this in the docs?
  • Do more manual testing to discover possible edge-cases

Fixes #5813
Fixes #5880
Fixes #5893
Addresses discussion #5954

Copy link

cloudflare-pages bot commented Jan 31, 2024

Deploying with  Cloudflare Pages  Cloudflare Pages

Latest commit: 3e07a0a
Status: ✅  Deploy successful!
Preview URL: https://afc68123.immich.pages.dev
Branch Preview URL: https://chore-server-pgvecto-rs-0-2.immich.pages.dev

View logs

@OverHash
Copy link

Should also help with #5813

@mertalev
Copy link
Contributor Author

Should also help with #5813

Right, added that to the PR description

@mertalev mertalev force-pushed the chore/server-pgvecto-rs-0.2 branch 2 times, most recently from d339abb to b8578ba Compare February 3, 2024 22:17
@mertalev mertalev marked this pull request as ready for review February 3, 2024 22:19
@alextran1502 alextran1502 changed the title feat(server): pgvecto.rs 0.2 and pgvector compatibility feat(server)!: pgvecto.rs 0.2 and pgvector compatibility Feb 4, 2024
Copy link
Member

@danieldietzler danieldietzler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I really like those new error messages for wrong installs. That's a great addition and probably helps a lot reducing support tickets related to this.

server/src/infra/1707000751533-AddVectorsToSearchPath.ts Outdated Show resolved Hide resolved
update version check

set ef_search for clip
Revert "pgvector compatibility"

This reverts commit 2b66a52.

pgvector compatibility: minimal edition

pgvector startup check
shortened vector extension variable name
add tests for updating extension

remove unnecessary check
update prod compose
check error message
@mertalev mertalev merged commit 56b0643 into main Feb 7, 2024
24 checks passed
@mertalev mertalev deleted the chore/server-pgvecto-rs-0.2 branch February 7, 2024 02:46
@ScrumpyJack
Copy link

Thank you so much for adding pgvector compat. Fantastic!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants