Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace Isar with SQLite for storing CLIP embeddings #1575

Merged
merged 15 commits into from May 2, 2024
Merged

Conversation

vishnukvmd
Copy link
Member

@vishnukvmd vishnukvmd commented May 1, 2024

Description

  • This PR removes the dependency on Isar, and sets up a SQLite DB for storing embeddings.
  • The existing DB is deleted, and the new DB is populated by pulling embeddings from the server. Local migration was possible, but that would have required us to keep Isar as a dependency for an unknown period of time.
  • For 30k embeddings, DB size has dropped from ~420MB to ~115MB. The first load on a Pixel 7 has increased from ~500ms to ~600ms.
  • More details @ https://ente.io/blog/tech/sqlite-objectbox-isar/#update

Tests

  • Verified over internal builds that semantic search is working as expected

Note: This fixes that jank that would happen when a foreground process tries to read data from the Isar DB while a background process is alive.

@vishnukvmd vishnukvmd requested review from ua741 and ashilkn May 1, 2024 09:50
@vishnukvmd vishnukvmd merged commit ab471dd into main May 2, 2024
2 checks passed
@vishnukvmd vishnukvmd deleted the embedding_sqlite branch May 2, 2024 04:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants