RFC: Search architecture #420
Replies: 4 comments
-
I don't know whether meilisearch and pgvecto.rs are a direct swap. pgvecto.rs is specifically targeted for operations on large ML embedding vectors, while meilisearch looks to me like a more 'standard' text search? That said, postgres itself has pretty good text search features built in, so it might be feasible to use those. As a general note, coming from experience on Immich, the approach of starting with a bunch of dedicated containers is awesome for getting things put together quickly. In the longer run I would suggest keeping an eye open for chances to drop containers again, since it helps simplify deployment and reduces the amount of things to keep aligned from your code. Some of the simplifications that we've made in the past in Immich:
Another simplification that we're looking at is to drop the redis container, which we use for queueing, and to manage the queues inside postgres instead. |
Beta Was this translation helpful? Give feedback.
-
Can you explain why removing containers is easier for deployment? Since users should just consume docker-compose files or helm charts, I don't really get how that impact the users. |
Beta Was this translation helpful? Give feedback.
-
For the users that just apply your docker-compose it doesn't make much difference indeed. But many people deploy through different methods or on different platforms, or edit compose files to add things like traefik annotations and such. In those cases, people need to build an understanding of what each container does and how they interact. That said, I'm on board with an approach of "if you don't use our compose file, you'll need to figure it out yourself". I think the development story is a stronger argument. For example, when we used typesense for Immich it meant there was a whole separate database that we had to keep synchronised with postgres. This was a huge pain, full of bugs, and accounted for something like 20% of the lines of code in the projects. Removing it has been one of the better choices we've made :) |
Beta Was this translation helpful? Give feedback.
-
Yeah I can see that, meilisearch syncing was done quickly tho and is ~400loc for the whole feature. |
Beta Was this translation helpful? Give feedback.
-
Feature description
Hi 👋🏼
I noticed you are using meilisearch and was wondering if you knew about, or had any plans to switch this architecture to pgvecto.rs.
The main benefit here is the reduction of additional containers while still having blazing fast search abilities. Immich recently changed from Typesense (meilisearch alternative) from pgvecto.rs in v1.91.0 . Using pgvecto.rs does require a custom PG container with the extension installed and enabled but that is not hard to build or deploy. tensorchord actually provides a Docker container for this as we can see in the Immich docker-compose.
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions