Documentation improvement ideas #2

KevinColemanInc · 2021-04-22T22:05:05Z

My friend and I were just chatting about adding vector search to pg the other day!

Could you add in the docs (or answer here?) these questions:

What algorithms are used to find the closest vectors? I see you mention FAIS, but its unclear exactly what was used.
Could you provide documentation on how this scales? Would this support 1B vectors?
Does this support partitioned indexes?

ankane · 2021-04-22T23:59:19Z

Hey @KevinColemanInc, glad to hear others are thinking about this.

It uses the IVFFlat index type. It doesn't use Faiss, but from what I can tell, Faiss invented the index type, which is why it's mentioned.
I haven't tried it with 1B vectors, but generally product quantization is used for that scale. I plan to add support if there's demand (Ideas #1).
Does this mean indexes on partitioned tables?

KevinColemanInc · 2021-04-23T00:11:24Z

about about 100M vectors?
I'm not asking a good question. nvm.
new question:
Does the index get backed up? like is it captured with pg_dumps and can be re-imported?

ankane · 2021-04-23T00:37:39Z

For 2: I've only tested with 1M at this point. From a storage perspective, it should have no problem storing any number of vectors up to the Postgres limits ("limited by the number of tuples that can fit onto 4,294,967,295 pages"). From a performance perspective, you'll want to increase the number of inverted lists to keep queries fast (100 by default but supports up to 32,768).

For 4: It works the same as native data and index types (works with pg_dump/pg_restore, uses WAL for recovery and replication)

KevinColemanInc · 2021-04-24T15:32:08Z

Awesome! Thanks for responding.

KevinColemanInc closed this as completed Apr 24, 2021

ankane mentioned this issue May 2, 2021

Dockerize pgvector #5

Closed

japinli mentioned this issue Nov 15, 2023

Fix coredump about HnswFreeElement() #357

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Documentation improvement ideas #2

Documentation improvement ideas #2

KevinColemanInc commented Apr 22, 2021

ankane commented Apr 22, 2021

KevinColemanInc commented Apr 23, 2021

ankane commented Apr 23, 2021 •

edited

KevinColemanInc commented Apr 24, 2021

Documentation improvement ideas #2

Documentation improvement ideas #2

Comments

KevinColemanInc commented Apr 22, 2021

ankane commented Apr 22, 2021

KevinColemanInc commented Apr 23, 2021

ankane commented Apr 23, 2021 • edited

KevinColemanInc commented Apr 24, 2021

ankane commented Apr 23, 2021 •

edited