This ANN extension of Postgres is based on ivf-hnsw implementation of HNSW, the code for the current state-of-the-art billion-scale nearest neighbor search system presented in the paper:
Revisiting the Inverted Indices for Billion-Scale Approximate Nearest Neighbors,
Dmitry Baranchuk, Artem Babenko, Yury Malkov
HNSW index is hold in memory (built on demand) and it's maxial size is limited
by maxelements
index parameter. Another required parameter is nubmer of dimensions (if it is not specified in column type).
Optional parameter ef
specifies number of neighbors which are considered during index construction and search (corresponds efConstruction
and efSearch
parameters
described in the article).
create extension hnsw;
create table embeddings(id integer primary key, payload real[]);
create index on embeddings using hnsw(payload) with (maxelements=1000000, dims=100, m=32);
select id from embeddings order by payload <-> array[1.0, 2.0,...] limit 100;