Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Index build error: invalid memory alloc request size >= 1GB #43

Closed
ArthurMelin opened this issue Nov 17, 2022 · 3 comments
Closed

Index build error: invalid memory alloc request size >= 1GB #43

ArthurMelin opened this issue Nov 17, 2022 · 3 comments

Comments

@ArthurMelin
Copy link

I encountered a new problem when working with our very large dataset: we hit a palloc size request limitation.
palloc are limited to 1GB - 1, see https://github.com/postgres/postgres/blob/REL_14_5/src/backend/utils/mmgr/mcxt.c#L1077.
This limit is exceeded when allocating the samples array during index build.

Reproduction steps:

CREATE TABLE embed (id integer NOT NULL, vec vector(384) NOT NULL);

Insert 1M rows into the table

SET maintenance_work_mem='16GB';
CREATE INDEX ON embed USING ivfflat (vec vector_cosine_ops) WITH (lists = 13909);
ERROR:  invalid memory alloc request size 1073774812

1073774812 = VECTOR_ARRAY_SIZE(50*13909, 384)

Versions:

posgresql 14.5
pgvector v0.3.1 (2d8b7e5)

@ankane ankane closed this as completed in ccec96b Nov 17, 2022
@ankane
Copy link
Member

ankane commented Nov 17, 2022

Hey @ArthurMelin, thanks for reporting, and sorry you keep running into bugs. Fixed in the commit above. Will spin up a server with the max number of dimensions and lists to try and find any other issues.

@lukeforehand
Copy link

lukeforehand commented Nov 29, 2023

We have the same problem in version 0.5.0 with the new HNSW indexing.

CREATE INDEX idx_embedding_features ON mytable USING hnsw (embedding_features vector_l2_ops);

ERROR:  invalid memory alloc request size 1073741824"

Creating index after the table is loaded with 123,313,896 8-d vectors.

@ankane
Copy link
Member

ankane commented Dec 4, 2023

Hi @lukeforehand, thanks for reporting. Pushed a fix in the commit above (it looks like Postgres lists only support 2^26 - 1 elements on 64-bit architectures).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants