-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: bitmap pushdown for filtering operation with column index #126
Conversation
Signed-off-by: silver-ymz <yinmingzhuo@gmail.com>
How to create a query using bitmap pushdown? |
example to use bitmap pushdown set vectors.enable_prefilter=on;
set vectors.enable_bitmap_pushdown=on;
CREATE TABLE products (
id serial primary key,
price real,
feature vector(3)
);
INSERT INTO products (price, feature) SELECT random(), ARRAY[random(), random(), random()]::real[] FROM generate_series(1, 5000);
CREATE INDEX ON products USING btree (price);
CREATE INDEX ON products USING vectors (feature l2_ops)
WITH (options = $$
capacity = 10000
[algorithm.hnsw]
$$);
SELECT id FROM products WHERE
price > 0.2 AND price <= 0.7
ORDER BY
feature <-> '[0.5, 0.5, 0.5]'
LIMIT 100; It will generate 2 possible query plan
Ideal plan to use bitmap pushdown will be
Current implementation is to collect vector index scan path and bitmap index scan path to custom scan path in Now the problem is postgres will do more work about generating custom scan plan, e.x. setting plan references, dealing with scan relation. It seems that we need to find a way to bypass this process. |
The SQL does not generate 2 plans in my environment. |
set vectors.enable_vector_index=off;
set enable_seqscan=off; It will be more likely to generate bitmap scan. |
Can we generate bitmap scan directly from filter quals? |
It is possible theoretically. But we need to implement almost all logic of build_index_paths. It seems quite complicated. |
I didn't get why "we need to find a way to bypass this process.". Do you mean other query plan may have lower cost in estimation? |
I mean we need to find a way to bypass the additional process related to scan that is done during the generation of a custom scan plan from a custom path. When postgres selects our injected custom path as cheapest path, it will generate a custom plan from it. In the generation, it will deal with lots of additional things about scan. We don't have the proper parameters to get postgres to complete the process, so it will error out. |
Hard to implement without modifying postgres. Closed for now |
WIP
close #116
Current problem is
ERROR: variable not found in subplan target list
in building plan process. Need to debug.