New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for fast pruning of inlined functions #1877
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -795,6 +795,33 @@ timescaledb_get_relation_info_hook(PlannerInfo *root, Oid relation_objectid, boo | |
switch (classify_relation(root, rel, &ht)) | ||
{ | ||
case TS_REL_HYPERTABLE: | ||
{ | ||
#if PG12_GE | ||
/* This only works for PG12 because for earlier versions the inheritance | ||
* expansion happens too early during the planning phase | ||
*/ | ||
RangeTblEntry *rte = planner_rt_fetch(rel->relid, root); | ||
Query *query = root->parse; | ||
/* Mark hypertable RTEs we'd like to expand ourselves. | ||
* Hypertables inside inlineable functions don't get marked during the query | ||
* preprocessing step. Therefore we do an extra try here. However, we need to | ||
* be careful for UPDATE/DELETE as Postgres (in at least version 12) plans them | ||
* in a complicated way (see planner.c:inheritance_planner). First, it runs the | ||
* UPDATE/DELETE through the planner as a simulated SELECT. It uses the results | ||
* of this fake planning to adapt its own UPDATE/DELETE plan. Then it's planned | ||
* a second time as a real UPDATE/DELETE, but with requiredPerms set to 0, as it | ||
* assumes permission checking has been done already during the first planner call. | ||
* We don't want to touch the UPDATE/DELETEs, so we need to check all the regular | ||
* conditions here that are checked during preprocess_query, as well as the | ||
* condition that rte->requiredPerms is not requiring UPDATE/DELETE on this rel. | ||
*/ | ||
if (ts_guc_enable_optimizations && ts_guc_enable_constraint_exclusion && inhparent && | ||
rte->ctename == NULL && !IS_UPDL_CMD(query) && query->resultRelation == 0 && | ||
query->rowMarks == NIL && (rte->requiredPerms & (ACL_UPDATE | ACL_DELETE)) == 0) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. From the conditions earlier in this if-statement, it seems like there is no need to check for Did I miss something? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. See my comment above |
||
{ | ||
rte_mark_for_expansion(rte); | ||
} | ||
#endif | ||
ts_create_private_reloptinfo(rel); | ||
#if PG12_GE | ||
/* in earlier versions this is done during expand_hypertable_inheritance() below */ | ||
|
@@ -809,6 +836,7 @@ timescaledb_get_relation_info_hook(PlannerInfo *root, Oid relation_objectid, boo | |
} | ||
#endif | ||
break; | ||
} | ||
case TS_REL_CHUNK: | ||
case TS_REL_CHUNK_CHILD: | ||
{ | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
-- This file and its contents are licensed under the Apache License 2.0. | ||
-- Please see the included NOTICE for copyright information and | ||
-- LICENSE-APACHE for a copy of the license. | ||
-- test hypertable classification when hypertable is not in cache | ||
-- https://github.com/timescale/timescaledb/issues/1832 | ||
\set PREFIX 'EXPLAIN (costs off)' | ||
CREATE TABLE test (a int, time timestamptz NOT NULL); | ||
SELECT create_hypertable('public.test', 'time'); | ||
create_hypertable | ||
------------------- | ||
(1,public,test,t) | ||
(1 row) | ||
|
||
INSERT INTO test SELECT i, '2020-04-01'::date-10-i from generate_series(1,20) i; | ||
CREATE OR REPLACE FUNCTION test_f(_ts timestamptz) | ||
RETURNS SETOF test LANGUAGE SQL STABLE PARALLEL SAFE | ||
AS $f$ | ||
SELECT DISTINCT ON (a) * FROM test WHERE time >= _ts ORDER BY a, time DESC | ||
$f$; | ||
:PREFIX SELECT * FROM test_f(now()); | ||
QUERY PLAN | ||
------------------------------------------------- | ||
Unique | ||
-> Sort | ||
Sort Key: test.a, test."time" DESC | ||
-> Custom Scan (ChunkAppend) on test | ||
Chunks excluded during startup: 4 | ||
(5 rows) | ||
|
||
-- create new session | ||
\c | ||
-- plan output should be identical to previous session | ||
:PREFIX SELECT * FROM test_f(now()); | ||
QUERY PLAN | ||
------------------------------------------------- | ||
Unique | ||
-> Sort | ||
Sort Key: test.a, test."time" DESC | ||
-> Custom Scan (ChunkAppend) on test | ||
Chunks excluded during startup: 4 | ||
(5 rows) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,149 @@ | ||
-- This file and its contents are licensed under the Apache License 2.0. | ||
-- Please see the included NOTICE for copyright information and | ||
-- LICENSE-APACHE for a copy of the license. | ||
-- test hypertable classification when query is in an inlineable function | ||
\set PREFIX 'EXPLAIN (costs off)' | ||
CREATE TABLE test (a int, b bigint NOT NULL); | ||
SELECT create_hypertable('public.test', 'b', chunk_time_interval=>10); | ||
create_hypertable | ||
------------------- | ||
(1,public,test,t) | ||
(1 row) | ||
|
||
INSERT INTO test SELECT i, i FROM generate_series(1, 20) i; | ||
CREATE OR REPLACE FUNCTION test_f(_ts bigint) | ||
RETURNS SETOF test LANGUAGE SQL STABLE | ||
as $f$ | ||
SELECT DISTINCT ON (a) * FROM test WHERE b >= _ts AND b <= _ts + 2 | ||
$f$; | ||
-- plans must be the same in both cases | ||
-- specifically, the first plan should not contain the parent hypertable | ||
-- as that is a sign the pruning was not done successfully | ||
:PREFIX SELECT * FROM test_f(5); | ||
QUERY PLAN | ||
------------------------------------------------------------------------------ | ||
Unique | ||
-> Sort | ||
Sort Key: _hyper_1_1_chunk.a | ||
-> Index Scan using _hyper_1_1_chunk_test_b_idx on _hyper_1_1_chunk | ||
Index Cond: ((b >= '5'::bigint) AND (b <= '7'::bigint)) | ||
(5 rows) | ||
|
||
:PREFIX SELECT DISTINCT ON (a) * FROM test WHERE b >= 5 AND b <= 5 + 2; | ||
QUERY PLAN | ||
------------------------------------------------------------------------------ | ||
Unique | ||
-> Sort | ||
Sort Key: _hyper_1_1_chunk.a | ||
-> Index Scan using _hyper_1_1_chunk_test_b_idx on _hyper_1_1_chunk | ||
Index Cond: ((b >= 5) AND (b <= 7)) | ||
(5 rows) | ||
|
||
-- test with FOR UPDATE | ||
CREATE OR REPLACE FUNCTION test_f(_ts bigint) | ||
RETURNS SETOF test LANGUAGE SQL STABLE | ||
as $f$ | ||
SELECT * FROM test WHERE b >= _ts AND b <= _ts + 2 FOR UPDATE | ||
$f$; | ||
-- pruning should not be done by TimescaleDb in this case | ||
-- specifically, the parent hypertable must exist in the output plan | ||
:PREFIX SELECT * FROM test_f(5); | ||
QUERY PLAN | ||
------------------------------------------------------------------------------------ | ||
Subquery Scan on test_f | ||
-> LockRows | ||
-> Append | ||
-> Seq Scan on test | ||
Filter: ((b >= '5'::bigint) AND (b <= '7'::bigint)) | ||
-> Index Scan using _hyper_1_1_chunk_test_b_idx on _hyper_1_1_chunk | ||
Index Cond: ((b >= '5'::bigint) AND (b <= '7'::bigint)) | ||
(7 rows) | ||
|
||
:PREFIX SELECT * FROM test WHERE b >= 5 AND b <= 5 + 2 FOR UPDATE; | ||
QUERY PLAN | ||
------------------------------------------------------------------------------ | ||
LockRows | ||
-> Append | ||
-> Seq Scan on test | ||
Filter: ((b >= 5) AND (b <= 7)) | ||
-> Index Scan using _hyper_1_1_chunk_test_b_idx on _hyper_1_1_chunk | ||
Index Cond: ((b >= 5) AND (b <= 7)) | ||
(6 rows) | ||
|
||
-- test with CTE | ||
-- these cases are just to make sure we're everything is alright with | ||
-- the way we identify hypertables to prune chunks - we abuse ctename | ||
-- for this purpose. So double-check if we're not breaking plans | ||
-- with CTEs here. | ||
CREATE OR REPLACE FUNCTION test_f(_ts bigint) | ||
RETURNS SETOF test LANGUAGE SQL STABLE | ||
as $f$ | ||
WITH ct AS MATERIALIZED ( | ||
SELECT DISTINCT ON (a) * FROM test WHERE b >= _ts AND b <= _ts + 2 | ||
) | ||
SELECT * FROM ct | ||
$f$; | ||
:PREFIX SELECT * FROM test_f(5); | ||
QUERY PLAN | ||
-------------------------------------------------------------------------------------- | ||
CTE Scan on ct | ||
CTE ct | ||
-> Unique | ||
-> Sort | ||
Sort Key: _hyper_1_1_chunk.a | ||
-> Index Scan using _hyper_1_1_chunk_test_b_idx on _hyper_1_1_chunk | ||
Index Cond: ((b >= '5'::bigint) AND (b <= '7'::bigint)) | ||
(7 rows) | ||
|
||
:PREFIX | ||
WITH ct AS MATERIALIZED ( | ||
SELECT DISTINCT ON (a) * FROM test WHERE b >= 5 AND b <= 5 + 2 | ||
) | ||
SELECT * FROM ct; | ||
QUERY PLAN | ||
-------------------------------------------------------------------------------------- | ||
CTE Scan on ct | ||
CTE ct | ||
-> Unique | ||
-> Sort | ||
Sort Key: _hyper_1_1_chunk.a | ||
-> Index Scan using _hyper_1_1_chunk_test_b_idx on _hyper_1_1_chunk | ||
Index Cond: ((b >= 5) AND (b <= 7)) | ||
(7 rows) | ||
|
||
-- CTE within CTE | ||
:PREFIX | ||
WITH ct AS MATERIALIZED ( | ||
SELECT * FROM test_f(5) | ||
) | ||
SELECT * FROM ct; | ||
QUERY PLAN | ||
---------------------------------------------------------------------------------------------- | ||
CTE Scan on ct | ||
CTE ct | ||
-> CTE Scan on ct ct_1 | ||
CTE ct | ||
-> Unique | ||
-> Sort | ||
Sort Key: _hyper_1_1_chunk.a | ||
-> Index Scan using _hyper_1_1_chunk_test_b_idx on _hyper_1_1_chunk | ||
Index Cond: ((b >= '5'::bigint) AND (b <= '7'::bigint)) | ||
(9 rows) | ||
|
||
-- CTE within NO MATERIALIZED CTE | ||
:PREFIX | ||
WITH ct AS NOT MATERIALIZED ( | ||
SELECT * FROM test_f(5) | ||
) | ||
SELECT * FROM ct; | ||
QUERY PLAN | ||
-------------------------------------------------------------------------------------- | ||
CTE Scan on ct | ||
CTE ct | ||
-> Unique | ||
-> Sort | ||
Sort Key: _hyper_1_1_chunk.a | ||
-> Index Scan using _hyper_1_1_chunk_test_b_idx on _hyper_1_1_chunk | ||
Index Cond: ((b >= '5'::bigint) AND (b <= '7'::bigint)) | ||
(7 rows) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,75 @@ | ||
-- This file and its contents are licensed under the Apache License 2.0. | ||
-- Please see the included NOTICE for copyright information and | ||
-- LICENSE-APACHE for a copy of the license. | ||
|
||
-- test hypertable classification when query is in an inlineable function | ||
|
||
\set PREFIX 'EXPLAIN (costs off)' | ||
|
||
CREATE TABLE test (a int, b bigint NOT NULL); | ||
SELECT create_hypertable('public.test', 'b', chunk_time_interval=>10); | ||
INSERT INTO test SELECT i, i FROM generate_series(1, 20) i; | ||
|
||
CREATE OR REPLACE FUNCTION test_f(_ts bigint) | ||
RETURNS SETOF test LANGUAGE SQL STABLE | ||
as $f$ | ||
SELECT DISTINCT ON (a) * FROM test WHERE b >= _ts AND b <= _ts + 2 | ||
$f$; | ||
|
||
-- plans must be the same in both cases | ||
-- specifically, the first plan should not contain the parent hypertable | ||
-- as that is a sign the pruning was not done successfully | ||
:PREFIX SELECT * FROM test_f(5); | ||
|
||
:PREFIX SELECT DISTINCT ON (a) * FROM test WHERE b >= 5 AND b <= 5 + 2; | ||
Comment on lines
+22
to
+24
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There are a few negative tests missing here: cases where the pruning should not be done.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would you like to have these tests explicitly in this file? These code paths are actually triggered by existing tests in TimescaleDb (see my comment above for one example). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps to elaborate a bit more - the code path added allows to do pruning of inlineable functions, but the logic There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It would be good to have tests here that explicitly check when and how inlining takes place, both positive (for this query the function should be inlined) and negative (for this function, the query should not be inlined). It is true that this is already done elsewhere, but I think it is useful to have explicit tests for whatever changes you add both to make it clear that you've done the job correctly, but also to make sure that future changes does not cause regression. It is not necessary to have a query that demonstrate that pruning is done because of the inlining of functions (this happens elsewhere), but having tests demonstrate that the inlining takes place when expected and not when not expected is useful in maintaining the code, so, something like this:
|
||
|
||
|
||
-- test with FOR UPDATE | ||
CREATE OR REPLACE FUNCTION test_f(_ts bigint) | ||
RETURNS SETOF test LANGUAGE SQL STABLE | ||
as $f$ | ||
SELECT * FROM test WHERE b >= _ts AND b <= _ts + 2 FOR UPDATE | ||
$f$; | ||
|
||
|
||
-- pruning should not be done by TimescaleDb in this case | ||
-- specifically, the parent hypertable must exist in the output plan | ||
:PREFIX SELECT * FROM test_f(5); | ||
|
||
:PREFIX SELECT * FROM test WHERE b >= 5 AND b <= 5 + 2 FOR UPDATE; | ||
|
||
-- test with CTE | ||
-- these cases are just to make sure we're everything is alright with | ||
-- the way we identify hypertables to prune chunks - we abuse ctename | ||
-- for this purpose. So double-check if we're not breaking plans | ||
-- with CTEs here. | ||
CREATE OR REPLACE FUNCTION test_f(_ts bigint) | ||
RETURNS SETOF test LANGUAGE SQL STABLE | ||
as $f$ | ||
WITH ct AS MATERIALIZED ( | ||
SELECT DISTINCT ON (a) * FROM test WHERE b >= _ts AND b <= _ts + 2 | ||
) | ||
SELECT * FROM ct | ||
$f$; | ||
|
||
:PREFIX SELECT * FROM test_f(5); | ||
|
||
:PREFIX | ||
WITH ct AS MATERIALIZED ( | ||
SELECT DISTINCT ON (a) * FROM test WHERE b >= 5 AND b <= 5 + 2 | ||
) | ||
SELECT * FROM ct; | ||
|
||
-- CTE within CTE | ||
:PREFIX | ||
WITH ct AS MATERIALIZED ( | ||
SELECT * FROM test_f(5) | ||
) | ||
SELECT * FROM ct; | ||
|
||
-- CTE within NO MATERIALIZED CTE | ||
:PREFIX | ||
WITH ct AS NOT MATERIALIZED ( | ||
SELECT * FROM test_f(5) | ||
) | ||
SELECT * FROM ct; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICT, since
resultRelation == 0
forINSERT
,UPDATE
, andDELETE
it seems like the check!IS_UPDL_CMD(query)
is redundant.You mention that Postgres runs the queries through the planner as simulated selects, but I cannot see where that happens and attaching a debugger and trying some queries does not allow me to produce a query where
query->resultRelation == 0
andquery->commandType != CMD_SELECT
.I have probably missed where this happens, but could you please point me to where Postgres runs the UPDATE or DELETE as a simulated SELECT or show me a statement that will trigger this situation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It happens in
planner.c
in Postgres source, around line 1300:Indeed that makes
!IS_UPDL_CMD(query)
check redundant, because that'll always evaluate to true on v12. However, since I'd say Postgres applies a bit of a hack for planning update/delete's in v12, I think it's good to check anyway and not rely on the Postgres hack completely. If you disagree I'd be happy to remove it though.Queries that trigger this behavior can actually be found in TimescaleDb tests already. For example, when running the tests without the extra ACL permission check, regressions come up. Here's one for an
UPDATE
, but a similar test exists forDELETE
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the pointer, it explained a lot.
From the code, it looks like
commandType
andresultRelation
are both reset, so (as you point out) they are not both needed.Being a little defensive is always good, but if all are checks in the condition part of the if-statement, it make it hard for future developer to understand what is necessary and what is redundant. I would suggest that you add assertions for the "extra conditions" that you expect to hold, which will allow them to be removed in release builds but trigger failures in debug builds if Postgres changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After some experimenting it would seem I cannot move them from the if-statement to the assert unfortunately.
The Postgres planner code I gave you is only part of the full, rather complicated, inheritance planner. The first step of the planner is what I showed - where it simulates it as a
SELECT
to find out which partitions to process. However, it then uses the results of this to modify its ownUPDATE
/DELETE
plan and then runs it through the planner again. This time, it actually resets therequiredPerms
field to0
though.This means that:
commandType==SELECT
andrequiredPerms
are set toUPDATE/DELETE
commandType==UPDATE/DELETE
andrequiredPerms==0
.The reset happens in
inherit.c
functionexpand_single_inheritance_child
These points together means we do need to check all conditions in the if-statement.
I will add some extra comments to this part though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, then they need to be kept, but please add tests that make sure that nobody accidentally removes them and triggers a regression. That is, a test that will fail if any of the checks are removed, but not otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'm adding extra tests now. Will post a new version soon.