-
Notifications
You must be signed in to change notification settings - Fork 848
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mark partialize_agg as parallel safe #4307
Merged
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,104 @@ | ||
-- This file and its contents are licensed under the Timescale License. | ||
-- Please see the included NOTICE for copyright information and | ||
-- LICENSE-TIMESCALE for a copy of the license. | ||
-- Test that for parallel-safe aggregate function a parallel plan is generated | ||
-- on data nodes, and for unsafe it is not. We use a manually created safe | ||
-- function and not a builtin one, to check that we can in fact create a | ||
-- function that is parallelized, to prevent a false negative (i.e. it's not | ||
-- parallelized, but for a different reason, not because it's unsafe). | ||
-- Create a relatively big table on one data node to test parallel plans and | ||
-- avoid flakiness. | ||
create table metrics_dist1(like metrics_dist); | ||
select table_name from create_distributed_hypertable('metrics_dist1', 'time', 'device_id', | ||
data_nodes => '{"data_node_1"}'); | ||
WARNING: only one data node was assigned to the hypertable | ||
table_name | ||
metrics_dist1 | ||
(1 row) | ||
|
||
insert into metrics_dist1 select * from metrics_dist order by metrics_dist limit 20000; | ||
\set safe 'create or replace aggregate ts_debug_shippable_safe_count(*) (sfunc = int8inc, combinefunc=int8pl, stype = bigint, initcond = 0, parallel = safe);' | ||
\set unsafe 'create or replace aggregate ts_debug_shippable_unsafe_count(*) (sfunc = int8inc, combinefunc=int8pl, stype = bigint, initcond = 0, parallel = unsafe);' | ||
:safe | ||
call distributed_exec(:'safe'); | ||
:unsafe | ||
call distributed_exec(:'unsafe'); | ||
call distributed_exec($$ set parallel_tuple_cost = 0; $$); | ||
call distributed_exec($$ set parallel_setup_cost = 0; $$); | ||
call distributed_exec($$ set max_parallel_workers_per_gather = 1; $$); | ||
set timescaledb.enable_remote_explain = 1; | ||
set enable_partitionwise_aggregate = 1; | ||
\set analyze 'explain (analyze, verbose, costs off, timing off, summary off)' | ||
:analyze | ||
select count(*) from metrics_dist1; | ||
QUERY PLAN | ||
Custom Scan (DataNodeScan) (actual rows=1 loops=1) | ||
Output: (count(*)) | ||
Relations: Aggregate on (public.metrics_dist1) | ||
Data node: data_node_1 | ||
Fetcher Type: Row by row | ||
Chunks: _dist_hyper_X_X_chunk, _dist_hyper_X_X_chunk | ||
Remote SQL: SELECT count(*) FROM public.metrics_dist1 WHERE _timescaledb_internal.chunks_in(public.metrics_dist1.*, ARRAY[..]) | ||
Remote EXPLAIN: | ||
Finalize Aggregate (actual rows=1 loops=1) | ||
Output: count(*) | ||
-> Gather (actual rows=2 loops=1) | ||
Output: (PARTIAL count(*)) | ||
Workers Planned: 1 | ||
Workers Launched: 1 | ||
-> Partial Aggregate (actual rows=1 loops=2) | ||
Output: PARTIAL count(*) | ||
Worker 0: actual rows=1 loops=1 | ||
-> Parallel Append (actual rows=10000 loops=2) | ||
Worker 0: actual rows=0 loops=1 | ||
-> Parallel Seq Scan on _timescaledb_internal._dist_hyper_X_X_chunk (actual rows=17990 loops=1) | ||
-> Parallel Seq Scan on _timescaledb_internal._dist_hyper_X_X_chunk (actual rows=2010 loops=1) | ||
|
||
(22 rows) | ||
|
||
:analyze | ||
select ts_debug_shippable_safe_count(*) from metrics_dist1; | ||
QUERY PLAN | ||
Custom Scan (DataNodeScan) (actual rows=1 loops=1) | ||
Output: (ts_debug_shippable_safe_count(*)) | ||
Relations: Aggregate on (public.metrics_dist1) | ||
Data node: data_node_1 | ||
Fetcher Type: Row by row | ||
Chunks: _dist_hyper_X_X_chunk, _dist_hyper_X_X_chunk | ||
Remote SQL: SELECT public.ts_debug_shippable_safe_count(*) FROM public.metrics_dist1 WHERE _timescaledb_internal.chunks_in(public.metrics_dist1.*, ARRAY[..]) | ||
Remote EXPLAIN: | ||
Finalize Aggregate (actual rows=1 loops=1) | ||
Output: public.ts_debug_shippable_safe_count(*) | ||
-> Gather (actual rows=2 loops=1) | ||
Output: (PARTIAL public.ts_debug_shippable_safe_count(*)) | ||
Workers Planned: 1 | ||
Workers Launched: 1 | ||
-> Partial Aggregate (actual rows=1 loops=2) | ||
Output: PARTIAL public.ts_debug_shippable_safe_count(*) | ||
Worker 0: actual rows=1 loops=1 | ||
-> Parallel Append (actual rows=10000 loops=2) | ||
Worker 0: actual rows=0 loops=1 | ||
-> Parallel Seq Scan on _timescaledb_internal._dist_hyper_X_X_chunk (actual rows=17990 loops=1) | ||
-> Parallel Seq Scan on _timescaledb_internal._dist_hyper_X_X_chunk (actual rows=2010 loops=1) | ||
|
||
(22 rows) | ||
|
||
:analyze | ||
select ts_debug_shippable_unsafe_count(*) from metrics_dist1; | ||
QUERY PLAN | ||
Custom Scan (DataNodeScan) (actual rows=1 loops=1) | ||
Output: (ts_debug_shippable_unsafe_count(*)) | ||
Relations: Aggregate on (public.metrics_dist1) | ||
Data node: data_node_1 | ||
Fetcher Type: Row by row | ||
Chunks: _dist_hyper_X_X_chunk, _dist_hyper_X_X_chunk | ||
Remote SQL: SELECT public.ts_debug_shippable_unsafe_count(*) FROM public.metrics_dist1 WHERE _timescaledb_internal.chunks_in(public.metrics_dist1.*, ARRAY[..]) | ||
Remote EXPLAIN: | ||
Aggregate (actual rows=1 loops=1) | ||
Output: public.ts_debug_shippable_unsafe_count(*) | ||
-> Append (actual rows=20000 loops=1) | ||
-> Seq Scan on _timescaledb_internal._dist_hyper_X_X_chunk (actual rows=17990 loops=1) | ||
-> Seq Scan on _timescaledb_internal._dist_hyper_X_X_chunk (actual rows=2010 loops=1) | ||
|
||
(14 rows) | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
-- This file and its contents are licensed under the Timescale License. | ||
-- Please see the included NOTICE for copyright information and | ||
-- LICENSE-TIMESCALE for a copy of the license. | ||
|
||
-- Test that for parallel-safe aggregate function a parallel plan is generated | ||
-- on data nodes, and for unsafe it is not. We use a manually created safe | ||
-- function and not a builtin one, to check that we can in fact create a | ||
-- function that is parallelized, to prevent a false negative (i.e. it's not | ||
-- parallelized, but for a different reason, not because it's unsafe). | ||
|
||
-- Create a relatively big table on one data node to test parallel plans and | ||
-- avoid flakiness. | ||
create table metrics_dist1(like metrics_dist); | ||
select table_name from create_distributed_hypertable('metrics_dist1', 'time', 'device_id', | ||
data_nodes => '{"data_node_1"}'); | ||
insert into metrics_dist1 select * from metrics_dist order by metrics_dist limit 20000; | ||
|
||
\set safe 'create or replace aggregate ts_debug_shippable_safe_count(*) (sfunc = int8inc, combinefunc=int8pl, stype = bigint, initcond = 0, parallel = safe);' | ||
\set unsafe 'create or replace aggregate ts_debug_shippable_unsafe_count(*) (sfunc = int8inc, combinefunc=int8pl, stype = bigint, initcond = 0, parallel = unsafe);' | ||
|
||
:safe | ||
call distributed_exec(:'safe'); | ||
:unsafe | ||
call distributed_exec(:'unsafe'); | ||
|
||
call distributed_exec($$ set parallel_tuple_cost = 0; $$); | ||
call distributed_exec($$ set parallel_setup_cost = 0; $$); | ||
call distributed_exec($$ set max_parallel_workers_per_gather = 1; $$); | ||
|
||
set timescaledb.enable_remote_explain = 1; | ||
set enable_partitionwise_aggregate = 1; | ||
|
||
\set analyze 'explain (analyze, verbose, costs off, timing off, summary off)' | ||
|
||
:analyze | ||
select count(*) from metrics_dist1; | ||
|
||
:analyze | ||
select ts_debug_shippable_safe_count(*) from metrics_dist1; | ||
|
||
:analyze | ||
select ts_debug_shippable_unsafe_count(*) from metrics_dist1; |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm not a fan of this. Can you get a stable test output without this change as the number of rows is useful for other tests so I would rather not have this removed globally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm keeping the number, just replacing double space for single. The formatting was changed between postgres versions, so I'm normalizing it to avoid having separate references.