[Enhancement]: Parallel Execution Plans on Data Nodes #4306

TroyDey · 2022-05-05T21:53:02Z

What type of enhancement is this?

Performance

What subsystems and features will be improved?

Distributed hypertable

What does the enhancement do?

I have pretty good luck getting Postgres to choose parallel execution plans for regular hypertables, but it is nearly impossible to get Postgres to choose a parallel execution plan for remote queries executed on the data nodes. I have been experimenting with this using the timescaledb.enable_remote_explain = on setting and simply executing the remote SQL directly on the data node.

One barrier I found was that the partialize_agg function completely prevents a parallel plan from being chosen even when force_parallel_mode is set to on.

When I set parallel_setup_cost and parallel_tuple_cost = 1 and force_parallel_mode = on, and change partialize_agg to a regular aggregation, such as avg, I can generate a parallel plan but it only ever plans one worker even though max_worker_processes = 59, max_parallel_workers = 48 and max_parallel_workers_per_gather = 24, and the table contains 25 million rows. This has become part of another discussion here.

I realize this is open ended so feel free to close, I mainly wanted to raise the issue of partialize_agg blocking parallel plans.

Implementation challenges

No response

akuzm · 2022-05-09T09:53:19Z

A comment from Gayathri about possible implementation:

(https://iobeam.slack.com/archives/CEKV5LMK3/p1651844444404349?thread_ts=1651833518.202549&cid=CEKV5LMK3)
You can parallelize it only if the passed in function is parallel safe.
fyi: There are 2 places where this function is used: 1) distr hypertables and 2) the current version of caggs. I would check the PG 14 code for PARTITIONWISE_AGGREGATE_PARTIAL and see how we can make changes for the distributed case. Might be able to peek into the plan and add parallel paths? (Partial aggs is being removed for caggs, so the cagg case will soon become irrelevant)

TroyDey added the enhancement An enhancement to an existing feature for functionality label May 5, 2022

horzsolt added the Query Experience Team label May 6, 2022

NunoFilipeSantos added multinode performance labels May 10, 2022

horzsolt assigned akuzm May 17, 2022

akuzm mentioned this issue May 30, 2022

Mark partialize_agg as parallel safe #4307

Merged

akuzm closed this as completed in #4307 May 31, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Enhancement]: Parallel Execution Plans on Data Nodes #4306

[Enhancement]: Parallel Execution Plans on Data Nodes #4306

TroyDey commented May 5, 2022

akuzm commented May 9, 2022

[Enhancement]: Parallel Execution Plans on Data Nodes #4306

[Enhancement]: Parallel Execution Plans on Data Nodes #4306

Comments

TroyDey commented May 5, 2022

What type of enhancement is this?

What subsystems and features will be improved?

What does the enhancement do?

Implementation challenges

akuzm commented May 9, 2022