[Enhancement]: Parallel Execution Plans on Data Nodes #4306
Labels
enhancement
An enhancement to an existing feature for functionality
multinode
performance
Query Experience Team
What type of enhancement is this?
Performance
What subsystems and features will be improved?
Distributed hypertable
What does the enhancement do?
I have pretty good luck getting Postgres to choose parallel execution plans for regular hypertables, but it is nearly impossible to get Postgres to choose a parallel execution plan for remote queries executed on the data nodes. I have been experimenting with this using the timescaledb.enable_remote_explain = on setting and simply executing the remote SQL directly on the data node.
One barrier I found was that the partialize_agg function completely prevents a parallel plan from being chosen even when force_parallel_mode is set to on.
When I set parallel_setup_cost and parallel_tuple_cost = 1 and force_parallel_mode = on, and change partialize_agg to a regular aggregation, such as avg, I can generate a parallel plan but it only ever plans one worker even though max_worker_processes = 59, max_parallel_workers = 48 and max_parallel_workers_per_gather = 24, and the table contains 25 million rows. This has become part of another discussion here.
I realize this is open ended so feel free to close, I mainly wanted to raise the issue of partialize_agg blocking parallel plans.
Implementation challenges
No response
The text was updated successfully, but these errors were encountered: