You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For use cases like a JOIN after a UNION ALL, it can be beneficial to have the UNION ALL run in a single stage.
Existing behavior
For example, consider this simplified fragment of the TPCDS Q5 query -
select 1
from (
select cs_catalog_page_sk as page_sk,
cs_sold_date_sk as date_sk
from catalog_sales
union all
select cr_catalog_page_sk as page_sk,
cr_returned_date_sk as date_sk
from catalog_returns
) salesreturns,
date_dim,
catalog_page
where date_sk = d_date_sk
and d_date between cast('2001-08-04' as date)
and date_add('day', 14, cast('2001-08-04' as date))
and page_sk = cp_catalog_page_sk
where we see that we have large data transfers from Fragment 1 & 2 to perform the UNION, only to have the very selective JOIN execute next and discard most rows
If instead, we could execute the table scans of catalog_sales and catalog_returns in a single stage, we would save the network cost of data transfer
Proposed behavior
Trino added a MultiSourcePartitionedScheduler in trinodb/trino#17265, that allows the scheduler to run multiple source partitioned table scans in a single stage. This along with changes to how exchanges are added, improved the performance of TPCDS Q02 and Q05 significantly
The Trino distributed plan for the above query is -
For use cases like a JOIN after a UNION ALL, it can be beneficial to have the UNION ALL run in a single stage.
Existing behavior
For example, consider this simplified fragment of the TPCDS Q5 query -
The distributed Presto plan for this is -
where we see that we have large data transfers from Fragment 1 & 2 to perform the UNION, only to have the very selective JOIN execute next and discard most rows
If instead, we could execute the table scans of
catalog_sales
andcatalog_returns
in a single stage, we would save the network cost of data transferProposed behavior
Trino added a
MultiSourcePartitionedScheduler
in trinodb/trino#17265, that allows the scheduler to run multiple source partitioned table scans in a single stage. This along with changes to how exchanges are added, improved the performance of TPCDS Q02 and Q05 significantlyThe Trino distributed plan for the above query is -
We can experiment with this to see if this is feasible to implement in Presto as well
The text was updated successfully, but these errors were encountered: