Skip to content

Join Graph #18250

@NGA-TRAN

Description

@NGA-TRAN

This task is part of feature #18249, aimed at illustrating what a query’s join graph looks like. We use TPC-H Q5 as the example.

TPC-H Query 5 involves 6 tables and 6 joins, as our example.

select
   n_name,
   sum(l_extendedprice * (1 - l_discount)) as revenue
from
   customer,
   orders,
   lineitem,
   supplier,
   nation,
   region
where
       c_custkey = o_custkey
 and l_orderkey = o_orderkey
 and l_suppkey = s_suppkey
 and c_nationkey = s_nationkey
 and s_nationkey = n_nationkey
 and n_regionkey = r_regionkey
 and r_name = 'ASIA'
 and o_orderdate >= date '1994-01-01' 
 and o_orderdate < date '1995-01-01'
group by
   n_name
order by
   revenue desc;

Join Graph

Since our focus is join order enumeration, we’ll represent the query as a join graph:

  • Tables are shown as circles
  • Joins appear as edges between circles:
    - Directed edges indicate many-to-one joins
    - Undirected edges indicate many-to-many joins
  • Columns inside a circle denote selection predicates (filters) on that table
    - Orders table is filtered on o_orderdate with ~15% selectivity.
    - Region table is filtered o r_name and 20% selectivity
  • Group-by column: n_name.
  • Order-by field: the aggregation
  • Color coding reflects table partitioning, sort order, and size category

These properties illustrate factors that influence join enumeration in the next section. They’re optional and can be extended or omitted based on specific needs

Image

Figure 1: Join Graph of tpc-h Q5

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions