# query execution plans

whenever we write a sql query, the database system first parses our code to generate a query tree.

then the optimizer modifies the query tree either based on relational algebra or cost-based optimization to generate an execution plan.

the execution plan is a sequence of operations that the database system will perform to execute the query.

the operations are things like "scan this table", "filter out rows that don't satisfy this condition", "join these two tables", and so on.

using the `explain` command, we can see the execution plan that the system came up with for our query.

In [41]:
!psql postgres -c "\h explain"

Command:     EXPLAIN
Description: show the execution plan of a statement
Syntax:
EXPLAIN [ ( option [, ...] ) ] statement
EXPLAIN [ ANALYZE ] [ VERBOSE ] statement

where option can be one of:

    ANALYZE [ boolean ]
    VERBOSE [ boolean ]
    COSTS [ boolean ]
    SETTINGS [ boolean ]
    GENERIC_PLAN [ boolean ]
    BUFFERS [ boolean ]
    WAL [ boolean ]
    TIMING [ boolean ]
    SUMMARY [ boolean ]
    FORMAT { TEXT | XML | JSON | YAML }

URL: https://www.postgresql.org/docs/16/sql-explain.html



since the execution plan is a tree, you have to read it from the leaf nodes to the root node.

each child node provides the input to the parent node.

the root node is the final result of the query that aggregates all the intermediate results (and costs).

learn more through the following resources:

_docs_

- https://www.postgresql.org/docs/current/sql-explain.html
- https://www.postgresql.org/docs/9.5/using-explain.html
- overview: https://www.postgresql.org/docs/current/index.html
- performance tips: https://www.postgresql.org/docs/current/performance-tips.html
- glossary: https://www.pgmustard.com/docs/explain

_visualization tools_

- https://explain.dalibo.com/
     - source: https://github.com/dalibo/pev2
- https://www.pgexplain.dev/ (pev2 but with a backend)
     - source: https://github.com/lfborjas/postgres-explain-visualizer
- https://tatiyants.com/pev/
     - source: https://github.com/AlexTatiyants/pev/
- https://explain.depesz.com/
- https://explain-postgresql.com/ (not so good)

# 1) what is the default triangle-join plan?

what is the most common execution plan for the following query: $r \bowtie s \bowtie t$?

the default join strategy for $r \bowtie s \bowtie t$ is a hash join first, combined with a merge join:

1. hash join $r \bowtie s$ (using $s$ to build the hash table)
    
    then sort the result and materialize it (turn it into a table so it doesn't have to be recomputed).

2. sort $t$ and merge join with the result from step 1.

<br>

**_what is a hash join?_**

assume we want to equi join: $R \bowtie_{\text{A}=\text{B}} S$

- 1) partition phase:
	- find a hash function that can map values in the join columns to a buffer frame index between $[1;B\text{-}1]$. → the buffer frames we map the rows to are called "buckets" and the 1 remaining buffer frame is used to read new pages in.
	- read each page $p_R$ of $R$ to memory. then hash the join value of each row to find the right bucket to store a pointer in. → if buffer frames overflow, write them back to disk.
	- repeat for $p_S$ of $S$.
	- total cost: $2 \cdot (b_R  + b_S)$ → factor of 2 because of initial reading and writing back the potentially full buckets to disk.
- 2) probing phase:
	- assuming $R_i$ and $S_i$ are all rows in the $i$ th-bucket (and $R_i$ is the smaller one of them): read $R_i$ to $B\text-2$ buffer frames. → if not possible, either hash recursively or try another algorithm. the 2 remaining buffer frames are used to read new $S_i$ pages in and store the final result.
	- read each page of $S_i$ into memory. then check each row for matches with $R_i$.
	- if a matching row is found, write it into the buffer frame dedicated to results.
	- total cost: $b_i  + b_s$
- total cost of both phases: $3 \cdot (b_R  + b_S)$

In [60]:
# reset the environment
!chmod +x ./reset.sh && ./reset.sh >> /dev/null

In [61]:
# print all tables
!psql postgres -c "\dt+"

                                      List of relations
 Schema |    Name     | Type  |  Owner  | Persistence | Access method |  Size   | Description 
--------+-------------+-------+---------+-------------+---------------+---------+-------------
 public | badges      | table | sueszli | permanent   | heap          | 1800 kB | 
 public | comments    | table | sueszli | permanent   | heap          | 8296 kB | 
 public | posthistory | table | sueszli | permanent   | heap          | 31 MB   | 
 public | postlinks   | table | sueszli | permanent   | heap          | 112 kB  | 
 public | posts       | table | sueszli | permanent   | heap          | 15 MB   | 
 public | r           | table | sueszli | permanent   | heap          | 744 kB  | 
 public | s           | table | sueszli | permanent   | heap          | 104 kB  | 
 public | t           | table | sueszli | permanent   | heap          | 744 kB  | 
 public | tags        | table | sueszli | permanent   | heap          | 56 kB   | 
 public

In [62]:
# print schema of r, s, t tables
!psql postgres -c "\d r"
!psql postgres -c "\d s"
!psql postgres -c "\d t"

                 Table "public.r"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 a      | integer |           |          | 
 b      | integer |           |          | 

                 Table "public.s"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 b      | integer |           |          | 
 c      | integer |           |          | 

                 Table "public.t"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 a      | integer |           |          | 
 c      | integer |           |          | 



In [101]:
!psql postgres -c "explain analyze SELECT a,b,c FROM r NATURAL JOIN s NATURAL JOIN t;"

                                                           QUERY PLAN                                                           
--------------------------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=52724.37..89027.34 rows=2010918 width=12) (actual time=127.162..322.327 rows=2008672 loops=1)
   Merge Cond: ((t.c = s.c) AND (t.a = r.a))
   ->  Sort  (cost=1717.77..1767.77 rows=20000 width=8) (actual time=7.572..8.591 rows=10264 loops=1)
         Sort Key: t.c, t.a
         Sort Method: quicksort  Memory: 1394kB
         ->  Seq Scan on t  (cost=0.00..289.00 rows=20000 width=8) (actual time=0.009..1.607 rows=20000 loops=1)
   ->  Materialize  (cost=51006.60..53076.46 rows=413972 width=12) (actual time=119.581..199.888 rows=2216058 loops=1)
         ->  Sort  (cost=51006.60..52041.53 rows=413972 width=12) (actual time=119.578..142.644 rows=413972 loops=1)
               Sort Key: s.c, r.a
               Sort Met

In [103]:
!psql postgres -c "explain (costs off) SELECT a,b,c FROM r NATURAL JOIN s NATURAL JOIN t;"

                 QUERY PLAN                  
---------------------------------------------
 Merge Join
   Merge Cond: ((t.c = s.c) AND (t.a = r.a))
   ->  Sort
         Sort Key: t.c, t.a
         ->  Seq Scan on t
   ->  Materialize
         ->  Sort
               Sort Key: s.c, r.a
               ->  Hash Join
                     Hash Cond: (r.b = s.b)
                     ->  Seq Scan on r
                     ->  Hash
                           ->  Seq Scan on s
(13 rows)



# 2) which join strategy is fastest for a triangle-join?

next we want to force the optimizer to use just one join strategy, benchmark the execution time and compare the results:

- **hash join**
    - avg execution time: `232.71 ms`
    - query plan: just sequential scans and hash joins.
    - the hash join strategy was the fastest - even faster than the default strategy that the optimizer came up with. one reason for this could be that the hash join is more cache-friendly than the merge join because it doesn't require any sorting.
- **merge join**
    - avg execution time: `369.47 ms`
    - query plan: additionally sorting with quicksort and materializing the results before the root node.
- **nested loop join**
    - avg execution time: timed out (over 10 minutes)
    - query plan: not quite clear, but it seems to have tried to materialize an entire table at leaf level. the attempted materialization of an entire table probably lead to reaching memory or disk limits.
- **default** (hash join + merge join)
    - avg execution time: `357.69 ms`
    - query plan: as explained above.
    - since its average runtime falls between the pure hash and pure merge cases, the optimizer seems to have chosen a good strategy for the default case.

by allocating data differently or manipulating the environment, we can strongly influence the effectiveness of the different join strategies:

- **favoring hash joins:**
    - if one table is smaller, allowing for a quick in-memory hash table construction.
    - if you create indexes on columns involved in join conditions (`r.a`, `s.b`, `t.c`).
- **favoring merge joins:**
    - if the join columns are already sorted.
    - if you create indexes on columns involved in join conditions (`r.a`, `s.b`, `t.c`).
- **favoring nested loop joins:**
    - not recommended - wouldn't make any sense to favor this strategy.

we could also skew to the data to mess with the optimizer's cost-based optimization:

```sql
create table r as 
    select (random()*10)::int as a, -- most 'a' values are low
           (random()*30)::int as b from generate_series(1, 20000);
```

In [153]:
!uname -a

Darwin Yahyas-MBP 23.2.0 Darwin Kernel Version 23.2.0: Wed Nov 15 21:55:06 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T6020 arm64


In [54]:
from enum import Enum
import os
import json

class JoinMode(Enum):
    HASHJOIN = "set enable_hashjoin=1; set enable_mergejoin=0; set enable_nestloop=0;"
    MERGEJOIN = "set enable_hashjoin=0; set enable_mergejoin=1; set enable_nestloop=0;"
    NESTLOOP = "set enable_hashjoin=0; set enable_mergejoin=0; set enable_nestloop=1;"
    DEFAULT = "set enable_hashjoin=1; set enable_mergejoin=1; set enable_nestloop=1;"

# execution plan
def print_plan(mode:JoinMode) -> None:
    output = os.popen(f'psql postgres -c "{mode.value} explain (analyze, costs) SELECT a,b,c FROM r NATURAL JOIN s NATURAL JOIN t;"').read()
    print(output)

# benchmark
def get_exec(mode:JoinMode) -> float:
    full_query = f"{mode.value} explain (analyze, verbose, costs, settings, buffers, wal, timing, summary, format json) SELECT a,b,c FROM r NATURAL JOIN s NATURAL JOIN t;"
    output = os.popen(f'psql postgres -c "{full_query}"').read()
    output = output.splitlines()[5:-2]
    for i in range(len(output)):
        if len(output[i]) > 2:
            output[i] = output[i][:-1]
    output = "\n".join(output)
    output = json.loads(output)[0]
    output = float(output["Execution Time"])
    return output

def print_avg_exec(mode:JoinMode, iters:int=10) -> None:
    vals = []
    for _ in range(iters):
        time = get_exec(mode)
        vals.append(time)
    print(f"average execution time in {mode.name}-MODE: {sum(vals) / iters:.2f} ms (in {iters} iterations)")

## hash join

In [55]:
print_plan(JoinMode.HASHJOIN)
print_avg_exec(JoinMode.HASHJOIN)

SET
SET
SET
                                                        QUERY PLAN                                                        
--------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=13408.40..267441.51 rows=1999411 width=12) (actual time=53.004..186.939 rows=1999499 loops=1)
   Hash Cond: ((t.a = r.a) AND (t.c = s.c))
   ->  Seq Scan on t  (cost=0.00..289.00 rows=20000 width=8) (actual time=0.007..0.859 rows=20000 loops=1)
   ->  Hash  (cost=5297.36..5297.36 rows=407936 width=12) (actual time=52.664..52.666 rows=407936 loops=1)
         Buckets: 262144  Batches: 4  Memory Usage: 6475kB
         ->  Hash Join  (cost=54.00..5297.36 rows=407936 width=12) (actual time=0.202..19.730 rows=407936 loops=1)
               Hash Cond: (r.b = s.b)
               ->  Seq Scan on r  (cost=0.00..289.00 rows=20000 width=8) (actual time=0.008..0.877 rows=20000 loops=1)
               ->  Hash  (cost=29.00..29.

## merge join

In [56]:
print_plan(JoinMode.MERGEJOIN)
print_avg_exec(JoinMode.MERGEJOIN)

SET
SET
SET
                                                            QUERY PLAN                                                             
-----------------------------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=54690.76..90777.15 rows=1999411 width=12) (actual time=106.186..304.897 rows=1999499 loops=1)
   Merge Cond: ((t.c = s.c) AND (t.a = r.a))
   ->  Sort  (cost=1717.77..1767.77 rows=20000 width=8) (actual time=3.632..4.831 rows=10248 loops=1)
         Sort Key: t.c, t.a
         Sort Method: quicksort  Memory: 1394kB
         ->  Seq Scan on t  (cost=0.00..289.00 rows=20000 width=8) (actual time=0.006..0.639 rows=20000 loops=1)
   ->  Materialize  (cost=52972.99..55012.67 rows=407936 width=12) (actual time=102.550..183.899 rows=2201013 loops=1)
         ->  Sort  (cost=52972.99..53992.83 rows=407936 width=12) (actual time=102.549..125.767 rows=407936 loops=1)
               Sort Key: s.c, r.a
     

## nest loop (timed out)

In [57]:
print_plan(JoinMode.NESTLOOP)
print_avg_exec(JoinMode.NESTLOOP, 1)

SET
SET
SET
                                                     QUERY PLAN                                                     
--------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..143378262.00 rows=1999411 width=12) (actual time=0.047..387676.777 rows=1999499 loops=1)
   Join Filter: ((r.a = t.a) AND (s.c = t.c))
   Rows Removed by Join Filter: 8156720501
   ->  Nested Loop  (cost=0.00..600323.00 rows=407936 width=12) (actual time=0.012..1987.080 rows=407936 loops=1)
         Join Filter: (r.b = s.b)
         Rows Removed by Join Filter: 39592064
         ->  Seq Scan on r  (cost=0.00..289.00 rows=20000 width=8) (actual time=0.004..7.961 rows=20000 loops=1)
         ->  Materialize  (cost=0.00..39.00 rows=2000 width=8) (actual time=0.000..0.044 rows=2000 loops=20000)
               ->  Seq Scan on s  (cost=0.00..29.00 rows=2000 width=8) (actual time=0.002..0.127 rows=2000 loops=1)
   ->  Materia

KeyboardInterrupt: 

## default (hash join + merge join)

In [58]:
print_plan(JoinMode.DEFAULT)
print_avg_exec(JoinMode.DEFAULT)

SET
SET
SET
                                                           QUERY PLAN                                                           
--------------------------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=52002.65..88089.04 rows=1999411 width=12) (actual time=133.156..329.885 rows=1999499 loops=1)
   Merge Cond: ((t.c = s.c) AND (t.a = r.a))
   ->  Sort  (cost=1717.77..1767.77 rows=20000 width=8) (actual time=7.724..9.038 rows=10248 loops=1)
         Sort Key: t.c, t.a
         Sort Method: quicksort  Memory: 1394kB
         ->  Seq Scan on t  (cost=0.00..289.00 rows=20000 width=8) (actual time=0.015..0.668 rows=20000 loops=1)
   ->  Materialize  (cost=50284.88..52324.56 rows=407936 width=12) (actual time=125.425..206.156 rows=2201013 loops=1)
         ->  Sort  (cost=50284.88..51304.72 rows=407936 width=12) (actual time=125.401..148.574 rows=407936 loops=1)
               Sort Key: s.c, r.a
           

# 3) can we go faster by using indexes?

we can create indexes on the join columns to speed up the query.

- **hash join**
    - old avg execution time: `232.71 ms`
    - new avg execution time: `222.87 ms`
- **merge join**
    - old avg execution time: `369.47 ms`
    - new avg execution time: `354.38 ms`
- **nested loop join**
    - old avg execution time: timed out (over 10 minutes)
    - new avg execution time: `4611.20 ms`
- **default** (hash join + merge join)
    - old avg execution time: `357.69 ms`
    - new avg execution time: `344.17 ms`
    - change in query plan: none - stayed the same.


In [128]:
# reset the environment
!chmod +x ./reset.sh && ./reset.sh > /dev/null

# add index to r, s, t
!psql postgres -c "CREATE INDEX r_a ON r(a);"
!psql postgres -c "CREATE INDEX s_b ON s(b);"
!psql postgres -c "CREATE INDEX t_c ON t(c);"

Did not find any relations.
psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: FATAL:  database "sueszli" does not exist
Did not find any relations.
ERROR:  cannot drop the currently open database
ERROR:  current user cannot be dropped
CREATE INDEX
CREATE INDEX
CREATE INDEX


In [78]:
print_avg_exec(JoinMode.HASHJOIN)

average execution time in HASHJOIN-MODE: 222.87 ms (in 10 iterations)


In [66]:
print_avg_exec(JoinMode.MERGEJOIN)

average execution time in MERGEJOIN-MODE: 354.38 ms (in 10 iterations)


In [68]:
print_avg_exec(JoinMode.NESTLOOP)

average execution time in NESTLOOP-MODE: 4611.20 ms (in 10 iterations)


In [70]:
print_plan(JoinMode.DEFAULT)
print_avg_exec(JoinMode.DEFAULT)

SET
SET
SET
                                                           QUERY PLAN                                                           
--------------------------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=52002.65..88089.04 rows=1999411 width=12) (actual time=125.600..316.176 rows=1999499 loops=1)
   Merge Cond: ((t.c = s.c) AND (t.a = r.a))
   ->  Sort  (cost=1717.77..1767.77 rows=20000 width=8) (actual time=9.337..10.376 rows=10248 loops=1)
         Sort Key: t.c, t.a
         Sort Method: quicksort  Memory: 1394kB
         ->  Seq Scan on t  (cost=0.00..289.00 rows=20000 width=8) (actual time=0.007..1.711 rows=20000 loops=1)
   ->  Materialize  (cost=50284.88..52324.56 rows=407936 width=12) (actual time=116.258..194.594 rows=2201013 loops=1)
         ->  Sort  (cost=50284.88..51304.72 rows=407936 width=12) (actual time=116.255..138.487 rows=407936 loops=1)
               Sort Key: s.c, r.a
          

# 4) can we go faster by also using a semi-join?

with the given data schema using a semi-join is algebraically equivalent to a natural join but has the potential to be faster.

keep in mind: $\text{R}_1\ltimes\text{R}_2$ or $\text{R}_1\rtimes\text{R}_2$ means $\pi_{\text{R}_1}$ or $\pi_{\text{R}_2} (\text{R}_1\bowtie\text{R}_2)$

for the sake of simplicity, we will just use the postgres stats instead of taking the average of multiple runs in this section:

- $R \bowtie S \bowtie T$ (see above)

    - execution time without index: `358.284 ms` → strategy: hash join + merge join
        - first we generate a hash table for $s$ in memory
        - then we sequantially read $r$ and $s$ and, filter by matching hash values
        - we sort the result and materialize it (slow)
        - then we sort $t$ and merge join with the result from the previous step (slow)
    - execution time with index: `351.574 ms` → strategy: stayed the same

- $(R \bowtie S) \ltimes T \equiv \pi_{(R \bowtie S)} ((R \bowtie S) \bowtie T)$

    - execution time without index: `120.725 ms` → strategy: hash join + hash join
        - first we group $t$ by $t.a$ and $t.c$ through a hash aggregate which we turn into a hash table
        - in the following steps we just hash-join $t$ with $s$ and then the result with $r$
    - execution time with index: `119.577 ms` → strategy: hash join + merge join
        - same order as before, but the optimizer choses to sort and merge join the result with $r$ in the last step

- $(T \bowtie S) \ltimes R \equiv \pi_{(T \bowtie S)} ((T \bowtie S) \bowtie R)$

    - execution time without index: `108.822 ms` → strategy: hash join + hash join
        - same as RS with index but $r$ and $s$ are joined first
    - execution time with index: `111.315 ms` → strategy: stayed the same

In [163]:
!chmod +x ./reset.sh && ./reset.sh > /dev/null
!psql postgres -c "reset all;""

Did not find any relations.
psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: FATAL:  database "sueszli" does not exist
ERROR:  cannot drop the currently open database
ERROR:  current user cannot be dropped
zsh:1: unmatched "


In [164]:
# RS semi join
!psql postgres -c "explain analyze SELECT DISTINCT r.a, r.b, s.c FROM r JOIN s ON r.b = s.b WHERE EXISTS (SELECT 1 FROM t WHERE t.a = r.a AND t.c = s.c);"

import os
q1 = os.popen("psql postgres -c 'SELECT DISTINCT r.a, r.b, s.c FROM r JOIN s ON r.b = s.b WHERE EXISTS (SELECT 1 FROM t WHERE t.a = r.a AND t.c = s.c);'").read()
q2 = os.popen("psql postgres -c 'SELECT DISTINCT a, b, c FROM r NATURAL JOIN s NATURAL JOIN t;'").read()
assert len(q1) == len(q2)

                                                                  QUERY PLAN                                                                   
-----------------------------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=3874.50..8650.30 rows=130754 width=12) (actual time=54.688..118.280 rows=20432 loops=1)
   ->  Incremental Sort  (cost=3874.50..7669.64 rows=130754 width=12) (actual time=54.688..110.181 rows=207350 loops=1)
         Sort Key: r.a, r.b, s.c
         Presorted Key: r.a, r.b
         Full-sort Groups: 1516  Sort Method: quicksort  Average Memory: 27kB  Peak Memory: 27kB
         Pre-sorted Groups: 1405  Sort Method: quicksort  Average Memory: 28kB  Peak Memory: 28kB
         ->  Merge Join  (cost=3874.47..4118.07 rows=130754 width=12) (actual time=54.655..82.085 rows=207350 loops=1)
               Merge Cond: ((t.a = r.a) AND (s.b = r.b))
               ->  Sort  (cost=1313.60..1318.12 rows

In [165]:
# TS semi join
!psql postgres -c "explain analyze SELECT DISTINCT t.a, s.b, t.c FROM t JOIN s ON t.c = s.c WHERE EXISTS (SELECT 1 FROM r WHERE r.a = t.a AND r.b = s.b);"

import os
q1 = os.popen("psql postgres -c 'SELECT DISTINCT t.a, s.b, t.c FROM t JOIN s ON t.c = s.c WHERE EXISTS (SELECT 1 FROM r WHERE r.a = t.a AND r.b = s.b);'").read()
q2 = os.popen("psql postgres -c 'SELECT DISTINCT a, b, c FROM r NATURAL JOIN s NATURAL JOIN t;'").read()
assert len(q1) == len(q2)

                                                                  QUERY PLAN                                                                   
-----------------------------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=3906.26..12851.62 rows=130754 width=12) (actual time=37.092..118.951 rows=20432 loops=1)
   ->  Incremental Sort  (cost=3906.26..11870.97 rows=130754 width=12) (actual time=37.091..107.510 rows=308658 loops=1)
         Sort Key: t.a, s.b, t.c
         Presorted Key: t.a
         Full-sort Groups: 51  Sort Method: quicksort  Average Memory: 27kB  Peak Memory: 27kB
         Pre-sorted Groups: 51  Sort Method: quicksort  Average Memory: 453kB  Peak Memory: 469kB
         ->  Merge Join  (cost=3874.47..4118.07 rows=130754 width=12) (actual time=36.354..66.012 rows=308658 loops=1)
               Merge Cond: ((r.a = t.a) AND (s.c = t.c))
               ->  Sort  (cost=1313.60..1318.12 rows=1808

In [166]:
# add index
!psql postgres -c "CREATE INDEX r_a ON r(a);"
!psql postgres -c "CREATE INDEX s_b ON s(b);"
!psql postgres -c "CREATE INDEX t_c ON t(c);"

CREATE INDEX
CREATE INDEX
CREATE INDEX


In [167]:
# RS semi join (with index)
!psql postgres -c "explain analyze SELECT DISTINCT r.a, r.b, s.c FROM r JOIN s ON r.b = s.b WHERE EXISTS (SELECT 1 FROM t WHERE t.a = r.a AND t.c = s.c);"

                                                               QUERY PLAN                                                                
-----------------------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=3060.60..3560.60 rows=50000 width=12) (actual time=106.059..107.214 rows=20432 loops=1)
   Group Key: r.a, r.b, s.c
   Batches: 1  Memory Usage: 2577kB
   ->  Merge Join  (cost=2523.10..2685.60 rows=50000 width=12) (actual time=58.058..86.504 rows=207350 loops=1)
         Merge Cond: ((t.a = r.a) AND (s.b = r.b))
         ->  Sort  (cost=766.33..768.83 rows=1000 width=12) (actual time=52.254..57.460 rows=102000 loops=1)
               Sort Key: t.a, s.b
               Sort Method: external merge  Disk: 2208kB
               ->  Hash Join  (cost=473.00..716.50 rows=1000 width=12) (actual time=6.532..18.692 rows=102000 loops=1)
                     Hash Cond: (s.c = t.c)
                     -> 

In [168]:
# TS semi join (with index)
!psql postgres -c "explain analyze SELECT DISTINCT t.a, s.b, t.c FROM t JOIN s ON t.c = s.c WHERE EXISTS (SELECT 1 FROM r WHERE r.a = t.a AND r.b = s.b);"

                                                               QUERY PLAN                                                                
-----------------------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=3060.60..3560.60 rows=50000 width=12) (actual time=109.195..110.473 rows=20432 loops=1)
   Group Key: t.a, s.b, t.c
   Batches: 1  Memory Usage: 2577kB
   ->  Merge Join  (cost=2523.10..2685.60 rows=50000 width=12) (actual time=41.692..77.796 rows=308658 loops=1)
         Merge Cond: ((r.a = t.a) AND (s.c = t.c))
         ->  Sort  (cost=766.33..768.83 rows=1000 width=12) (actual time=37.198..39.891 rows=32130 loops=1)
               Sort Key: r.a, s.c
               Sort Method: quicksort  Memory: 4020kB
               ->  Hash Join  (cost=473.00..716.50 rows=1000 width=12) (actual time=7.628..17.852 rows=63568 loops=1)
                     Hash Cond: (s.b = r.b)
                     ->  Seq 

## can we force $(R \bowtie S) \ltimes T$ to use a "hash semi join"?

we can force it by setting:

- `enable_hashjoin` to `1`: to enable hash join
- `cpu_tuple_cost` to `0`: ...
- `cpu_operator_cost` to `0`: ...

here are the results:

- $(R \bowtie S) \ltimes T$

    - execution time without index: `65.001 ms` → strategy: hash join + hash join
        - first we group $t$ by $t.a$ and $t.c$ through a hash aggregate which we turn into a hash table
        - in the following steps we just hash-join $t$ with $s$ and then the result with $r$
    - execution time with index: `110.995 ms` → strategy: hash join + merge join
        - same order as before, but the optimizer choses to sort and merge join the result with $r$ in the last step

- $(R \bowtie S) \ltimes T$ (using hash semijoin)

    - execution time without index: `108.930 ms` → strategy: hash join + hash semi join
        - ...
    - execution time with index: `89.381 ms` → strategy: hash join + hash semi join
        - same as above ...

In [169]:
!chmod +x ./reset.sh && ./reset.sh > /dev/null
!psql postgres -c "reset all;""

Did not find any relations.
psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: FATAL:  database "sueszli" does not exist
ERROR:  cannot drop the currently open database
ERROR:  current user cannot be dropped
zsh:1: unmatched "


In [217]:
config1 = "set enable_hashjoin=1; set enable_mergejoin=0; set enable_nestloop=0;"
config2 = "set cpu_tuple_cost = 0; set cpu_operator_cost = 0;"
full_query = f"{config1} {config2} explain (analyze, costs false) SELECT DISTINCT r.a, r.b, s.c FROM r JOIN s ON r.b = s.b WHERE EXISTS (SELECT 1 FROM t WHERE t.a = r.a AND t.c = s.c);"
!psql postgres -c "{full_query}"


SET
SET
SET
SET
SET
                                     QUERY PLAN                                     
------------------------------------------------------------------------------------
 HashAggregate (actual time=106.908..108.152 rows=20833 loops=1)
   Group Key: r.a, r.b, s.c
   Batches: 1  Memory Usage: 2577kB
   ->  Hash Semi Join (actual time=9.271..77.846 rows=210785 loops=1)
         Hash Cond: ((r.a = t.a) AND (s.c = t.c))
         ->  Hash Join (actual time=0.500..32.476 rows=415072 loops=1)
               Hash Cond: (r.b = s.b)
               ->  Seq Scan on r (actual time=0.008..1.048 rows=20000 loops=1)
               ->  Hash (actual time=0.484..0.484 rows=2000 loops=1)
                     Buckets: 2048  Batches: 1  Memory Usage: 95kB
                     ->  Seq Scan on s (actual time=0.009..0.247 rows=2000 loops=1)
         ->  Hash (actual time=8.697..8.697 rows=20000 loops=1)
               Buckets: 32768  Batches: 1  Memory Usage: 1038kB
               ->  Seq Sc

In [218]:
# add index
!psql postgres -c "CREATE INDEX r_a ON r(a);"
!psql postgres -c "CREATE INDEX s_b ON s(b);"
!psql postgres -c "CREATE INDEX t_c ON t(c);"

CREATE INDEX
CREATE INDEX
CREATE INDEX


In [219]:
config1 = "set enable_hashjoin=1; set enable_mergejoin=0; set enable_nestloop=0;"
config2 = "set cpu_tuple_cost = 0; set cpu_operator_cost = 0;"
full_query = f"{config1} {config2} explain (analyze, costs false) SELECT DISTINCT r.a, r.b, s.c FROM r JOIN s ON r.b = s.b WHERE EXISTS (SELECT 1 FROM t WHERE t.a = r.a AND t.c = s.c);"
!psql postgres -c "{full_query}"


SET
SET
SET
SET
SET
                                     QUERY PLAN                                     
------------------------------------------------------------------------------------
 HashAggregate (actual time=87.248..88.588 rows=20833 loops=1)
   Group Key: r.a, r.b, s.c
   Batches: 1  Memory Usage: 2577kB
   ->  Hash Semi Join (actual time=5.183..61.647 rows=210785 loops=1)
         Hash Cond: ((r.a = t.a) AND (s.c = t.c))
         ->  Hash Join (actual time=0.600..26.250 rows=415072 loops=1)
               Hash Cond: (r.b = s.b)
               ->  Seq Scan on r (actual time=0.011..0.941 rows=20000 loops=1)
               ->  Hash (actual time=0.569..0.569 rows=2000 loops=1)
                     Buckets: 2048  Batches: 1  Memory Usage: 95kB
                     ->  Seq Scan on s (actual time=0.010..0.258 rows=2000 loops=1)
         ->  Hash (actual time=4.509..4.509 rows=20000 loops=1)
               Buckets: 32768  Batches: 1  Memory Usage: 1038kB
               ->  Seq Scan