### Cost Analysis

#### General

##### Cost Types
- The start-up cost is the cost expended before the first tuple is fetched. For example, the start-up cost of the index scan node is the cost of reading index pages to access the first tuple in the target table.
- The run cost is the cost of fetching all tuples.
- The total cost is the sum of the costs of both start-up and run costs.

##### Important Notes
- PostgreSQL does not consider whether the scanned page is in the shared buffers or not

##### Configurable Costs
Name                    |   Description                                                     |   Default Value   |   Note
----                    |   -----------                                                     |   -------------   |   ----
cpu_tuple_cost          |   Processing each row                                             |   0.01            |   
cpu_operator_cost       |   Processing each operator or function                            |   0.0025          |
cpu_index_tuple_cost    |   Processing each index entry                                     |   0.005           |
seq_page_cost           |   Disk page fetch that is part of a series of sequential fetches  |   1               |   Serves as the relative point for other operations, hence 1
random_page_cost        |   Non-sequentially-fetched disk page                              |   4.0             |   Reduce relative to seq_page_cost will cause more index scans, in practice it's much slower (around x40) in flash disks but the indexes are assumed to be 90% in cache. Moreover, in SSDs the relative cost of random vs sequential is lower as well.

#### Cost Helpers and Terms

- Selectivity - the proportion of the search range that satisfies the WHERE clause, it is a floating-point number from 0 to 1. (explained in detail in statistics)
- Index Correlation - The correlation between the index physical layout order and the table physical layout order. Helps to estimate how likely we can get the right page in the table based on it's place in the index.

Variables

- Nind-tup - Number of index tuples
- Nind-page - Number of index pages
- Ntup - Number of table tuples
- Hind - Height of index tree

##### Create Example

In [None]:
# Create Example
export PGHOST=db
export PGUSER=postgres
export PGDATABASE=postgres

psql < ./helpers/create-example.sql

#### Sequential Scan


##### Cost Calculation

`startup_cost` = 0

`run_cost` = `CPU Run Cost` + `Disk Run Cost`

`CPU Run Cost` = (`cpu_tuple_cost` + `cpu_operator_cost`) * Ntup

`Disk Run Cost` = `seq_page_cost` * *Npa*

Variables Explained:

- Ntup - Number of table tuples
- Npa - Number of table pages

##### Calculate

In [None]:
##### Create Example
psql -c "SELECT relpages, reltuples FROM pg_class WHERE relname = 'tbl';"
psql -c "EXPLAIN SELECT * FROM tbl WHERE id < 8000;"

#### Index Scan (BTREE)

##### Startup Cost
*startup_cost* - The start-up cost of the index scan is the cost of reading the index pages to access the first tuple in the target table, which actually is the cost of descending through the index BTREE.

`startup_cost` = (`Comparison` + `Entering Pages`) * `cpu_operator_cost`

`Comparison` = ceil(log2(*Nind-tup*))

`Entering Pages` = (*Hind* + `Leaf Page Enter`) * 50

`Leaf Page Enter` = 1

[Link to source](https://github.com/postgres/postgres/blob/a29834beb1deeb0aa06742dd77ba1d21b444ca44/src/backend/utils/adt/selfuncs.c#L5777)

##### Run Cost
TODO - Draw a chart from this

TODO - Summarize that, what is important to remember

`run_cost` = `Index CPU Cost` + `Table CPU Cost` + `Index I/O Cost` + `Table I/O Cost`

`Index CPU Cost` = Selectivity * Nind-tup * (`cpu_index_tuple_cost` + 0.0025)

`Table CPU Cost` = Selectivity * Ntup * `cpu_tuple_cost`

`Index I/O Cost` = ceil(Selectivity * Nind-pa) * `random_page_cost`

`Table I/O Cost` = `Max I/O Cost` + (IndexCorrelation)^2 * (`Min I/O Cost` - `Max I/O Cost`)

`Max I/O Cost` = Npa * `random_page_cost` # Means we had to enter all the pages to find the tuples

`Min I/O Cost` = (\
    &ensp; &ensp; &ensp; 1 * `random_page_cost` # Means we randomly chose one page\
    &ensp; &ensp; &ensp; + ((ceil(Selectivity * Npa))-1) * `seq_page_cost` # Others were sequentially found\
)

##### Calculate

In [None]:
##### Create Example
psql -c "SELECT relpages, reltuples FROM pg_class WHERE relname = 'tbl_data_idx';"
psql -c "EXPLAIN SELECT id, data FROM tbl WHERE data < 240;"