<div style="width:100%; background-color: #000041"><a target="_blank" href="http://university.yugabyte.com"><img src="assets/YBU_Logo.png" /></a></div><br>

> **YugabyteDB YSQL Development**
>
> Enroll for free at [Yugabyte University](https://university.yugabyte.com/courses/yugabytedb-ysql-development).
>
<br>
This notebook file is:

`04_Anatomy_of_an_Index.ipynb`


# Anatomy of an Index
In the `03_Demystifying_table_sharding_tablets_and_data_distribution.ipynb` notebook, you discovered how YugabyteDB stores data in tablets. You also learned that YugabyteDB and YSQL support two sharding strategies for tables: hash and range sharding.

An index in YugabyteDB is also distributed. Just like a table, YugabyteDB stores and distributes the data for an index stores in one or more tablets.  An index can employ either a hash or range sharding strategy. 

> Note
> 
> In this regard, a tablet represents a shard of data which contains a set of rows for a logical entity. Each tablet is a customized RocksDB instance. A tablet leader has a peer group known as tablet followers, and this group of tablet peers exists as a Raft consensus group. YugabyteDB calls this distributed document store, DocDB.

In this notebook, using Explain Plans, built-in functions, and custom utilities for YB-TServer metrics, you will learn how YugabyteDB stores data for an index in one or more tablets. You will also learn how YugabyteDB reads tablet data for an index during query execution.


## 🛠️ Requirements
Here are the requirements for this notebook:
- ✅ Create the notebook variables in `01_Lab_Setup.ipynb`, which you previously did
- ✅ Create the `db_ybu` database, which you previously did
- ☑️ Import the notebook variables, *which you must do next*
- ☑️ Connect to the `db_ybu` database, *which you must do next*
- ☑️ Complete the following sections
  -  Create a secondary index using range sharding
  -  Create a secondary index using hash sharding
  -  Review of secondary index sharding
  -  Expression index
  -  Expression with include index
  -  Partial index
  -  Hints and other features



### Select your notebook kernel
- In the Notebook toolbar, click **Select Kernel**.
<br>
<img width=50% src="assets/01_01_Select_Kernel_Toolbar.png" />

- Next, in the dropdown, select **Python 3.12** or higher.
<br>
<img width=50% src="assets/01_02_Select_Kernel_Dropdown.png" />

That's it!

## ⛑️ Getting help
The best way to get help from the Yugabyte University team is to post your question on YugabyteDB Community Slack in the #training or #yb-university channels. To sign up, visit [https://communityinviter.com/apps/yugabyte-db/register](https://communityinviter.com/apps/yugabyte-db/register).


## 👣 Setup steps
Here are the steps to setup this lab:
- Import the notebook variables
- Connect to `db_ybu` database
- Load the SQL Magic extension for the connection
- Create the prepared statements

### 👇 Import the notebook variables and style the notebook

> 👉 **IMPORTANT!** 👈
> 
> Do **NOT** skip running the following cells. 
>
> The following Python cell reads the stored variables created in the `01_Lab_Setup.ipynb` notebook. To run the script, select Execute Cell (Play Arrow) in the left gutter of the cell.  The cell after that styles the notebook.

👇 👇 👇 

In [None]:
# Use %store -r to read 01_Lab_Setup variables
%store -r
%config SqlMagic.autopandas=False
# %config SqlMagic.named_parameters=True
%config SqlMagic.displaylimit=30
%config SqlMagic.displaycon=False

**Update the styling of the notebook**.

In [None]:
from IPython.core.display import HTML
def css_styling():
    styles = open("./styles/custom.css", "r").read()
    return HTML(styles)
css_styling()

## Connect to the `db_ybu` database
Run all the cells in this section:
- Connect using Python and PostgreSQL driver
- Load the SQL magic extension
- Create the prepared statements
- View the listener
- View the DDL for tbl_cities

In [None]:
# connect use Python 3.12.1+
import psycopg2
import sqlalchemy as alc
from sqlalchemy import create_engine

db_host = NB_HOST_IPv4_01
db_name = NB_DB_NAME

connection_str = 'postgresql+psycopg2://yugabyte@'+db_host+':5433/'+db_name

engine = create_engine(connection_str)

### Load the SQL magic extension

In [None]:
%reload_ext sql

# SQL magic for python connection string
%sql {connection_str}

### Create the prepared statements

> 👉 **IMPORTANT!** 👈
> 
>   
> In order to create the prepared statements for the SQL magic connection, you must run the following cell!!!
> 
> Do not skip this step.
> 

In [None]:
#%% python, but prepared statements as sql magic

if (NB_YB_MASTER_HOST_GITPOD_URL is None):
    stmt = %sql select fn_yb_create_stmts()
else:
    stmt = %sql select fn_yb_create_stmts(:NB_YB_MASTER_HOST_GITPOD_URL )
print(stmt)

Confirm that the following query returns a count of 3 (for three prepared statements).

In [None]:
%%sql 
select count(*) from pg_prepared_statements where 1=1 and name in ('stmt_util_metrics_snap_tablet','stmt_util_metrics_snap_table','stmt_util_metrics_snap_reset')

### View the listener address
Run the following cell to view the host for your client connection

In [None]:
%%sql /* confirm listener */
show listen_addresses;

### View DDL for tbl_cities

Run the following cell to execute the describe table command, `\d`.

In [None]:
%%bash -s "$NB_YB_PATH_BIN" "$NB_DB_NAME"  # \d+
YB_PATH_BIN=${1}
DB_NAME=${2}

cd $YB_PATH_BIN

./ysqlsh -d ${DB_NAME} -c "\d+ tbl_cities"

# ./ysqlsh -d ${DB_NAME} -c "\d+ idx_cities_city_name_range"

> Need help?
> 
> If you can't find tbl_cities, please go back to `01_Lab_Setup.ipynb`.

---
## q1 | Create a secondary index using range sharding
When you create an index for a table with a sort order (`asc` or `desc`), YugabyteDB will create the index using a range sharding strategy.
  

In [None]:
%%sql
drop index if exists idx_cities_city_name_range;
drop index if exists idx_cities_city_name_hash;

select pg_sleep(1);

create index idx_cities_city_name_range on tbl_cities (city_name asc);

Review the DDL for index in the following cell:

In [None]:
%%sql
select pg_get_indexdef('idx_cities_city_name_range':: regclass);


Alternatively, use the `\d+` command:

In [None]:
%%bash -s "$NB_YB_PATH_BIN" "$NB_DB_NAME"  # \d+
YB_PATH_BIN=${1}
DB_NAME=${2}

cd $YB_PATH_BIN

./ysqlsh -d ${DB_NAME} -c "\d+ idx_cities_city_name_range"

`USING lsm` indicates that the index is of the type Log-Structured Merge tree (LSM) and employs range sharding. A LSM tree is not an tree structure, but rather, a complex algorithm that converts discrete random write requests into batch sequential write requests. To improve write performance for the LSM tree, RocksDB utilizes a Write-Ahead Log (WAL) and a memtable (a skiplist that lives in memory). The sequential writes in the memtable persists to disk as a Sorted Sequence Table (SSTable or SST) files.

Both hash sharding and range sharding use LSM. The difference of course is that the LSM tree sorts the hash coded values for hash sharding and the specified key values for range sharding.

> Important!
> 
> In Data Definition Language statements, YugabyteDB will interpret the `BTREE` keyword as `LSM`. YugabyteDB does not support `BTREE` indexes.

### View the Index details in the YB-Master web ui
You can view the details of the `idx_cities_city_name_range` index in the YB-Master web ui. Run the cell below and open the URL in your web browser.

In [None]:
#%% python, but prepared statements as sql magic
THIS_INDEX_NAME = 'idx_cities_city_name_range'
THIS_SCHEMA_NAME = 'public'
DB_NAME = NB_DB_NAME

## Comment out if local
view_url = %sql select fn_get_table_id_url(:NB_YB_MASTER_HOST_GITPOD_URL,7000,:DB_NAME,:THIS_SCHEMA_NAME,:THIS_INDEX_NAME ) as view_url 

## Uncomment if local
# view_url = %sql select fn_get_table_id_url(:NB_HOST_IPv4_01,7000,:DB_NAME,:THIS_SCHEMA_NAME,:THIS_INDEX_NAME ) as view_url =

# get for ResultSet dict the key value as a string
s = str(view_url.dict().get('view_url'))
url = s.replace("('", "").replace("',)", "")
print(url)

#### Review the Column section
The column section shows details about each column in the index. Here is the section for  `idx_cities_city_name_range`:


| Column | ID	| Type |
|--------|------|------|
| city_name    | 0	| string NOT NULL NOT A PARTITION KEY | 
| ybidxbasectid	   | 1	| binary NOT NULL NOT A PARTITION KEY | 


<br/>

> Important!
>  
>  YugabyteDB creates an internal, hidden column, `ybidxbasectid`, for the indexed row. `ybidxbasectid` is similar to the internal, hidden colum, `ybctid`, for a row of a table. Both `ybctid` and  `ybidxbasectid` are virtual columns that represent the
>  DocDB-encoded key for the tuple. 
> 
> Using  `\d` or `\d+` will not show the `ybidxbasectid` column. It is also not possible to query the `ybidxbasectid` value.

#### Review the Tablet section
The Tablet section shows the details for the existing tablets. Here is the section for  `idx_cities_city_name_range`: 

| Tablet ID |	Partition	| SplitDepth	| State	| Hidden	| Message	| RaftConfig|
|--|--|--|--|--|--|--|
| some_uuid_1<br>`1e2c3ef228534d3cbbf59c9fa6968d88	` |	`range: [<start>, <end>)` |	0	| Running|	false| Tablet reported with an active leader	|<li>FOLLOWER: 127.0.0.1 <li>FOLLOWER: 127.0.0.3<li>LEADER: 127.0.0.1 |

YugabyteDB will automatically split this tablet based on the size of the table on disk. The following global flags determine this behavior:

```

--tablet_force_split_threshold_bytes=107374182400 --> 10240 MB
--tablet_split_high_phase_shard_count_per_node=24
--tablet_split_high_phase_size_threshold_bytes=10737418240 --> 10240 MB
--tablet_split_low_phase_shard_count_per_node=8
--tablet_split_low_phase_size_threshold_bytes=536870912 --> 512 MB
--tablet_split_size_threshold_bytes=0
```

The low phase indicates the threshold for the initial splits of the tablet. With more data volume, the threshold increases from 512 MB to over 10 GBs.

### q1a | Range index in equality predicate
To generate an Explain Plan, run the following cell:

In [None]:
%%sql
execute  stmt_util_metrics_snap_reset;

explain (analyze, dist) 
select '' _
 , *
 --, city_id
 --, city_name
from tbl_cities 
where 1=1 
 -- and country_id = 233
 -- and country_code = 'US'
 -- and state_id = 1416
 -- and state_coude = 'CA'
and city_name = 'Alameda'
 -- and city_id = 111088
 -- and yb_hash_code(city_id) = yb_hash_code(111088)
 -- and yb_hash_code(city_name) = yb_hash_code('Alameda'::text)
-- limit 100
;

#### q1a | Explain Plan (above ^^)

<details>
      <summary>
      View Example Output
      </summary>

```sql
Index Scan using idx_cities_city_name_range on tbl_cities (cost=0.00..5.22 rows=10 width=2170) (actual time=3.611..3.615 rows=2 loops=1)
  Index Cond: ((city_name)::text = 'Alameda'::text)
  Storage Table Read Requests: 1
  Storage Table Read Execution Time: 1.756 ms
  Storage Index Read Requests: 1
  Storage Index Read Execution Time: 1.706 ms
Planning Time: 0.109 ms
Execution Time: 3.659 ms
Storage Read Requests: 2
Storage Read Execution Time: 3.462 ms
Storage Write Requests: 0
Catalog Read Requests: 1
Catalog Read Execution Time: 0.782 ms
Catalog Write Requests: 0
Storage Flush Requests: 0
Storage Execution Time: 4.244 ms
Peak Memory Usage: 8 kB
```
</details>
The Explain Plan shows that this query uses the index and reads 2 rows
- `Index Scan using idx_cities_city_name_range on public.tbl_cities (actual time=6.392..6.397 rows=2 loops=1)`

The Index Condition reflects the query predicate.

- `Index Cond: ((tbl_cities.city_name)::text = 'Alameda'::text)`

To view the tablet metrics, run the following cell:

In [None]:
%%sql
execute  stmt_util_metrics_snap_table;

#### q1a | Metrics (above ^^)


In the initial query, the `Index Scan` accesses the index tablet:

| row_name| 	rocksdb_number_db_seek | 	rocksdb_number_db_next | 
|--|--|--|
| db_ybu idx_cities_city_name_range link_table_id tablet_id_unq_1 leader	 | 1| 	2 |

There are two results for the predicate expression, resulting in one seek of the index tablet offset, and then two reads.
Because the query returns all columns, the query also reads from one of the tablet leader for `tbl_cities`, a table with hash sharding.


| row_name| 	rocksdb_number_db_seek | 	rocksdb_number_db_next | 
|--|--|--|
| db_ybu tbl_cities link_table_id tablet_id_unq_1	| 2 | 30 |


What the Metrics report reveals is that even though the query uses the range index, the query must still seek 4 offsets from one of the tablets for `tbl_cities`.



###  q1b | Covering index with an equality predicate
When the query results include all the columns in the index, the index "covers the query" and is known as a "covering index".

To view the Explain Plan, run the following cell: 

In [None]:
%%sql
execute  stmt_util_metrics_snap_reset;

explain (analyze, dist) 
select '' _
 --, *
-- , city_id
 , city_name
from tbl_cities 
where 1=1 
 -- and country_id = 233
 -- and country_code = 'US'
 -- and state_id = 1416
 -- and state_coude = 'CA'
and city_name = 'Alameda'
-- and city_id = 111088
 -- and yb_hash_code(city_id) = yb_hash_code(111088)
 -- and yb_hash_code(city_name) = yb_hash_code('Alameda'::text)
-- limit 100
;

#### q1b | Explain plan (above ^^)
<details>
      <summary>
      View Example Output
      </summary>

```sql
Index Only Scan using idx_cities_city_name_range on tbl_cities (cost=0.00..5.12 rows=10 width=548) (actual time=3.043..3.046 rows=2 loops=1)
  Index Cond: (city_name = 'Alameda'::text)
  Heap Fetches: 0
  Storage Index Read Requests: 1
  Storage Index Read Execution Time: 2.956 ms
Planning Time: 0.105 ms
Execution Time: 3.077 ms
Storage Read Requests: 1
Storage Read Execution Time: 2.956 ms
Storage Write Requests: 0
Catalog Read Requests: 1
Catalog Read Execution Time: 0.864 ms
Catalog Write Requests: 0
Storage Flush Requests: 0
Storage Execution Time: 3.820 ms
Peak Memory Usage: 8 kB
```
</details>

The Explain Plan shows that this query uses the index only `Index Only Scan` and reads 2 rows.
- `Index Only Scan using idx_cities_city_name_range on public.tbl_cities (actual time=4.964..4.968 rows=2 loops=1)`

The Index Condition reflects the query predicate.

- `Index Cond: ((tbl_cities.city_name)::text = 'Alameda'::text)`

To view the tablet metrics, run the following cell:

In [None]:
%%sql
execute  stmt_util_metrics_snap_table;

#### q1b | Metrics (above ^^)

In the initial query, the `Index Scan` accesses the index tablet:

| row_name| 	rocksdb_number_db_seek | 	rocksdb_number_db_next | 
|--|--|--|
| db_ybu idx_cities_city_name_range link_table_id tablet_id_unq_1 leader	 | 1| 	2 |

There are two results for the predicate expression, resulting in one seek of the index tablet offset, and then two reads.
Because the query only returns the city name, the index "covers" the query and is a "coverign index". There is no need for the query to access the tablets for `tbl_cities`.


#### q1b | Experiment

Question: What happens when the query includes the PK column, `city_id`,  in the select command? 

Answer:
  - The Explain Plan becomes an `Index Scan`
  - The query eads from a tablet for `tbl_cities`
  - The reason? The tuple for the index, `ybidxbasectid`, encodes the PK pointer. YugabyteDB at this time is unable to use this PK encoding in results. This is a known issue.

Question: What happens the query includes a PK equality clause in the predicate?

Answer:
- The query will use the PK index, `tbl_cities_pkey`, and not the secondary index.



### q1c | Range index with a range predicate
Run the following cell to view the query results:

In [None]:
%%sql
-- to see results
select '' _
   , *
-- , city_name
from tbl_cities 
where 1=1 
 -- and country_id = 233
 -- and country_code = 'US'
 -- and state_id = 1416
 -- and state_coude = 'CA'
 and city_name BETWEEN 'Alameda' AND  'Alamo'
-- and city_id = 111088
 -- and yb_hash_code(city_id) = yb_hash_code(111088)
-- and yb_hash_code(city_name) = yb_hash_code(city_name::text)
limit 30
;

To view the Explain Plan, run the following cell:

In [None]:
%%sql

execute  stmt_util_metrics_snap_reset;

explain (analyze, dist) 
select ''_ 
    , *
 -- , city_name
from tbl_cities 
where 1=1 
 -- and country_id = 233
 -- and country_code = 'US'
 -- and state_id = 1416
 -- and state_coude = 'CA'
 and city_name BETWEEN 'Alameda' AND  'Alamo'
 -- and city_id = 111088
 -- and yb_hash_code(city_id) = yb_hash_code(111088)
-- and yb_hash_code(city_name) = yb_hash_code(city_name::text)
-- limit 100
;

#### q1c | Explain plan (above ^^)
<details>
      <summary>
      View Example Output
      </summary>

```sql
Index Scan using idx_cities_city_name_range on tbl_cities (cost=0.00..5.25 rows=10 width=2170) (actual time=19.205..19.257 rows=16 loops=1)
  Index Cond: (((city_name)::text >= 'Alameda'::text) AND ((city_name)::text <= 'Alamo'::text))
  Storage Table Read Requests: 1
  Storage Table Read Execution Time: 5.441 ms
  Storage Index Read Requests: 1
  Storage Index Read Execution Time: 12.614 ms
Planning Time: 0.127 ms
Execution Time: 19.333 ms
Storage Read Requests: 2
Storage Read Execution Time: 18.055 ms
Storage Write Requests: 0
Catalog Read Requests: 1
Catalog Read Execution Time: 0.801 ms
Catalog Write Requests: 0
Storage Flush Requests: 0
Storage Execution Time: 18.856 ms
Peak Memory Usage: 8 kB
```
</details>

The Explain Plan shows that this query uses the index only `Index Scan` and reads 16 rows.
- `Index Scan using idx_cities_city_name_range on public.tbl_cities (actual time=3.192..3.239 rows=16 loops=1`

The Index Condition reflects the query predicate range.

- `  Index Cond: (((tbl_cities.city_name)::text >= 'Alameda'::text) AND ((tbl_cities.city_name)::text <= 'Alamo'::text))`

To view the tablet metrics, run the following cell:

In [None]:
%%sql
execute  stmt_util_metrics_snap_table;

#### q1c | Metrics (above ^^)

In the initial query, the `Index Scan` accesses the index tablet:

| row_name| 	rocksdb_number_db_seek | 	rocksdb_number_db_next | 
|--|--|--|
| db_ybu idx_cities_city_name_range link_table_id tablet_id_unq_1 leader	 | 1 | 	0 |

Because the query returns all the columns, the query accesses data on the the three tablet leaders for `tbl_cities`. Retrieving the column data for the query requires additional seeks and reads on each tablet.


### q1d | Covering index with a range predicate
To view the Explain Plan, run the following cell:

In [None]:
%%sql
execute  stmt_util_metrics_snap_reset;

explain (analyze, dist) 
select '' _
 --, *
 -- , city_id
 , city_name
from tbl_cities 
where 1=1 
 -- and country_id = 233
 -- and country_code = 'US'
 -- and state_id = 1416
 -- and state_code = 'CA'
and city_name between 'alameda' and 'alamo'
-- and city_id = 111088
 -- and yb_hash_code(city_id) = yb_hash_code(111088)
 -- and yb_hash_code(city_name) = yb_hash_code('Alameda'::text)
-- limit 100
;

#### q1d | Explain Plan (above ^^)
<details>
      <summary>
      View Example Output
      </summary>

```sql
Index Only Scan using idx_cities_city_name_range on tbl_cities (cost=0.00..5.15 rows=10 width=548) (actual time=1.777..1.777 rows=0 loops=1)
  Index Cond: ((city_name >= 'alameda'::text) AND (city_name <= 'alamo'::text))
  Heap Fetches: 0
  Storage Index Read Requests: 1
  Storage Index Read Execution Time: 1.685 ms
Planning Time: 0.105 ms
Execution Time: 1.823 ms
Storage Read Requests: 1
Storage Read Execution Time: 1.685 ms
Storage Write Requests: 0
Catalog Read Requests: 1
Catalog Read Execution Time: 0.956 ms
Catalog Write Requests: 0
Storage Flush Requests: 0
Storage Execution Time: 2.641 ms
Peak Memory Usage: 8 kB
```
</details>

The Explain Plan shows that this query only uses the range index and in doing so, retrieves 16 rows.
- `Index Only Scan using idx_cities_city_name_range on public.tbl_cities (actual time=0.883..0.891 rows=16 loops=1)`

The Index Condition reflects the query predicate.

- `Index Cond: ((tbl_cities.city_name >= 'Alameda'::text) AND (tbl_cities.city_name <= 'Alamo'::text))`

To view the tablet metrics, run the following cell:

In [None]:
%%sql

execute  stmt_util_metrics_snap_table;

#### q1d | Metrics (above ^^)

The range index covers the query and serves as a covering index. A covering index is indicative of a Index Only Scan query.

| row_name| 	rocksdb_number_db_seek | 	rocksdb_number_db_next | 
|--|--|--|
| db_ybu idx_cities_city_name_range link_table_id tablet_id_unq_1 leader	 | 1 | 	2 |



---
## q2 | Create a secondary index using hash sharding
In this series of queries, you will explore using a hash index. 

To begin, create an index with hash sharding. In the DLL for the index, specify the `hash` keyword as follows:

In [None]:
%%sql 
drop index if exists idx_cities_city_name_range;

drop index if exists idx_cities_city_name_hash;

select pg_sleep(1);

create index idx_cities_city_name_hash on tbl_cities (city_name hash);

To view the DDL, run the following cell:

In [None]:
%%sql
select pg_get_indexdef('idx_cities_city_name_hash':: regclass);

### View the Index details in the YB-Master web ui
You can view the details of the `idx_cities_city_name_hash` index in the YB-Master web ui. Run cell below and open the URL in your web browser.

In [None]:
#%% python, but prepared statements as sql magic
THIS_INDEX_NAME = 'idx_cities_city_name_hash'
THIS_SCHEMA_NAME = 'public'
DB_NAME = NB_DB_NAME

## Comment out if local
view_url = %sql select fn_get_table_id_url(:NB_YB_MASTER_HOST_GITPOD_URL,7000,:DB_NAME,:THIS_SCHEMA_NAME,:THIS_INDEX_NAME ) as view_url 

## Uncomment if local
# view_url = %sql select fn_get_table_id_url(:NB_HOST_IPv4_01,7000,:DB_NAME,:THIS_SCHEMA_NAME,:THIS_INDEX_NAME ) as view_url =

# get for ResultSet dict the key value as a string
s = str(view_url.dict().get('view_url'))
url = s.replace("('", "").replace("',)", "")
print(url)

#### Review the Column section
The column section shows details about each column in the index. Here is the section for  `idx_cities_city_name_hash`:


| Column | ID	| Type |
|--------|------|------|
| city_name    | 0	| string NOT NULL PARTITION KEY| 
| ybidxbasectid	   | 1	| binary NOT NULL NOT A PARTITION KEY | 


<br/>

> Important!
>  
>  YugabyteDB creates an internal, hidden column, `ybidxbasectid`, for the indexed row. `ybidxbasectid` is similar to the internal, hidden colum, `ybctid`, for a row of a table. Both `ybctid` and  `ybidxbasectid` are virtual columns that represent the
>  DocDB-encoded key for the tuple. 
> 
> Using  `\d` or `\d+` will not show the `ybidxbasectid` column. It is also not possible to query the `ybidxbasectid` value.

##### Partition Key
YugabyteDB uses the shard key (shown as `PARTITION KEY`) to distribute the data among the tablet leaders for the index. 

With consistent hash sharding, a partitioning algorithm distributes data evenly and randomly across shards. By computing a consistent hash on the partition key (or keys) of a given row, YugabyteDB knows where to insert the row among the tablet leaders.



#### Review the Tablet section
For the given index, the Tablet section shows the details for the existing tablets. Of particular interest are the number of tablet leaders and the partition strategy. Here is an example of the Tablet section for `idx_cities_city_name_hash`:

| Tablet ID |	Partition	| SplitDepth	| State	| Hidden	| Message	| RaftConfig|
|--|--|--|--|--|--|--|
| some_uuid_1 |	`hash_split: [0x5555, 0xAAAA)` |	0	| Running|	false| Tablet reported with an active leader	|<li>FOLLOWER: 127.0.0.1 <li>FOLLOWER: 127.0.0.3<li>LEADER: 127.0.0.2  |
| some_uuid_2	| `hash_split: [0xAAAA, 0xFFFF)`	| 0 |  Running |false |	Tablet reported with an active leader |<li>FOLLOWER: 127.0.0.1 <li>LEADER: 127.0.0.3 <li>FOLLOWER: 127.0.0.2 |
| some_uuid_3 <br>(tablet leader where the row lives) |	`hash_split: [0x0000, 0x5555)` |	0 |	Running | 	false	| Tablet reported with an active leader |	<li>LEADER: 127.0.0.1<li>FOLLOWER: 127.0.0.3<li>FOLLOWER: 127.0.0.2 |

### q2a | Hash index with an equality predicate
To view the Explain Plan for the query, run the following cell:

In [None]:
%%sql
execute  stmt_util_metrics_snap_reset;

explain (analyze, dist) 
select 0
 , *
 -- , city_id
 -- , city_name
from tbl_cities 
where 1=1 
 -- and country_id = 233
 -- and country_code = 'US'
 -- and state_id = 1416
 -- and state_coude = 'CA'
and city_name = 'Alameda'
 -- and city_id = 111088
 -- and yb_hash_code(city_id) = yb_hash_code(111088)
 -- and yb_hash_code(city_name) = yb_hash_code('Alameda'::text)
-- limit 100
;

#### q2a | Explain Plan (above ^^)

<details>
      <summary>
      View Example Output
      </summary>

```sql
Index Scan using idx_cities_city_name_hash on tbl_cities (cost=0.00..5.22 rows=10 width=2142) (actual time=2.517..2.522 rows=2 loops=1)
  Index Cond: ((city_name)::text = 'Alameda'::text)
  Storage Table Read Requests: 1
  Storage Table Read Execution Time: 0.726 ms
  Storage Index Read Requests: 1
  Storage Index Read Execution Time: 0.447 ms
Planning Time: 2.953 ms
Execution Time: 2.570 ms
Storage Read Requests: 2
Storage Read Execution Time: 1.173 ms
Storage Write Requests: 0
Catalog Read Requests: 2
Catalog Read Execution Time: 2.314 ms
Catalog Write Requests: 0
Storage Flush Requests: 0
Storage Execution Time: 3.487 ms
Peak Memory Usage: 8 kB
```
</details>

The Explain Plan shows that this query uses the hash index and reads 2 rows.
- `Index Scan using idx_cities_city_name_hash on public.tbl_cities (actual time=3.539..3.552 rows=2 loops=1`

The Index Condition reflects the query predicate.

- `Index Cond: ((tbl_cities.city_name)::text = 'Alameda'::text)`

To view the tablet metrics, run the following cell:

In [None]:
%%sql
execute  stmt_util_metrics_snap_table;

#### q2a | Metrics (above ^^)

The query uses the hash index and accesses one of the tablet leaders for `idx_cities_city_name_hash`.

| row_name| 	rocksdb_number_db_seek | 	rocksdb_number_db_next | 
|--|--|--|
| db_ybu idx_cities_city_name_hash link_table_id tablet_id_unq_1 leader	 | 1 | 	2 |

Because the query requires more than the columns in the index, the query accesses one of the tablet leaders for `tbl_cities`.

| row_name| 	rocksdb_number_db_seek | 	rocksdb_number_db_next | 
|--|--|--|
| db_ybu tbl_cities link_table_id tablet_id_unq_1 leader	 | 2 | 	30 |

There's a 33% chance that the tablet leaders for the index and the table are on different hosts. If you are running this notebook locally or in Gitpod, the host is the same machine with the YB-TServer processes running on different ports using host aliases for localhost (127.0.0.1) such as 127.0.0.2 and 127.0.0.3.


### q2b | Covering index with a hash index and equality predicate
Run the following cell to generate an Explain Plan:

In [None]:
%%sql
execute  stmt_util_metrics_snap_reset;

explain (analyze, dist) 
select '' _
 -- , *
 --, city_id
, city_name
from tbl_cities 
where 1=1 
 -- and country_id = 233
 -- and country_code = 'US'
 -- and state_id = 1416
 -- and state_coude = 'CA'
and city_name = 'Alameda'
 -- and city_name BETWEEN 'Alameda' AND 'Alamo'
 -- and city_id = 111088
 -- and yb_hash_code(city_id) = yb_hash_code(111088)
 -- and yb_hash_code(city_name) = yb_hash_code('Alameda'::text)
-- limit 100
;

#### q2b | Explain Plan (above ^^)

<details>
      <summary>
      View Example Output
      </summary>

```sql
Index Only Scan using idx_cities_city_name_hash on tbl_cities (cost=0.00..5.12 rows=10 width=548) (actual time=1.340..1.344 rows=2 loops=1)
  Index Cond: (city_name = 'Alameda'::text)
  Heap Fetches: 0
  Storage Index Read Requests: 1
  Storage Index Read Execution Time: 1.267 ms
Planning Time: 0.078 ms
Execution Time: 1.371 ms
Storage Read Requests: 1
Storage Read Execution Time: 1.267 ms
Storage Write Requests: 0
Catalog Read Requests: 1
Catalog Read Execution Time: 0.810 ms
Catalog Write Requests: 0
Storage Flush Requests: 0
Storage Execution Time: 2.077 ms
Peak Memory Usage: 8 kB
```
</details>

The Explain Plan shows that the index is a covering index and returns 2 rows.
- `Index Only Scan using idx_cities_city_name_hash on public.tbl_cities (actual time=0.805..0.808 rows=2 loops=1)`

The Index Condition reflects the query predicate.

- `Index Cond: (tbl_cities.city_name = 'Alameda'::text)`

To view the tablet metrics, run the following cell:

In [None]:
%%sql
execute  stmt_util_metrics_snap_table;

#### q2b | Metrics (above ^^)

As an Index Only Scan query, the query uses the hash index and accesses one of the tablet leaders for `idx_cities_city_name_hash` and reads 2 rows.

| row_name| 	rocksdb_number_db_seek | 	rocksdb_number_db_next | 
|--|--|--|
| db_ybu idx_cities_city_name_hash link_table_id tablet_id_unq_1 leader	 | 1 | 	2 |


### q2c | Hash index with a range predicate
To view the Explain Plan, run the following:

In [None]:
%%sql
%%sql
SET yb_enable_expression_pushdown=off;
SHOW yb_enable_expression_pushdown;

execute  stmt_util_metrics_snap_reset;

explain (analyze, dist) 
select '' _
    , *
-- , city_id
 --, city_name
from tbl_cities 
where 1=1 
 -- and country_id = 233
 -- and country_code = 'US'
 -- and state_id = 1416
 -- and state_coude = 'CA'
and city_name BETWEEN 'Alameda' AND 'Alamo'
 -- and city_id = 111088
 -- and yb_hash_code(city_id) = yb_hash_code(111088)
 -- and yb_hash_code(city_name) = yb_hash_code('Alameda'::text)
-- limit 100
;

#### q2c | Explain plan (above ^^)

<details>
      <summary>
      View Example Output
      </summary>

```sql
Seq Scan on tbl_cities (cost=0.00..105.00 rows=1000 width=2170) (actual time=57.441..543.533 rows=16 loops=1)
  Filter: (((city_name)::text >= 'Alameda'::text) AND ((city_name)::text <= 'Alamo'::text))
  Rows Removed by Filter: 148250
  Storage Table Read Requests: 146
  Storage Table Read Execution Time: 361.278 ms
Planning Time: 0.071 ms
Execution Time: 543.596 ms
Storage Read Requests: 146
Storage Read Execution Time: 361.278 ms
Storage Write Requests: 0
Catalog Read Requests: 1
Catalog Read Execution Time: 0.709 ms
Catalog Write Requests: 0
Storage Flush Requests: 0
Storage Execution Time: 361.987 ms
Peak Memory Usage: 8 kB
```
</details>


As expected, the optimizer generates an Explain Plan that does not use the hash index. Instead, the Explain Plan shows a full table scan of the table itself.
- `Seq Scan on tbl_cities (cost=0.00..105.00 rows=1000 width=2170) (actual time=57.441..543.533 rows=16 loops=1)`

The YB-TServer that serves the client connection removes the rows:
  - `Filter: (((tbl_cities.city_name)::text >= 'Alameda'::text) AND ((tbl_cities.city_name)::text <= 'Alamo'::text))`
  - `Rows Removed by Filter: 148250`

To view the Metrics report for the query, run the following cell:


In [None]:
%%sql

execute  stmt_util_metrics_snap_table;

#### q2c | Metrics (above ^^)
The `Seq Scan` requires that the query access all three tablets for `tbl_cities`, requiring 50 offset seeks and 60OK reads per tablet. 

However, this query does not use a recent **pushdown** optimization. A pushdown is to the tablet. The tablet (RocksDB) applies the filter, known as a `Remote Filter`, and then returns the results. 

To see how the improvement, run the same query with the pushdown optimization.

### q2d | Hash index with a range predicate with pushdown
To view the Explain Plan, run the following:

In [None]:
%%sql
SET yb_enable_expression_pushdown=on;
SHOW yb_enable_expression_pushdown;

execute  stmt_util_metrics_snap_reset;

explain (analyze, dist) 
select '' _
    , *
-- , city_id
 --, city_name
from tbl_cities 
where 1=1 
 -- and country_id = 233
 -- and country_code = 'US'
 -- and state_id = 1416
 -- and state_coude = 'CA'
and city_name BETWEEN 'Alameda' AND 'Alamo'
 -- and city_id = 111088
 -- and yb_hash_code(city_id) = yb_hash_code(111088)
 -- and yb_hash_code(city_name) = yb_hash_code('Alameda'::text)
-- limit 100
;

#### q2d | Explain plan (above ^^)

<details>
      <summary>
      View Example Output
      </summary>

```sql
Seq Scan on tbl_cities (cost=0.00..105.00 rows=1000 width=2170) (actual time=270.262..270.312 rows=16 loops=1)
  Remote Filter: (((city_name)::text >= 'Alameda'::text) AND ((city_name)::text <= 'Alamo'::text))
  Storage Table Read Requests: 1
  Storage Table Read Execution Time: 270.131 ms
Planning Time: 0.070 ms
Execution Time: 270.371 ms
Storage Read Requests: 1
Storage Read Execution Time: 270.131 ms
Storage Write Requests: 0
Catalog Read Requests: 1
Catalog Read Execution Time: 0.783 ms
Catalog Write Requests: 0
Storage Flush Requests: 0
Storage Execution Time: 270.914 ms
Peak Memory Usage: 8 kB
```

</details>

With the pushdown enabled, the filter of the rows occurs at the tablet level. This is the `Remote Filter`.
  - `Remote Filter: (((city_name)::text >= 'Alameda'::text) AND ((city_name)::text <= 'Alamo'::text))`
  - This results in about 100ms improvement over the non-pushdown query.


To view the Metrics report for the query, run the following cell:


In [None]:
%%sql

execute  stmt_util_metrics_snap_table;

#### q2d | Metrics (above ^^)
The `Seq Scan` requires that the query access all three tablets for `tbl_cities`, requiring 1 offset seek per tablet and over 600K reads per tablet. 



---
## Review of secondary index sharding

PK in predicate will take precedence and result in index NOT being used...
- Can't drop or alter PK, need to rename, create new table (or similar), and select into to make change.
- PK can be composite (hash, and range order), and this behavior can be different if that is the case [not covered here]

Secondary index as Range 
- Equality is performant, but when range index tablets split, may not be as great as hash index for equality
- Ideal for range and comparison predicates
- For both Equality and Comparison predicate...
  - When the index column is the sole column in the select command and the sole column in the predicate (no PK), the range index is the covering index, and results in Index Only Scan
 
Secondary index as Hash
- Equality is ideal
  -  When the index column is the sole column in the select command and the sole column in the predicate (no PK), the range index is the covering index, and results in Index Only Scan
- A range or comparison predicate ignores the hash index and results in a costly Seq Scan

YugabyteDB vs PostgreSQL
- YugabyteDB only updates the indexes of columns where there are key-values and subkey values changes.
- A Index Only Scan reads from the index tablet.

---
## q3 | Expression index
Uses for an expression for index.

In [None]:
%%sql
drop index if exists idx_cities_city_name_exp;

create index idx_cities_city_name_exp on tbl_cities (UPPER (COALESCE(city_name, city_name_alt) ) asc);

### q3a | Expression query

In [None]:
%%sql
execute  stmt_util_metrics_snap_reset;

explain (analyze, dist) 
select '' _
 -- , *
, city_id
, city_name
, city_name_alt
from tbl_cities 
where 1=1 
 -- and country_id = 233
 -- and country_code = 'US'
 -- and state_id = 1416
 -- and state_coue = 'CA'
 and UPPER (COALESCE(city_name, city_name_alt) ) like 'A%'
 -- and city_name = 'Alameda'
 -- and city_name BETWEEN 'Alameda' AND 'Alamo'
 -- and city_id = 111088
 -- and yb_hash_code(city_id) = yb_hash_code(111088)
 -- and yb_hash_code(city_name) = yb_hash_code('Alameda'::text)
limit 100
;

#### q3a | Explain plan (above ^^)

<details>
      <summary>
      View Example Output
      </summary>

```sql
Limit (cost=0.00..5.30 rows=10 width=1068) (actual time=7.750..7.970 rows=100 loops=1)
  -> Index Scan using idx_cities_city_name_exp on tbl_cities (cost=0.00..5.30 rows=10 width=1068) (actual time=7.748..7.960 rows=100 loops=1)
        Index Cond: ((upper((COALESCE(city_name, city_name_alt))::text) >= 'A'::text) AND (upper((COALESCE(city_name, city_name_alt))::text) < 'B'::text))
        Filter: (upper((COALESCE(city_name, city_name_alt))::text) ~~ 'A%'::text)
        Storage Table Read Requests: 1
        Storage Table Read Execution Time: 4.327 ms
        Storage Index Read Requests: 1
        Storage Index Read Execution Time: 1.533 ms
Planning Time: 10.659 ms
Execution Time: 8.055 ms
Storage Read Requests: 2
Storage Read Execution Time: 5.860 ms
Storage Write Requests: 0
Catalog Read Requests: 16
Catalog Read Execution Time: 12.911 ms
Catalog Write Requests: 0
Storage Flush Requests: 0
Storage Execution Time: 18.770 ms
Peak Memory Usage: 24 kB
```

</details>

A couple of observations about this query plan:
- By using an index, there is no table sequence scan.
- The index utilizes range sharding
- The index condition shows a range boundary for the `like` predicate, `>= 'A' AND < 'B'`
- The Filter is not showing a row result which indicates it is most likely semantic and duplicates the condition.




In [None]:
%%sql

execute  stmt_util_metrics_snap_table;

####  q3a | Metrics (above ^^)

A couple of observations about the metrics snapshot:
- There are two offset seeks into the index tablet, with about 2K reads 
- All other tablets are read, as expected, due to the fact that the index does not contain the table columns.

---
## q4 | Expression with include index
How to use an expression and include.

In [None]:
%%sql
drop index if exists idx_cities_city_name_exp_covering;

create index idx_cities_city_name_covering on tbl_cities (UPPER (COALESCE(city_name, city_name_alt) ) asc) include (city_id, city_name, city_name_alt);

### q4a | Query using expression and include columns

In [None]:
%%sql
execute  stmt_util_metrics_snap_reset;

explain (analyze, dist) 
select '' _
 -- , *
, city_id
, city_name
, city_name_alt
-- , state_id
from tbl_cities 
where 1=1 
 -- and country_id = 233
 -- and country_code = 'US'
 -- and state_id = 1416
 -- and state_coude = 'CA'
 and UPPER (COALESCE(city_name, city_name_alt) ) like 'A%'
 -- and city_name = 'Alameda'
 -- and city_name BETWEEN 'Alameda' AND 'Alamo'
 -- and city_id = 111088
 -- and yb_hash_code(city_id) = yb_hash_code(111088)
 -- and yb_hash_code(city_name) = yb_hash_code('Alameda'::text)
limit 100
;

#### q4a | Explain plan (above ^^)

<details>
      <summary>
      View Example Output
      </summary>

```sql
Limit (cost=0.00..5.20 rows=10 width=1068) (actual time=2.682..2.733 rows=100 loops=1)
  -> Index Only Scan using idx_cities_city_name_covering on tbl_cities (cost=0.00..5.20 rows=10 width=1068) (actual time=2.681..2.725 rows=100 loops=1)
        Index Cond: (((upper((COALESCE(city_name, city_name_alt))::text)) >= 'A'::text) AND ((upper((COALESCE(city_name, city_name_alt))::text)) < 'B'::text))
        Filter: ((upper((COALESCE(city_name, city_name_alt))::text)) ~~ 'A%'::text)
        Heap Fetches: 0
        Storage Index Read Requests: 1
        Storage Index Read Execution Time: 2.558 ms
Planning Time: 3.608 ms
Execution Time: 4.732 ms
Storage Read Requests: 1
Storage Read Execution Time: 2.558 ms
Storage Write Requests: 0
Catalog Read Requests: 6
Catalog Read Execution Time: 5.345 ms
Catalog Write Requests: 0
Storage Flush Requests: 0
Storage Execution Time: 7.903 ms
Peak Memory Usage: 24 kB
```

</details>

A couple of observations about this query plan:
- The index utilizes range sharding
- The plan utilizes a covering index as `Index Only Scan` indicates
- The `Index Cond` shows a range boundary for the `like` predicate, `>= 'A' AND < 'B'`
- The Filter is not showing a row result which indicates it is most likely semantic and duplicates the condition.




In [None]:
%%sql

execute  stmt_util_metrics_snap_table;

#### q4a | Metrics (above ^^)
A couple of observations about the metrics snapshot:
- Only the index tablet is accessed
- There are two offset seeks into the index tablet, with about 8K reads, the additional reads being for the additional columns in the index

There are some considerations for covering indexes:
- Non-DQL access patterns, such as DML statements - `insert, update, and delete`, can impact performance for write heavy workloads
- Additional indexes mean more resource consumption as indexes are tablets


---
## q5 | Partial index
A partial index uses a where a clause in it's definition. This is helpful for creating an index that does not contain out rows that are irrelevant for most queries, such as a soft-deleted row, archived row, or historical row. For example, suppose you need only country_codes that are equal to `US`.


### q5a | Create secondary index using expression and include and where

In [None]:
%%sql
drop index if exists idx_cities_city_name_covering_US;

create index idx_cities_city_name_covering_US
on tbl_cities (UPPER (COALESCE(city_name, city_name_alt) ) asc) 
include (city_id, city_name, city_name_alt) 
where country_code='US';

View the explain plan.

In [None]:
%%sql
execute  stmt_util_metrics_snap_reset;

explain (analyze, dist) 
select '' _
 -- , *
, city_id
, city_name
, city_name_alt
from tbl_cities 
where 1=1 
 -- and country_id = 233
 and country_code = 'US'
 --  and country_code = 'UK'
 -- and state_id = 1416
 -- and state_coude = 'CA'
 and UPPER (COALESCE(city_name, city_name_alt) ) like 'A%'
 -- and city_name = 'Alameda'
 -- and city_name BETWEEN 'Alameda' AND 'Alamo'
 -- and city_id = 111088
 -- and yb_hash_code(city_id) = yb_hash_code(111088)
 -- and yb_hash_code(city_name) = yb_hash_code('Alameda'::text)
limit 100
;

#### q5 | Explain plan (above ^^)

<details>
      <summary>
      View Example Output
      </summary>

```sql
Limit (cost=0.00..4.98 rows=10 width=1068) (actual time=2.454..2.504 rows=100 loops=1)
  -> Index Only Scan using idx_cities_city_name_covering_us on tbl_cities (cost=0.00..4.98 rows=10 width=1068) (actual time=2.453..2.496 rows=100 loops=1)
        Index Cond: (((upper((COALESCE(city_name, city_name_alt))::text)) >= 'A'::text) AND ((upper((COALESCE(city_name, city_name_alt))::text)) < 'B'::text))
        Filter: ((upper((COALESCE(city_name, city_name_alt))::text)) ~~ 'A%'::text)
        Heap Fetches: 0
        Storage Index Read Requests: 1
        Storage Index Read Execution Time: 1.374 ms
Planning Time: 10.972 ms
Execution Time: 2.551 ms
Storage Read Requests: 1
Storage Read Execution Time: 1.374 ms
Storage Write Requests: 0
Catalog Read Requests: 16
Catalog Read Execution Time: 13.286 ms
Catalog Write Requests: 0
Storage Flush Requests: 0
Storage Execution Time: 14.660 ms
Peak Memory Usage: 24 kB
```
</details>

A couple of observations about this query plan:
- The index is for `idx_cities_city_name_covering_us` and utilizes range sharding
- The plan utilizes a covering index as `Index Only Scan` indicates
- The `Index Cond` shows a range boundary for the `like` predicate, `>= 'A' AND < 'B'`
- The Filter is not showing a row result which indicates it is most likely semantic and duplicates the condition.
- The query `where` predicate is not shown in the explain plan. 

What happens if you change the predicate to not use the filter expression? You can change the country code from `US` to `UK` in the query above and rerun it.
- The index is for `idx_cities_city_name_covering`, so the `idx_cities_city_name_covering_us` is not used
- A  `Remote Filter` indicates a pushdown for the predicate, `country_code = 'UK'`

In [None]:
%%sql

execute  stmt_util_metrics_snap_table;

#### q5 | Metrics (above ^^)
A couple of observations about the metrics snapshot:
- Only the index tablet is accessed
- There is just 1 offset seeks into the index tablet, with about 3K reads

---
## Review of secondary indexes and sharding

PK in predicate will take precedence and result in index NOT being used...
- Can't drop or alter PK, need to rename, create new table (or similar), and select into to make change.
- PK can be composite (hash, and range order), and this behavior can be different if that is the case [TODO, need to cover]

Secondary index as Range 
- Equality is performant, but when range index tablets split, may not be as great as hash index for equality
- Ideal for range and comparison predicates
- For both Equality and Comparison predicate...
  - When the index column is the sole column in the select command and the sole column in the predicate (no PK), the range index is the covering index, and results in Index Only Scan
 
Secondary index as Hash
- Equality is ideal
  -  When the index column is the sole column in the select command and the sole column in the predicate (no PK), the range index is the covering index, and results in Index Only Scan
- A range or comparison predicate ignores the hash index and results in a costly Seq Scan

YugabyteDB vs PostgreSQL
- YugabyteDB only updates the indexes of columns where there are key-values and subkey values changes.
- A Index Only Scan reads from the index tablet.

---
# 🌟🌟🌟🌟  All  done! 
In this notebook, you completed the following:

- Created indexes using both hash and range sharding
- Viewed Explain Plans and metrics reports for various queries


## 😊 Next up!
Continue your learning by opening the next notebook, `05_Using_GIN_Indexes.ipynb`. 

Or, to open the notebook from GitPod, run the following:

In [None]:
%%bash
gp open '05_Using_GIN_Indexes.ipynb'