<div style="width:100%; background-color: #000041"><a target="_blank" href="http://university.yugabyte.com"><img src="assets/YBU_Logo.png" /></a></div><br>

> **YugabyteDB YSQL Development**
>
> Enroll for free at [Yugabyte University](https://university.yugabyte.com/courses/yugabytedb-ysql-development).
>
<br>
This notebook file is:

`04_Anatomy_of_an_Index.ipynb`


# Anatomy of an Index
In the `03_Demystifying_table_sharding_tablets_and_data_distribution.ipynb` notebook, you discovered how YugabyteDB stores data in tablets. You also learned that YugabyteDB and YSQL support two sharding strategies for tables: hash and range sharding.

An index in YugabyteDB is also distributed. Just like a table, YugabyteDB stores and distributes the data for an index stores in one or more tablets.  An index can employ either a hash or range sharding strategy. 

> Note
> 
> In this regard, a tablet represents a shard of data which contains a set of rows for a logical entity. Each tablet is a customized RocksDB instance. A tablet leader has a peer group known as tablet followers, and this group of tablet peers exists as a Raft consensus group. YugabyteDB calls this distributed document store, DocDB.

In this notebook, using Explain Plans, built-in functions, and custom utilities for YB-TServer metrics, you will learn how YugabyteDB stores data for an index in one or more tablets. You will also learn how YugabyteDB reads tablet data for an index during query execution.


## 🛠️ Requirements
Here are the requirements for this notebook:
- ✅ Create the notebook variables in `01_Lab_Setup.ipynb`, which you previously did
- ✅ Create the `ds_ybu` database, which you previously did
- ☑️ Import the notebook variables, *which you must do next*
- ☑️ Connect to the `ds_ybu` database, *which you must do next*
- ☑️ Complete the following sections
  -  Create a secondary index using range sharding
  -  Create a secondary index using hash sharding
  -  Review of secondary index sharding
  -  Expression index
  -  Expression with include index
  -  Partial index
  -  Hints and other features



### Select your notebook kernel
- In the Notebook toolbar, click **Select Kernel**.
<br>
<img width=50% src="assets/01_01_Select_Kernel_Toolbar.png" />

- Next, in the dropdown, select **Python 3.12** or higher.
<br>
<img width=50% src="assets/01_02_Select_Kernel_Dropdown.png" />

That's it!

## ⛑️ Getting help
The best way to get help from the Yugabyte University team is to post your question on YugabyteDB Community Slack in the #training or #yb-university channels. To sign up, visit [https://communityinviter.com/apps/yugabyte-db/register](https://communityinviter.com/apps/yugabyte-db/register).


## 👣 Setup steps
Here are the steps to setup this lab:
- Import the notebook variables
- Connect to `db_ybu` database
- Load the SQL Magic extension for the connection
- Create the prepared statements

### 👇 Import the notebook variables

> 👉 IMPORTANT! 👈
> 
> Do **NOT** skip running the following cell. 
> 

The following Python cell reads the stored variables created in the `01_Lab_Setup.ipynb` notebook. To run the script, select Execute Cell (Play Arrow) in the left gutter of the cell. 

👇 👇 👇 

In [1]:
# Use %store -r to read 01_Lab_Setup variables

%store -r MY_YB_PATH
%store -r MY_YB_PATH_DATA
%store -r MY_GITPOD_WORKSPACE_URL

%store -r MY_DB_NAME
%store -r MY_DB_PORT

%store -r MY_HOST_IPv4_01
%store -r MY_HOST_IPv4_02
%store -r MY_HOST_IPv4_03

%store -r MY_MASTER_WEB_PORT
%store -r MY_TSERVER_WEBSERVER_PORT
%store -r MY_YUGABYTED_WEB_UI_PORT

%store -r MY_YB_MASTER_HOST_GITPOD_URL
%store -r MY_YB_TSERVER_HOST_GITPOD_URL
%store -r MY_YUGABYTED_UI_HOST_GITPOD_URL

%store -r MY_NOTEBOOK_DIR
%store -r MY_NOTEBOOK_DATA_FOLDER
%store -r MY_NOTEBOOK_UTILS_FOLDER

%store -r MY_DATA_DDL_FILE_0
%store -r MY_DATA_DML_FILE_0
%store -r MY_DATA_DDL_FILE_1
%store -r MY_DATA_DML_FILE_1
%store -r MY_DATA_DDL_FILE_2
%store -r MY_DATA_DML_FILE_2
%store -r MY_DATA_DDL_FILE_3
%store -r MY_DATA_DML_FILE_3

%store -r MY_JEOPARDY_DATA_FILE
%store -r MY_GIN_EXAMPLES
%store -r MY_GITHUB_DATA_FILE

%store -r MY_UTIL_FUNCTIONS_FILE
%store -r MY_UTIL_YBTSERVER_METRICS_FILE

## Connect to the `db_ybu` database
Run all the cells in this section:
- Connect using Python and PostgreSQL driver
- Load the SQL magic extension
- Create the prepared statements
- View the listener
- View the DDL for tbl_cities

In [2]:
# connect use Python 3.7.9+
import psycopg2
import sqlalchemy as alc
from sqlalchemy import create_engine

db_host=MY_HOST_IPv4_01
db_name=MY_DB_NAME

connection_str='postgresql+psycopg2://yugabyte@'+db_host+':5433/'+db_name

# engine = create_engine(connection_str)

### Load the SQL magic extension

In [3]:
%reload_ext sql

# SQL magic for python connection string
%sql {connection_str}

### Create the prepared statements

> IMPORTANT!
>   
> In order to create the prepared statements for the SQL magic connection, you must run the following cell!!!
> 
> Do not skip this step.
> 

In [4]:
#%% python, but prepared statements as sql magic

if (MY_YB_MASTER_HOST_GITPOD_URL is None):
    a = %sql select fn_yb_create_stmts()
else:
    a = %sql select fn_yb_create_stmts(:MY_YB_MASTER_HOST_GITPOD_URL )
print (a)

 * postgresql+psycopg2://yugabyte@127.0.0.1:5433/db_ybu
1 rows affected.
+----------------------------------+
|        fn_yb_create_stmts        |
+----------------------------------+
| 2023-05-24 10:55:23.343679-07:00 |
+----------------------------------+


Confirm that the following query returns a count of 3 (for three prepared statements).

In [5]:
%%sql 
select count(*) from pg_prepared_statements where 1=1 and name in ('stmt_util_metrics_snap_tablet','stmt_util_metrics_snap_table','stmt_util_metrics_snap_reset')

 * postgresql+psycopg2://yugabyte@127.0.0.1:5433/db_ybu
1 rows affected.


count
3


### View the listener address
Run the following cell to view the host for your client connection

In [None]:
%%sql /* confirm listener */
show listen_addresses;

### View DDL for tbl_cities

Run the following cell to execute the describe table command, `\d`.

In [None]:
%%bash -s "$MY_YB_PATH" "$MY_DB_NAME"  # \d+
YB_PATH=${1}
DB_NAME=${2}

cd $YB_PATH

./bin/ysqlsh -d ${DB_NAME} -c "\d+ tbl_cities"

# ./bin/ysqlsh -d ${DB_NAME} -c "\d+ idx_cities_city_name_range"

> Need help?
> 
> If you can't find tbl_cities, please go back to `01_Lab_Setup.ipynb`.

---
## q1 | Create a secondary index using range sharding
When you create an index for a table with a sort order (`asc` or `desc`), YugabyteDB will create the index using a range sharding strategy.
  

In [None]:
%%sql
drop index if exists idx_cities_city_name_range;
drop index if exists idx_cities_city_name_hash;

select pg_sleep(1);

create index idx_cities_city_name_range on tbl_cities (city_name asc);

Review the DDL for index in the following cell:

In [None]:
%%sql
select pg_get_indexdef('idx_cities_city_name_range':: regclass);


Alternatively, use the `\d+` command:

In [None]:
%%bash -s "$MY_YB_PATH" "$MY_DB_NAME"  # \d+
YB_PATH=${1}
DB_NAME=${2}

cd $YB_PATH

./bin/ysqlsh -d ${DB_NAME} -c "\d+ idx_cities_city_name_range"

`USING lsm` indicates that the index is of the type Log-Structured Merge tree (LSM) and employs range sharding. A LSM tree is not an tree structure, but rather, a complex algorithm that converts discrete random write requests into batch sequential write requests. To improve write performance for the LSM tree, RocksDB utilizes a Write-Ahead Log (WAL) and a memtable (a skiplist that lives in memory). The sequential writes in the memtable persists to disk as a Sorted Sequence Table (SSTable or SST) files.

Both hash sharding and range sharding use LSM. The difference of course is that the LSM tree sorts the hash coded values for hash sharding and the specified key values for range sharding.

> Important!
> 
> In Data Definition Language statements, YugabyteDB will interpret the `BTREE` keyword as `LSM`. YugabyteDB does not support `BTREE` indexes.

### View the Index details in the YB-Master web ui
You can view the details of the `idx_cities_city_name_range` index in the YB-Master web ui. Run the cell below and open the URL in your web browser.

In [None]:
#%% python, but prepared statements as sql magic
THIS_INDEX_NAME = 'idx_cities_city_name_range'
THIS_SCHEMA_NAME = 'public'
DB_NAME = MY_DB_NAME

## Comment out if local
view_gitpod_url = %sql select fn_get_table_id_url(:MY_YB_MASTER_HOST_GITPOD_URL,7000,:DB_NAME,:THIS_SCHEMA_NAME,:THIS_INDEX_NAME ) as view_gitpod_url
print (view_gitpod_url)

## Uncomment if local
# view_local_url = %sql select fn_get_table_id_url(:MY_HOST_IPv4_01,7000,:DB_NAME,:THIS_SCHEMA_NAME,:THIS_INDEX_NAME ) as view_local_url
# print (view_local_url)

#### Review the Column section
The column section shows details about each column in the index. Here is the section for  `idx_cities_city_name_range`:


| Column | ID	| Type |
|--------|------|------|
| city_name    | 0	| string NOT NULL NOT A PARTITION KEY | 
| ybidxbasectid	   | 1	| binary NOT NULL NOT A PARTITION KEY | 


<br/>

> Important!
>  
>  YugabyteDB creates an internal, hidden column, `ybidxbasectid`, for the indexed row. `ybidxbasectid` is similar to the internal, hidden colum, `ybctid`, for a row of a table. Both `ybctid` and  `ybidxbasectid` are virtual columns that represent the
>  DocDB-encoded key for the tuple. 
> 
> Using  `\d` or `\d+` will not show the `ybidxbasectid` column. It is also not possible to query the `ybidxbasectid` value.

#### Review the Tablet section
The Tablet section shows the details for the existing tablets. Here is the section for  `idx_cities_city_name_range`: 

| Tablet ID |	Partition	| SplitDepth	| State	| Hidden	| Message	| RaftConfig|
|--|--|--|--|--|--|--|
| some_uuid_1<br>`1e2c3ef228534d3cbbf59c9fa6968d88	` |	`range: [<start>, <end>)` |	0	| Running|	false| Tablet reported with an active leader	|<li>FOLLOWER: 127.0.0.1 <li>FOLLOWER: 127.0.0.3<li>LEADER: 127.0.0.1 |

YugabyteDB will automatically split this tablet based on the size of the table on disk. The following global flags determine this behavior:

```
--tablet_force_split_threshold_bytes=107374182400 --> 10240 MB
--tablet_split_high_phase_shard_count_per_node=24
--tablet_split_high_phase_size_threshold_bytes=10737418240 --> 10240 MB
--tablet_split_low_phase_shard_count_per_node=8
--tablet_split_low_phase_size_threshold_bytes=536870912 --> 512 MB
--tablet_split_size_threshold_bytes=0
```

The low phase indicates the threshold for the initial splits of the tablet. With more data volume, the threshold increases from 512 MB to over 10 GBs.

### q1a | Range index in equality predicate
To generate an Explain Plan, run the following cell:

In [None]:
%%sql
execute  stmt_util_metrics_snap_reset;

explain (costs off, analyze,verbose) 
select '' _
 , *
 --, city_id
 --, city_name
from tbl_cities 
where 1=1 
 -- and country_id = 233
 -- and country_code = 'US'
 -- and state_id = 1416
 -- and state_coude = 'CA'
and city_name = 'Alameda'
 -- and city_id = 111088
 -- and yb_hash_code(city_id) = yb_hash_code(111088)
 -- and yb_hash_code(city_name) = yb_hash_code('Alameda'::text)
-- limit 100
;

#### q1a | Explain Plan (above ^^)

The Explain Plan shows that this query uses the index and reads 2 rows
- `Index Scan using idx_cities_city_name_range on public.tbl_cities (actual time=6.392..6.397 rows=2 loops=1)`

The Index Condition reflects the query predicate.

- `Index Cond: ((tbl_cities.city_name)::text = 'Alameda'::text)`

To view the tablet metrics, run the following cell:

In [None]:
%%sql
execute  stmt_util_metrics_snap_table;

#### q1a | Metrics (above ^^)

In the initial query, the `Index Scan` accesses the index tablet:

| row_name| 	rocksdb_number_db_seek | 	rocksdb_number_db_next | 
|--|--|--|
| db_ybu idx_cities_city_name_range link_table_id tablet_id_unq_1 leader	 | 1| 	2 |

There are two results for the predicate expression, resulting in one seek of the index tablet offset, and then two reads.
Because the query returns all columns, the query also reads from one of the tablet leader for `tbl_cities`, a table with hash sharding.


| row_name| 	rocksdb_number_db_seek | 	rocksdb_number_db_next | 
|--|--|--|
| db_ybu tbl_cities link_table_id tablet_id_unq_1	| 4 | 28 |


What the Metrics report reveals is that even though the query uses the range index, the query must still seek 4 offsets from one of the tablets for `tbl_cities`.



###  q1b | Covering index with an equality predicate
When the query results include all the columns in the index, the index "covers the query" and is known as a "covering index".

To view the Explain Plan, run the following cell: 

In [None]:
%%sql
execute  stmt_util_metrics_snap_reset;

explain (costs off, analyze, verbose) 
select '' _
 --, *
-- , city_id
 , city_name
from tbl_cities 
where 1=1 
 -- and country_id = 233
 -- and country_code = 'US'
 -- and state_id = 1416
 -- and state_coude = 'CA'
and city_name = 'Alameda'
-- and city_id = 111088
 -- and yb_hash_code(city_id) = yb_hash_code(111088)
 -- and yb_hash_code(city_name) = yb_hash_code('Alameda'::text)
-- limit 100
;

#### q1b | Explain plan (above ^^)
The Explain Plan shows that this query uses the index only `Index Only Scan` and reads 2 rows.
- `Index Only Scan using idx_cities_city_name_range on public.tbl_cities (actual time=4.964..4.968 rows=2 loops=1)`

The Index Condition reflects the query predicate.

- `Index Cond: ((tbl_cities.city_name)::text = 'Alameda'::text)`

To view the tablet metrics, run the following cell:

In [None]:
%%sql
execute  stmt_util_metrics_snap_table;

#### q1b | Metrics (above ^^)

In the initial query, the `Index Scan` accesses the index tablet:

| row_name| 	rocksdb_number_db_seek | 	rocksdb_number_db_next | 
|--|--|--|
| db_ybu idx_cities_city_name_range link_table_id tablet_id_unq_1 leader	 | 1| 	2 |

There are two results for the predicate expression, resulting in one seek of the index tablet offset, and then two reads.
Because the query only returns the city name, the index "covers" the query and is a "coverign index". There is no need for the query to access the tablets for `tbl_cities`.


#### q1b | Experiment

Question: What happens when the query includes the PK column, `city_id`,  in the select command? 

Answer:
  - The Explain Plan becomes an `Index Scan`
  - The query eads from a tablet for `tbl_cities`
  - The reason? The tuple for the index, `ybidxbasectid`, encodes the PK pointer. YugabyteDB at this time is unable to use this PK encoding in results. This is a known issue.

Question: What happens the query includes a PK equality clause in the predicate?

Answer:
- The query will use the PK index, `tbl_cities_pkey`, and not the secondary index.



### q1c | Range index with a range predicate
Run the following cell to view the query results:

In [None]:
%%sql
-- to see results
select '' _
   , *
-- , city_name
from tbl_cities 
where 1=1 
 -- and country_id = 233
 -- and country_code = 'US'
 -- and state_id = 1416
 -- and state_coude = 'CA'
 and city_name BETWEEN 'Alameda' AND  'Alamo'
-- and city_id = 111088
 -- and yb_hash_code(city_id) = yb_hash_code(111088)
-- and yb_hash_code(city_name) = yb_hash_code(city_name::text)
limit 30
;

To view the Explain Plan, run the following cell:

In [None]:
%%sql

execute  stmt_util_metrics_snap_reset;

explain (costs off, analyze, verbose) 
select ''_ 
    , *
 -- , city_name
from tbl_cities 
where 1=1 
 -- and country_id = 233
 -- and country_code = 'US'
 -- and state_id = 1416
 -- and state_coude = 'CA'
 and city_name BETWEEN 'Alameda' AND  'Alamo'
 -- and city_id = 111088
 -- and yb_hash_code(city_id) = yb_hash_code(111088)
-- and yb_hash_code(city_name) = yb_hash_code(city_name::text)
-- limit 100
;

#### q1c | Explain plan (above ^^)
The Explain Plan shows that this query uses the index only `Index Scan` and reads 16 rows.
- `Index Scan using idx_cities_city_name_range on public.tbl_cities (actual time=3.192..3.239 rows=16 loops=1`

The Index Condition reflects the query predicate range.

- `  Index Cond: (((tbl_cities.city_name)::text >= 'Alameda'::text) AND ((tbl_cities.city_name)::text <= 'Alamo'::text))`

To view the tablet metrics, run the following cell:

In [None]:
%%sql
execute  stmt_util_metrics_snap_table;

#### q1c | Metrics (above ^^)

In the initial query, the `Index Scan` accesses the index tablet:

| row_name| 	rocksdb_number_db_seek | 	rocksdb_number_db_next | 
|--|--|--|
| db_ybu idx_cities_city_name_range link_table_id tablet_id_unq_1 leader	 | 1 | 	16 |

Because the query returns all the columns, the query accesses data on the the three tablet leaders for `tbl_cities`. Retrieving the column data for the query requires additional seeks and reads on each tablet.


### q1d | Covering index with a range predicate
To view the Explain Plan, run the following cell:

In [None]:
%%sql
execute  stmt_util_metrics_snap_reset;

explain (costs off, analyze,verbose) 
select '' _
 --, *
 -- , city_id
 , city_name
from tbl_cities 
where 1=1 
 -- and country_id = 233
 -- and country_code = 'US'
 -- and state_id = 1416
 -- and state_code = 'CA'
and city_name between 'alameda' and 'alamo'
-- and city_id = 111088
 -- and yb_hash_code(city_id) = yb_hash_code(111088)
 -- and yb_hash_code(city_name) = yb_hash_code('Alameda'::text)
-- limit 100
;

#### q1d | Explain Plan (above ^^)

The Explain Plan shows that this query only uses the range index and in doing so, retrieves 16 rows.
- `Index Only Scan using idx_cities_city_name_range on public.tbl_cities (actual time=0.883..0.891 rows=16 loops=1)`

The Index Condition reflects the query predicate.

- `Index Cond: ((tbl_cities.city_name >= 'Alameda'::text) AND (tbl_cities.city_name <= 'Alamo'::text))`

To view the tablet metrics, run the following cell:

In [None]:
%%sql

execute  stmt_util_metrics_snap_table;

#### q1d | Metrics (above ^^)

The range index covers the query and serves as a covering index. A covering index is indicative of a Index Only Scan query.

| row_name| 	rocksdb_number_db_seek | 	rocksdb_number_db_next | 
|--|--|--|
| db_ybu idx_cities_city_name_range link_table_id tablet_id_unq_1 leader	 | 1 | 	16 |



---
## q2 | Create a secondary index using hash sharding
In this series of queries, you will explore using a hash index. 

To begin, create an index with hash sharding. In the DLL for the index, specify the `hash` keyword as follows:

In [None]:
%%sql 
drop index if exists idx_cities_city_name_range;

drop index if exists idx_cities_city_name_hash;

select pg_sleep(1);

create index idx_cities_city_name_hash on tbl_cities (city_name hash);

To view the DDL, run the following cell:

In [None]:
%%sql
select pg_get_indexdef('idx_cities_city_name_hash':: regclass);

### View the Index details in the YB-Master web ui
You can view the details of the `idx_cities_city_name_hash` index in the YB-Master web ui. Run cell below and open the URL in your web browser.

In [None]:
#%% python, but prepared statements as sql magic
THIS_INDEX_NAME = 'idx_cities_city_name_hash'
THIS_SCHEMA_NAME = 'public'
DB_NAME = MY_DB_NAME

## Comment out if local
view_gitpod_url = %sql select fn_get_table_id_url(:MY_YB_MASTER_HOST_GITPOD_URL,7000,:DB_NAME,:THIS_SCHEMA_NAME,:THIS_INDEX_NAME ) as view_gitpod_url
print (view_gitpod_url)

## Uncomment if local
# view_local_url = %sql select fn_get_table_id_url(:MY_HOST_IPv4_01,7000,:DB_NAME,:THIS_SCHEMA_NAME,:THIS_INDEX_NAME ) as view_local_url
# print (view_local_url)

#### Review the Column section
The column section shows details about each column in the index. Here is the section for  `idx_cities_city_name_hash`:


| Column | ID	| Type |
|--------|------|------|
| city_name    | 0	| string NOT NULL PARTITION KEY| 
| ybidxbasectid	   | 1	| binary NOT NULL NOT A PARTITION KEY | 


<br/>

> Important!
>  
>  YugabyteDB creates an internal, hidden column, `ybidxbasectid`, for the indexed row. `ybidxbasectid` is similar to the internal, hidden colum, `ybctid`, for a row of a table. Both `ybctid` and  `ybidxbasectid` are virtual columns that represent the
>  DocDB-encoded key for the tuple. 
> 
> Using  `\d` or `\d+` will not show the `ybidxbasectid` column. It is also not possible to query the `ybidxbasectid` value.

##### Partition Key
YugabyteDB uses the shard key (shown as `PARTITION KEY`) to distribute the data among the tablet leaders for the index. 

With consistent hash sharding, a partitioning algorithm distributes data evenly and randomly across shards. By computing a consistent hash on the partition key (or keys) of a given row, YugabyteDB knows where to insert the row among the tablet leaders.



#### Review the Tablet section
For the given index, the Tablet section shows the details for the existing tablets. Of particular interest are the number of tablet leaders and the partition strategy. Here is an example of the Tablet section for `idx_cities_city_name_hash`:

| Tablet ID |	Partition	| SplitDepth	| State	| Hidden	| Message	| RaftConfig|
|--|--|--|--|--|--|--|
| some_uuid_1 |	`hash_split: [0x5555, 0xAAAA)` |	0	| Running|	false| Tablet reported with an active leader	|<li>FOLLOWER: 127.0.0.1 <li>FOLLOWER: 127.0.0.3<li>LEADER: 127.0.0.2  |
| some_uuid_2	| `hash_split: [0xAAAA, 0xFFFF)`	| 0 |  Running |false |	Tablet reported with an active leader |<li>FOLLOWER: 127.0.0.1 <li>LEADER: 127.0.0.3 <li>FOLLOWER: 127.0.0.2 |
| some_uuid_3 <br>(tablet leader where the row lives) |	`hash_split: [0x0000, 0x5555)` |	0 |	Running | 	false	| Tablet reported with an active leader |	<li>LEADER: 127.0.0.1<li>FOLLOWER: 127.0.0.3<li>FOLLOWER: 127.0.0.2 |

### q2a | Hash index with an equality predicate
To view the Explain Plan for the query, run the following cell:

In [None]:
%%sql
execute  stmt_util_metrics_snap_reset;

explain (costs off, analyze,verbose) 
select 0
 , *
 -- , city_id
 -- , city_name
from tbl_cities 
where 1=1 
 -- and country_id = 233
 -- and country_code = 'US'
 -- and state_id = 1416
 -- and state_coude = 'CA'
and city_name = 'Alameda'
 -- and city_id = 111088
 -- and yb_hash_code(city_id) = yb_hash_code(111088)
 -- and yb_hash_code(city_name) = yb_hash_code('Alameda'::text)
-- limit 100
;

#### q2a | Explain Plan (above ^^)

The Explain Plan shows that this query uses the hash index and reads 2 rows.
- `Index Scan using idx_cities_city_name_hash on public.tbl_cities (actual time=3.539..3.552 rows=2 loops=1`

The Index Condition reflects the query predicate.

- `Index Cond: ((tbl_cities.city_name)::text = 'Alameda'::text)`

To view the tablet metrics, run the following cell:

In [None]:
%%sql
execute  stmt_util_metrics_snap_table;

#### q2a | Metrics (above ^^)

The query uses the hash index and accesses one of the tablet leaders for `idx_cities_city_name_hash`.

| row_name| 	rocksdb_number_db_seek | 	rocksdb_number_db_next | 
|--|--|--|
| db_ybu idx_cities_city_name_hash link_table_id tablet_id_unq_1 leader	 | 1 | 	2 |

Because the query requires more than the columns in the index, the query accesses one of the tablet leaders for `tbl_cities`.

| row_name| 	rocksdb_number_db_seek | 	rocksdb_number_db_next | 
|--|--|--|
| db_ybu tbl_cities link_table_id tablet_id_unq_1 leader	 | 4 | 	28 |

There's a 33% chance that the tablet leaders for the index and the table are on different hosts. If you are running this notebook locally or in Gitpod, the host is the same machine with the YB-TServer processes running on different ports using host aliases for localhost (127.0.0.1) such as 127.0.0.2 and 127.0.0.3.


### q2b | Covering index with a hash index and equality predicate
Run the following cell to generate an Explain Plan:

In [None]:
%%sql
execute  stmt_util_metrics_snap_reset;

explain (costs off, analyze,verbose) 
select '' _
 -- , *
 --, city_id
, city_name
from tbl_cities 
where 1=1 
 -- and country_id = 233
 -- and country_code = 'US'
 -- and state_id = 1416
 -- and state_coude = 'CA'
and city_name = 'Alameda'
 -- and city_name BETWEEN 'Alameda' AND 'Alamo'
 -- and city_id = 111088
 -- and yb_hash_code(city_id) = yb_hash_code(111088)
 -- and yb_hash_code(city_name) = yb_hash_code('Alameda'::text)
-- limit 100
;

#### q2a | Explain Plan (above ^^)

The Explain Plan shows that the index is a covering index and returns 2 rows.
- `Index Only Scan using idx_cities_city_name_hash on public.tbl_cities (actual time=0.805..0.808 rows=2 loops=1)`

The Index Condition reflects the query predicate.

- `Index Cond: (tbl_cities.city_name = 'Alameda'::text)`

To view the tablet metrics, run the following cell:

In [None]:
%%sql
execute  stmt_util_metrics_snap_table;

#### q2b | Metrics (above ^^)

As an Index Only Scan query, the query uses the hash index and accesses one of the tablet leaders for `idx_cities_city_name_hash` and reads 2 rows.

| row_name| 	rocksdb_number_db_seek | 	rocksdb_number_db_next | 
|--|--|--|
| db_ybu idx_cities_city_name_hash link_table_id tablet_id_unq_1 leader	 | 1 | 	2 |


### q2c | Hash index with a range predicate
To view the Explain Plan, run the following:

In [None]:
%%sql
execute  stmt_util_metrics_snap_reset;

explain (costs off, analyze,verbose) 
select '' _
    , *
-- , city_id
 --, city_name
from tbl_cities 
where 1=1 
 -- and country_id = 233
 -- and country_code = 'US'
 -- and state_id = 1416
 -- and state_coude = 'CA'
and city_name BETWEEN 'Alameda' AND 'Alamo'
 -- and city_id = 111088
 -- and yb_hash_code(city_id) = yb_hash_code(111088)
 -- and yb_hash_code(city_name) = yb_hash_code('Alameda'::text)
-- limit 100
;

#### q2c | Explain plan (above ^^)
As expected, the optimizer generates an Explain Plan that does not use the hash index. Instead, the Explain Plan shows a full table scan of the table itself.
- `Seq Scan on public.tbl_cities (actual time=366.921..2929.900 rows=16 loops=1`

The YB-TServer that serves the client connection removes the rows:
  - `Filter: (((tbl_cities.city_name)::text >= 'Alameda'::text) AND ((tbl_cities.city_name)::text <= 'Alamo'::text))`
  - `Rows Removed by Filter: 148250`

To view the Metrics report for the query, run the following cell:


In [None]:
%%sql

execute  stmt_util_metrics_snap_table;

#### q2b | Metrics (above ^^)
The `Seq Scan` requires that the query access all three tablets for `tbl_cities`, requiring 30K+ seeks and 60OK reads per tablet. 

---
## Review of secondary index sharding

PK in predicate will take precedence and result in index NOT being used...
- Can't drop or alter PK, need to rename, create new table (or similar), and select into to make change.
- PK can be composite (hash, and range order), and this behavior can be different if that is the case [not covered here]

Secondary index as Range 
- Equality is performant, but when range index tablets split, may not be as great as hash index for equality
- Ideal for range and comparison predicates
- For both Equality and Comparison predicate...
  - When the index column is the sole column in the select command and the sole column in the predicate (no PK), the range index is the covering index, and results in Index Only Scan
 
Secondary index as Hash
- Equality is ideal
  -  When the index column is the sole column in the select command and the sole column in the predicate (no PK), the range index is the covering index, and results in Index Only Scan
- A range or comparison predicate ignores the hash index and results in a costly Seq Scan

YugabyteDB vs PostgreSQL
- YugabyteDB only updates the indexes of columns where there are key-values and subkey values changes.
- A Index Only Scan reads from the index tablet.

---
## q3 | Expression index
Uses for an expression for index.

In [None]:
%%sql
drop index if exists idx_cities_city_name_exp;

create index idx_cities_city_name_exp on tbl_cities (UPPER (COALESCE(city_name, city_name_alt) ) asc);

### q3a | Expression query

In [None]:
%%sql
execute  stmt_util_metrics_snap_reset;

explain (costs off, analyze,verbose) 
select '' _
 -- , *
, city_id
, city_name
, city_name_alt
from tbl_cities 
where 1=1 
 -- and country_id = 233
 -- and country_code = 'US'
 -- and state_id = 1416
 -- and state_coue = 'CA'
 and UPPER (COALESCE(city_name, city_name_alt) ) like 'A%'
 -- and city_name = 'Alameda'
 -- and city_name BETWEEN 'Alameda' AND 'Alamo'
 -- and city_id = 111088
 -- and yb_hash_code(city_id) = yb_hash_code(111088)
 -- and yb_hash_code(city_name) = yb_hash_code('Alameda'::text)
limit 100
;

#### q3a | Explain plan (above ^^)
- Index is Range
- Index Scan
- Expression in Index Cond --> This shows the boundary >='A' <'B'
- Filter not doing anything



In [None]:
%%sql

execute  stmt_util_metrics_snap_table;

####  | Metrics (above ^^)
- Only index tablet, which is range
- Accesses the other table tablets

---
## q4 | Expression with include index
How to use an expression and include.

In [None]:
%%sql
drop index if exists idx_cities_city_name_exp_covering;

create index idx_cities_city_name_covering on tbl_cities (UPPER (COALESCE(city_name, city_name_alt) ) asc) include (city_id, city_name, city_name_alt);

### q4a | Query using expression and include columns

In [None]:
%%sql
execute  stmt_util_metrics_snap_reset;

explain (costs off, analyze,verbose) 
select '' _
 -- , *
, city_id
, city_name
, city_name_alt
-- , state_id
from tbl_cities 
where 1=1 
 -- and country_id = 233
 -- and country_code = 'US'
 -- and state_id = 1416
 -- and state_coude = 'CA'
 and UPPER (COALESCE(city_name, city_name_alt) ) like 'A%'
 -- and city_name = 'Alameda'
 -- and city_name BETWEEN 'Alameda' AND 'Alamo'
 -- and city_id = 111088
 -- and yb_hash_code(city_id) = yb_hash_code(111088)
 -- and yb_hash_code(city_name) = yb_hash_code('Alameda'::text)
limit 100
;

#### q4a | Explain plan (above ^^)
- Index is Range
- Index Only Scan
- Expression in Index Cond --> This shows the boundary >='A' <'B'
- Filter not doing anything



In [None]:
%%sql

execute  stmt_util_metrics_snap_table;

#### q4a | Metrics (above ^^)
- Only index tablet means 1 rpc
- Trade off is data storage
- Non-DQL access patterns, such as DML [insert, update, delete]... Impact comes if source table is highly touched


---
## q5 | Partial index
A partial index uses a where a clause in it's definition. This is helpful for creating an index that does not contain out rows that are irrelevant for most queries, such as a soft-deleted row, archived row, or historical row. For example, suppose you need only country_codes that are equal to `US`.


### q5a | Create secondary index using expression and include and where

In [None]:
%%sql
drop index if exists idx_cities_city_name_covering_US;

create index idx_cities_city_name_covering_US
on tbl_cities (UPPER (COALESCE(city_name, city_name_alt) ) asc) 
include (city_id, city_name, city_name_alt) 
where country_code='US';

View the explain plan.

In [None]:
%%sql
execute  stmt_util_metrics_snap_reset;

explain (costs off, analyze,verbose) 
select '' _
 -- , *
, city_id
, city_name
, city_name_alt
from tbl_cities 
where 1=1 
 -- and country_id = 233
 and country_code = 'US'
-- and country_code = 'US'
 -- and state_id = 1416
 -- and state_coude = 'CA'
 and UPPER (COALESCE(city_name, city_name_alt) ) like 'A%'
 -- and city_name = 'Alameda'
 -- and city_name BETWEEN 'Alameda' AND 'Alamo'
 -- and city_id = 111088
 -- and yb_hash_code(city_id) = yb_hash_code(111088)
 -- and yb_hash_code(city_name) = yb_hash_code('Alameda'::text)
limit 100
;

#### q5 | Explain plan (above ^^)
- Index is Range
- Index Only Scan
- Expression in Index Cond
  - boundary >='A' <'B'
  - where is not shown
- Filter not doing anything, as index cond handles


In [None]:
%%sql

execute  stmt_util_metrics_snap_table;

#### q5 | Metrics (above ^^)
- Only index tablet means 1 rpc
- Trade off is data storage
- Non-DQL access patterns, such as DML [insert, update, delete]... Impact comes if source table is highly touched


---
## q6 | Hints and other features

### q6a | hints

`/*+ SeqScan(tbl_cities) */`

In [None]:
%%sql
execute  stmt_util_metrics_snap_reset;
explain (costs off, analyze, verbose) 
/*+ SeqScan(tbl_cities) */
select '' _
    , *
from tbl_cities 
where 1=1 
 -- and country_id = 233
 -- and country_code = 'US'
 -- and state_id = 1416
 -- and state_coude = 'CA'
and city_name = 'Alameda'
-- and city_id = 111088
 -- and yb_hash_code(city_id) = yb_hash_code(111088)
-- limit 100
;

Explain plan (above ^^)
- Seq Scan is the same as Full Scan of the table
- Filter remove Y rows out of X, `Rows Removed by Filter: 148264`


In [None]:
%%sql
execute  stmt_util_metrics_snap_table;

Metrics (above ^^)
- All tablet leaders
- Seek in all tablet offsets
  - Next to read values after seek
  - Seek again for another value
  - Seek/next ratio is around 1:19+ (can reduce by narrow columns)

### q6b | No PK or IDX, but New Distributed Features
- Seq Scan
- Enable Pushdown
- Follower reads
- Combo

In [None]:
%%sql
drop index if exists idx_cities_city_name;
drop index if exists idx_cities_city_name_range;
drop index if exists idx_cities_city_name_hash;
drop index if exists idx_cities_city_name_covering_US;
drop index if exists idx_cities_city_name_exp;

#### Enable Pushdown
```
name        | yb_enable_expression_pushdown
setting     | off
description | Push supported expressions down to DocDB for evaluation.
```
Set to true = on, set to false = off.

In [None]:
%%sql
SET yb_enable_expression_pushdown=on;
SHOW yb_enable_expression_pushdown;

In [None]:
%%sql
execute  stmt_util_metrics_snap_reset;
explain (costs off, analyze, verbose) 
select '' _
    , *
from tbl_cities 
where 1=1 
 -- and country_id = 233
 -- and country_code = 'US'
 -- and state_id = 1416
 -- and state_coude = 'CA'
 and city_name = 'Alameda' 
 -- and city_id = 111088
 -- and yb_hash_code(city_id) = yb_hash_code(111088)
-- limit 100
;

##### Pushdown | Explain plan (above ^^)
- `Remote Filter: ((tbl_cities.city_name)::text = 'Alameda'::text)` is pushdown
- Pushdown filter is in DocDB (Remember, each tablet is a running rocksdb/docdb, and tablets are part of raft consensus group, e.g. tablet peers)
- Pushdown is faster because non-pushdown (YSQl/PostgreSQL Query Layer reuses) needs to remove rows from all results
- Stats are better
  - Planning Time: 0.062 ms
  - Execution Time: 966.344 ms
  - Peak Memory Usage: 8 kB
  

In [None]:
%%sql
execute  stmt_util_metrics_snap_table;

##### Pushdown | Metrics (above ^^)
- All tablet leaders
- Seek in all tablet offsets
  - Next to read values after seek
  - Seek again for another value
  - Seek/next ratio is around 1:19+
- Performance of pushdown of filter revealed in explain plan stats

##### Pushdown | Cleanup

In [None]:
%%sql
SET yb_enable_expression_pushdown=off;
SHOW yb_enable_expression_pushdown;

#### Enable Follower Reads
```
name        | yb_read_from_followers
setting     | off
description | Allow any statement that generates a read request to go to any node.

name        | yb_follower_read_staleness_ms
setting     | 30000
description | Sets the staleness (in ms) to be used for performing follower reads.
```
Set to true = on, set to false = off.

Also, other configurations such as:
- `set session characteristics as transaction read write;`
- hint, `/*+ Set(transaction_read_only on) */`

For more details about this feature, review:

https://docs.yugabyte.com/preview/explore/ysql-language-features/going-beyond-sql/follower-reads-ysql/#examples


In [None]:
%%sql
set session characteristics as transaction read write;
set yb_read_from_followers=true;
SHOW yb_read_from_followers;

In [None]:
%%sql
execute  stmt_util_metrics_snap_reset;
explain (costs off, analyze, verbose) 
/*+ Set(transaction_read_only on) */
select '' _
    , *
from tbl_cities 
where 1=1 
 -- and country_id = 233
 -- and country_code = 'US'
 -- and state_id = 1416
 -- and state_coude = 'CA'
 and city_name = 'Alameda'
 -- and city_id = 111088
 -- and yb_hash_code(city_id) = yb_hash_code(111088)
-- limit 100
;


##### Follower-reads | Explain plan (above ^^)
- Execution time would be better in "real world" (no host aliases)
  - Planning Time: 0.110 ms
  - Execution Time: 2613.345 ms
  - Peak Memory Usage: 8 kB

In [None]:
%%sql

execute  stmt_util_metrics_snap_table;

##### Follower-reads | Metrics (above ^^)
- Only from session host, 127.0.0.1 (can confirm by re-running connection to different) 
- Uses tablet leader and followers on host (this works for RF=3 with a 3 node cluster)
- Seek in all tablet offsets
  - Next to read values after seek
  - Seek again for another value
  - Seek/next ratio is around 1:19+
- Performance of follower reads in explain plan stats

#####  Rerun
- Restart notebook
- Clear output
- Change db_host from `127.0.0.1` to other, e.g. `127.0.0.3`
- Confirm session host
- Rerun all of q3c and review
- Restart notebook
- Clear output
- Change db_host from `127.0.0.3` to other, e.g. `127.0.0.1`

##### Follower-reads | Plan and Metrics
- First pass is high cost
- Second is faster?

##### Follower-reads | Cleanup

In [None]:
%%sql
set yb_read_from_followers=false;
SHOW yb_read_from_followers;

### Pushdown + Follower

In [None]:
%%sql
SET yb_enable_expression_pushdown=true;
SHOW yb_enable_expression_pushdown;

In [None]:
%%sql
SET session characteristics as transaction read write;
SET yb_read_from_followers=true;
SHOW yb_read_from_followers;

In [None]:
%%sql
execute  stmt_util_metrics_snap_reset;
explain (costs off, analyze, verbose) 
/*+ Set(transaction_read_only on) */
select *
from tbl_cities 
where 1=1 
 -- and country_id = 233
 -- and country_code = 'US'
 -- and state_id = 1416
 -- and state_coude = 'CA'
 and city_name = 'Alameda'
 -- and city_id = 111088
 -- and yb_hash_code(city_id) = yb_hash_code(111088)
-- limit 100
;


##### Pushdown + Follower | Explain plan (above ^^)
- Pushdown benefit, `  Remote Filter: ((tbl_cities.city_name)::text = 'Alameda'::text)`
- A little faster than just pushdown alone

In [None]:
%%sql

execute  stmt_util_metrics_snap_table;

##### Pushdown + Follower | Metrics (above ^^)
- Only node host, same seeks and reads as above

##### Pushdown + Follower | Cleanup

In [None]:
%%sql
SET yb_enable_expression_pushdown=off;
SHOW yb_enable_expression_pushdown;

In [None]:
%%sql
SET session characteristics as transaction read write;
SET yb_read_from_followers=off;
SHOW yb_read_from_followers;

---
# 🌟🌟🌟🌟  All  done! 
In this notebook, you completed the following:

- Created indexes using both hash and range sharding
- Viewed Explain Plans and metrics reports for various queries


## 😊 Next up!
Continue your learning by opening the next notebook, `05_Using_GIN_Indexes.ipynb`. 

Or, to open the notebook from GitPod, run the following:

In [None]:
%%bash
gp open '05_Using_GIN_Indexes.ipynb'