<div style="width:100%; background-color: #000041"><a target="_blank\" href="http://university.yugabyte.com\"><img src="assets/YBU_Logo.webp" /></a></div><br>

> **YugabyteDB YCQL Development**
>
> Enroll for free at [Yugabyte University](https://university.yugabyte.com/courses/yugabytedb-ycql-development).
>

# Query-driven data model: Secondary indexes
In this notebook, you will learn how to create secondary indexes to not only improve query performance, but also remove unnecessary tables from the data model.

### Import the notebook variables 

> Requirements:
>
> You must first create the variables in the `01_Setup.ipynb` notebook.
>

The following Python cell reads the stored variables created in the `01_Setup.ipynb` notebook. 

- To run the script, select Execute Cell (Play Arrow) in the left gutter of the cell.

In [None]:
%store -r MY_DB_NAME
%store -r MY_YB_PATH
%store -r MY_YB_PATH_DATA
%store -r MY_GITPOD_WORKSPACE_URL
%store -r MY_HOST_IPv4_01
%store -r MY_HOST_IPv4_02
%store -r MY_HOST_IPv4_03
%store -r MY_NOTEBOOK_DIR
%store -r MY_TSERVER_WEBSERVER_PORT
%store -r MY_NOTEBOOK_DATA_FOLDER
%store -r MY_YB_MASTER_HOST_GITPOD_URL
%store -r MY_YB_TSERVER_HOST_GITPOD_URL
%store -r MY_DATA_DDL_FILE
%store -r MY_DATA_DML_FILE

#### JSON

JSONB is considered the best way to utilize complex data structures since in YCQL, JSONB is searchable. This is not true of the other collections in YCQL. Also note that collections can be used in JSON as well. Run the following cell and note the pattern necessary to write a JSON object to a table in YCQL.

In [None]:
%%bash -s "$MY_YB_PATH" "$MY_DB_NAME"   # Query plan: Sequential scan  
YB_PATH=${1}
DB_NAME=${2}  
cd $YB_PATH/bin

# DB_NAME=ks_ybu
./ycqlsh -r -k $DB_NAME -e "
  select category, product_name, product_id, price, 
    (sku_details->'tags'->>0) as tag1, 
    (sku_details->'tags'->>1) as tag2,
    (sku_details->'tags'->>2) as tag3,
    (sku_details->'tags'->>3) as tag4,
    (sku_details->'tags'->>4) as tag5,
    (sku_details->'tags'->>5) as tag6,
    (sku_details->'colors'->>0) as color1, 
    (sku_details->'colors'->>1) as color2,
    (sku_details->'colors'->>2) as color3,
    (sku_details->'colors'->>3) as color4,
    (sku_details->'colors'->>4) as color5,
    (sku_details->'colors'->>5) as color6,
    (sku_details->'colors'->>6) as color7
  from tbl_products_by_category
  where (sku_details->>'kid_friendly')='true' 
  ;
"

### update JSONB

In [None]:
%%bash -s "$MY_YB_PATH" "$MY_DB_NAME"   # Query plan: Sequential scan  
YB_PATH=${1}
DB_NAME=${2}  
cd $YB_PATH/bin

# DB_NAME=ks_ybu
./ycqlsh -r -k $DB_NAME -e "
 update tbl_products_by_category
   set sku_details->'kid_friendly' = 'false'
   where category ='H20'
     and product_name = 'Drip 5'
     and product_id = 23484 ;

  update tbl_products_by_category
   set sku_details->'kid_friendly' = 'false'
   where category ='H20'
     and product_name = 'Aqua 5'
     and product_id = 45693 ;
 "

### query plan

In [84]:
%%bash -s "$MY_YB_PATH" "$MY_DB_NAME"   # Query plan: Sequential scan  
YB_PATH=${1}
DB_NAME=${2}  
cd $YB_PATH/bin

# DB_NAME=ks_ybu
./ycqlsh -r -k $DB_NAME -e "
  explain select  category, product_name, product_id, brand, description, price, gtin
  from tbl_products_by_category
  where (sku_details->>'kid_friendly')='true' 
    and (sku_details->>'country')='US' 
    and (sku_details->'colors'->>0)='blue'
  ;
"


 QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
 Index Only Scan using ks_ybu.idx_products_by_catgegory_jsonb on ks_ybu.tbl_products_by_category                                          
   Key Conditions: (sku_details->>'country' = 'US') AND (sku_details->>'kid_friendly' = 'true') AND (sku_details->'colors'->>'0' = 'blue')



#### JSONB Index

JSONB is considered the best way to utilize complex data structures since in YCQL, JSONB is searchable.

In [79]:
%%bash -s "$MY_YB_PATH" "$MY_DB_NAME"   # Query plan: Sequential scan  
YB_PATH=${1}
DB_NAME=${2}  
cd $YB_PATH/bin

# DB_NAME=ks_ybu
./ycqlsh -r -k $DB_NAME -e "
  drop index if exists idx_products_by_catgegory_jsonb
  ;
"

In [82]:
%%bash -s "$MY_YB_PATH" "$MY_DB_NAME"   # Query plan: Sequential scan  
YB_PATH=${1}
DB_NAME=${2}  
cd $YB_PATH/bin

# DB_NAME=ks_ybu
./ycqlsh -r -k $DB_NAME -e "
  create index if not exists idx_products_by_catgegory_jsonb
  on tbl_products_by_category ( (
     sku_details->>'country', 
     sku_details->>'kid_friendly',
     sku_details->'colors'->>0
    ) )
  include (
    brand,
    price,
    description,
    gtin
    )
  ;
"

Query plan

In [83]:
%%bash -s "$MY_YB_PATH" "$MY_DB_NAME"   # Query plan: Sequential scan  
YB_PATH=${1}
DB_NAME=${2}  
cd $YB_PATH/bin

# DB_NAME=ks_ybu
./ycqlsh -r -k $DB_NAME -e "
  explain select  category, product_name, product_id, brand, description, price, gtin
  from tbl_products_by_category
  where (sku_details->>'kid_friendly')='true' 
    and (sku_details->>'country')='US' 
    and (sku_details->'colors'->>0)='blue'
  ;
"


 QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
 Index Only Scan using ks_ybu.idx_products_by_catgegory_jsonb on ks_ybu.tbl_products_by_category                                          
   Key Conditions: (sku_details->>'country' = 'US') AND (sku_details->>'kid_friendly' = 'true') AND (sku_details->'colors'->>'0' = 'blue')



In the preceding cell, note the syntax that is required to create an index using a JSON key. Also note the syntax used to search by the key in a JSON object.

If the row is still visible, wait a few more seconds to run the preceding cell. This will verify that the row has expired as expected.

---
# All done!
In this lab, you completed the following:

- Setup
  - Created the `ks_ybu` database with `ycqlsh`
  - Created tables and loaded data using DDL and DML scripts