# Lab Requirements and Setup

This lab consists of several Jupyter notebooks. The notebooks run in Gitpod. [Gitpod](https://www.gitpod.io/) facilitates runtime environments where a YugabyteDB database cluster can be deployed. Read the following instructions for requirements and setup of the Gitpod environment.

## About Jupyter notebooks
You will use a Jupyter notebook in this lab to run commands to assign environmental variables and Cassandra Query Language commands for the Yugabyte Cloud Query Language known as YCQL. 

There are two types of cells: markdown and code. This is a markdown cell.

You run a code cell by simply selecting the play icon in the cell's left gutter. For code cells, you can modify the code for execution. Certain labs contain challenges or experiments that require you to do just that - modify a code cell and re-run it!

### Requirements
Here are the requirements for this lab:
- Launch using a gitpod workspace
- Run a three node, YugabyteDB cluster using `yb-ctl`

> Note
>  
> Although a three node cluster is up and running, Gitpod does not support visiting loopback addresses over a web ui, even if exposed on a different port.
> 127.0.0.1 is the only web user interfaces. To see all available ports in Gitpod, in the terminal, run `gp ports list`.

#### Notebook keyboard shortcuts
The Jupyter extension for Gitpod supports the following keyboard shortcuts:
| Keystroke | Description |
|--|--|
| ESC | Change the cell mode |
| A | Add a cell above |
| B | Add a cell below |
| J or down arrow key |  Change a cell to below | 
| K or up arrow key | Change a cell to above | 
| Ctrl+Enter | Run the currently selected cell |
| Shift+Enter | Run the currently selected cell and insert a new cell immediately below (focus moves to new cell) |
| Alt+Enter | Run the currently selected cell and insert a new cell immediately below (focus remains on current cell) |
| dd | Delete a selected cell |
| z | Undo the last change | 
| M | switch the cell type to Markdown | 
| Y | switch the cell type to code |
| L | Enable/Disable line numbers |
```


## Setup steps
Here are the steps to setup this lab:
- Install missing dependencies and restart the notebook
- Create the notebook variables
- Create the `db_ybu` database

### Install missing dependencies and restart the notebook
Run the following cell to ensure that the notebook dependencies are available to the notebook. 

### Create the notebook variables 

> IMPORTANT!
> 
> Do NOT skip running this cell. 
> 

The following Python cell creates and stores variables that all the notebooks in this lab will use. You can view these variables in the Jupyter tab.

- To run the script, select Execute Cell (Play Arrow) in the left gutter of the cell.
- Verify the accuracy of the output values

In [None]:
# Env variables for Notebook
import os

# read env_vars.env
env_vars = !cat env_vars.env
for var in env_vars:
    key, value = var.split('=')
    os.environ[key] = value
 

# Comment out Local
# MY_YB_PATH=os.environ.get('MY_YB_PATH_LOCAL')
# MY_GITPOD_WORKSPACE_URL=os.environ.get('MY_GITPOD_WORKSPACE_URL_LOCAL')
# MY_SUDO=os.environ.get('MY_SUDO')

# Gitpod specific
MY_YB_PATH=os.environ.get('MY_YB_PATH')
MY_GITPOD_WORKSPACE_URL=os.environ.get('GITPOD_WORKSPACE_URL')

# env_vars defines the following
MY_DB_NAME=os.environ.get('MY_DB_NAME')
MY_HOST_IPv4_01=os.environ.get('MY_HOST_IPv4_01')
MY_HOST_IPv4_02=os.environ.get('MY_HOST_IPv4_02')
MY_HOST_IPv4_03=os.environ.get('MY_HOST_IPv4_03')
MY_TSERVER_WEBSERVER_PORT=os.environ.get('MY_TSERVER_WEBSERVER_PORT')
MY_DATA_DDL_FILE=os.environ.get("MY_DATA_DDL_FILE")
MY_DATA_DML_FILE=os.environ.get("MY_DATA_DML_FILE")
print(MY_DATA_DDL_FILE, MY_DATA_DML_FILE)
MY_UTIL_FUNCTIONS_FILE=os.environ.get("MY_UTIL_FUNCTIONS_FILE")
MY_UTIL_YBTSERVER_METRICS_FILE=os.environ.get("MY_UTIL_YBTSERVER_METRICS_FILE")

# Current directory of project and related child folders
MY_NOTEBOOK_DIR=os.getcwd()
MY_NOTEBOOK_DATA_FOLDER=MY_NOTEBOOK_DIR +'/data'
MY_NOTEBOOK_UTILS_FOLDER=MY_NOTEBOOK_DIR + '/utils'

print(MY_NOTEBOOK_DATA_FOLDER, MY_NOTEBOOK_UTILS_FOLDER)
# Store the note book values for other notebooks to use

%store MY_DB_NAME
%store MY_YB_PATH
%store MY_GITPOD_WORKSPACE_URL
%store MY_HOST_IPv4_01
%store MY_HOST_IPv4_02
%store MY_HOST_IPv4_03
%store MY_NOTEBOOK_DIR
%store MY_TSERVER_WEBSERVER_PORT
%store MY_NOTEBOOK_DATA_FOLDER
%store MY_NOTEBOOK_UTILS_FOLDER
%store MY_DATA_DDL_FILE
%store MY_DATA_DML_FILE
%store MY_UTIL_FUNCTIONS_FILE
%store MY_UTIL_YBTSERVER_METRICS_FILE

#### YCQL Shell Commands
The following commands are YCQL shell commands that use the YCQLSH to connect to the keyspace. Work down the list of commands, uncommenting only one line at a time under the YCQL Shell Commands comment.

In [None]:
%%bash -s "$MY_YB_PATH"   # Shell Commands
YB_PATH=${1}
cd $YB_PATH

# YCQL Shell Commands
./bin/ycqlsh --execute "HELP SHOW"  
# ./bin/ycqlsh --execute "SHOW VERSION"  
# ./bin/ycqlsh --execute "SHOW HOST"
# ./bin/ycqlsh --execute "HELP DESCRIBE"
# ./bin/ycqlsh --execute "DESC KEYSPACES"
# ./bin/ycqlsh --execute "DESC TABLES"
# ./bin/ycqlsh --execute "HELP LOGIN"
# ./bin/ycqlsh --execute "LOGIN cassandra"
# ./bin/ycqlsh --execute "./bin/ycqlsh -u cassandra -p cassandra"
# ./bin/ycqlsh --execute "EXIT"
# ./bin/ycqlsh --execute "SELECT * FROM system_auth.roles;"  

The results of these commands show how to display information about the database environment and how to select roles, important for role backed security. In the preceeding commands, note that the `ycqlsh` script is used to execute a shell command. You will use this command to send YCQL commands to the database.

### Create the `mydemo` keyspace and `employees` table with `ycqlsh`
Run the following cell to connect to the YugabyteDB cluster using `ycqlsh`, then create the `mydemo` keyspace and `employees` table.

In [None]:
%%bash -s "$MY_YB_PATH"   # Create the keyspace, mydemo, and table, employees
YB_PATH=${1}
cd $YB_PATH


./bin/ycqlsh --execute "CREATE KEYSPACE IF NOT EXISTS mydemo;"

./bin/ycqlsh --execute "
  CREATE TABLE mydemo.employees ( 
      id INT, 
      name TEXT, 
      email TEXT,
      PRIMARY KEY (id)
  );
"

./bin/ycqlsh --execute "DESC keyspaces"


#### Write to the table employees
Seed the employees table with employee data.

In [None]:
%%bash -s "$MY_YB_PATH"   # Populate the employees table
YB_PATH=${1}
cd $YB_PATH

./bin/ycqlsh --execute "
  INSERT INTO mydemo.employees (id, name, email) VALUES 
    (1, 'Bruce Wayne', 'batman@yb.com');
 "

./bin/ycqlsh --execute "
  INSERT INTO mydemo.employees (id, name, email) VALUES 
      (2, 'Clark Kent', 'superman@yb.com');
"

./bin/ycqlsh --execute "
  INSERT INTO mydemo.employees (id, name, email) VALUES 
      (3, 'Peter Parker', 'spiderman@yb.com');
"

#### Read from Table 
Verify the tables were seeded with data. In this section, run the following cells to see the difference between searching for all the rows, a particular value, or a range of values by changing the WHERE clause predicate.

In [None]:
%%bash -s "$MY_YB_PATH"   # Data search
YB_PATH=${1}
cd $YB_PATH

# Read all columns from the employees table
./bin/ycqlsh --execute "SELECT * FROM mydemo.employees;"

In [None]:
%%bash -s "$MY_YB_PATH"   
YB_PATH=${1}
cd $YB_PATH

# Find an employee id = 3
./bin/ycqlsh --execute "
  SELECT id as company_id, name as full_name FROM mydemo.employees WHERE id=3;
"

#### Range Query
To select a range of values, the AND operator is used to create the upper and lower limits of the search.

In [None]:
%%bash -s "$MY_YB_PATH"   
YB_PATH=${1}
cd $YB_PATH

# Search a range of values
./bin/ycqlsh --execute "
  SELECT * FROM mydemo.employees WHERE name>='B' AND name<'D';
"

## Import data from SQL scripts
In this section, you will create a new keyspace, `db_ybu`, where you will import sql scripts. Run the next cell, to create the keyspace, `db_ybu`.

In [None]:
%%bash -s "$MY_YB_PATH" "$MY_DB_NAME" 
YB_PATH=${1}
DB_NAME=${2}

cd $YB_PATH

# Drop keyspace if exists
./bin/ycqlsh --execute "
  DROP KEYSPACE IF EXISTS "${DB_NAME}";
"
 
# Create keyspace, db_ybu
./bin/ycqlsh --execute "
  CREATE KEYSPACE "${DB_NAME}";
" 

# Connect to the keyspace, db_ybu
./bin/ycqlsh --execute "USE "${DB_NAME}";"

# List keyspaces, validate keyspace creation
./bin/ycqlsh --execute "DESCRIBE "${DB_NAME}""

To verify that the keyspace, db_ybu, was created successfully, you can use the DESCRIBE shell command, also abbreviated as DESC.

### Load DDL and Data from File
You will run the following cell in order to assign the file paths of the DML and DDL sql scripts. Once the DDL file creates the tables, the DML file will populate the tables with data.

In [None]:
%%bash -s "$MY_YB_PATH" "$MY_DB_NAME" "$MY_NOTEBOOK_DATA_FOLDER" "$MY_DATA_DDL_FILE" "$MY_DATA_DML_FILE"   
# Wishlist
YB_PATH=${1}
DB_NAME=${2}
DATA_FOLDER=${3}
DATA_DDL_FILE=${4}
DATA_DML_FILE=${5}

# ls $DATA_FOLDER

WISHLIST_DDL_PATH=${DATA_FOLDER}/${DATA_DDL_FILE}
WISHLIST_DML_PATH=${DATA_FOLDER}/${DATA_DML_FILE}

cd $YB_PATH

# DDL file
./bin/ycqlsh -k ${DB_NAME} -f ${WISHLIST_DDL_PATH} 
sleep 1;

# # DML file
./bin/ycqlsh -k ${DB_NAME} -f ${WISHLIST_DML_PATH} 
sleep 1;

# # Describe relations
./bin/ycqlsh --execute "DESCRIBE TABLES"

In the preceeding import commands, you ran the YCQL commands in the YCQL shell to connect to the db_ybu keyspace and execute the respective sql scripts. Note that the DESCRIBE TABLES shell command displays the schemas of each table in the keyspace, `db_ybu`. Note that the DESCRIBE shell command can be used to describe keyspaces, tables, indexes, and sequences as well. For more information about the use of DESCRIBE, use the HELP DESCRIBE command.

### Verify the tables were seeded
Run the following cell to verfy each of the three tables 

In [None]:
%%bash -s "$MY_YB_PATH"   # Verify data population
YB_PATH=${1}
cd $YB_PATH

# Read data in the tbl_wishlists_by_category
./bin/ycqlsh --execute "
  SELECT * FROM db_ybu.tbl_products_by_category;
"  

In [None]:
%%bash -s "$MY_YB_PATH"  
YB_PATH=${1}
cd $YB_PATH

# Read data in the tbl_wishlists_by_user 
./bin/ycqlsh --execute "
  SELECT * FROM db_ybu.tbl_wishlists_by_user;
" 

In [None]:
%%bash -s "$MY_YB_PATH"   
YB_PATH=${1}
cd $YB_PATH

# Read data in the tbl_products_by_wishlist
./bin/ycqlsh --execute "
  SELECT * FROM db_ybu.tbl_products_by_wishlist;
"  

In [None]:
%%bash -s "$MY_YB_PATH"   # Drop tables
YB_PATH=${1}
cd $YB_PATH

./bin/ycqlsh --execute "
  DROP table IF EXISTS db_ybu.tbl_products_by_wishlist;
" 

./bin/ycqlsh --execute "
  DROP table IF EXISTS db_ybu.tbl_products_by_category;
" 

./bin/ycqlsh --execute "
  DROP table IF EXISTS db_ybu.tbl_wishlists_by_user;
" 

./bin/ycqlsh --execute "
  DROP keyspace IF EXISTS db_ybu;
" 

---
# All done!
In this lab, you completed the following:

- Setup
  - Created the `mydemo` keyspace with `ycqlsh`
  - Created tables and loaded data using DDL and DML scripts

In [None]:
%%bash
gp open 02_YCQL_Intermediate.ipynb