# Lab Requirements and Setup

This lab consists of several Jupyter notebooks and runs in Gitpod using VS Code.  Follow the instructions for requirements and setup.

## About Jupyter notebooks
A notebook consists of one or more cells. In VS Code, notebooks cells are editable. 

There are two types of cells: markdown and code. This is a markdown cell.

You run a code cell by simply selecting the play icon in the cell's left gutter. For code cells, you can modify the code for execution. Certain labs contain challenges or experiments that require you to do just that - modify a code cell and re-run it!

### Requirements
Here are the requirements for this lab:
- Launch using a gitpod workspace
- Run a three node, YugabyteDB cluster using `yb-ctl`

> Note
>  
> Although a three node cluster is up and running, Gitpod does not support visiting loopback addresses over a web ui, even if exposed on a different port.
> 127.0.0.1 is the only web user interfaces. To see all available ports in Gitpod, in the terminal, run `gp ports list`.

#### Notebook keyboard shortcuts
The Jupyter extension for Gitpod supports the following keyboard shortcuts:
| Keystroke | Description |
|--|--|
| ESC | Change the cell mode |
| A | Add a cell above |
| B | Add a cell below |
| J or down arrow key |  Change a cell to below | 
| K or up arrow key | Change a cell to above | 
| Ctrl+Enter | Run the currently selected cell |
| Shift+Enter | Run the currently selected cell and insert a new cell immediately below (focus moves to new cell) |
| Alt+Enter | Run the currently selected cell and insert a new cell immediately below (focus remains on current cell) |
| dd | Delete a selected cell |
| z | Undo the last change | 
| M | switch the cell type to Markdown | 
| Y | switch the cell type to code |
| L | Enable/Disable line numbers |
```


## Setup steps
Here are the steps to setup this lab:
- Install missing dependencies and restart the notebook
- Create the notebook variables
- Create the `db_ybu` database

### Install missing dependencies and restart the notebook
Run the following cell to ensure that the notebook dependencies are available to the notebook. 

### Create the notebook variables 

> IMPORTANT!
> 
> Do NOT skip running this cell. 
> 

The following Python cell creates and stores variables that all the notebooks in this lab will use. You can view these variables in the Jupyter tab.

- To run the script, select Execute Cell (Play Arrow) in the left gutter of the cell.
- Verify the accuracy of the output values

In [2]:
# Env variables for Notebook
import os

# read env_vars.env
env_vars = !cat env_vars.env
for var in env_vars:
    key, value = var.split('=')
    os.environ[key] = value
 

# Comment out Local
# MY_YB_PATH=os.environ.get('MY_YB_PATH_LOCAL')
# MY_GITPOD_WORKSPACE_URL=os.environ.get('MY_GITPOD_WORKSPACE_URL_LOCAL')
# MY_SUDO=os.environ.get('MY_SUDO')

# Gitpod specific
MY_YB_PATH=os.environ.get('MY_YB_PATH')
MY_GITPOD_WORKSPACE_URL=os.environ.get('GITPOD_WORKSPACE_URL')

# env_vars defines the following
MY_DB_NAME=os.environ.get('MY_DB_NAME')
MY_HOST_IPv4_01=os.environ.get('MY_HOST_IPv4_01')
MY_HOST_IPv4_02=os.environ.get('MY_HOST_IPv4_02')
MY_HOST_IPv4_03=os.environ.get('MY_HOST_IPv4_03')
MY_TSERVER_WEBSERVER_PORT=os.environ.get('MY_TSERVER_WEBSERVER_PORT')
MY_DATA_DDL_FILE=os.environ.get("MY_DATA_DDL_FILE")
MY_DATA_DML_FILE=os.environ.get("MY_DATA_DML_FILE")
print(MY_DATA_DDL_FILE, MY_DATA_DML_FILE)
MY_UTIL_FUNCTIONS_FILE=os.environ.get("MY_UTIL_FUNCTIONS_FILE")
MY_UTIL_YBTSERVER_METRICS_FILE=os.environ.get("MY_UTIL_YBTSERVER_METRICS_FILE")

# Current directory of project and related child folders
MY_NOTEBOOK_DIR=os.getcwd()
MY_NOTEBOOK_DATA_FOLDER=MY_NOTEBOOK_DIR +'/data'
MY_NOTEBOOK_UTILS_FOLDER=MY_NOTEBOOK_DIR + '/utils'

print(MY_NOTEBOOK_DATA_FOLDER, MY_NOTEBOOK_UTILS_FOLDER)
# Store the note book values for other notebooks to use

%store MY_DB_NAME
%store MY_YB_PATH
%store MY_GITPOD_WORKSPACE_URL
%store MY_HOST_IPv4_01
%store MY_HOST_IPv4_02
%store MY_HOST_IPv4_03
%store MY_NOTEBOOK_DIR
%store MY_TSERVER_WEBSERVER_PORT
%store MY_NOTEBOOK_DATA_FOLDER
%store MY_NOTEBOOK_UTILS_FOLDER
%store MY_DATA_DDL_FILE
%store MY_DATA_DML_FILE
%store MY_UTIL_FUNCTIONS_FILE
%store MY_UTIL_YBTSERVER_METRICS_FILE

wishlist_ddl.sql wishlist_dml.sql
/Users/markkim/Documents/YBU_repos/jupyter/YCQL-01/data /Users/markkim/Documents/YBU_repos/jupyter/YCQL-01/utils
Stored 'MY_DB_NAME' (str)
Stored 'MY_YB_PATH' (str)
Stored 'MY_GITPOD_WORKSPACE_URL' (str)
Stored 'MY_HOST_IPv4_01' (str)
Stored 'MY_HOST_IPv4_02' (str)
Stored 'MY_HOST_IPv4_03' (str)
Stored 'MY_NOTEBOOK_DIR' (str)
Stored 'MY_TSERVER_WEBSERVER_PORT' (str)
Stored 'MY_NOTEBOOK_DATA_FOLDER' (str)
Stored 'MY_NOTEBOOK_UTILS_FOLDER' (str)
Stored 'MY_DATA_DDL_FILE' (str)
Stored 'MY_DATA_DML_FILE' (str)
Stored 'MY_UTIL_FUNCTIONS_FILE' (str)
Stored 'MY_UTIL_YBTSERVER_METRICS_FILE' (str)
Stored 'MY_SUDO' (str)


#### YCQL Shell Commands
The following commands are YCQL shell commands that use the YCQLSH to connect to the keyspace.\

In [5]:
%%bash -s "$MY_YB_PATH"   # READ TABLES
YB_PATH=${1}
cd $YB_PATH

# YCQL Commands
./bin/ycqlsh --execute "HELP SHOW"  
# ./bin/ycqlsh --execute "SHOW VERSION"  
# ./bin/ycqlsh --execute "SHOW HOST"
# ./bin/ycqlsh --execute "HELP DESCRIBE"
# ./bin/ycqlsh --execute "DESC KEYSPACES"
# ./bin/ycqlsh --execute "DESC TABLES"
# ./bin/ycqlsh --execute "HELP LOGIN"
# ./bin/ycqlsh --execute "LOGIN cassandra"
# ./bin/ycqlsh --execute "./bin/ycqlsh -u cassandra -p cassandra"
# ./bin/ycqlsh --execute "EXIT"
# ./bin/ycqlsh --execute "SELECT * FROM system_auth.roles;"  


        SHOW [ycqlsh only]

          Displays information about the current ycqlsh session. Can be called in
          the following ways:

        SHOW VERSION

          Shows the version and build of the connected Cassandra instance, as
          well as the versions of the CQL spec and the Thrift protocol that
          the connected Cassandra instance understands.

        SHOW HOST

          Shows where ycqlsh is currently connected.

        SHOW SESSION <sessionid>

          Pretty-prints the requested tracing session.
        


The results of these commands show how to display information about the database environment and how to select roles, important for role backed security.

### Create the `mydemo` keyspace and `employees` table with `ycqlsh`
Run the following cell to connect to the YugabyteDB cluster using `ycqlsh`, then create the `mydemo` keyspace and `employees` table.

In [11]:
%%bash -s "$MY_YB_PATH"   # READ TABLES
YB_PATH=${1}
cd $YB_PATH


./bin/ycqlsh --execute "CREATE KEYSPACE IF NOT EXISTS mydemo;"  
# ./bin/ycqlsh --execute "CREATE TABLE mydemo.employees (
#     id INT,
#     name TEXT,
#     email TEXT,
#     PRIMARY KEY (id)
# );"
# ./bin/ycqlsh --execute "DESC keyspaces"



system_auth  mydemo  system_schema  system



#### Write to the table employees
Seed the employees table with employee data.

In [7]:
%%bash -s "$MY_YB_PATH"   # READ TABLES
YB_PATH=${1}
cd $YB_PATH

./bin/ycqlsh --execute "
  INSERT INTO mydemo.employees (id, name, email) VALUES 
    (1, 'Bruce Wayne', 'batman@yb.com');
 "

./bin/ycqlsh --execute "
INSERT INTO mydemo.employees (id, name, email) VALUES 
    (2, 'Clark Kent', 'superman@yb.com');
"

./bin/ycqlsh --execute "
INSERT INTO mydemo.employees (id, name, email) VALUES 
    (3, 'Peter Parker', 'spiderman@yb.com');
"


#### Read from Table 
Verify the tables were seeded with data.

In [39]:
%%bash -s "$MY_YB_PATH"   # READ TABLES
YB_PATH=${1}
cd $YB_PATH

# # Read
./bin/ycqlsh --execute "SELECT * FROM mydemo.employees;"
# ./bin/ycqlsh --execute "SELECT id as company_id, name as full_name FROM mydemo.employees WHERE id=3;"
# ./bin/ycqlsh --execute "SELECT * FROM mydemo.employees WHERE name >= 'B' AND name < 'D';"


 company_id | full_name
------------+--------------
          3 | Peter Parker

(1 rows)


#### Data Load
In this section, you will populate a database using sql shell scripts. First you will create a new keyspace, then load the DDL and DML sql files into YugabyteDB.

In [12]:
%%bash -s "$MY_YB_PATH" "$MY_DB_NAME"  # create database
YB_PATH=${1}
DB_NAME=${2}

cd $YB_PATH

# drop and create
./bin/ycqlsh --execute "DROP KEYSPACE IF EXISTS "${DB_NAME}";"
# 127.0.0.1
 
./bin/ycqlsh --execute "CREATE KEYSPACE "${DB_NAME}" 
  WITH REPLICATION = 
  { 'class': 'SimpleStrategy', 'replication_factor': 1 };" 
# 127.0.0.1

# Connect to Keyspace
./bin/ycqlsh --execute "USE "${DB_NAME}";"

# List keyspaces, validate keyspace creation
./bin/ycqlsh --execute "DESCRIBE TABLES"


Keyspace db_ybu
---------------
<empty>

Keyspace system_auth
--------------------
roles  role_permissions  resource_role_permissions_index

Keyspace mydemo
---------------
employees

Keyspace system_schema
----------------------
aggregates  types  columns  functions  keyspaces
indexes     views  tables   triggers 

Keyspace system
---------------
local  size_estimates  peers  partitions



## Load DDL and Data from File
Load SQL statements from a script
This will load 

In [16]:
%%bash -s "$MY_YB_PATH" "$MY_DB_NAME" "$MY_NOTEBOOK_DATA_FOLDER" "$MY_DATA_DDL_FILE" "$MY_DATA_DML_FILE"   
# Wishlist
YB_PATH=${1}
DB_NAME=${2}
DATA_FOLDER=${3}
DATA_DDL_FILE=${4}
DATA_DML_FILE=${5}

# ls $DATA_FOLDER

WISHLIST_DDL_PATH=${DATA_FOLDER}/${DATA_DDL_FILE}
WISHLIST_DML_PATH=${DATA_FOLDER}/${DATA_DML_FILE}

cd $YB_PATH

# DDL file
./bin/ycqlsh -k ${DB_NAME} -f ${WISHLIST_DDL_PATH} 
sleep 1;

# # DML file
./bin/ycqlsh -k ${DB_NAME} -f ${WISHLIST_DML_PATH} 
sleep 1;

# # Describe relations
./bin/ycqlsh --execute "DESCRIBE TABLES"


Keyspace db_ybu
---------------
tbl_products_by_category  tbl_products_by_wishlist  tbl_wishlists_by_user

Keyspace system_auth
--------------------
roles  role_permissions  resource_role_permissions_index

Keyspace mydemo
---------------
employees

Keyspace system_schema
----------------------
aggregates  types  columns  functions  keyspaces
indexes     views  tables   triggers 

Keyspace system
---------------
local  size_estimates  peers  partitions



In [1]:
%%bash -s "$MY_YB_PATH"   # READ TABLES
YB_PATH=${1}
cd $YB_PATH

# drop and create
./bin/ycqlsh --execute "SELECT * FROM db_ybu.tbl_products_by_category;"  
# ./bin/ycqlsh --execute "SELECT * FROM db_ybu.tbl_wishlists_by_user;"  
# ./bin/ycqlsh --execute "SELECT * FROM db_ybu.tbl_products_by_wishlist;"  

# list dbs
# ./bin/ycqlsh --execute "DESCRIBE db_ybu.tbl_products_by_category"
# ./bin/ycqlsh --execute "DESCRIBE db_ybu.tbl_wishlists_by_user"
# ./bin/ycqlsh --execute "DESCRIBE db_ybu.tbl_products_by_wishlist"

bash: line 2: cd: $MY_YB_PATH: No such file or directory
bash: line 5: ./bin/ycqlsh: No such file or directory


CalledProcessError: Command 'b'YB_PATH=${1}\ncd $YB_PATH\n\n# drop and create\n./bin/ycqlsh --execute "SELECT * FROM db_ybu.tbl_products_by_category;"  \n# ./bin/ycqlsh --execute "SELECT * FROM db_ybu.tbl_wishlists_by_user;"  \n# ./bin/ycqlsh --execute "SELECT * FROM db_ybu.tbl_products_by_wishlist;"  \n\n# list dbs\n# ./bin/ycqlsh --execute "DESCRIBE db_ybu.tbl_products_by_wishlist"\n'' returned non-zero exit status 127.

---
# All done!
In this lab, you completed the following:

- Setup
  - Created the `mydemo` keyspace with `ycqlsh`
  - Created tables and loaded data using DDL and DML scripts