<div style="width:100%; background-color: #000041"><a target="_blank" href="http://university.yugabyte.com"><img src="assets/YBU_Logo.png" /></a></div>

> **YugabyteDB YSQL Development**
>
> Enroll for free  [Yugabyte University](https://university.yugabyte.com/).
>

# About these notebook Labs
The labs for this course consists of several Jupyter notebooks that run in Gitpod. [Gitpod](https://www.gitpod.io/) facilitates runtime environments using Ubuntu and VS Code in a Chromium (browser) environment. 

A Gitpod instance belongs to your Gitpod workspace. The instance automatically runs a three node, YugabyteDB database cluster. YugabyteDB is the world's #1 open source, distributed SQL database. 

> Although a three node cluster is up and running, Gitpod does not support visiting loopback addresses over a web ui, even if exposed on a different port.
> 127.0.0.1 is the only web user interface that is exposed. To see all available ports in Gitpod, in the Terminal tab, you can run `gp ports list`.
> 

## 🛠️ Requirements
Here are the requirements for this lab:
- ✅ Launch using this Gitpod workspace, which you are already doing
- ✅ A running, standalone MySQL database, which is already running
- ✅ A running, 3 node YugabyteDB cluster, which is already running
- ☑️ A Python 3.8.13 or higher kernel for the notebook, *which you need to select right now!!!*

### Select your notebook kernel
- In the Notebook toolbar, click **Select Kernel**.
<br>
<img width=50% src="assets/01_01_Select_Kernel_Toolbar.png" />

- Next, in the dropdown, select **Python 3.8.13** or higher.
<br>
<img width=50% src="assets/01_02_Select_Kernel_Dropdown.png" />

That's it!

## About Jupyter notebooks

A Juypter notebook is a runtime environment for executing code. A notebook consists of one or more cells. There are three types of cells in a notebook: Markdown, Code, and Output. This is a Markdown cell.

A cell has two modes: read-mode and edit-mode. By default, the cells in a notebook are in read-mode. You can run a read-mode cell that is code. To execute a code cell, select the play icon in the cell's left gutter. 

The execution of a code cell often results in Output. An Output cell shows the `stdout` of the previous executed code cell. You can delete an Output cell. You can also collapse the cell. In addition, you can also select `Clear Outputs of All Cells`.

To modify a cell, select `Enter` or `Return`. In edit-mode, you can modify the contents of a cell. Certain labs contain challenges or experiments that require you to do just that: modify a code cell and re-run it! To switch from edit-mode to read-mode, use the `Shift+Enter` or `Shift+Return` keystroke.

#### Notebook keyboard shortcuts
The Jupyter extension for Gitpod supports the following keyboard shortcuts:
| Keystroke | Description |
|--|--|
| ESC | Change the cell mode |
| A | Add a cell above |
| B | Add a cell below |
| J or down arrow key |  Change a cell to below | 
| K or up arrow key | Change a cell to above | 
| Ctrl+Enter | Run the currently selected cell |
| Shift+Enter | Run the currently selected cell and insert a new cell immediately below (focus moves to new cell) |
| Alt+Enter | Run the currently selected cell and insert a new cell immediately below (focus remains on current cell) |
| dd | Delete a selected cell |
| z | Undo the last change | 
| M | switch the cell type to Markdown | 
| Y | switch the cell type to code |
| L | Enable/Disable line numbers |
```


## ⛑️ Getting help
The best way to get help from the Yugabyte University team is to post your question on YugabyteDB Community Slack in the #training or #yb-university channels. To sign up, visit [https://communityinviter.com/apps/yugabyte-db/register](https://communityinviter.com/apps/yugabyte-db/register).

---
# 👣 Setup steps
Here are the steps to setup this lab:
- Install missing dependencies and restart the notebook
- Create the notebook variables
- Create the `db_ybu` database

### Install missing dependencies and restart the notebook
Run the following cell to ensure that the notebook dependencies are available to the notebook. 

### Create the notebook variables 

> IMPORTANT!
> 
> Do NOT skip running this cell. 
> 

The following Python cell creates and stores variables that all the notebooks in this lab will use. You can view these variables in the Jupyter tab.

- To run the script, select Execute Cell (Play Arrow) in the left gutter of the cell.
- Verify the accuracy of the output values

In [None]:
# Env variables for Notebook
import os

# read env_vars.env
env_vars = !cat env_vars.env
for var in env_vars:
    key, value = var.split('=')
    os.environ[key] = value

### Begin: Gitpod or local
#
### Uncomment Gitpod specific
#
MY_YB_PATH=os.environ.get('MY_YB_PATH_GITPOD')
MY_YB_PATH_DATA=os.environ.get('MY_YB_PATH_DATA_GITPOD')
MY_GITPOD_WORKSPACE_URL=os.environ.get('GITPOD_WORKSPACE_URL')
#


# env_vars defines the following
MY_DB_NAME=os.environ.get('MY_DB_NAME')
MY_MYSQL_DB_NAME=os.environ.get('MY_MYSQL_DB_NAME')

MY_HOST_IPv4_01=os.environ.get('MY_HOST_IPv4_01')
MY_HOST_IPv4_02=os.environ.get('MY_HOST_IPv4_02')
MY_HOST_IPv4_03=os.environ.get('MY_HOST_IPv4_03')

MY_TSERVER_WEBSERVER_PORT=os.environ.get('MY_TSERVER_WEBSERVER_PORT')
MY_MYSQL_PORT=os.environ.get('MY_MYSQL_PORT')

# Gitpod URLS
MY_YB_MASTER_HOST_GITPOD_URL = MY_GITPOD_WORKSPACE_URL.replace('https://','https://7000-')
MY_YB_TSERVER_HOST_GITPOD_URL = MY_GITPOD_WORKSPACE_URL.replace('https://','https://'+MY_TSERVER_WEBSERVER_PORT+'-')

# Current directory of project and related child folders
MY_NOTEBOOK_DIR=os.getcwd()
MY_NOTEBOOK_DATA_FOLDER=MY_NOTEBOOK_DIR +'/data' 
MY_NOTEBOOK_UTILS_FOLDER=MY_NOTEBOOK_DIR + '/utils'


# Store the note book values for other notebooks to use
%store MY_YB_PATH
%store MY_YB_PATH_DATA

%store MY_GITPOD_WORKSPACE_URL

%store MY_DB_NAME
%store MY_MYSQL_DB_NAME

%store MY_HOST_IPv4_01
%store MY_HOST_IPv4_02
%store MY_HOST_IPv4_03

%store MY_TSERVER_WEBSERVER_PORT
%store MY_MYSQL_PORT

%store MY_YB_MASTER_HOST_GITPOD_URL
%store MY_YB_TSERVER_HOST_GITPOD_URL

%store MY_NOTEBOOK_DIR
%store MY_NOTEBOOK_DATA_FOLDER
%store MY_NOTEBOOK_UTILS_FOLDER


---
# Verify the MySQL database for the migration

Here is the related schema for the Sakila database.

<div style="width:100%; background-color: #000041"><a target="_blank" href="https://dev.mysql.com/doc/sakila/en/sakila-structure.html"><img src="assets/sakila-schema.png" /></a></div>

The following commands are MySQL shell commands. Using `mysql`, you can execute MySQL statements. 

To review the **tables** in the MySQL database, run the following cell:

In [None]:
%%bash -s "$MY_MYSQL_DB_NAME"  # create database
DB_NAME=${1}

# terminate connections, drop, and create
mysql -h 0 -u root ${DB_NAME} -e "SHOW TABLES; "  

To review the **views** in the MySQL database, run the following cell:

In [None]:
%%bash -s "$MY_MYSQL_DB_NAME"  # create database
DB_NAME=${1}

# terminate connections, drop, and create
mysql -h 0 -u root ${DB_NAME} -e "SHOW FULL TABLES WHERE Table_Type LIKE 'VIEW';"


To review the **user-defined functions** in the MySQL database, run the following cell:

In [None]:
%%bash -s "$MY_MYSQL_DB_NAME"  # create database
DB_NAME=${1}

# terminate connections, drop, and create
mysql -h 0 -u root ${DB_NAME} -e "SHOW FUNCTION STATUS WHERE db = '${DB_NAME}'; "  

To review the **triggers** in the MySQL database, run the following cell:

In [None]:
%%bash -s "$MY_MYSQL_DB_NAME"  # create database
DB_NAME=${1}

# terminate connections, drop, and create
mysql -h 0 -u root ${DB_NAME} -e "SHOW TRIGGERS;"

To review the **stored procedures** in the MySQL database, run the following cell:

In [None]:
%%bash -s "$MY_MYSQL_DB_NAME"  # create database
DB_NAME=${1}

# terminate connections, drop, and create
mysql -h 0 -u root ${DB_NAME} -e "SHOW PROCEDURE STATUS WHERE db = '${DB_NAME}'; "   


## Create the required user and roles in the source database

### Verify the Voyager Installation

You can learn more about Voyager at. Enter this doc.


In [None]:
%%bash
yb-voyager version

### Create the `ybvoyager`` user with the necessary grants for the

In [None]:
from __future__ import print_function
from datetime import date, datetime, timedelta
import mysql.connector

# env_var.env
db_host=MY_HOST_IPv4_01
db_name=MY_MYSQL_DB_NAME
db_port=MY_MYSQL_PORT

# sql statements for grants for ybvoyager user
SQL_STATEMENTS = {} 

SQL_STATEMENTS['create_user_1'] = ("CREATE USER 'ybvoyager'@'127.0.0.1' IDENTIFIED WITH  mysql_native_password BY 'Yugabyte#1'")
SQL_STATEMENTS['create_user_2'] = ("CREATE USER 'ybvoyager'@'locahost' IDENTIFIED WITH  mysql_native_password BY 'Yugabyte#1'")

SQL_STATEMENTS['grant_process_1'] = ("GRANT PROCESS ON *.* TO 'ybvoyager'@'127.0.0.1'")
SQL_STATEMENTS['grant_process_2'] = ("GRANT PROCESS ON *.* TO 'ybvoyager'@'localhost'")

SQL_STATEMENTS['grant_select_1'] = ("GRANT SELECT ON sakila.* TO 'ybvoyager'@'127.0.0.1'")
SQL_STATEMENTS['grant_select_2'] = ("GRANT SELECT ON sakila.* TO 'ybvoyager'@'localhost'")

SQL_STATEMENTS['grant_show_view_1'] = ("GRANT SHOW VIEW ON sakila.* TO 'ybvoyager'@'127.0.0.1'")
SQL_STATEMENTS['grant_show_view_2'] = ("GRANT SHOW VIEW ON sakila.* TO 'ybvoyager'@'localhost'")

SQL_STATEMENTS['grant_trigger_1'] = ("GRANT TRIGGER ON sakila.* TO 'ybvoyager'@'127.0.0.1'")
SQL_STATEMENTS['grant_trigger_2'] = ("GRANT TRIGGER ON sakila.* TO 'ybvoyager'@'localhost'")

SQL_STATEMENTS['grant_flush_1'] = ("GRANT FLUSH_TABLES ON *.* TO 'ybvoyager'@'127.0.0.1'")
SQL_STATEMENTS['grant_flush_2'] = ("GRANT FLUSH_TABLES ON *.* TO 'ybvoyager'@'localhost'")

SQL_STATEMENTS['grant_replication_1'] = ("GRANT REPLICATION CLIENT ON *.* TO 'ybvoyager'@'127.0.0.1'")
SQL_STATEMENTS['grant_replication_2'] = ("GRANT REPLICATION CLIENT ON *.* TO 'ybvoyager'@'localhost'")

SQL_STATEMENTS['grant_routine_1'] = ("GRANT SHOW_ROUTINE ON *.* TO 'ybvoyager'@'127.0.0.1'")
SQL_STATEMENTS['grant_routine_2'] = ("GRANT SHOW_ROUTINE ON *.* TO 'ybvoyager'@'localhost'")


# Create connection
cnx = mysql.connector.connect(user='root',',
                              host='127.0.0.1',
                              database=db_name)
cursor = cnx.cursor()

# Execute statements
for sql_name in SQL_STATEMENTS:
    sql_statement = SQL_STATEMENTS[sql_name]
    try:
        print("Running sql statement  {}: ".format(sql_name), end='')
        cursor.execute(sql_statement)
    except mysql.connector.Error as err:
        print(err.msg)
    else:
        print("OK")

cursor.close()
cnx.close()

##### Create tables, load data, and review relations
Run the following cell to execute the DDL and DML scripts using `ysqlsh`.

In [None]:
%%bash -s "$MY_YB_PATH" "$MY_DB_NAME" "$MY_NOTEBOOK_DATA_FOLDER" "$MY_DATA_DDL_FILE_0" "$MY_DATA_DML_FILE_0"   # World Cities
YB_PATH=${1}
DB_NAME=${2}
DATA_FOLDER=${3}
DATA_DDL_FILE=${4}
DATA_DML_FILE=${5}

#ls $DATA_FOLDER

WORLD_DDL_PATH=${DATA_FOLDER}/${DATA_DDL_FILE}
WORLD_DML_PATH=${DATA_FOLDER}/${DATA_DML_FILE}

cd ${YB_PATH}

# DDL file
./bin/ysqlsh -d ${DB_NAME} -f ${WORLD_DDL_PATH} >&/dev/null
sleep 1;

# DML file
./bin/ysqlsh -d ${DB_NAME} -f ${WORLD_DML_PATH} >&/dev/null
sleep 1;

# Describe relations
./bin/ysqlsh -d ${DB_NAME} -c "\d public.*"

##### View DDL for tbl_countries
Run the following cell using `ysqlsh` to view a table definition.

> Note
> 
> SQL magic does not support PostgreSQL `psql` commands. In order to execute `psql` commands, the notebook uses bash and `ysqlsh`.



In [None]:
%%bash -s "$MY_YB_PATH" "$MY_DB_NAME"  # \d tbl_countriees

YB_PATH=${1}
DB_NAME=${2}

cd $YB_PATH


./bin/ysqlsh -d ${DB_NAME} -c "\d tbl_countries"
#./bin/ysqlsh -d ${DBNAME} -c "\d tbl_countries"
#./bin/ysqlsh -d ${DBNAME} -c "\d tbl_states"


## Connect to MySQL using the MySQL Driver for Python
The following cells requires:
- Python 3.8+ and psycopg2

In [None]:
# Connect to db_ybu
# Inspiration from https://medium.com/analytics-vidhya/postgresql-integration-with-jupyter-notebook-deb97579a38d
import psycopg2
import sqlalchemy as alc
from sqlalchemy import create_engine

# env_var.env
db_host=MY_HOST_IPv4_01
db_name=MY_MYSQL_DB_NAME
db_port=MY_MYSQL_PORT

connection_str='postgresql://yugabyte@'+db_port+':3306/'+db_name
print(connection_str)
engine = create_engine(connection_str)

## Connect to YugabyteDB using the PostgreSQL Driver for Python
The following cells requires:
- Python 3.8+ and psycopg2

In [None]:
# Connect to db_ybu
# Inspiration from https://medium.com/analytics-vidhya/postgresql-integration-with-jupyter-notebook-deb97579a38d
import psycopg2
import sqlalchemy as alc
from sqlalchemy import create_engine

# env_var.env
db_host=MY_HOST_IPv4_01
db_name=MY_DB_NAME

connection_str='postgresql://yugabyte@'+db_host+':5433/'+db_name
print(connection_str)
engine = create_engine(connection_str)

#### Load SQL magic extension
>IMPORTANT!
>
> To use SQL magic, you must run the following cell that loads the notebook extension.

In [None]:
%reload_ext sql
# creates connection for sql magic
%sql {connection_str}

Optional.

In [None]:
 %config SqlMagic

#### Create Voyager user

In [None]:
%%sql 

CREATE USER ybvoyager SUPERUSER PASSWORD 'password';

---
Make directory

In [None]:
%%bash 
mkdir $HOME/export-dir
export EXPORT_DIR=$HOME/export-dir



---
## Remove Node and view the data distribution

In [None]:
%%bash
# Replace the argument values with those applicable for your migration.

yb-voyager export schema --export-dir $HOME/export-dir \
        --source-db-type "MYSQL" \
        --source-db-host "127.0.0.1" \
        --source-db-port "3306" \
        --source-db-name "sakila" \
        --source-db-user "ybvoyager" \
        --source-db-password "Yugabyte#1" \ 


---
# 🌟 Well done! 
In this notebook, you completed the following:
- Setup steps
- Connect to YugabyteDB using ysqlsh
  - Created the `db_ybu` database with `ysqlsh`
  - Created utils
  - Created tables and loaded data using DDL and DML scripts
  - Connected to the database using a PostgreSQL driver for Python
  - Executed queries with SQL Magic


## 😊 Next up!
Continue your learning by opening the next notebook, `02_Demystifying_table_sharding_tablets_and_data_distribution`. 

Or, if using GitPod, run the following cell:

In [None]:
%%bash
gp open '02_Demystifying_table_sharding_tablets_and_data_distribution'