<div style="width:100%; background-color: #000041"><a target="_blank" href="http://university.yugabyte.com"><img src="assets/YBU_Logo.png" /></a></div><br>

> **YugabyteDB Fundamentals**
>
> Enroll for free  [Yugabyte University](https://university.yugabyte.com/courses/yugabytedb-fundamentals).
>

<br>
This notebook file is:

`02_TPCC.ipynb`

---
## About the Yugabyte TPCC benchmark

The <a href="https://github.com/yugabyte/tpcc" rel="noopener noreferrer" target="_blank">Yugabyte TPCC benchmark</a> application is a fork of the popular <a href="https://github.com/oltpbenchmark/oltpbench" rel="noopener noreferrer" target="_blank">OLTPBench</a> benchmark tool.

Just like the OLTP Bench original, the Yugabyte TPCC benchmark is a multi-threaded load generator that is be able to produce a variety of workloads, including variations in rate and transaction type. The benchmark also allows for benchmark data collection. You can analyze this data to determine key metrics such as Transactions per Second (TPS) and Latency per Transaction Type. **TPMC** remains as the main metric for summarizing the benchmark.


## üõ†Ô∏è Requirements
Here are the requirements for this notebook:
- ‚úÖ Create the notebook variables in `01_Lab_Setup.ipynb`, which you previously did
- ‚úÖ Create the `db_tpcc` database, which you previously did
- ‚òëÔ∏è Import the notebook variables, *which you must do next*
- ‚òëÔ∏è Connect to the `db_tpcc` database, *which you must do next*
- ‚òëÔ∏è Run through a series of DDL and DML scenarios
  -  Basic of DDL and DML
  -  Built-in Functions
  -  Advanced Features


### Select your notebook kernel
- In the Notebook toolbar, click **Select Kernel**.
<br>
<img width=50% src="assets/01_01_Select_Kernel_Toolbar.png" />

- Next, in the dropdown, select **Python 3.12** or higher.
<br>
<img width=50% src="assets/01_02_Select_Kernel_Dropdown.png" />

That's it!

## ‚õëÔ∏è Getting help
The best way to get help from the Yugabyte University team is to post your question on YugabyteDB Community Slack in the #training or #yb-university channels. To sign up, visit [https://communityinviter.com/apps/yugabyte-db/register](https://communityinviter.com/apps/yugabyte-db/register).


## üë£ Setup steps
Here are the steps to setup this lab:
- Import the notebook variables
- Run the TPCC Benchmark

### üëá Import the notebook variables

> üëâ IMPORTANT! üëà
> 
> Do **NOT** skip running the following cell. 
> 

The following Python cell reads the stored variables created in the `01_Lab_Setup.ipynb` notebook. To run the script, select Execute Cell (Play Arrow) in the left gutter of the cell. 

üëá üëá üëá 

In [None]:
%store -r MY_YB_PATH
%store -r MY_YB_PATH_DATA
%store -r MY_GITPOD_WORKSPACE_URL

%store -r -r MY_DB_NAME
%store -r MY_DB_PORT

%store -r MY_HOST_IPv4_01
%store -r MY_HOST_IPv4_02
%store -r MY_HOST_IPv4_03

%store -r MY_MASTER_WEB_PORT
%store -r MY_TSERVER_WEBSERVER_PORT
%store -r MY_YUGABYTED_WEB_UI_PORT

%store -r MY_YB_MASTER_HOST_GITPOD_URL
%store -r MY_YB_TSERVER_HOST_GITPOD_URL
%store -r MY_YUGABYTED_UI_HOST_GITPOD_URL

%store -r MY_NOTEBOOK_DIR
%store -r MY_NOTEBOOK_UTILS_FOLDER

%store -r MY_TPCC_PATH
%store -r MY_TPCC_WORKLOAD_FILE

---


## Run the TPCC Benchmark

There are 4 basic steps to running the benchmark:
- create the schema
- load the data
- run the benchmark
- review the results

To run the TPCC benchmark, you use a utility script, `tpccbenchmark` which supports the following arguments.

| Argument | Description |
|-|:-|
| `-c,--config`|  `[required]` Workload configuration file|
| `--clear` | Clear all records in the database for this benchmark |
| `--create` |  Initialize the database for this benchmark |
| `--execute` |  Execute the benchmark workload |
| `-h,--help` |  Print this help |
| `--histograms` |  Print txn histograms |
| `--load` |  Load data using the benchmark's data loader |
| `-o,--output` |  Output file (default System.out) |
| `--runscript` |  Run an SQL script |
| `-s,--sample` |  Sampling window |
| `-v,--verbose` |  Display Messages |

The default benchmark values are:
- `warehouses=10`
- `terminals=100`
- `dbConnections=10`
- `loaderThreads=10`


A <a href="https://github.com/yugabyte/tpcc/blob/master/config/workload_all.xml" rel="noopener noreferrer" target="_blank">`config/workload_all.xml`</a> file provides an example of how to describe and configure a workload.

For more configurations, review the forked OLTP benchmark <a href="https://github.com/oltpbenchmark/oltpbench/blob/master/config/sample_tpcc_config.xml" rel="noopener noreferrer" target="_blank">`config.xml`</a>.

The Yugabyte TPCC benchmark also supports multi-region cluster topologies row-level geo-partitioning. To see how, review the <a href="https://github.com/yugabyte/tpcc/blob/master/config/geopartitioned_workload.xml" rel="noopener noreferrer" target="_blank">`geopartitioned_workload.xml`</a> file which illustrates how to specify tablespaces with specific placement policies.</p>

See the following for more details:
- <a href="https://github.com/yugabyte/tpcc" rel="noopener noreferrer" target="_blank">https://github.com/yugabyte/tpcc</a>
- <a href="https://docs.yugabyte.com/latest/benchmark/tpcc-ysql/" rel="noopener noreferrer" target="_blank">https://docs.yugabyte.com/latest/benchmark/tpcc-ysql/</a>



### Create the benchmark schema
Before starting your benchmark workload, you first need to create the *TPCC* data model and then load data.

Create the TPCC data model using theses arguments:
- `--config`
- `--create`
- `--nodes` 

In [None]:
%%bash -s "$MY_TPCC_PATH" "$MY_DB_NAME" "$MY_TPPC_WORKLOAD_FILE" "$MY_HOST_IPv4_01" "$MY_HOST_IPv4_02" "$MY_HOST_IPv4_03"

TPCC_PATH=${1}
DB_NAME=${2}
TPCC_WORKLOAD_FILE=${3}
YB_NODE_01=${4}
YB_NODE_02=${5}
YB_NODE_03=${6}


cd $TPPC_PATH

# terminate connections, drop, and create
./tpccbenchmark \
  --config=${MY_TPPC_WORKLOAD_FILE} \
  --create=true \
  --nodes=${YB_NODE_01},${YB_NODE_02},${YB_NODE_03} 


### Load the benchmark data
Load the data for the TPCC database using the following arguments:
- `--config`
- `--load`
- `--nodes`
- `--warehouses`
- `--loaderthreads` represents the total number vCPU in your cluster. For example, `--loaderthreads=12` is for a 3 node cluster with 4 vCPU per node.

> Note:
> 
> Depending on the vCPU of the nodes in your cluster and scale factor, the load time may be more than 10 minutes. 

In [None]:
%%bash -s "$MY_TPCC_PATH" "$MY_DB_NAME" "$MY_TPPC_WORKLOAD_FILE" "$MY_HOST_IPv4_01" "$MY_HOST_IPv4_02" "$MY_HOST_IPv4_03"

TPCC_PATH=${1}
DB_NAME=${2}
TPCC_WORKLOAD_FILE=${3}
YB_NODE_01=${4}
YB_NODE_02=${5}
YB_NODE_03=${6}


cd $TPPC_PATH

# terminate connections, drop, and create
./tpccbenchmark \
  --config=${MY_TPPC_WORKLOAD_FILE} \
  --load=true \
  --nodes=${YB_NODE_01},${YB_NODE_02},${YB_NODE_03} \
  --warehouses=1 \
  --loaderthreads=2


### Run the benchmark

After creating the schema and loading the data, you can now run the benchmark using the following arguments:
- `--config`
- `--execute`
- `--nodes`
- `--warehouses`
- `--historgrams`



In [None]:
%%bash -s "$MY_TPCC_PATH" "$MY_DB_NAME" "$MY_TPPC_WORKLOAD_FILE" "$MY_HOST_IPv4_01" "$MY_HOST_IPv4_02" "$MY_HOST_IPv4_03"

TPCC_PATH=${1}
DB_NAME=${2}
TPCC_WORKLOAD_FILE=${3}
YB_NODE_01=${4}
YB_NODE_02=${5}
YB_NODE_03=${6}


cd $TPPC_PATH

# terminate connections, drop, and create
./tpccbenchmark \
  --config=${MY_TPPC_WORKLOAD_FILE} \
  --execute=true \
  --nodes=${YB_NODE_01},${YB_NODE_02},${YB_NODE_03} \
  --warehouses=1 \
  --histograms


### Review the benchmark results
When the benchmark completes, you will be able to review the results. Here's an example:

```
14:22:21,014 (DBWorkload.java:522) INFO  -
================RESULTS================
             TPM-C |             126.73
        Efficiency |             98.55%
Throughput (req/s) |               4.75

14:22:21,036 (DBWorkload.java:689) INFO  -
======================LATENCIES (INCLUDE RETRY ATTEMPTS)=====================
 Transaction |  Count   | Avg. Latency | P99 Latency | Connection Acq Latency
    NewOrder |     3802 |        19.19 |       48.66 |                   2.89
     Payment |     3742 |        11.98 |       29.36 |                   0.95
 OrderStatus |      326 |         6.29 |       25.85 |                   1.57
    Delivery |      343 |        63.33 |      184.60 |                   1.78
  StockLevel |      338 |        20.83 |       85.70 |                   0.24
        All  |     8551 |        17.38 |       95.43 |                   1.84

14:22:21,074 (DBWorkload.java:633) INFO  -
=======================WORKER TASK LATENCIES=======================
 Transaction |     Task     |  Count   | Avg. Latency | P99 Latency
    NewOrder |   Fetch Work |     3802 |         0.07 |        3.79
    NewOrder |       Keying |     3802 |     18003.74 |    18005.02
    NewOrder |Op With Retry |     3802 |        22.31 |      228.44
    NewOrder |     Thinking |     3802 |     11945.93 |    57887.91
     Payment |   Fetch Work |     3776 |         0.06 |        1.27
     Payment |       Keying |     3776 |      3003.73 |     3005.03
     Payment |Op With Retry |     3776 |        14.06 |       85.91
     Payment |     Thinking |     3776 |     11837.72 |    55752.92
 OrderStatus |   Fetch Work |      326 |         0.06 |        0.04
 OrderStatus |       Keying |      326 |      2003.78 |     2005.04
 OrderStatus |Op With Retry |      326 |         7.91 |       26.15
 OrderStatus |     Thinking |      326 |      9702.81 |    52510.19
    Delivery |   Fetch Work |      343 |         0.07 |        2.45
    Delivery |       Keying |      343 |      2003.80 |     2005.08
    Delivery |Op With Retry |      343 |        65.14 |      278.49
    Delivery |     Thinking |      343 |      4411.94 |    19129.31
  StockLevel |   Fetch Work |      338 |         0.02 |        0.03
  StockLevel |       Keying |      338 |      2003.65 |     2005.02
  StockLevel |Op With Retry |      338 |        21.09 |       85.76
  StockLevel |     Thinking |      338 |      5186.73 |    24646.16
        All  |   Fetch Work |     8585 |         0.06 |        2.45
        All  |       Keying |     8585 |      9529.42 |    18004.98
        All  |Op with Retry |     8585 |        19.80 |      145.40
        All  |     Thinking |     8585 |     11246.03 |    55505.03
        All  |          All |     8585 |     20795.31 |    68765.58
```

In addition to the terminal output, there are two files that your can also review:
- `output.json`, contains the results in `JSON` format
- `results/oltpbench.csv`, contains the results in `CSV` format



TODO... view the files

---
# üåüüåü Well done! 
In this notebook, you completed the following:
- YSQL Development
  - Basic of DDL and DML
  - Built-in Functions
  - Advanced Language features


## üòä Next up!
Continue your learning by opening the next notebook, `03_Demystifying_table_sharding_tablets_and_data_distribution.ipynb`. 

Or, to open the notebook from GitPod, run the following:

In [None]:
%%bash
gp open '03_Demystifying_table_sharding_tablets_and_data_distribution.ipynb'