<div style="width:100%; background-color: #000041"><a target="_blank" href="http://university.yugabyte.com"><img src="assets/YBU_Logo.png" /></a></div><br>

> **YugabyteDB Fundamentals**
>
> Enroll for free  [Yugabyte University](https://university.yugabyte.com/courses/yugabytedb-fundamentals).
>

<br>
This notebook file is:

`02_TPCC_Benchmark.ipynb`

---
## About the Yugabyte TPCC benchmark

The <a href="https://github.com/yugabyte/tpcc" rel="noopener noreferrer" target="_blank">Yugabyte TPCC benchmark</a> application is a fork of the popular <a href="https://github.com/oltpbenchmark/oltpbench" rel="noopener noreferrer" target="_blank">OLTPBench</a> benchmark tool.

Just like the OLTP Bench original, the Yugabyte TPCC benchmark is a multi-threaded load generator that is be able to produce a variety of workloads, including variations in rate and transaction type. The benchmark also allows for benchmark data collection. You can analyze this data to determine key metrics such as Transactions per Second (TPS) and Latency per Transaction Type. **TPMC** remains as the main metric for summarizing the benchmark.


## 🛠️ Requirements
Here are the requirements for this notebook:
- ✅ Create the notebook variables in `01_Lab_Setup.ipynb`, which you previously did
- ✅ Create the `db_tpcc` database, which you previously did
- ☑️ Import the notebook variables, *which you must do next*
- ☑️ Run the TPCC Benchmark, *which you must do next*


### Select your notebook kernel
- In the Notebook toolbar, click **Select Kernel**.
<br>
<img width=50% src="assets/01_01_Select_Kernel_Toolbar.png" />

- Next, in the dropdown, select **Python 3.12** or higher.
<br>
<img width=50% src="assets/01_02_Select_Kernel_Dropdown.png" />

That's it!

## ⛑️ Getting help
The best way to get help from the Yugabyte University team is to post your question on YugabyteDB Community Slack in the #training or #yb-university channels. To sign up, visit [https://communityinviter.com/apps/yugabyte-db/register](https://communityinviter.com/apps/yugabyte-db/register).


## 👣 Setup steps
Here are the steps to setup this lab:
- Import the notebook variables
- Run the TPCC Benchmark

### 👇 Import the notebook variables

> 👉 IMPORTANT! 👈
> 
> Do **NOT** skip running the following cell. 
> 

The following Python cell reads the stored variables created in the `01_Lab_Setup.ipynb` notebook. To run the script, select Execute Cell (Play Arrow) in the left gutter of the cell. 

👇 👇 👇 

In [None]:
%store -r

---


## Run the TPCC Benchmark

There are 4 basic steps to running the benchmark:
- create the schema
- load the data
- run the benchmark
- review the results

To run the TPCC benchmark, you use a utility script, `tpccbenchmark` which supports the following arguments.

| Argument | Description |
|-|:-|
| `-c,--config`|  `[required]` Workload configuration file|
| `--clear` | Clear all records in the database for this benchmark |
| `--create` |  Initialize the database for this benchmark |
| `--execute` |  Execute the benchmark workload |
| `-h,--help` |  Print this help |
| `--histograms` |  Print txn histograms |
| `--load` |  Load data using the benchmark's data loader |
| `-o,--output` |  Output file (default System.out) |
| `--runscript` |  Run an SQL script |
| `-s,--sample` |  Sampling window |
| `-v,--verbose` |  Display Messages |

The default benchmark values are:
- `warehouses=10`
- `terminals=100`
- `dbConnections=10`
- `loaderThreads=10`


A <a href="https://github.com/yugabyte/tpcc/blob/master/config/workload_all.xml" rel="noopener noreferrer" target="_blank">`config/workload_all.xml`</a> file provides an example of how to describe and configure a workload.

For more configurations, review the forked OLTP benchmark <a href="https://github.com/oltpbenchmark/oltpbench/blob/master/config/sample_tpcc_config.xml" rel="noopener noreferrer" target="_blank">`config.xml`</a>.

The Yugabyte TPCC benchmark also supports multi-region cluster topologies row-level geo-partitioning. To see how, review the <a href="https://github.com/yugabyte/tpcc/blob/master/config/geopartitioned_workload.xml" rel="noopener noreferrer" target="_blank">`geopartitioned_workload.xml`</a> file which illustrates how to specify tablespaces with specific placement policies.</p>

See the following for more details:
- <a href="https://github.com/yugabyte/tpcc" rel="noopener noreferrer" target="_blank">https://github.com/yugabyte/tpcc</a>
- <a href="https://docs.yugabyte.com/latest/benchmark/tpcc-ysql/" rel="noopener noreferrer" target="_blank">https://docs.yugabyte.com/latest/benchmark/tpcc-ysql/</a>



### Create the benchmark schema
Before starting your benchmark workload, you first need to create the *TPCC* data model and then load data.

Create the TPCC data model using theses arguments:
- `--config`
- `--create`
- `--nodes` 

In [None]:
%%bash -s "$MY_TPCC_PATH" "$MY_DB_NAME" "$MY_TPCC_WORKLOAD_FILE" "$MY_HOST_IPv4_01" "$MY_HOST_IPv4_02" "$MY_HOST_IPv4_03"
TPCC_PATH=${1}
DB_NAME=${2}
TPCC_WORKLOAD_FILE=${3}
YB_NODE_01=${4}
YB_NODE_02=${5}
YB_NODE_03=${6}

cd  $TPCC_PATH

cp  $TPCC_WORKLOAD_FILE  $TPCC_PATH/config/my_workload_all.xml
cat $TPCC_PATH/config/my_workload_all.xml
# terminate connections, drop, and create
./tpccbenchmark \
  --config=config/my_workload_all.xml \
  --create=true \
  --nodes=${YB_NODE_01},${YB_NODE_02},${YB_NODE_03} 


### Load the benchmark data
Load the data for the TPCC database using the following arguments:
- `--config`
- `--load`
- `--nodes`
- `--warehouses`
- `--loaderthreads` represents the total number vCPU in your cluster. For example, `--loaderthreads=12` is for a 3 node cluster with 4 vCPU per node.

> Note:
> 
> Depending on the vCPU of the nodes in your cluster and scale factor, the load time may be more than 10 minutes. 
>

So, while this is running below, visit the tab with the yugabyted ui in yoour chromium browser. Try to identify the slow queries during this load.

> Hint
>
> - yugabyted Web UI > Performance > YSQL Slow Queries
>


In [None]:
%%bash -s "$MY_TPCC_PATH" "$MY_DB_NAME" "$MY_TPCC_WORKLOAD_FILE" "$MY_HOST_IPv4_01" "$MY_HOST_IPv4_02" "$MY_HOST_IPv4_03"

TPCC_PATH=${1}
DB_NAME=${2}
TPCC_WORKLOAD_FILE=${3}
YB_NODE_01=${4}
YB_NODE_02=${5}
YB_NODE_03=${6}


cd $TPCC_PATH

# terminate connections, drop, and create
./tpccbenchmark \
  --config=config/my_workload_all.xml \
  --load=true \
  --nodes=${YB_NODE_01},${YB_NODE_02},${YB_NODE_03} \
  --warehouses=1 \
  --loaderthreads=2


### Run the benchmark

After creating the schema and loading the data, you can now run the benchmark using the following arguments:
- `--config`
- `--execute`
- `--nodes`
- `--warehouses`
- `--historgrams`



So, while this is running below, visit the tab with the `yugabyted` Web UI in your chromium browser. Try to identify the slow queries during this load.

> Hint
>
> - yugabyted Web UI > Performance > Metrics
> - yugabyted Web UI > Performance > Live Queries
>
> Remember to `Refresh`

This may take about **10 minutes** to complete.

In [None]:
%%bash -s "$MY_TPCC_PATH" "$MY_DB_NAME" "$MY_TPCC_WORKLOAD_FILE" "$MY_HOST_IPv4_01" "$MY_HOST_IPv4_02" "$MY_HOST_IPv4_03"

TPCC_PATH=${1}
DB_NAME=${2}
TPCC_WORKLOAD_FILE=${3}
YB_NODE_01=${4}
YB_NODE_02=${5}
YB_NODE_03=${6}


cd $TPCC_PATH

# terminate connections, drop, and create
./tpccbenchmark \
  --config=config/my_workload_all.xml \
  --execute=true \
  --warmup-time-secs=30 \
  --nodes=${YB_NODE_01},${YB_NODE_02},${YB_NODE_03} \
  --warehouses=1 \
  --histograms


### Review the benchmark results
When the benchmark completes, you will be able to review the results. Here's an example:

```
================RESULTS================
             TPM-C |              11.60
        Efficiency |             90.20%
Throughput (req/s) |               0.49

05:13:32,998 (DBWorkload.java:696) INFO  - 
======================LATENCIES (INCLUDE RETRY ATTEMPTS)=====================
 Transaction |  Count   | Avg. Latency | P99 Latency | Connection Acq Latency
    NewOrder |      116 |        25.04 |       66.41 |                   0.73
     Payment |      138 |        15.35 |       52.56 |                   0.80
 OrderStatus |       14 |         7.16 |       19.82 |                   0.50
    Delivery |       14 |        93.91 |      164.13 |                   0.63
  StockLevel |       12 |         9.70 |       47.36 |                   0.65
        All  |      294 |        22.29 |      136.17 |                   0.75

05:13:33,006 (DBWorkload.java:640) INFO  - 
=======================WORKER TASK LATENCIES=======================
 Transaction |     Task     |  Count   | Avg. Latency | P99 Latency
    NewOrder |   Fetch Work |      116 |         0.38 |       10.06
    NewOrder |       Keying |      116 |     18000.14 |    18000.95
    NewOrder |Op With Retry |      116 |        25.86 |       66.99
    NewOrder |     Thinking |      116 |     12253.64 |    72840.10
     Payment |   Fetch Work |      138 |         0.00 |        0.03
     Payment |       Keying |      138 |      3000.19 |     3003.18
     Payment |Op With Retry |      138 |        16.22 |       53.02
     Payment |     Thinking |      138 |     11609.32 |    60768.14
 OrderStatus |   Fetch Work |       14 |         0.01 |        0.12
 OrderStatus |       Keying |       14 |      2000.11 |     2000.14
 OrderStatus |Op With Retry |       14 |         7.71 |       20.46
 OrderStatus |     Thinking |       14 |     15752.29 |    37918.09
    Delivery |   Fetch Work |       14 |         0.00 |        0.01
    Delivery |       Keying |       14 |      2000.10 |     2000.13
    Delivery |Op With Retry |       14 |        94.61 |      164.70
    Delivery |     Thinking |       14 |      5344.52 |    14846.08
  StockLevel |   Fetch Work |       12 |         0.00 |        0.01
  StockLevel |       Keying |       12 |      2000.11 |     2000.15
  StockLevel |Op With Retry |       12 |        10.40 |       47.85
  StockLevel |     Thinking |       12 |      3920.01 |     9980.15
        All  |   Fetch Work |      294 |         0.15 |        0.12
        All  |       Keying |      294 |      8782.47 |    18000.18
        All  |Op with Retry |      294 |        23.11 |      136.79
        All  |     Thinking |      294 |     11448.65 |    62973.09
        All  |          All |      294 |     20254.39 |    75767.52
```

In addition to the terminal output, there are two files that your can also review:
- `output.json`, contains the results in `JSON` format
- `results/oltpbench.csv`, contains the results in `CSV` format



View the `output.json` file.

In [None]:
%%bash
gp open '/home/gitpod/tpcc/results/json/output.json'

View the `oltpbench.csv` file.

In [None]:
%%bash
gp open '/home/gitpod/tpcc/results/oltpbench.csv'

---
# 🌟🌟 Well done! 
In this lab, you completed the following:
- Setup steps
- Run the TPCC Benchmark