Skip to content

Commit a2ab975

Browse files
authored
V0.3.11 (#34)
* Prepare next release * K8s: Some demo yml files for deployments and services * TPC-H: Some demo queries and init scripts * TPC-H: Some demo configs and scripts * Docs: TPC-H example * Prepare next release 0.4.0
1 parent 12eb26e commit a2ab975

39 files changed

+3620
-163
lines changed

README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,8 @@ This tool supports AWS and kubernetes (k8s) based clusters.
77

88
This documentation
99
* illustrates the [concepts](docs/Concept.md)
10-
* provides [basic examples](docs/Examples.md)
10+
* provides a basic [TPC-H like example](docs/Example-TPC-H.md)
11+
* provides [more detailed examples](docs/Examples.md)
1112
* [Example: TPC-H Benchmark for 3 DBMS on 1 Virtual Machine](docs/Examples.md#example-tpc-h-benchmark-for-3-dbms-on-1-virtual-machine)
1213
* [Example: TPC-H Benchmark for 1 DBMS on 3 Virtual Machines](docs/Examples.md#example-tpc-h-benchmark-for-1-dbms-on-3-virtual-machines)
1314
* defines [how to configure an experiment setup](docs/Config.md)

demo-tpch-k8s.py

Lines changed: 113 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,113 @@
1+
"""
2+
Demo for bexhoma
3+
This compares MonetDB and PostgreSQL performing some some TPC-H queries.
4+
The cluster is managed using Kubernetes.
5+
Copyright (C) 2020 Patrick Erdelt
6+
7+
This program is free software: you can redistribute it and/or modify
8+
it under the terms of the GNU Affero General Public License as
9+
published by the Free Software Foundation, either version 3 of the
10+
License, or (at your option) any later version.
11+
12+
This program is distributed in the hope that it will be useful,
13+
but WITHOUT ANY WARRANTY; without even the implied warranty of
14+
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
15+
GNU Affero General Public License for more details.
16+
17+
You should have received a copy of the GNU Affero General Public License
18+
along with this program. If not, see <https://www.gnu.org/licenses/>.
19+
"""
20+
from bexhoma import *
21+
import logging
22+
import urllib3
23+
import gc
24+
25+
urllib3.disable_warnings()
26+
logging.basicConfig(level=logging.ERROR)
27+
28+
# continue previous experiment?
29+
code=None
30+
# pick query file
31+
queryfile = 'queries-tpch.config'
32+
# pick scaling factor
33+
SF = '1'
34+
# number of repetition
35+
numExperiments = 1
36+
# pick hardware
37+
cpu = "4000m"
38+
memory = '16Gi'
39+
cpu_type = 'epyc-7542'
40+
41+
# set basic config
42+
cluster = masterK8s.testdesign(
43+
clusterconfig = 'cluster.config',
44+
yamlfolder = 'k8s/',
45+
configfolder = 'experiments/tpch',
46+
queryfile = queryfile)
47+
48+
# remove existing pods
49+
cluster.cleanExperiment()
50+
51+
# set data volume
52+
cluster.set_experiment(volume='tpch')
53+
54+
# set DDL scripts
55+
cluster.set_experiment(script='1s-SF'+SF+'-index')
56+
57+
# continue previous experiment?
58+
cluster.set_code(code=code)
59+
60+
# set workload parameters - this overwrites infos given in the query file
61+
cluster.set_workload(
62+
name = 'TPC-H Queries',
63+
info = 'This experiment compares instances of different DBMS on different machines.'
64+
)
65+
66+
# set connection parameters - this overwrites infos given in the query file
67+
cluster.set_connectionmanagement(
68+
numProcesses = 1,
69+
runsPerConnection = 0,
70+
timeout = 600,
71+
singleConnection = False)
72+
73+
# set query parameters - this overwrites infos given in the query file
74+
cluster.set_querymanagement(numRun = 1)
75+
76+
# set hardware requests and limits
77+
cluster.set_resources(
78+
requests = {
79+
'cpu': cpu,
80+
'memory': memory
81+
},
82+
limits = {
83+
'cpu': 0,
84+
'memory': 0
85+
},
86+
nodeSelector = {
87+
'cpu': cpu_type,
88+
})
89+
90+
91+
# function to capture recurring parts of the workflow
92+
def run_experiments(docker, alias):
93+
cluster.set_experiment(docker=docker)
94+
cluster.set_experiment(instance=cpu+"-"+memory)
95+
cluster.prepareExperiment(delay=60)
96+
cluster.startExperiment(delay=60)
97+
for i in range(1,numExperiments+1):
98+
connection = cluster.getConnectionName()
99+
cluster.runBenchmarks(connection=connection+"-"+str(i), alias=alias+'-'+str(i))
100+
cluster.stopExperiment()
101+
cluster.cleanExperiment()
102+
del gc.garbage[:]
103+
104+
105+
# run experiments
106+
run_experiments(docker='MonetDB', alias='DBMS-A')
107+
run_experiments(docker='PostgreSQL', alias='DBMS-B')
108+
109+
# run reporting
110+
cluster.runReporting()
111+
112+
exit()
113+

docs/Example-TPC-H.md

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
# Example: TPC-H
2+
3+
This example shows how to benchmark 22 reading queries Q1-Q22 derived from TPC-H in MonetDB and PostgreSQL.
4+
5+
> The query file is derived from the TPC-H and as such is not comparable to published TPC-H results, as the query file results do not comply with the TPC-H Specification.
6+
7+
Official TPC-H benchmark - http://www.tpc.org/tpch
8+
9+
**Content**:
10+
* [Prerequisites](#prerequisites)
11+
* [Perform Benchmark](#perform-benchmark)
12+
* [Evaluate Results in Dashboard](#evaluate-results-in-dashboard)
13+
14+
## Prerequisites
15+
16+
We need configuration file containing the following informations in a predefined format, c.f. [demo file](../k8s-cluster.config).
17+
We may adjust the configuration to match the actual environment.
18+
The demo also includes the necessary settings for some DBMS: MariaDB, MonetDB, MySQL, OmniSci and PostgreSQL.
19+
20+
For basic execution of benchmarking we need
21+
* a Kubernetes (K8s) cluster
22+
* a namespace `mynamespace`
23+
* `kubectl` usable, i.e. access token stored in a default vault like `~/.kube`
24+
* a persistent volume named `vol-benchmarking` containing the raw TPC-H data in `/data/tpch/SF1/`
25+
* JDBC driver `./monetdb-jdbc-2.29.jar` and `./postgresql-42.2.5.jar`
26+
* a folder `/benchmarks` for the results
27+
28+
29+
For also enabling monitoring we need
30+
* a monitoring instance Prometheus / Grafana that scrapes metrics from `localhost:9300`
31+
* an access token and URL for asking Grafana for metrics
32+
https://grafana.com/docs/grafana/latest/http_api/auth/#create-api-token
33+
34+
35+
## Perform Benchmark
36+
37+
For performing the experiment we can run the [demo file](../demo-tpch-k8s.py).
38+
39+
The actual benchmarking is done by
40+
```
41+
# run experiments
42+
run_experiments(docker='MonetDB', alias='DBMS-A')
43+
run_experiments(docker='PostgreSQL', alias='DBMS-B')
44+
```
45+
46+
### Adjust Parameter
47+
48+
You maybe want to adjust some of the parameters that are set in the file.
49+
50+
The hardware requirements are set via
51+
```
52+
# pick hardware
53+
cpu = "4000m"
54+
memory = '16Gi'
55+
cpu_type = 'epyc-7542'
56+
```
57+
58+
The number of executions of each query can be adjusted here
59+
```
60+
# set query parameters - this overwrites infos given in the query file
61+
cluster.set_querymanagement(numRun = 1)
62+
```
63+
64+
### Evaluate Results in Dashboard
65+
66+
Evaluation is done using DBMSBenchmarker: https://github.com/Beuth-Erdelt/DBMS-Benchmarker/blob/master/docs/Dashboard.md
67+

experiment-example-AWS.py

Lines changed: 0 additions & 91 deletions
This file was deleted.

experiment-example-k8s.py

Lines changed: 0 additions & 65 deletions
This file was deleted.
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
-- sccsid: @(#)dss.ri 2.1.8.1
2+
-- tpcd benchmark version 8.0
3+
4+
-- for table nation
5+
alter table tpch.nation
6+
add foreign key (n_regionkey) references tpch.region(r_regionkey);
7+
8+
-- for table supplier
9+
alter table tpch.supplier
10+
add foreign key (s_nationkey) references tpch.nation(n_nationkey);
11+
12+
-- for table customer
13+
alter table tpch.customer
14+
add foreign key (c_nationkey) references tpch.nation(n_nationkey);
15+
16+
-- for table partsupp
17+
alter table tpch.partsupp
18+
add foreign key (ps_suppkey) references tpch.supplier(s_suppkey);
19+
20+
alter table tpch.partsupp
21+
add foreign key (ps_partkey) references tpch.part(p_partkey);
22+
23+
-- for table orders
24+
alter table tpch.orders
25+
add foreign key (o_custkey) references tpch.customer(c_custkey);
26+
27+
-- for table lineitem
28+
alter table tpch.lineitem
29+
add foreign key (l_orderkey) references tpch.orders(o_orderkey);
30+
31+
alter table tpch.lineitem
32+
add foreign key (l_partkey,l_suppkey) references
33+
tpch.partsupp(ps_partkey,ps_suppkey);
34+
35+
36+

0 commit comments

Comments
 (0)