Skip to content

Commit

Permalink
V0.5.17 (#95)
Browse files Browse the repository at this point in the history
* Prepare next release

* Docs: Badges

* Docs: Entry page

* Docs: References

* Docs: More (old) details

* Docs: More (new) details

* Docs: References
  • Loading branch information
perdelt committed Jan 20, 2022
1 parent 83f09e4 commit f9cd3d6
Show file tree
Hide file tree
Showing 12 changed files with 69 additions and 60 deletions.
13 changes: 9 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[![Maintenance](https://img.shields.io/badge/Maintained%3F-yes-green.svg)](https://GitHub.com/Beuth-Erdelt/DBMS-Benchmarker/graphs/commit-activity)
[![GitHub release](https://img.shields.io/github/release/Beuth-Erdelt/DBMS-Benchmarker.svg)](https://GitHub.com/Beuth-Erdelt/DBMS-Benchmarker/releases/)
[![Maintenance](https://img.shields.io/badge/Maintained%3F-yes-green.svg)](https://GitHub.com/Beuth-Erdelt/Benchmark-Experiment-Host-Manager/graphs/commit-activity)
[![GitHub release](https://img.shields.io/github/release/Beuth-Erdelt/Benchmark-Experiment-Host-Manager.svg)](https://GitHub.com/Beuth-Erdelt/Benchmark-Experiment-Host-Manager/releases/)

# Benchmark Experiment Host Manager
This Python tools helps **managing benchmark experiments of Database Management Systems (DBMS) in a Kubernetes-based High-Performance-Computing (HPC) cluster environment**.
Expand Down Expand Up @@ -49,8 +49,8 @@ The repository contains a [tool](experiments/tpch/) for running TPC-H (reading)

## More Informations

For full power, use this tool as an orchestrator as in [2]. It also starts a monitoring container using [Prometheus](https://prometheus.io/) and a metrics collector container using [cAdvisor](https://github.com/google/cadvisor). It also uses the Python package [dbmsbenchmarker](https://github.com/Beuth-Erdelt/DBMS-Benchmarker) as query executor [2] and evaluator [1].
See the [images](images/) folder for more details.
For full power, use this tool as an orchestrator as in [2]. It also starts a monitoring container using [Prometheus](https://prometheus.io/) and a metrics collector container using [cAdvisor](https://github.com/google/cadvisor). It also uses the Python package [dbmsbenchmarker](https://github.com/Beuth-Erdelt/Benchmark-Experiment-Host-Manager) as query executor [2] and evaluator [1].
See the [images](https://github.com/Beuth-Erdelt/Benchmark-Experiment-Host-Manager/tree/master/images/) folder for more details.

This module has been tested with Brytlyt, Citus, Clickhouse, DB2, Exasol, Kinetica, MariaDB, MariaDB Columnstore, MemSQL, Mariadb, MonetDB, MySQL, OmniSci, Oracle DB, PostgreSQL, SingleStore, SQL Server and SAP HANA.

Expand All @@ -66,5 +66,10 @@ This module has been tested with Brytlyt, Citus, Clickhouse, DB2, Exasol, Kineti

[2] [Orchestrating DBMS Benchmarking in the Cloud with Kubernetes](https://www.researchgate.net/publication/353236865_Orchestrating_DBMS_Benchmarking_in_the_Cloud_with_Kubernetes)
> Erdelt P.K. (2022)
> Orchestrating DBMS Benchmarking in the Cloud with Kubernetes.
> In: Nambiar R., Poess M. (eds) Performance Evaluation and Benchmarking. TPCTC 2021.
> Lecture Notes in Computer Science, vol 13169. Springer, Cham.
> https://doi.org/10.1007/978-3-030-94437-7_6
(old, slightly outdated [docs](docs/Docs_old.md))
22 changes: 11 additions & 11 deletions docs/API.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ cluster.set_connectionmanagement(
* `numProcesses`: Number of parallel client processes. Default is 1.
* `runsPerConnection`: Number of runs performed before connection is closed. Default is None, i.e. no limit.

These values are handed over to the [benchmarker](https://github.com/Beuth-Erdelt/DBMS-Benchmarker/blob/master/docs/Options.md#extended-query-file).
These values are handed over to the [benchmarker](https://github.com/Beuth-Erdelt/DBMS-Benchmarker/blob/master/docs/Options.html#extended-query-file).

## Set Query Management

Expand All @@ -83,11 +83,11 @@ cluster.set_querymanagement(numRun = 1)

* `numRun`: Number of runs each query is run for benchmarking

These values are handed over to the [benchmarker](https://github.com/Beuth-Erdelt/DBMS-Benchmarker/blob/master/docs/Options.md#extended-query-file), c.f. for more options.
These values are handed over to the [benchmarker](https://github.com/Beuth-Erdelt/DBMS-Benchmarker/blob/master/docs/Options.html#extended-query-file), c.f. for more options.

## Set Resources

Specify details about the following experiment. This overwrites infos given in the instance description (YAML) in [deployments](Deployments.md) for Kubernetes.
Specify details about the following experiment. This overwrites infos given in the instance description (YAML) in [deployments](Deployments.html) for Kubernetes.

```
cluster.set_resources(
Expand Down Expand Up @@ -118,7 +118,7 @@ All occurrences of `{shard_count}` in the DDL scripts of the following experimen
## Run Experiment

<p align="center">
<img src="run-experiment.png" width="160">
<img src="https://raw.githubusercontent.com/Beuth-Erdelt/Benchmark-Experiment-Host-Manager/master/docs/run-experiment.png" width="160">
</p>

The command `cluster.runExperiment()` is short for:
Expand All @@ -136,7 +136,7 @@ In a k8s cluster, this also starts the DBMS.


<p align="center">
<img src="prepare-experiment.png" width="320">
<img src="https://raw.githubusercontent.com/Beuth-Erdelt/Benchmark-Experiment-Host-Manager/master/docs/prepare-experiment.png" width="320">
</p>

### On K8s
Expand All @@ -150,7 +150,7 @@ cluster.startPortforwarding()
* `cluster.createDeployment()`: Creates a deployment (pod and services) of Docker images to k8s
* Setup Network `cluster.startPortforwarding()`: Forwards the port of the DBMS in the pod to localhost:fixedport (same for all containers)

See the documentation for more information about [deployments](Deployments.md).
See the documentation for more information about [deployments](Deployments.html).

### On AWS

Expand Down Expand Up @@ -179,7 +179,7 @@ cluster.mountVolume()
This yields a fully loaded DBMS with a fixed port on the virtual machine in a docker container with the fixed name `benchmark` (AWS) or a pod with the fixed label `app=` (k8s) resp.

<p align="center">
<img src="start-experiment.png" width="320">
<img src="https://raw.githubusercontent.com/Beuth-Erdelt/Benchmark-Experiment-Host-Manager/master/docs/start-experiment.png" width="320">
</p>

### On K8s
Expand Down Expand Up @@ -228,7 +228,7 @@ cluster.loadData()
The command `cluster.runBenchmarks()` runs an [external benchmark tool](https://github.com/Beuth-Erdelt/GEO-GPU-DBMS-Benchmarks).

<p align="center">
<img src="https://github.com/Beuth-Erdelt/DBMS-Benchmarker/raw/master/docs/Concept-Benchmarking.png" width="320">
<img src="https://raw.githubusercontent.com/Beuth-Erdelt/Benchmark-Experiment-Host-Manager/master/docs/https://github.com/Beuth-Erdelt/DBMS-Benchmarker/raw/master/docs/Concept-Benchmarking.png" width="320">
</p>

### Connectionname and Client Configurations
Expand Down Expand Up @@ -288,7 +288,7 @@ The result folder also contains

**Note this means it stores confidential informations**

Results are inspected best using the [dashboard](https://github.com/Beuth-Erdelt/DBMS-Benchmarker/blob/master/docs/Dashboard.md)
Results are inspected best using the [dashboard](https://github.com/Beuth-Erdelt/DBMS-Benchmarker/blob/master/docs/Dashboard.html)

### Collect Host Informations

Expand Down Expand Up @@ -331,7 +331,7 @@ This generates reports about all experiments that have been stored in the same c
This yields the virtual machine in (almost) the same state as if it was just prepared without restarting it.

<p align="center">
<img src="stop-experiment.png" width="320">
<img src="https://raw.githubusercontent.com/Beuth-Erdelt/Benchmark-Experiment-Host-Manager/master/docs/stop-experiment.png" width="320">
</p>

### On K8s
Expand Down Expand Up @@ -369,7 +369,7 @@ cluster.cleanDocker()
This removes everything from the virtual machine that is related to the experiment (except for results) and shuts it down.

<p align="center">
<img src="clean-experiment.png" width="320">
<img src="https://raw.githubusercontent.com/Beuth-Erdelt/Benchmark-Experiment-Host-Manager/master/docs/clean-experiment.png" width="320">
</p>

### On K8s
Expand Down
12 changes: 6 additions & 6 deletions docs/Concept.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,13 +17,13 @@ A **benchmark setting** consists of
## Workflow

The **management** roughly means
* [configure](Config.md#how-to-configure-an-experiment-setup), [set up](Config.md#example-setup-different-dbms-on-same-instance) and [start](API.md#prepare-experiment) a virtual machine environment
* [start](API.md#start-experiment) a DBMS and load raw data
* [run](API.md#run-benchmarks) some benchmarks, fetch metrics and do reporting
* [shut](API.md#stop-experiment) down environment and [clean up](API.md#clean-experiment)
* [configure](Config.html#how-to-configure-an-experiment-setup), [set up](Config.html#example-setup-different-dbms-on-same-instance) and [start](API.html#prepare-experiment) a virtual machine environment
* [start](API.html#start-experiment) a DBMS and load raw data
* [run](API.html#run-benchmarks) some benchmarks, fetch metrics and do reporting
* [shut](API.html#stop-experiment) down environment and [clean up](API.html#clean-experiment)

<p align="center">
<img src="architecture.png" width="640">
<img src="https://raw.githubusercontent.com/Beuth-Erdelt/Benchmark-Experiment-Host-Manager/master/docs/architecture.png" width="640">
</p>

In more detail this means
Expand Down Expand Up @@ -64,4 +64,4 @@ This tool relies on
* [paramiko](http://www.paramiko.org/) for SSH handling
* [scp](https://pypi.org/project/scp/) for SCP handling
* [kubernetes](https://github.com/kubernetes-client/python) for k8s management
* and some more [python libraries](https://github.com/perdelt/kubecluster/blob/master/requirements.txt)
* and some more [python libraries](https://github.com/Beuth-Erdelt/Benchmark-Experiment-Host-Manager/blob/master/requirements.txt)
4 changes: 2 additions & 2 deletions docs/Config.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

We need
* a [config file](#clusterconfig) containing cluster information , say `cluster.config`
* a [config folder](https://github.com/Beuth-Erdelt/DBMS-Benchmarker/docs/Options.md#config-folder) for the benchmark tool, say `experiments/tpch/`, containing a config file `queries.config` for the [queries](https://github.com/Beuth-Erdelt/DBMS-Benchmarker/docs/Options.md#query-file)
* a [config folder](https://github.com/Beuth-Erdelt/DBMS-Benchmarker/docs/Options.html#config-folder) for the benchmark tool, say `experiments/tpch/`, containing a config file `queries.config` for the [queries](https://github.com/Beuth-Erdelt/DBMS-Benchmarker/docs/Options.html#query-file)
* some additional data depending on if it is an [AWS](#on-aws) or a [k8s](#on-k8s) cluster
* a python script managing the experimental workflow, say `experiment-tpch.py`

Expand Down Expand Up @@ -208,4 +208,4 @@ Monitoring requires
* A dict of exporters given as docker commands
* Will be installed and activated automatically at each instance when `cluster.prepareExperiment()` is invoked.

More information can be found [here](Monitoring.md)
More information can be found [here](Monitoring.html)
4 changes: 2 additions & 2 deletions docs/DBMS.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ This document contains examples for

### Deployment

See documentation of [deployments](Deployments.md).
See documentation of [deployments](Deployments.html).

### Configuration

Expand All @@ -47,7 +47,7 @@ See documentation of [deployments](Deployments.md).
```
This has
* a base name for the DBMS
* a placeholder `template` for the [benchmark tool](https://github.com/Beuth-Erdelt/DBMS-Benchmarker/blob/master/docs/Options.md#connection-file)
* a placeholder `template` for the [benchmark tool](https://github.com/Beuth-Erdelt/DBMS-Benchmarker/blob/master/docs/Options.html#connection-file)
* the JDBC driver jar locally available
* a command `loadData` for running the init scripts with `{scriptname}` as a placeholder for the script name inside the container
* `{serverip}` as a placeholder for the host address (localhost for k8s, an Elastic IP for AWS)
Expand Down
4 changes: 2 additions & 2 deletions docs/Deployments.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ To generate a file `'deployment-'+docker+'-'+instance+'.yml'` from this

This repository includes some templates at https://github.com/Beuth-Erdelt/Benchmark-Experiment-Host-Manager/tree/master/k8s

[DBMS](DBMS.md) included are:
[DBMS](DBMS.html) included are:
* MariaDB (10.4.6)
* MonetDB (11.31.7)
* OmniSci (v5.4.0)
Expand Down Expand Up @@ -81,4 +81,4 @@ cluster.set_resources(
})
```

For further information and option see the [documentation](API.md#set-resources).
For further information and option see the [documentation](API.html#set-resources).
38 changes: 13 additions & 25 deletions docs/Example-TPC-H.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,18 +6,13 @@ This example shows how to benchmark 22 reading queries Q1-Q22 derived from TPC-H
Official TPC-H benchmark - http://www.tpc.org/tpch

**Content**:
* [Prerequisites](#prerequisites)
* [Perform Benchmark](#perform-benchmark)
* [Evaluate Results in Dashboard](#evaluate-results-in-dashboard)

## Prerequisites

We need configuration file containing the following informations in a predefined format, c.f. [demo file](../k8s-cluster.config).
We may adjust the configuration to match the actual environment.
This in particular holds for `imagePullSecrets`, `tolerations` and `nodeSelector` in the [YAML files](Deployments.md).
We need configuration file containing the following informations in a predefined format, c.f. [demo file](https://github.com/Beuth-Erdelt/Benchmark-Experiment-Host-Manager/tree/master/k8s-cluster.config).
The demo also includes the necessary settings for some [DBMS](DBMS.html): MariaDB, MonetDB, MySQL, OmniSci and PostgreSQL.

The demo also includes the necessary settings for some [DBMS](DBMS.md): MariaDB, MonetDB, MySQL, OmniSci and PostgreSQL.
We may adjust the configuration to match the actual environment.
This in particular holds for `imagePullSecrets`, `tolerations` and `nodeSelector` in the [YAML files](Deployments.html).

For basic execution of benchmarking we need
* a Kubernetes (K8s) cluster
Expand All @@ -36,13 +31,12 @@ For also enabling monitoring we need

## Perform Benchmark

For performing the experiment we can run the [demo file](../demo-tpch-k8s.py).
For performing the experiment we can run the [demo file](https://github.com/Beuth-Erdelt/Benchmark-Experiment-Host-Manager/blob/master/tpch.py).

The actual benchmarking is done by
The actual configurations to benchmark are added by
```
# run experiments
run_experiments(docker='MonetDB', alias='DBMS-A')
run_experiments(docker='PostgreSQL', alias='DBMS-B')
config = configurations.default(experiment=experiment, docker='MonetDB', configuration='MonetDB-{}'.format(cluster_name), alias='DBMS A')
config = configurations.default(experiment=experiment, docker='PostgreSQL', configuration='PostgreSQL-{}'.format(cluster_name), alias='DBMS D')
```

### Adjust Parameter
Expand All @@ -52,18 +46,12 @@ You maybe want to adjust some of the parameters that are set in the file.
The hardware requirements are set via
```
# pick hardware
cpu = "4000m"
memory = '16Gi'
cpu_type = 'epyc-7542'
```

The number of executions of each query can be adjusted here
```
# set query parameters - this overwrites infos given in the query file
cluster.set_querymanagement(numRun = 1)
cpu = str(args.request_cpu)
memory = str(args.request_ram)
cpu_type = str(args.request_cpu_type)
```

### Evaluate Results in Dashboard
## Evaluate Results in Dashboard

Evaluation is done using DBMSBenchmarker: https://github.com/Beuth-Erdelt/DBMS-Benchmarker/blob/master/docs/Dashboard.md
Evaluation is done using DBMSBenchmarker: https://github.com/Beuth-Erdelt/DBMS-Benchmarker/blob/master/docs/Dashboard.html

8 changes: 4 additions & 4 deletions docs/Monitoring.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ This document contains information about the
## Concept

<p align="center">
<img src="architecture.png" width="640">
<img src="https://raw.githubusercontent.com/Beuth-Erdelt/Benchmark-Experiment-Host-Manager/master/docs/architecture.png" width="640">
</p>

There is
Expand All @@ -27,7 +27,7 @@ To be documented

### Kubernetes

* Experiment Host: Exporters are part of the [deployments](Deployments.md)
* Experiment Host: Exporters are part of the [deployments](Deployments.html)
* Monitor: Servers are deployed using Docker images, fixed on a separate monitoring instance
* Manager: See [configuration](#configuration)

Expand All @@ -49,11 +49,11 @@ We insert information about
* metrics definitions

into the cluster configuration.
This is handed over to the [DBMS configuration](https://github.com/Beuth-Erdelt/DBMS-Benchmarker/blob/master/docs/Options.md#connection-file) of the [benchmarker](https://github.com/Beuth-Erdelt/DBMS-Benchmarker/blob/master/docs/Concept.md#monitoring-hardware-metrics) in a [monitoring section](https://github.com/Beuth-Erdelt/DBMS-Benchmarker/blob/master/docs/Options.md#monitoring).
This is handed over to the [DBMS configuration](https://github.com/Beuth-Erdelt/DBMS-Benchmarker/blob/master/docs/Options.html#connection-file) of the [benchmarker](https://github.com/Beuth-Erdelt/DBMS-Benchmarker/blob/master/docs/Concept.html#monitoring-hardware-metrics) in a [monitoring section](https://github.com/Beuth-Erdelt/DBMS-Benchmarker/blob/master/docs/Options.html#monitoring).

### Example

The details of the metrics correspond to the YAML configuration of the [deployments](Deployments.md):
The details of the metrics correspond to the YAML configuration of the [deployments](Deployments.html):
* `job="monitor-node"`
* `container_name="dbms"`

Expand Down
19 changes: 17 additions & 2 deletions docs/index.rst
Original file line number Diff line number Diff line change
@@ -1,9 +1,24 @@
================
Bexhoma Python Package
================

----------
Benchmark Experiment Host Manager for Orchestration of DBMS Benchmarking Experiments in Clouds
----------

.. toctree::
:maxdepth: 2
:caption: Table of Contents:
:hidden:

README.md
../README.md
./Example-TPC-H.md
./Concept.md
./Config.md
./DBMS.md
./Deployments.md
./Monitoring.md
...

.. mdinclude:: ../README.md

.. mdinclude:: ../README.md
2 changes: 1 addition & 1 deletion images/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Benchmark Experiment Host Manager

In this folder is a collection of useful images.
In this folder is a collection of useful Docker images.

## Orchestration of Benchmarking Experiments

Expand Down
1 change: 1 addition & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,4 @@ kubernetes>=9.0.0
psutil>=5.6.1
dbmsbenchmarker>=0.11.4
m2r2
myst_parser
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@

setuptools.setup(
name="bexhoma",
version="0.5.16",
version="0.5.17",
author="Patrick Erdelt",
author_email="perdelt@beuth-hochschule.de",
description="This python tools helps managing DBMS benchmarking experiments in a Kubernetes-based HPC cluster environment. It enables users to configure hardware / software setups for easily repeating tests over varying configurations.",
Expand Down

0 comments on commit f9cd3d6

Please sign in to comment.