Skip to content

Commit

Permalink
Docs: Typos
Browse files Browse the repository at this point in the history
  • Loading branch information
perdelt committed Jan 4, 2022
1 parent 21abfd6 commit b7f714a
Show file tree
Hide file tree
Showing 4 changed files with 15 additions and 15 deletions.
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ See the [homepage](https://github.com/Beuth-Erdelt/DBMS-Benchmarker) and the [do
## Key Features

DBMS-Benchmarker

* is Python3-based
* helps to **benchmark DBMS**
* connects to all DBMS having a JDBC interface - including GPU-enhanced DBMS
Expand Down Expand Up @@ -52,6 +53,7 @@ As a result we obtain an interactive dashboard to inspect timing aspects.
### Configuration

We need to provide

* a [DBMS configuration file](Options.html#connection-file), e.g. in `./config/connections.config`
```
[
Expand Down Expand Up @@ -112,9 +114,7 @@ Run the command: `dbmsdashboard`
This will start the evaluation dashboard at `localhost:8050`.
Visit the address in a browser and select the experiment `<code>`.

This is equivalent to `python dashboard.py`.

Alternatively you may use a [Jupyter notebooks](https://github.com/Beuth-Erdelt/DBMS-Benchmarker/blob/master/Evaluation-Demo.ipynb), see a [rendered example](https://beuth-erdelt.github.io/DBMS-Benchmarker/Evaluation-Demo.html).
Alternatively you may use a [Jupyter notebook](https://github.com/Beuth-Erdelt/DBMS-Benchmarker/blob/master/Evaluation-Demo.ipynb), see a [rendered example](https://beuth-erdelt.github.io/DBMS-Benchmarker/Evaluation-Demo.html).

## Benchmarking in a Kubernetes Cloud

Expand Down
6 changes: 2 additions & 4 deletions docs/Concept.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ In the end we have
* Per DBMS and Query:
* Time per session
* Time per run
* Time per run, split up in: connection / execution / data transfer
* Time per run, split up into: connection / execution / data transfer
* Latency and Throughputs per run
* Latency and Throughputs per session

Expand All @@ -113,10 +113,9 @@ Each query will be sent to every DBMS in the same number of runs.
</p>

This also respects randomization, i.e. every DBMS receives exactly the same versions of the query in the same order.

We assume all DBMS will give us the same result sets.
Without randomization, each run should yield the same result set.
This tool automatically can check these assumptions by **comparison**.
This tool can check these assumptions automatically by **comparison**.
The resulting data table is handled as a list of lists and treated by this:
```
# restrict precision
Expand All @@ -130,7 +129,6 @@ columnnames = [[i[0].upper() for i in connection.cursor.description]]
hashed = columnnames + [[hashlib.sha224(pickle.dumps(data)).hexdigest()]]
```
Result sets of different runs (not randomized) and different DBMS can be compared by their sorted table (small data sets) or their hash value or size (bigger data sets).

In order to do so, result sets (or their hash value or size) are stored as lists of lists and additionally can be saved as csv files or pickled pandas dataframes.

## Monitoring Hardware Metrics
Expand Down
2 changes: 2 additions & 0 deletions docs/Options.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ Basically this can be done running `dbmsbenchmarker run` or `dbmsbenchmarker con
The lists of [DBMS](#connection-file) and [queries](#query-file) are given in config files in dict format.

Benchmarks can be [parametrized](#query-file) by

* number of benchmark runs: *Is performance stable across time?*
* number of benchmark runs per connection: *How does reusing a connection affect performance?*
* number of warmup and cooldown runs, if any: *How does (re)establishing a connection affect performance?*
Expand All @@ -26,6 +27,7 @@ Benchmarks can be [parametrized](#query-file) by
* optional [comparison](#results-and-comparison) of result sets: *Do I always receive the same results sets?*

Benchmarks can be [randomized](#randomized-query-file) (optionally with specified [seeds](#random-seed) for reproducible results) to avoid caching side effects and to increase variety of queries by taking samples of arbitrary size from a

* list of elements
* dict of elements (one-to-many relations)
* range of integers
Expand Down
16 changes: 8 additions & 8 deletions paper.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ As a result we obtain an interactive dashboard to inspect timing aspects.
### Configuration

We need to provide
* a [DBMS configuration file](Options.html#connection-file), e.g. in `./config/connections.config`
* a [DBMS configuration file](#connection-file), e.g. in `./config/connections.config`
```
[
{
Expand All @@ -86,7 +86,7 @@ We need to provide
]
```
* the required JDBC driver, e.g. `mysql-connector-java-8.0.13.jar`
* a [Queries configuration file](Options.html#query-file), e.g. in `./config/queries.config`
* a [Queries configuration file](#query-file), e.g. in `./config/queries.config`
```
{
'name': 'Some simple queries',
Expand Down Expand Up @@ -266,7 +266,7 @@ In order to do so, result sets (or their hash value or size) are stored as lists

## Monitoring Hardware Metrics

To make hardware metrics available, we must [provide](Options.html#connection-file) an API URL for a Prometheus Server.
To make hardware metrics available, we must [provide](#connection-file) an API URL for a Prometheus Server.
The tool collects metrics from the Prometheus server with a step size of 1 second.

![Caption for example figure.\label{fig:Concept-Monitoring}](docs/Concept-Monitoring.png){ width=320 }
Expand Down Expand Up @@ -914,7 +914,7 @@ optional arguments:

It has two options:
* `--result-folder`: Path of a local folder containing result folders. This parameter is the same as for `benchmark.py`
* `--anonymize`: If this flag is set, all DBMS are anonymized following the parameters in their [configuration](Options.html#connection-file).
* `--anonymize`: If this flag is set, all DBMS are anonymized following the parameters in their [configuration](#connection-file).

When you start the dashboard it is available at `localhost:8050`.

Expand All @@ -926,7 +926,7 @@ Optionally you can activate to have some default panels that will be included at

## Concept

The dashboard analyzes the data in [three dimensions](Concept.html#evaluation) using various [aggregation functions](Concept.html#aggregation-functions):
The dashboard analyzes the data in [three dimensions](#evaluation) using various [aggregation functions](#aggregation-functions):
<p align="center">
<img src="https://raw.githubusercontent.com/Beuth-Erdelt/DBMS-Benchmarker/master/docs/Evaluation-Cubes.png">
</p>
Expand Down Expand Up @@ -1006,7 +1006,7 @@ In the settings panel you can select the

* [Kind of measure](#data) you want to inspect (kind, name)
* [Type](#graph-panels) of plot (graph type, x-axis, annotate)
* [Aggregation functions](Concept.html#aggregation-functions).
* [Aggregation functions](#aggregation-functions).
The order of aggregation is
1. Query (run dimension)
1. Total (query dimension)
Expand All @@ -1032,11 +1032,11 @@ In the filter panel you can
* single queries
* receive details about
* the connections (configurations)
* [Configuration](Options.html#connection-file)
* [Configuration](#connection-file)
* DBMS
* Resources
* and the queries like
* [Configuration](Options.html#query-file)
* [Configuration](#query-file)
* Number of runs
* Result sets

Expand Down

0 comments on commit b7f714a

Please sign in to comment.