V0.6.7 (#201)

* Masterscript: ENVs for statefulsets * Masterscript: ENVs for statefulsets only adding * Masterscript: ENVs for sut-job * Masterscript: VolumeMounts optional in sut deployment * Masterscript: Volumes optional in sut deployment * Masterscript: service_name for loading scripts as argument * Masterscript: volumeClaimTemplates for statefulset reads storage requests * Masterscript: volumeClaimTemplates as a list * Masterscript: SUT's services can have component names different from selector * Masterscript: Catch more expections in get_host_diskspace_used_data() * Masterscript: storage_parameter in connection infos * Tool: Show worker volumes * Update README.md JOSS draft badge * Masterscript: Also remove worker storage after experiment * Masterscript: Label DBMS in job * Masterscript: Label DBMS in monitoring * Docs: CockroachDB tested successfully * Masterscript: Do not retry delete_pvc() if pvc cannot be found * Masterscript: Name of worker storage contains experiment code * Masterscript: Find worker pvc by labels * Masterscript: benchmarking_parameters per experiment and configuration * Masterscript: Wait 5s before (re)checking status of workers * Masterscript: Remove worker pv if not wanted * Masterscript: Loading waits 60 secs for all pods * Masterscript: Benchbase similar to ycsb * Masterscript: Benchbase convert results to df * Masterscript: Benchbase collect results into df * Masterscript: Benchbase collect results into df and set index * Masterscript: Benchbase uses name_format for results * Masterscript: Fix container to dashboard when copying to result component * Masterscript: Accept successful pods as completed * Masterscript: Benchbase all collect results into single df at end of benchmark * Masterscript: Benchbase no results for loading phase and benchmarker results per job only * Masterscript: YCSB all collect results into single df at end of benchmark * Masterscript: Debug messages about evaluation * Masterscript: BEXHOMA_CONNECTION set to connection for benchmarker, to configuration otherwise * Masterscript: HammerDB merge results into df * Masterscript: HammerDB merge results into dfm, differ between connection and configuration * Masterscript: BEXHOMA_CONNECTION test for benchbase * Masterscript: Benchbase merge results into dfm, differ between connection and configuration * Masterscript: Debug messages about evaluation for YCSB collected dfs * Masterscript: HammerDB extract pod name * Masterscript: YCSB extract pod name * Masterscript: YCSB dump more information * Masterscript: HammerDB extract pod name * Masterscript: YCSB extract pod name * Masterscript: YCSB evaluation improved * Masterscript: Benchbase evaluation improved * Masterscript: BEXHOMA meta data in job envs * Masterscript: HammerDB concat dbms infos * Masterscript: Fetch metrics for specific connection * Masterscript: HammerDB also keep config file for single connection * Masterscript: Benchbase requests schema file * Masterscript: All experiments keep config file for single connection * Masterscript: Allow all job ENVs to be overwritten * Masterscript: BEXHOMA_CLIENT set to number of benchmarker client * Masterscript: Fetch metrics for specific connection for all benchmarker * Masterscript: BEXHOMA_CLIENT set to number of benchmarker client, thus is 0 during loading * Masterscript: Show list of open benchmarks per configuration * Masterscript: NEVER rerun, only one connection in config for detached - all benchmarker collect all dbms in one connection file * Masterscript: Copy connection file for connection specified by name * Masterscript: Copy connection file for connection specified by name * Masterscript: fetch_metrics_loading() for connection * Masterscript: fetch_metrics_loading() for connection, run in dashboard pod * Masterscript: set connection file name * Masterscript: fetch_metrics_loading() after loading, dump results * Build script for Docker images * Python 3.11.5 instead of 3.12 because of bug in setuptools * Benchmarker: Less output * DBMS: YugyByteDB dummy deployment * Docs: YCSB at entry page * Docs: scaled-out drivers at entry page * Docs: TPC-C at entry page * Docs: scaled-out drivers at entry page * Docs: Example: Run a custom SQL workload * # Conflicts: # README.md # bexhoma/configurations.py # bexhoma/experiments.py * requirements: no nbconvert * requirements: python 3.11.15 * Docs: .readthedocs.yaml * requirements: no m2r2 * requirements: sphinx * Docs: Example: Run a custom SQL workload * Docs: Formatting * YCSB: scaling-factor-operations * YugabyteDB dummy less resources * fix: requirements.txt to reduce vulnerabilities * fix: requirements.txt to reduce vulnerabilities - Werkzeug>=3.0.1 * Require only Python 3.10.2 * v0.6.7 prerelease
Beuth-Erdelt · Nov 15, 2023 · 23e570b · 23e570b
1 parent bb6c399
commit 23e570b
Show file tree

Hide file tree

Showing 6 changed files with 22 additions and 21 deletions.
diff --git a/README.md b/README.md
@@ -23,7 +23,7 @@ The basic workflow is [1,2]: start a containerized version of the DBMS, install
 A more advanced workflow is: Plan a sequence of such experiments, run plan as a batch and join results for comparison.
 
 It is also possible to scale-out drivers for generating and loading data and for benchmarking to simulate cloud-native environments as in [4].
-See [example](TPCTC23/README.md) results as presented in [A Cloud-Native Adoption of Classical DBMS Performance Benchmarks and Tools](http://dx.doi.org/10.13140/RG.2.2.29866.18880) and how they are generated.
+See [example](https://github.com/Beuth-Erdelt/Benchmark-Experiment-Host-Manager/TPCTC23/README.md) results as presented in [A Cloud-Native Adoption of Classical DBMS Performance Benchmarks and Tools](http://dx.doi.org/10.13140/RG.2.2.29866.18880) and how they are generated.
 
 See the [homepage](https://github.com/Beuth-Erdelt/Benchmark-Experiment-Host-Manager) and the [documentation](https://bexhoma.readthedocs.io/en/latest/).
 
@@ -33,33 +33,28 @@ If you encounter any issues, please report them to our [Github issue tracker](ht
 
 1. Download the repository: https://github.com/Beuth-Erdelt/Benchmark-Experiment-Host-Manager
 1. Install the package `pip install bexhoma`
-1. Make sure you have a working `kubectl` installed  
-  (Also make sure to have access to a running Kubernetes cluster - for example [Minikube](https://minikube.sigs.k8s.io/docs/start/))
-  (Also make sure, you can create PV via PVC and dynamic provisioning)
+1. Make sure you have a working `kubectl` installed.
+    * (Also make sure to have access to a running Kubernetes cluster - for example [Minikube](https://minikube.sigs.k8s.io/docs/start/))
+    * (Also make sure, you can create PV via PVC and dynamic provisioning)
 1. Adjust [configuration](https://bexhoma.readthedocs.io/en/latest/Config.html)
     1. Rename `k8s-cluster.config` to `cluster.config`
     1. Set name of context, namespace and name of cluster in that file
-1. Install result folder  
-  Run `kubectl create -f k8s/pvc-bexhoma-results.yml`
+1. Install result folder: Run `kubectl create -f k8s/pvc-bexhoma-results.yml`
 
 
 ## Quickstart
 
 
-1. Run `python ycsb.py -ms 1 -dbms PostgreSQL -workload a run`.  
-  This installs PostgreSQL and runs YCSB workload A with varying target.
-  The driver is monolithic with 64 threads.
-  The experiments runs a second time with the driver scaled out to 8 instances each having 8 threads.
-1. You can watch status using `bexperiments status` while running.  
-  This is equivalent to `python cluster.py status`.
-1. After benchmarking has finished, run `bexperiments dashboard` to connect to a dashboard. You can open dashboard in browser at `http://localhost:8050`.  
-  This is equivalent to `python cluster.py dashboard`  
-  Alternatively you can open a Jupyter notebook at `http://localhost:8888`.
+1. Run `python ycsb.py -ms 1 -dbms PostgreSQL -workload a run`. This installs PostgreSQL and runs YCSB workload A with varying target. The driver is monolithic with 64 threads. The experiments runs a second time with the driver scaled out to 8 instances each having 8 threads.
+1. You can watch status using `bexperiments status` while running. This is equivalent to `python cluster.py status`.
+1. After benchmarking has finished, run `bexperiments dashboard` to connect to a dashboard. You can open dashboard in browser at `http://localhost:8050`. This is equivalent to `python cluster.py dashboard`. Alternatively you can open a Jupyter notebook at `http://localhost:8888`.
+
 
 ## More Informations
 
 For full power, use this tool as an orchestrator as in [2]. It also starts a monitoring container using [Prometheus](https://prometheus.io/) and a metrics collector container using [cAdvisor](https://github.com/google/cadvisor). For analytical use cases, the Python package [dbmsbenchmarker](https://github.com/Beuth-Erdelt/DBMS-Benchmarker), [3], is used as query executor and evaluator as in [1,2].
 For transactional use cases, HammerDB's TPC-C, Benchbase's TPC-C and YCSB are used as drivers for generating and loading data and for running the workload as in [4].
+
 See the [images](https://github.com/Beuth-Erdelt/Benchmark-Experiment-Host-Manager/tree/master/images/) folder for more details.
 
 <p align="center">

diff --git a/docs/Example-custom.md b/docs/Example-custom.md
@@ -2,7 +2,7 @@
 
 ## Preparation
 
-* clone repository, branch v0.6.5
+* clone repository
 * pip install requirements
 * rename `k8s-cluster.config` to `cluster.config`
 * replace inside that file where to store the results locally  

diff --git a/k8s/deploymenttemplate-YugabyteDB.yml b/k8s/deploymenttemplate-YugabyteDB.yml
@@ -63,8 +63,8 @@ spec:
         #- {containerPort: 5433}
         #- {containerPort: 9042}
         resources:
-          limits: {cpu: 16000m, memory: 128Gi}
-          requests: {cpu: 16000m, memory: 128Gi}
+          limits: {cpu: 1000m, memory: 1Gi}
+          requests: {cpu: 1000m, memory: 1Gi}
           #, ephemeral-storage: "1536Gi"}
         volumeMounts:
         - {mountPath: /data, name: benchmark-data-volume}

diff --git a/requirements.txt b/requirements.txt
@@ -13,3 +13,5 @@ redis
 mistune>=2.0.3 # not directly required, pinned by Snyk to avoid a vulnerability
 numpy>=1.22.2 # not directly required, pinned by Snyk to avoid a vulnerability
 setuptools>=65.5.1 # not directly required, pinned by Snyk to avoid a vulnerability
+pillow>=10.0.1 # not directly required, pinned by Snyk to avoid a vulnerability
+Werkzeug>=3.0.1
diff --git a/setup.py b/setup.py
@@ -8,7 +8,7 @@
 
 setuptools.setup(
     name="bexhoma",
-    version="0.6.6",
+    version="0.6.7",
     author="Patrick Erdelt",
     author_email="perdelt@beuth-hochschule.de",
     description="This python tools helps managing DBMS benchmarking experiments in a Kubernetes-based HPC cluster environment. It enables users to configure hardware / software setups for easily repeating tests over varying configurations.",
@@ -22,7 +22,7 @@
         "Operating System :: OS Independent",
     ],
     license="GNU Affero General Public License v3",
-    python_requires='>=3.11.5',
+    python_requires='>=3.10.2',
     include_package_data=True,
     install_requires=requirements,
     package_dir={'bexhoma': 'bexhoma'},

diff --git a/ycsb.py b/ycsb.py
@@ -44,6 +44,7 @@
     parser.add_argument('-nl', '--num-loading', help='number of parallel loaders per configuration', default=1)
     parser.add_argument('-nlp', '--num-loading-pods', help='total number of loaders per configuration', default=[1,8])
     parser.add_argument('-sf', '--scaling-factor', help='scaling factor (SF) = number of rows in millions', default=1)
+    parser.add_argument('-sfo', '--scaling-factor-operations', help='scaling factor (SF) = number of operations in millions (=SF if not set)', default=None)
     parser.add_argument('-su', '--scaling-users', help='scaling factor = number of total threads', default=64)
     parser.add_argument('-sbs', '--scaling-batchsize', help='batch size', default="")
     parser.add_argument('-ltf', '--list-target-factors', help='comma separated list of factors of 16384 ops as target - default range(1,9)', default="1,2,3,4,5,6,7,8")
@@ -75,6 +76,9 @@
     monitoring_cluster = args.monitoring_cluster
     mode = str(args.mode)
     SF = str(args.scaling_factor)
+    SFO = str(args.scaling_factor_operations)
+    if SFO is None:
+        SFO = SF
     SU = int(args.scaling_users)
     target_base = int(args.target_base)
     list_target_factors = args.list_target_factors
@@ -214,7 +218,7 @@
     #experiment.name_format = '{dbms}-{threads}-{pods}-{target}'
     experiment.set_experiment(script='Schema')
     ycsb_rows = int(SF)*1000000 # 1kb each, that is SF is size in GB
-    ycsb_operations = int(SF)*1000000
+    ycsb_operations = int(SFO)*1000000
     # note more infos about experiment in workload description
     experiment.workload['info'] = experiment.workload['info']+" YCSB data is loaded using several processes."
     if len(args.dbms):