Cluster Profiler Tool

This small tool consists of two applictions. The cluster-classifier-client analyses the nodes inside a heterogeneous cluster. We provide the following information:

CPU, RAM, IO hardware specifications
CPU single-thread and multi-thread performance benchmark
RAM benchmark
IO sequential and random read-write benchmark

The cluster-classifier-api provides the gathered information as a REST API for usage:

Information about a single node in the cluster
Clusters inside the computing cluster, consisting of similar nodes
Labelling the nodes inside a cluster based on the cluster centroids
Labels the nodes controlled by Kubernetes to enable fine-granular mapping of computing resources

BibTeX

@INPROCEEDINGS{bader2021tarema,
  author={Bader, Jonathan and Thamsen, Lauritz and Kulagina, Svetlana and Will, Jonathan and Meyerhenke, Henning and Kao, Odej},
  booktitle={2021 IEEE International Conference on Big Data (Big Data)}, 
  title={Tarema: Adaptive Resource Allocation for Scalable Scientific Workflows in Heterogeneous Clusters}, 
  year={2021},
  publisher={IEEE},
  pages={65-75},
  doi={10.1109/BigData52589.2021.9671519}}

Prerequisites

Install ansible on your host-machine
Add all target servers to the ansible inventory under /etc/ansible/hosts
Please substitute the values from cluster-classifier-client/src/resources/application.properties with the values of your database
```
spring.datasource.url=jdbc:mysql://${MYSQL_HOST:remotehost}:3306/db_example
spring.datasource.username=yourDBusername
spring.datasource.password=yourDBpassword
```
You may also change the database by removing the MySQL driver from the pom.xml and add a new one (i.e PostgreSQL, MariaDB).
run the following command in the cluster-classifier-client folder to build the executable jar:
```
./mvn clean package
```

Deploy cluster-classifier-client

Execute the the ansible-playbook:

ansible-playbook ./absible/run_cc.yml

Check results with the API

run the following command in the cluster-classifier-api folder to build the executable jar:
```
./mvn clean package 
```
run
```
java -jar cluster-classifier-api*.jar 
```
You can see a list of available REST endpoints under
```
URL:PORT/swagger-ui/index.html
```

Clustering the nodes

For clustering the nodes we are using the KMeansPlusPlusClusterer (org.apache.commons.math3.ml.clustering) and implemented a Silhouette score to evaluate the number of clusters. The org.apache.commons.math3.ml.clustering offer multiple clustering methods. Moreover, one can extend The Clusterer to implement a custom clustering approach. If you want to implement an evaluation metric different from the Silhouette score, you can easily exchange our SilhouetteScore with another custom ClusterEvaluator.

Labelling Kubernetes nodes

To disable labelling for Kubernetes go to the application.properties file under cluster-classifier-api and set kube.enable to false. Switch to true to enable Kubernetes labelling.

After clustering the nodes, our system starts labelling the nodes.
In the first step, we compare the centroids, by defining the quartiles depending on the number of node groups.

These labels are then stored in the database.

The class KubernetesNodeLabeller handels the labelling of the Kubernetes nodes. We are using a DefaultKubernetesClient. This means the we use the kube config file (~./kube(config) to create the connection. For a more advanced configuration you might want to use the DefaultKubernetesClient constructor, which uses a costum configuration. The KubernetesNodeLabeller uses the Kubernetes API to set the labels from the NodeRepresentations to the Kubernetes nodes.

How it works

Usage without own cluster setup

If you do not have an own compute cluster but want to test the capabilities of our tool, you can use our predefined terraform setup. By executing terraform, you will set up Google Compute Engines with various hardware specifications.

The current specification runs with GCP. However AWS or other cloud providers are possbible to use as well. Please refer to the official terraform documentation. Moreover, depending on the configuration cost might occur.

Install Terraform on your local host machine
Edit the ./terraform/main.tf file
Feel free to remove/add/change instances to from ./terraform/compute-engines.tf file
Run terraform init
Run terraform apply
To destroy the setup run terraform destroy

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cluster Profiler Tool

BibTeX

Prerequisites

Deploy cluster-classifier-client

Check results with the API

Clustering the nodes

Labelling Kubernetes nodes

How it works

Usage without own cluster setup

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
ansible		ansible
cluster-classifier-api		cluster-classifier-api
cluster-classifier-client		cluster-classifier-client
fig		fig
terraform		terraform
README.md		README.md

CRC-FONDA/tarema-cluster-profiler

Folders and files

Latest commit

History

Repository files navigation

Cluster Profiler Tool

BibTeX

Prerequisites

Deploy cluster-classifier-client

Check results with the API

Clustering the nodes

Labelling Kubernetes nodes

How it works

Usage without own cluster setup

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages