Skip to content


Repository files navigation

  1. Implementig A-Bench
  1. Getting started
  1. Additional information

Implementig A-Bench

This is my master thesis project. The main goal is to make installing, setting up and implementing the Big Data Benchmark A-Bench easier and to automate the process as much as possible. Using HTML, python3, flask, pandas and some other tools I created a WebUI management console for easier control over the setup of the infrastructure, running the benchmark and visualizing the results in a few charts which gives information about some metrics like CPU, memory and file system usage.

alt text


  • Iternet connection
  • Ubuntu 18.04 LTS (clean install)
  • Modern web browser like Chromium or Mozilla Firefox


The ABench management console uses a number of open source projects to work properly:

  • python3 - Python Programming Language version 3.6
  • Flask - Flask is a microframework for Python based on Werkzeug, Jinja 2 and good intentions
  • pandas - pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language
  • chart.js - Simple yet flexible JavaScript charting for designers & developers

Getting started

1. Step:

  • Download the repository into home/user/ in order to work properly
  • Go go project folder "scripts", make it executable and run as root to install missing tools if any and download a GitHub repository for creating A-Bench infrastructure. An action from the user is required during the installation [Press ENTER to continue]
$ chmod +x
$ sudo ./
  • To start the main WebUI run the python script in terminal as root:
$ sudo python3
  • Verify the deployment by navigating to your server address in your preferred browser

2. Step:

  • On the homepage there are three columns of buttons to the left and text box with the output from running different commands inside the page to the right
  • First set of buttons under the "Setup" are used to check software pre-requirements, if everything needed is installed, to deploy A-Bench infrastructure using A-Bench infrastructure and to monitor the infrastructure using Grafana/Kubernetes dashboards after successful deployment
  • Second set of buttons under "Run" are used to configure which queries to be run and to run a sample ABench experiments after going to "Configuration" page and selecting the queries
  • Third set of buttons under "Analyse" are used to load the results ONLY AFTER running a sample experiment described in 3.Step.

3. Step:

  • When you click the button "Configuration" under "Run" you will be forwarded to a new page
  • There are shown all 30 queries with explanation that can be run as an experiment as a check boxes
  • After selecting the desired one click "Save config" and under the field with all queries the chosen one will be shown
  • An environment variable will be created and after clicking "Run SRE with HIVE"/"Run SRE with SPARK" this variable would be used

4. Step:

  • After successfully running an experiment the results will be saved in:
$ ~/wd/abench/a-bench/results/
  • On the homepage under "Run" by clicking on "Load results" a file explorer will open, navigate to ~/wd/abench/a-bench/results/, choose the file name to load the results and analyze them using density charts
  • If you want to load new results from different experiment repeat the previous step

Additional Information

  • In the folder "all_executed_exp" will be stored all results of all executed experiments
  • In the folder "experiment_results" will be saved as a .csv tables the results from the experiment needed for the charts
  • In the folder "outputs" are two .txt files used for the output from all executed commands to be shown on the homepage
  • In the folder "scripts" are all necessary scripts for deploying and running the infrastructure
  • In the folder "templates" are all html pages
  • In the folder "static" are located all .css files for the styling of the pages
  • In folder "~/wd" will be downloaded everything necessary for the infrastructure from GitHub repository from Michael Czaja