Skip to content
[DEPRECATED] For read-only reference of the ALOJA Big Data Benchmarking platform: includes tools to define and deploy clusters, orchestrate benchmarking, collect and manage results, and analyze them in Web app including Predictive Analytic tools. Check the website for the datasets and papers
JavaScript C HTML Shell PHP CSS Other
Branch: master
Clone or download

Latest commit

Fetching latest commit…
Cannot retrieve the latest commit at this time.


Type Name Latest commit message Commit time
Failed to load latest commit information.

ALOJA Big Data benchmarking platform

###Quick start

  1. Get familiar with the Web app, browse data and views at:
  2. Checkout some slides or publications as background and documentation:
To experiment on a local DEV copy:
git clone
cd aloja
vagrant up
xdg-open http://localhost:8080

Note: Requires git, vagrant >= v1.6, virtualbox >= v4.2, some patience to download and import the VM, and a web browser.


The ALOJA research project is an initiative from the Barcelona Supercomputing Center (BSC) to explore new hardware architectures for Big Data processing. One of the main goals of the project is to produce a systematic study of SW and HW configuration and deployment options; where we are analyzing the cost-effectiveness of the different cloud services (IaaS or PasS) as well as on-premise hardware, both commodity and up-scale.

In ALOJA we have currently created the largest vendor-neutral repository of Hadoop benchmark with over 42,000 public results, as well as several tools for the management of the full-cycle from planning and execution of benchmarks, to data analysis and automated tools to produce insights to better understand system behavior and take decisions on framework and cluster design.

This repository includes the on-going open source tools of this project that consists of:

  • Cluster definition and automated deployment
  • Benchmark selection and iteration of configurations
  • Metrics collections, results gathering, and importing into a DB
  • Web application to manage results
  • Advanced data views for aggregate results with filters
  • Predictive Analytics (PA) aka Machine Learning tools for modeling and Knowledge Discovery

More info

The project is under constant development and in the process of being documented. Feel free to browse the site, the code, and send inquiries, feature requests or bug reports to:

Write us at:

You can’t perform that action at this time.