ALOJA Big Data Benchmarking platform: includes tools to define and deploy clusters, orchestrate benchmarking, collect and manage results, and analyze them in Web app including Predictive Analytic tools
JavaScript C HTML Shell PHP CSS Other
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
aloja-bench
aloja-deploy
aloja-tools
aloja-web
blobs
config
publications
secure
shell
.gitattributes
.gitignore
.gitmodules
.travis.yml
Gruntfile.js
LICENSE.md
README.md
Vagrantfile
ganglia
package.json

README.md

ALOJA Big Data benchmarking platform

###Quick start

  1. Get familiar with the Web app, browse data and views at: http://aloja.bsc.es
  2. Checkout some slides or publications as background and documentation: http://aloja.bsc.es/publications
To experiment on a local DEV copy:
git clone https://github.com/Aloja/aloja.git
cd aloja
vagrant up
xdg-open http://localhost:8080

Note: Requires git, vagrant >= v1.6, virtualbox >= v4.2, some patience to download and import the VM, and a web browser.

About ALOJA

The ALOJA research project is an initiative from the Barcelona Supercomputing Center (BSC) to explore new hardware architectures for Big Data processing. One of the main goals of the project is to produce a systematic study of SW and HW configuration and deployment options; where we are analyzing the cost-effectiveness of the different cloud services (IaaS or PasS) as well as on-premise hardware, both commodity and up-scale.

In ALOJA we have currently created the largest vendor-neutral repository of Hadoop benchmark with over 42,000 public results, as well as several tools for the management of the full-cycle from planning and execution of benchmarks, to data analysis and automated tools to produce insights to better understand system behavior and take decisions on framework and cluster design.

This repository includes the on-going open source tools of this project that consists of:

  • Cluster definition and automated deployment
  • Benchmark selection and iteration of configurations
  • Metrics collections, results gathering, and importing into a DB
  • Web application to manage results
  • Advanced data views for aggregate results with filters
  • Predictive Analytics (PA) aka Machine Learning tools for modeling and Knowledge Discovery

More info

The project is under constant development and in the process of being documented. Feel free to browse the site, the code, and send inquiries, feature requests or bug reports to:

Write us at: hadoop@bsc.es