🐘 A local Hadoop cluster bootstrapper using Vagrant, Ansible, and Ambari.
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
docs
jumbo
.gitignore
LICENSE
README.md
setup.py

README.md

Jumbo - A local Hadoop cluster bootstrapper

Jumbo is a tool that allows you to deploy a virtualized Hadoop cluster on a local machine in minutes. It is made to help you quickly bootstrap development environments without struggling with nodes and services configurations.

Jumbo shell

Jumbo is written in Python and relies on other tools that it coordinates:

  • Vagrant, to manage the virtual machines;
  • Ansible, to configure the cluster;
  • Apache Ambari, to provision and manage the Hadoop cluster.

The distribution used for the Hadoop cluster is Hortonworks Data Platform.

Who can use Jumbo?

Originally, Jumbo is designed for developers with a limited knowledge of the Hadoop deployment process. But this doesn't mean that it cannot be helpful to others! Everything needed to create and deploy a Hadoop cluster is done by Jumbo, so if you need different environments (e.g. for different projects, testing...), be sure it will be useful to you!

Getting started

A complete documentation is available at Jumbo website. Jumbo installation instructions are available on the installation page.

If you want a local documentation, it is also available in Gitbook format in the docs/ folder.

Project roadmap

Current version: v0.4.4

  • Add Kerberos support
  • Add a -r option on addservice for automatic dependency installation
  • "Proxify" Vagrant commands into Jumbo: start, stop, status, restart, delete
  • Start HDP services on vagrant start
  • Host the documentation on a website (jumbo.adaltas.com)
  • Allow custom configurations via JSON (versions, urls...)
  • Add informative commands (info, versions, available services...)
  • Add support for all HDP services
  • Generalize HA support
  • Smart cluster topology based on available ressources
  • Allow to duplicate existing cluster with a different name

Contributing

Jumbo is a very recent project. We would be happy to have feedback so don't hesitate to post issues or even to do a PR if you need extra features!

Authors

Jumbo was developed by Gauthier Leonard and Xavier Hermand at Adaltas.

Contributors

License

Jumbo is licensed under MIT License. See LICENSE for the full license text.