Chef recipes for Bloomberg's deployment of Hadoop and related components
Ruby HTML Shell Other
Clone or download
Latest commit 286b6c2 Aug 15, 2018
Permalink
Failed to load latest commit information.
bin Initial Chef-BACH release script Jan 20, 2017
cookbooks set hbase web ui binding address to floating ip other than the defaul… Aug 14, 2018
data_bags Remove BCPC specific files May 12, 2016
files/default UEFI and Ubuntu 14.04 support for chef-bach Sep 30, 2016
gemfiles Clarify the purpose of each Gemfile May 10, 2018
lib Correct fetch_cluster_def behavior when a node_obj is provided May 8, 2018
nodes added support for encrypted data bags (both chef-server and little-chef) Nov 30, 2013
site-cookbooks adding rest of files needed to use littlechef to build a cluster Nov 28, 2013
spec Working to cluster-assign-roles.sh Sep 14, 2017
stub-environment re-modify bcpc-hadoop and the stub enviornment to support bach cluste… Jun 7, 2018
test/integration HDP2.3.4+ support Mar 22, 2016
tests Remove serial ports from VM cluster Jul 23, 2018
.gitignore Merge branch 'master' into hannibal Apr 12, 2018
.kitchen.yml update HDP 2.6.5.0-292 build version Jun 15, 2018
.rubocop.yml Initial Travis attempt Feb 25, 2018
.travis.yml Leave bundler setting management to chef May 10, 2018
.vagrant .vagrant symlink disappeared add it back Mar 3, 2014
Berksfile Merge pull request #1226 from ronny-macmaster/backup Jul 5, 2018
Berksfile.common removed unintentional changes Feb 27, 2018
CONTRIBUTING.md Use vendor/bootstrap in CONTRIBUTING.md May 10, 2018
Jenkinsfile Add Foodcritic warnings Mar 7, 2018
LICENSE.txt Initial import Apr 15, 2013
README.md Licensing info badge for README.md Apr 3, 2018
Rakefile Use Rakefile with jenkins job Mar 7, 2018
Vagrantfile Ensure bootstrap hostname is correct for the cluster Sep 14, 2017
Vagrantfile.baremetal Make vagrantfile.baremetal compatible with Vagrant 1.8.1 Apr 8, 2016
Vagrantfile.local.rb Fix Mac OS X build issue missing local SSL certificates Sep 14, 2017
archive_bins.sh Evade escaping problem by taking entire gemfiles directory in archive May 15, 2018
bootstrap.md Remove BCPC specific files May 12, 2016
bootstrap_chef.sh Try new cobbler from source Sep 14, 2017
build_bins.sh modified cluster.txt, added bits to support http get cluster txt, wor… Feb 22, 2018
chefignore Replace C-A-R.sh with a ruby equivalent Jan 24, 2017
chefit.sh Replace C-A-R.sh with a ruby equivalent Jan 24, 2017
cluster-assign-roles.sh Leave bundler setting management to chef May 10, 2018
cluster-enroll-cobbler.sh modified cluster.txt, added bits to support http get cluster txt, wor… Feb 22, 2018
cluster-nuke.sh Added code to create roles using ruby (*.rb) files. Jun 19, 2014
cluster-readme.txt modified cluster.txt, added bits to support http get cluster txt, wor… Feb 22, 2018
cluster-rechef.sh Support cobbler LWRP Oct 27, 2014
cluster-to-html.sh Support cobbler LWRP Oct 27, 2014
cluster-vip.sh finish off param checking Aug 7, 2013
cluster-whatsup.sh Support cobbler LWRP Oct 27, 2014
cluster.txt handle cluster.txt rendering Oct 5, 2017
cluster_assign_roles.rb Warn about the dangers of reindexing Mar 19, 2018
compare_style.sh Initial Travis attempt Feb 25, 2018
config.cfg added support for encrypted data bags (both chef-server and little-chef) Nov 30, 2013
environments adding a stub environment May 13, 2016
install-chef.sh Make install-chef not fail to remove https://rubygems.org due to http… Apr 22, 2017
metadata.rb HDFS DU (#932) Dec 14, 2017
nodescp chef vault cobbler root and web password Jun 22, 2015
nodessh.sh chef vault cobbler root and web password Jun 22, 2015
proxy_setup.sh Use system SSL certificates for Ruby operations Sep 14, 2017
repxe-host.sh Leave bundler setting management to chef May 10, 2018
repxe_host.rb Revert to #corrected_mac(vm) May 15, 2018
roles adding a stub environment May 13, 2016
setup_chef_bootstrap_node.sh knife needs to be called with sudo Dec 28, 2017
setup_chef_cookbooks.sh Reuse old ipxe disks Dec 15, 2017
setup_chef_server.sh Commit new nodes to the Solr index right away Mar 19, 2018
setup_ssh_keys.sh Change proxy to be disabled by default May 30, 2013
uninstall_zabbix.rb Add a test for parse_cluster_txt Sep 14, 2017
vbox_create.sh Remove serial ports from VM cluster Jul 23, 2018
vbox_update.sh Remove old, un-maintained scripts; and maintain one that could be useful Jun 2, 2017
virtualbox_env.sh Correct indentation on virtualbox_env.sh May 14, 2018
vm-to-cluster.sh Remove unneeded dependency on the Gemfile Apr 24, 2018
vm_to_cluster.rb Revert "apply linting changes from pull #1217" Jun 7, 2018
wait-for-hosts.sh Leave bundler setting management to chef May 10, 2018
wait_for_hosts.rb modified cluster.txt, added bits to support http get cluster txt, wor… Feb 22, 2018
windows.md Update windows.md Jul 25, 2013

README.md

Chef BACH

License

Overview

This is a set of Chef cookbooks to bring up Hadoop and Kafka clusters. In addition, there are a number of additional services provided with these cookbooks - such as DNS, metrics, and monitoring - see below for a partial list of services provided by these cookbooks.

Hadoop

Each Hadoop head node is Hadoop component specific. The roles are intended to be run so that they can be layered in a highly-available manner. E.g. multiple BCPC-Hadoop-Head-* machines will correctly build a MySQL, Zookeeper, HDFS JournalNode, etc. cluster and deploy the named component as well. Further, for components which support HA, the intention is one can simply add the role to multiple machines and the right thing will be done to support HA (except in the case of HDFS).

To setup HDFS HA, please follow the following model from your Bootstrap VM:

  • Install the cluster once with a non-HA HDFS:
    • with a BCPC-Hadoop-Head-Namenode-NoHA role
    • with the following node variable [:bcpc][:hadoop][:hdfs][:HA] = false
    • ensure at least three machines are installed with BCPC-Hadoop-Head roles
    • ensure at least one machine is a datanode
    • run cluster-assign-roles.sh <Environment> Hadoop successfully
  • Re-configure the cluster with an HA HDFS:
    • change the BCPC-Hadoop-Head-Namenode-NoHA machine's role to BCPC-Hadoop-Head-Namenode
    • set the following node variable [:bcpc][:hadoop][:hdfs][:HA] = true on all nodes (e.g. in the environment)
    • run cluster-assign-roles.sh <Environment> Hadoop successfully

Setup

These recipes are currently intended for building a BACH cluster on top of Ubuntu 14.04 servers using Chef 11. When setting this up in VMs, be sure to add a few dedicated disks (for HDFS data nodes) aside from boot volume. In addition, it's expected that you have three separate NICs per machine, with the following as defaults (and recommendations for VM settings):

  • eth0 - management traffic (host-only NIC in VM)
  • eth1 - reserved traffic (host-only NIC in VM)
  • eth2 - compute traffic (host-only NIC in VM)

You should look at the various settings in cookbooks/bcpc/attributes/default.rb and tweak accordingly for your setup (by adding them to an environment file).

Cluster Bootstrap

The provided scripts which sets up a Chef and Cobbler server via Vagrant permits imaging of the cluster via PXE.

Once the Chef server is set up, you can bootstrap any number of nodes to get them registered with the Chef server for your environment - see the next section for enrolling the nodes.

Make a cluster

To build a new BACH cluster, you have to start with building head nodes first. (This assumes that you have already completed the bootstrap process and have a Chef server available.) Since the recipes will automatically generate all passwords and keys for this new cluster, the nodes must temporarily become admin's in the chef server, so that the recipes can write the generated info to a databag. The databag will be called configs and the databag item will be the same name as the environment (Test-Laptop in this example). You only need to leave the node as an admin for the first chef-client run. You can also manually create the databag & item (as per the example in data_bags/configs/Example.json) and manually upload it if you'd rather not bother with the whole admin thing for the first run.

To assign machines a role, one can update the cluster.txt file and ensure all necessary information is provided as per cluster-readme.txt.

Using the script tests/automated_install.sh, one can run through what is the expected "happy-path" install for a single machine running (by default) four VirtualBox VMs. This simple install supports only changing DNS, proxy and VM resource settings. (This is the basis of our automated build tests.)

Using the script tests/automated_install.sh on Mac OS (OS X), will require brew to be available/installed.

Note: To run more than one test cluster at a time with VirtualBox: One may export BACH_CLUSTER_PREFIX to set their desired cluster name prefix. This will set the namespace so that the cluster's virtual machines do not collide on the hypervisor. Resulting in names following the convention:

      ${BACH_CLUSTER_PREFIX}-bcpc-bootstrap
      ${BACH_CLUSTER_PREFIX}-bcpc-vm1
      ${BACH_CLUSTER_PREFIX}-bcpc-vm2
      ${BACH_CLUSTER_PREFIX}-bcpc-vm3

Lacking a $BACH_CLUSTER_PREFIX tests/automated_install.sh will not assign a cluster prefix to the cluster hosts or bootstrap. One also needs to ensure their management, float and storage network ranges differ between clusters in the environment and cluster.txt) -- update them to be unique. Further, one needs to have each cluster's repository in a different parent directory (to avoid the cluster directory from colliding).

Note: For man-in-the-middle proxy or local repository users: One need ensure local SSL certificate authority certificates are located on your hypervisor at /usr/local/share/ca-certificates this will populate your bootstrap with the necessary certificates. Further, to not use a proxy for specific hosts, one can set $additional_no_proxy to a comma separated list of hosts or *-wildcard domains. (This is specifically useful for local APT, Maven or Ruby repositories.)

Other Deployment Flavors

In addition to the "happy-path" integration test using automated_install.sh there are ways to deploy on OpenStack or to bare-metal hosts. Lastly, for those using test-kitchen there are various test-kitchen suites one can run as well.

A view of the various full-cluster deployment types: Flow Chart of BACH Deployment Flavors -- VBox, OpenStack, Vagrant Bootstrap/Baremetal, Baremetal Only

Using a BACH cluster

Once the nodes are configured and bootstrapped, BACH services will be accessible via the floating IP. (For the Test-Laptop environment, it is 10.0.100.5.)

For example, you can go to https://10.0.100.5:8888 for the Graphite web interface. To find the automatically-generated service credentials, look in the data bag for your environment.

ubuntu@bcpc-bootstrap:~$ knife data bag show configs Test-Laptop | grep mysql-root-password
mysql-root-password:       abcdefgh

For example, to check on HDFS:

ubuntu@bcpc-vm1:~$ HADOOP_USER_NAME=hdfs hdfs dfsadmin -report
Configured Capacity: 40781217792 (37.98 GB)
Present Capacity: 40114298221 (37.36 GB)
DFS Remaining: 39727463789 (37.00 GB)
DFS Used: 386834432 (368.91 MB)
DFS Used%: 0.96%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Live datanodes (1):

Name: 192.168.100.13:50010 (f-bcpc-vm3.bcpc.example.com)
Hostname: bcpc-vm3.bcpc.example.com
Decommission Status : Normal
Configured Capacity: 40781217792 (37.98 GB)
DFS Used: 386834432 (368.91 MB)
Non DFS Used: 666919571 (636.02 MB)
DFS Remaining: 39727463789 (37.00 GB)
DFS Used%: 0.95%
DFS Remaining%: 97.42%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 12
Last contact: Fri Aug 14 21:08:23 EDT 2015

Chef-BACH Philosophies

The philosophy behind BACH cluster operation is that no single machine is special and all services are multi-master or have sufficiently fast failover to prevent failure in application data paths and availability. Commits to the codebase should be deployable without requiring path dependence from the previous repository state. For example, a machine should be able to be PXE-booted fresh into a particular version of the code, while an existing machine should be able to simply run Chef to upgrade into a particular Chef-BACH version. Unhealthy machines should always be able to be torn down and reinstalled from scratch without disruption. Any Chef-BACH version which requires manual interaction is considered BREAKING (as a GitHub tag) and should be avoided as much as possible; our mantra is that all operations are handled automatically. All services should be secured and kerberized as appropriate. Yet, testing should be done both with a kerberized VM cluster as well as a non-kerberized cluster (or Test-Kitchen VM) to ensure both workflows run-through. The non-kerberized path is useful to ensure others can more easily integrate starting in a non-secure environment before going fully secure.

BACH Services

BACH currently relies upon a number of open-source packages:

Thanks to all of these communities for producing this software!

Contributing

See our contributing document for more.