hadoop-sandbox

Fully Distributed installation of hadoop ecosystem on GCP IaaS.

all playbooks are consistent within related tools

High Level Architecture

Generally,

choose which playbook do you need (hdfs, hdfs + hbase, zookeeper etc..)
design inrastructure arch. on .env files in related playbook folders.
- you can also combine your own playbook (good for trainings or POCs)
create machines on GCP, and establish passwordless ssh from master to workers
and configure products
then start servers

Create a GCP account and billing account etc..., Then

Configure your local for gcloud CLI or use gcloud-shell in gcp console, after cloning the git-reporitory.
- for local, run gcloud auth list to check active gcp account. And gcloud auth login if necessary
git clone https://github.com/tansudasli/hadoop-backbone-boilerplate.git,
- Then cd hadoop-backbone-boilerplate
- Edit .gcp.env and update (service account, project, region etc...)
Run ./create-gcp-project.sh to create project, and to link your billing account
Run ./create-firewall-rule.sh to create fw rules, so that you can reach via web consoles

and also consider

Name		Name	Last commit message	Last commit date
Latest commit History 201 Commits
doc		doc
playbook-hbase		playbook-hbase
playbook-hdfs		playbook-hdfs
playbook-zookeeper		playbook-zookeeper
.gcp.env		.gcp.env
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
create-firewall-rule.sh		create-firewall-rule.sh
create-gcp-project.sh		create-gcp-project.sh