Skip to content

oferbene/one-script-deploy

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

134 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

One Script Deploy

One script to rule them all: CDP / CDP Private Cloud / CDH / HDP !

Given some machines, this script will setup all pre-requisites, Install, Configure a fully secure cluster and Load Data into it.

Requirements

Launch the script requirements.sh to enable all requirements before launching the full script.

Installation

Command line tool

To install a cluster, default one is a CDP 7 - 10 nodes with Kerberos and TLS set:

export PAYWALL_USER=  # Your Paywall User from Cloudera to access archive.cloduera.com
export PAYWALL_PASSWORD=  # Your Paywall password from Cloudera to access archive.cloduera.com
export LICENSE_FILE=   # Your Licence file from Cloudera
export CLUSTER_NAME=   # A name of your choice (ex: cloudera-test )
export NODES=   # *Space* separated list of nodes (ex: "node1 node2 node3 ") (You must provide as much as nodes are needed for the type of installation you are launching, default being 10.)
./setup-cluster.sh \
    --cluster-name=${CLUSTER_NAME} \
        --license-file=${LICENSE_FILE} \
        --paywall-username=${PAYWALL_USER} \
        --paywall-password=${PAYWALL_PASSWORD} \
        --nodes="${NODES}"

N.B. : This assumes that a passwordless connection is present from here to all your cluster nodes, however provide a password with --node-password or a private key file with --node-key

Configuration

Many more configurations are available, see them all with:

./setup-cluster.sh --help

Examples

!!! Special No license or Paywall Cluster : CDP 7 - Basic 6 nodes !!!

./setup-cluster.sh \
    --cluster-name=${CLUSTER_NAME} \
    --cluster-type=basic \
    --nodes-base="${NODES}"

CDP 7 - Full 10 nodes with almost all services (Kerberos / TLS)

./setup-cluster.sh \
    --cluster-name=${CLUSTER_NAME} \
    --license-file=${LICENSE_FILE} \
    --paywall-username=${PAYWALL_USER} \
    --paywall-password=${PAYWALL_PASSWORD} \
    --nodes-base="${NODES}"

CDP 7 - Basic 6 nodes (Kerberos / TLS)

./setup-cluster.sh \
    --cluster-name=${CLUSTER_NAME} \
    --license-file=${LICENSE_FILE} \
    --paywall-username=${PAYWALL_USER} \
    --paywall-password=${PAYWALL_PASSWORD} \
    --cluster-type=basic \
    --nodes-base="${NODES}"

CDP 7 - Basic encrypted 6 nodes (Kerberos / TLS) (You can specify 1 or 2 nodes for KTS)

./setup-cluster.sh \
    --cluster-name=${CLUSTER_NAME} \
    --license-file=${LICENSE_FILE} \
    --paywall-username=${PAYWALL_USER} \
    --paywall-password=${PAYWALL_PASSWORD} \
    --cluster-type=basic-enc \
    --nodes-kts=<Dedicated Node(s) for KTS> \
    --nodes-base="${NODES}"

CDP 7 - Basic 6 nodes with Free IPA on a dedicated node (All CDP clusters can have free-ipa just by adding --free-ipa=true and provide a node with --node-ipa=) (Kerberos / TLS)

./setup-cluster.sh \
    --cluster-name=${CLUSTER_NAME} \
    --license-file=${LICENSE_FILE} \
    --paywall-username=${PAYWALL_USER} \
    --paywall-password=${PAYWALL_PASSWORD} \
    --cluster-type=basic \
    --free-ipa=true \
    --node-ipa=<One node dedicated to IPA> \
    --nodes-base="${NODES}"

CDP 7 - 9 nodes with 3 dedicated for PvC with ECS (Kerberos / TLS / FreeIPA)

./setup-cluster.sh \
    --cluster-name=${CLUSTER_NAME} \
    --license-file=${LICENSE_FILE} \
    --paywall-username=${PAYWALL_USER} \
    --paywall-password=${PAYWALL_PASSWORD} \
    --cluster-type=pvc \
    --nodes-ecs=<Space separated list of 3 nodes> \
    --node-ipa=<One node dedicated to IPA> \
    --nodes-base="${NODES}"

CDP 7 - 6 nodes basic for PVC with Openshift (Experiences installed on a provided OCP cluster) (Kerberos / TLS / FreeIPA)

./setup-cluster.sh \
    --cluster-name=${CLUSTER_NAME} \
    --license-file=${LICENSE_FILE} \
    --paywall-username=${PAYWALL_USER} \
    --paywall-password=${PAYWALL_PASSWORD} \
    --cluster-type=pvc-oc \
    --kubeconfig-path=<Path to your kubeconfig file> \
    --oc-tar-file-path=<Path to your oc.tar file downloaded from RedHat> \
    --node-ipa=<One node dedicated to IPA> \
    --nodes-base="${NODES}"
./setup-cluster.sh \
    --cluster-name=${CLUSTER_NAME} \
    --license-file=${LICENSE_FILE} \
    --paywall-username=${PAYWALL_USER} \
    --paywall-password=${PAYWALL_PASSWORD} \
    --cluster-type=streaming \
    --nodes-base="${NODES}"
./setup-cluster.sh \
    --cluster-name=${CLUSTER_NAME} \
    --license-file=${LICENSE_FILE} \
    --paywall-username=${PAYWALL_USER} \
    --paywall-password=${PAYWALL_PASSWORD} \
    --cluster-type=all-services-pvc \
    --nodes-kts=<Dedicated Node for KTS> \
    --node-ipa=<Dedicated Node for IPA> \
    --kubeconfig-path=<Path to your kubeconfig file> \
    --oc-tar-file-path=<Path to your oc.tar file downloaded from RedHat> \
    --nodes-base="${NODES}"
./setup-cluster.sh \
    --cluster-name=${CLUSTER_NAME} \
    --license-file=${LICENSE_FILE} \
    --paywall-username=${PAYWALL_USER} \
    --paywall-password=${PAYWALL_PASSWORD} \
    --cluster-type=full-enc-pvc \
    --nodes-kts=<Dedicated Node(s) for KTS> \
    --node-ipa=<Dedicated Node for IPA> \
    --kubeconfig-path=<Path to your kubeconfig file> \
    --oc-tar-file-path=<Path to your oc.tar file downloaded from RedHat> \
    --nodes-base="${NODES}"

CDP 7 - Workload XM cluster (1 WXM cluster of 5 nodes associated with a base cluster (provided in command line) ) (Kerberos / TLS)

./setup-cluster.sh \
    --cluster-name=${CLUSTER_NAME} \
    --license-file=${LICENSE_FILE} \
    --paywall-username=${PAYWALL_USER} \
    --paywall-password=${PAYWALL_PASSWORD} \
    --cluster-type=wxm \
    --altus-key-id=<ALTUS key ID provided by Cloudera> \
    --altus-private-key=<path to ALTUS private key provided by Cloudera> \
    --cm-base-url=<http://<CM host to connect to WXM>:<Port> \
    --tp-host=<Host in base cluster that will have Telemetry Publisher installed> \
    --nodes-base="${NODES}"

CDP 7.1.8 - Full 10 nodes with almost all services (Kerberos / TLS)

./setup-cluster.sh \
    --cluster-name=${CLUSTER_NAME} \
    --license-file=${LICENSE_FILE} \
    --paywall-username=${PAYWALL_USER} \
    --paywall-password=${PAYWALL_PASSWORD} \
    --cdh-version='7.1.8.1' \
    --cm-version='7.7.3-33365545' \
    --nodes-base="${NODES}"

CDP 7 - Unsecure

./setup-cluster.sh \
    --cluster-name=${CLUSTER_NAME} \
    --license-file=${LICENSE_FILE} \
    --paywall-username=${PAYWALL_USER} \
    --paywall-password=${PAYWALL_PASSWORD} \
    --kerberos=false \
    --tls=false \
    --nodes-base="${NODES}"

CDH 6 (Kerberos)

./setup-cluster.sh \
    --cluster-name=${CLUSTER_NAME} \
    --license-file=${LICENSE_FILE} \
    --paywall-username=${PAYWALL_USER} \
    --paywall-password=${PAYWALL_PASSWORD} \
    --cluster-type=cdh6 \
    --nodes-base="${NODES}"

CDH 5 (Kerberos)

./setup-cluster.sh \
    --cluster-name=${CLUSTER_NAME} \
    --license-file=${LICENSE_FILE} \
    --paywall-username=${PAYWALL_USER} \
    --paywall-password=${PAYWALL_PASSWORD} \
    --cluster-type=cdh5 \
    --nodes-base="${NODES}"

HDP 3 (Kerberos)

./setup-cluster.sh \
    --cluster-name=${CLUSTER_NAME} \
    --license-file=${LICENSE_FILE} \
    --paywall-username=${PAYWALL_USER} \
    --paywall-password=${PAYWALL_PASSWORD} \
    --cluster-type=hdp3 \
    --nodes-base="${NODES}"

HDP 2 (Kerberos)

./setup-cluster.sh \
    --cluster-name=${CLUSTER_NAME} \
    --license-file=${LICENSE_FILE} \
    --paywall-username=${PAYWALL_USER} \
    --paywall-password=${PAYWALL_PASSWORD} \
    --cluster-type=hdp2 \
    --nodes-base="${NODES}"

Output

CM & Ambari

At the end, CM or Ambari depending on your installation should be available at the first node URL with appropriate http or https and port (depending on tls parameters for HDP which is false by default and tls for CDP which is true by default).

During the installation, you can also follow the installation from CM or Ambari by connecting to it.

N.B.: It is recommended to not interfer with the cluster during ansible installation until it is done

Users and Data

At the end of the installation, if it completed successfully, users are created on machines, their keytabs too and are retrieved in your local computer under /tmp/, krb5.conf is also retrieved.

Moreover, it is also possible to launch some random data generation into various systems.

All default passwords are Cloudera1234

Details on Installation

This describe in details the steps made during the installation in the right order, each one could be skipped and hence be launched separately.

Architecture

Once you gathered all previous requirements, a launch could be made, it will mainly consist of 5 steps:

  • Prepare your machines

  • Launch the installation from the first node of your cluster using appropriate ansible playbook and files

  • Do post-install configuration (mainly for CDP)

  • Create users on your cluster

  • Load some data into your cluster

Each step could be skipped (see command line help).

Scripts

This group of scripts, coordinated by main script: setup-cluster.sh has the goal to configure machines provided and launch a CDP (or HDP, CDH) installation with ansible. Finally, some extra configurations steps and random data could be generated into different services.

All this, is only made from your machine.

This script relies on ansible scripts that must be accessible from your machine (if they are not, please setup an internal webserver and provide its url through command line).

Ansible script relies also on Cloudera repository to access CDP, CM, HDP, Ambari etc…​ (if they are not accessible, please setup an internal webserver and provide its url through command line).

This script relies also on github repository to load data. (if they are not accessible, please setup an internal webserver and provide its url through command line).

Setup Machines

This step uses Playbook hosts_setup.

If you did not set parameter --setup to false, it will prepare all machines by setting ssh-passwordless, pushing required files to them.

N.B.: This step can be done only one time and then bypass if you reuse same machines

Ansible Installation

This step uses Playbook ansible_install_preparation and then launch commands directly on the host to launch ansible installation there.

The first playbook used can be skipped setting parameter --install to false, which is true by default.

It cleans up the first node, creates a directory ~/deployment/ansible-repo/, get ansible repository as a zip in it and add files for your installation in it.

Then, the proper ansible command corresponding to the installation is lauched directly on the first node.

Post Installation

This step uses Playbook post_install.

If you install a CDP cluster and let parameter --post-install to true, it will do some extra-steps, such as setting no unlogin on CM, fix various potential bugs.

User Creation

This step uses Playbook user_creation.

If you did not set explicitly parameter --user-creation to false, and installation completed succesfully, some users are created defined in extra_vars of user_creation.

They are present on all nodes with their /home directory containing their keytabs.

Their keytabs are also fetch in your /tmp directory along with the krb5.conf allowing you to kinit directly from your computer.

Data Loading

This step uses Playbook data_load.

If you let parameter --data-load to true, a data loading step will start (only on CDP, HDP 2 and CDH 5 currently) to generate data into existing services of the paltform: HDFS, HBase, Hive etc…​

It is based on random-datagen project

Note that this step is completely extensible as you can add new files to specify how data should be generated in folder playbooks/data_load/generate_data/models

N.B.: This step will also create Ranger required policies, and these are also extensible by adding policies in playbooks/data_load/ranger_policies/push_policies/policies

Extension

Once you are familiar with these scripts, you can easily tune them using command-line parameters to provide your own cluster files and repositories.

Cluster Definition

To provide a quick new definition of a cluster:

  1. Copy-Paste directory ansible-cdp and name it for example: ansible-cdp-configured

  2. Make all your modifications in files of your copied directory

  3. Launch script with argument: --cluster-type=ansible-cdp-configured (It will automatically take files under ansible-cdp-configured/ directory)

User Creation & Data Loading

Those steps can be launched indepently and you can configure it to create more users or load different and more data.

Look inside playbooks folder to extra_vars.yml to get more about possibilities.

Private Cloud

Private Cloud setup (on ECS or OC) can also be launched independently on a running cluster.

Configuration of private cloud cluster can also be launched independently. (Use --install-pvc=false but --pvc=true to configure but not re-install your pvc).

In extra_vars.yml you can provide CDWs, CDEs, CMLs that will be provisionned for you and also rights that you expect on your users.

Limitations & Known Bugs

  • TLS is not set for HDP & CDH clusters

  • Data loading is not made for HDP 3 & CDH 6 clusters

  • Free IPA is only available for CDP clusters

Please feel free to contribute and help solve and implement TODOs listed in TODOs.adoc

About

One Click Script to Deploy CDP (CDP PvC & HDP & CDH)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Shell 95.8%
  • OpenEdge ABL 4.2%