Skip to content

0.7 Platform Deployment Procedure

Marek Slesicki edited this page Jul 6, 2016 · 5 revisions

Description

Trusted Analytics Platform (TAP) can be deployed on Amazon Web Services (AWS) using the AWS CloudFormation template or on OpenStack using OpenStack Heat Orchestraton Template.

Provisioning Workflow

  +--------------------+
  | AWS CloudFormation |
  |   OpenStack HOT    |
  +---------+----------+
            |
+-----------v------------+
| User Data Shell Script |
+-----------+------------+
            |
+-----------v------------+
| cfn-init Helper Script <-+
+-----------+------------+ |
            |              |
    +-------v--------+     |
    | cfn-hup Helper +-----+
    +-------+--------+
            |
       +----v----+
       | Ansible |
       +---------+
ℹ️ Information
Provisioning should be done in Ansible playbooks. Any changes in existing user data contents causes replacement (termination) of an existing EC2 instance.
  1. The first step of the provisioning uses Cloud Formation / HOT to create a base for the TAP infrastructure:

    1. get all of the environment parameters from the user:

      1. CDH worker size and the amount of workers,
      2. Cloud Foundry domain, password and the amount of workers,
      3. SMTP server parameters that will be used for sending invitation e-mails,
      4. Quay.io username for downloading restricted Docker images,
      5. Logsearch parameters on the platform,
      6. a keypair used for platform access,
      7. Elastic IP used for platform access.
    2. create the network infrastructure:

      1. VPC (AWS only),
      2. Subnets,
      3. Security Groups.
    3. create the machines:

      1. a jumpbox for platform operator access,
      2. CDH cluster machines,
      3. an Nginx load balancer.
  2. To start the provisioning process and provide machines with necessary automation tooling:

    1. an user data shell script(common for every instance/VM) is used to:

      1. install pip(with get-pip.py),
      2. install Ansible's dependencies using an OS-specific package management system (apt-get for Debian/Ubuntu and yum for RHEL/CentOS),
      3. install Ansible(via pip),
      4. install CloudFormation Helper Scripts,
      5. call the cfn-init script.

      The log can be found in /var/log/cloud-init-output.log file on each of the machines created.

    2. cfn-init helper script is used to:

      1. create an Upstart job for running cfn-hup daemon,
      2. create a configuration for cfn-hup daemon,
      3. create a configuration for Ansible,
      4. create an Ansible's inventory,
      5. create hooks configurationfor cfn-hup daemon,
      6. start the cfn-hup daemon service.

      The log can be found in /var/log/cfn-init.log file on each of the machines created.

    3. cfn-hup hooks are used to:

      1. run the cfn-init helper script,
      2. run the ansible-pull,

      when the metadata of the instance is changed. The log can be found in /var/log/cfn-hup.log file.

  3. To provision the software the platform is using Ansible is executed on the Jump Box and Nginx machines:

    1. on the Jump Box:

      1. BOSH is installed,
      2. Cloud Foundry is provisioned.
    2. on the Nginx host:

      1. Nginx is installed and configured. In both cases, the ansible-playbooks repository is used (described in more detail in the repos section). The log can be found in /var/log/ansible.log file on both of the machines.** **
  4. To provision the CDH cluster and TAP applications:

    1.  the user logs in to Jump Box and executes the tqd.sh shell script:

      1. environment variables KERBEROS_ENABLED and PUSH_APPS are checked to see if the script should enable Kerberos and pushing apps. PLATFORM_ANSIBLE_ARCHIVE is checked for an URL with the platform-ansible archive,
      2. python and virtualenv are installed and set-up on the Jump Box,
      3. a legacy version of Ansible (1.9.4) is installed in the virtualenv,
      4. the platform-ansible archive is downloaded and untared,
      5. Java 1.8 and JCE 8 is downloaded for use with platform-ansible,
      6. Ansible inventory generation script (ec2.py) is downloaded and set up and platform-ansible is ran.

      All logs from the script are available on screen to the user.

    2. the main steps during the platform-ansible (described in more detail in the repos section) run are:

      1. setting up hostnames on all of the machines,
      2. setting up Consul for platform service discovery and Unbound for DNS,
      3. installing CDH (described in more detail below),
      4. using Apployer to push the applications that are a part of TAP,
      5. installing Logsearch for log aggregation.

Provisioning is done from the root user on platform machines.

Resulting architecture

The image above shows the resulting AWS infrastructure, with all of the VMs used by Cloud Foundry and Cloudera.

Layers

Cloudera

     +---------+
     | Ansible |
     +----+----+
          |
+---------v--------+
| Cloudera Manager |
+------------------+

The deployment of Cloudera Manager and CDH follows Installation Path B from Cloudera Installation Overview.

  1. Ansible is used to:

    1. install Cloudera Manager Server (only on Cloudera Manager instance/VM),
    2. install Cloudera Manager Agent (on every Cloudera instance/VM),
    3. install JDK,
    4. use Cloudera Manager API (via Python client binding).

    The log can be found in /var/log/ansible.log file.

  2. Cloudera Manager API is used to deploy CDH.

Cloud Foundry

+---------+
| Ansible |
+----+----+
     |
 +---v--+
 | BOSH |
 +------+

Cloud Foundry is deployed by BOSH triggered from a Jump Box instance/VM.

  1. Ansible is used to:

    1. generate an SSH key for BOSH(and as a technical key used to access other instances/VMs),
    2. import the generated key as a new EC2 Key Pair,
    3. install bosh-init,
    4. install BOSH CLI,
    5. install Cloud Foundry CLI,
    6. install UAA CLI.

    The log can be found in /var/log/ansible.log file.

  2. bosh-init is used to deploy the BOSH Director.

  3. BOSH Director is used to deploy Cloud Foundry and Docker Service Broker.

Logsearch

     +---------+
     | Ansible |
     +----+----+
          |
+---------v--------+
|    Logsearch     |
+------------------+

The deployment of Logsearch uses both Ansible and BOSH:

  1. Ansible is used to:

    1. generate a BOSH manifest from the provided template,
    2. run the BOSH cli, initiating Logsearch deployment.
  2. BOSH Director is used to deploy Logsearch.

  3. A BOSH errand job is used to push Kibana to the platform.

Logsearch exposes a syslog sink to which Cloud Foundry exposes all of the application logs. The fields collected for applications are:

  • time - The time the log line was produced
  • application name - The application name
  • level - The severity level, like INFO, WARN, etc.
  • message - The actual log line

For more information regarding using Logsearch, please consult the TAP Logsearch documentation.

Security recommendations

  1. Please make sure that all nodes of your platform installation are secured with up-to-date anti-virus solution.

  2. Please keep all login credentials and deployment private keys in a secure store within your organization.

Repositories used and their structure

This repository contains the Troposphere TAP.py script that is used to generate the Cloud Formation template used to bootstrap the platform. It won't be used by end-users, as they will only use the generated template which isn't a part of the repository.

This repository contains the TAP.yaml HEAT template that is used to boostrap the platform. As it is not generated, this is the artifact that the end user uses to bootstrap the platform.

This repository contains all of the Ansible playbooks developed for TAP Quick Deployment with Cloud Orchestration (think Cloud Formation/HOT templates) in mind. It is downloaded by ansible-pull on the non-CDH machines during the 3rd step of the Provisioning Workflow.

The repository structure follows the default ansible directory layout with top-level playbooks for each of the host roles (ex. nginx.yml used for the load balancer).

Roles that are used in the deployment are described in the ROLES.md file inside the repository.

This repository long predates the TAP Quick Deployment, and as such has accumulated some technical debt. It wasn't written with Cloud Orchestration in mind, but rather to be run on already existing machines. It will eventually be obsoleted and incorporated into the ansible-playbooks repo.

As of now, it contains the parts that bootstrap the platform CDH, Consul and Logsearch components. The CDH provisioning part is described in greater detail in the Layers/Cloudera part.

It is run automatically when the user runs the tqd.sh script during the 4th step of the Provisioning Workflow.

Clone this wiki locally