Skip to content
This repository has been archived by the owner on Jul 19, 2023. It is now read-only.

target/REDstack

Project REDstack

What is REDstack?

REDstack is a service that deploys the Hortonworks Data Platform to OpenStack. It is intended for development teams to have a place where they fully control the environment and can try new Big Data software.

REDstack guide

REDstack can be run within a Docker container on any computer that has Docker installed. This is done to contain the environment to have specific control over the correct versions of the software and libraries installed

Install and set up Docker

Check out the Docker website for installation information: https://docs.docker.com/engine/installation/

  1. cd to the root of the project on your local file system. Execute the command docker build -t <name>:<tag> to build the Docker image on your local machine.

  2. docker run -it <name>:<tag> /bin/sh to start the container in an interactive shell session.

Setting up REDstack

REDstack needs a few configs to be set for your environment before it's ready to be run

  1. From inside the image, navigate to /opt/redstack/redstack/conf

  2. Open or create a new file under templates and Create a template that corresponds to the size of the cluster that you want to build

    • count: the amount of the node type (usually only applies to Data nodes)
    • flavor: The corresponding Openstack Flavor to map this node type to
    • volume_size: How much volume storage to give to to thir node it's HDFS contribution
  3. Open the rs_conf.yml file and fill it with the appropriate settings based on your environment and change the following settings: (v2 vs v3 stands for the version of openstack you are running), defaults are for Ormuco cloud

    • stack_name: "redstack": The name of the stack in Ambari
    • cluster_name: "hadoop": The name of the cluster in REDstack
    • auth_version: 3: The version of Keystone your openstack project is running
    • region: The Region to deploy to in Openstack
    • availability_zone: The AZ to deploy the instances in
    • openstack_auth_url: The keystone auth URL
    • external_network_id: The UUID of the external ketwork in Openstack to attach to
    • subnet_cidr: "192.168.198.0/24": The CIDR used for the subnet (default is OK)
    • expose_ui_ssh: "0.0.0.0/0": The CIDR to expose SSH traffic ant the web UIs in the cluster to (default is all network traffic)
    • ost_username: Your Openstack user name
    • ost_password: Your Openstack password
    • ost_project_id: The ID of your Openstack project, can be used in place of project name and domain
    • ost_project_name: The name of the Openstack project
    • ost_domain: The 'domain' that your Openstack project resides in
    • template_file: "hdpv3.yml": The filename of the template file you created or edited
    • define_custom_repos: false: If you want, you can define cusom yum repos to install from
    • ambari_password: The password that will be set for Ambari
    • fqdn_address: ".redstack.com": The FQDN to assign to the nodes in the cluster
    • kerberos_password: The password to assign to the Kerberos environment at install time
    • ambari_db_password: The database password for the Ambari PSQL database
    • mysql_root_password: The default root password for the mysql instance
  4. Note that the Openstack network traffic is by default configured to only allow traffic on hadoop service web pages, Ambari, and Knox

  5. To create users on the cluster backed in OpenLDAP, create JSON files in the users directory like:

     {
       "id": "redstack-admin",
       "uid": "1501",
       "keytab_principal": "redstack-admin",
       "keytab_filename": "redstack-admin.headless.keytab",
       "keytab_location": "/user_items/keytabs",
       "keytab_owner": "redstack-admin",
       "keytab_groupowner": "redstack-admin",
       "keytab_permissions": "400",
       "create_hdfs_home": "true",
       "create_ssh_key": "true",
       "regular_user": "true",
       "sudo_user": "true",
       "password": 
     }
    
    • Where the value for password is an encrypted password generated by openssl
    • The password is encrypted with a general unix crypt hash, it can be created with openssl passwd -1 password_string
    • To create a service user, set the regular_user flag to false. This is how to create a service account for something like Hue
    • Note that service account are not present in the LDAP server, only as a local account

Launching the install

sh ~/run_redstack.sh to start the REDstack deployment. If it completes, you shoult receive a link to the ambari server on the cluster in the command line

Security

There are a few security considerations to keep in mind when used this cluster

  1. The cluster will, by default, expose a few ports for incoming traffic.
    • Ambari and Knox - 8443: The cluster manangement UI, and REST endpoint (https)
    • Application History Server - 8188
    • Namenode UI - 50070
    • SSH - 22
    • Resource Manager UI - 8088
    • YARN History Server - 19888
    • Spark History Server - 18080
    • Zeppelin Notebook - 9995
    • Jornalnode UI - 8480
  2. If you want to expose any other ports, you will need to manually update the security group. You can do this from the CLI on the docker image or the Openstack UI.
  3. Cluster access is firewalled off by a security group running in front of the cluster in Openstack. When you specify expose_ui_ssh, it is important that you provide a CIDR that only your connections, or those that you trust, can access the cluster.
  4. None of the passwords provided in the config on the docker image will be in plaintext on the cluster itself, once you destroy the docker image after creating the cluster, they are gone.
  5. All Hadoop services are secured with kerberos, and cannot be used without authenticating with a keytab. Each user provided in the cluster will have a keytab granted by default and located in /user_items/keytabs, All Hadoop services will have their keytabs in /etc/security/keytabs. The kerberos server is located on the rs-master node, and the password is configured in the docker image prior to cluster install.

Releases

No releases published

Packages

No packages published