Skip to content

dmytro-lytvyn/dp-ansible-roles

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

43 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Platform: Ansible Roles

Roles description

These Ansible roles install and configure the components of the Data Platform:

  • Data Input server with Apache NiFi for data ingestion, Confluent Schema Registry (along with Zookeeper and Kafka to store schemas) for event schema validation, Landoo Schema Registry UI for user-friendly schema editing, and Kafdrop as a simple Kafka UI.
  • (TBD) Data Streaming server with Confluent Zookeeper and Kafka to stream the ingested events for further processing.
  • (TBD) Data Processing and Storage server, with Cloudera Manager, Spark and Hadoop (or Hortonworks, to be decided later).
  • (TBD) Data Output and Presentation server, with Cloudera Hue interface for Hive and Impala for data querying, and Redash for data visualisation and dashboards.

For demonstration purposes, the components will be installed on the servers named rogue-1, rogue-2 etc. Later they can be assigned proper domain names in DNS.

How to use this project

Preparing a brand new host to be managed with Ansible

Generate a private key on your managing computer if needed, and copy it to the root user of the remote server (no passphrase). This is only needed if your new server was not provisioned with your key already in place:

$ ssh-keygen
$ ssh-copy-id root@rogue-1 # assuming you only have root access at first

Log in to the remote server with your private key (-i key_path is needed only if it's not in a default location):

$ ssh root@rogue-1 -i ~/id_rsa

Set a preferred host name if needed:

$ sudo vi /etc/hostname
$ sudo vi /etc/hosts
$ sudo reboot

Install Python:

$ sudo apt install python

At this step, you no longer need to log in to your server and do anything manually, all further installation will be done by running the Ansible playbook.

Setting up Ansible on your managing computer

Install Ansible (on Ubuntu):

$ sudo add-apt-repository -y ppa:ansible/ansible
$ sudo apt update
$ sudo apt install ansible

Install Ansible (on Mac OS):

# Install Homebrew:
$ ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

# Add Local directory to PATH in ~/.bash_profile:
$ export PATH="/usr/local/opt/python/libexec/bin:$PATH"

# Install Python with Homebrew:
$ brew install python

# Install Ansible with pip:
$ pip install --user ansible

# Also add Python Local directory to PATH in ~/.bash_profile:
$ export PATH="~/Library/Python/3.7/bin:/usr/local/opt/python/libexec/bin:$PATH" 

Prepare custom config in a separate directory of your choice:

$ cp -R /etc/ansible ~/data-platform
$ cd ~/data-platform

Add your remote servers to Ansible hosts file, under data-platform group:

$ vi hosts
[data-platform]
something ansible_host=something.freemyip.com ansible_port=55522 # For a dynamic IP host, non-standard port
rogue-1 ansible_host=123.123.123.123
rogue-2 ansible_host=123.123.123.124

Set hosts file location in a local Ansible config:

$ vi ansible.cfg
[defaults]
inventory      = hosts

Ansible commands

List the inventory:

$ ansible --user=username --list-hosts all

Test the configuration:

$ ansible --user=username --module-name=ping all

Execute shell commands:

$ ansible --user=username --module-name=shell --args='df -h' all

Execute commands with sudo and a password:

$ ansible --user=username --become --ask-become-pass --module-name=shell --args='whoami' all

Execute commands with a passwordless sudo:

$ ansible --user=username --become --module-name=shell --args='whoami' all

Copy the sample Ansible playbook provided in the repository and update it to your needs:

$ cp data-platform-sample.yml data-platform.yml
$ vi data-platform.yml

Run the Ansible playbook for the first time for one of the servers, when no passwordless sudo available, only root user exists (which will add a default user with a passwordless sudo and disable root login), and only the roles specified by tags in a playbook:

$ ansible-playbook --user=root data-platform.yml --limit=rogue-1 --tags="basic,firewall"

For every consequent run, as a normal user, without a password, applying all roles defined in a playbook:

$ ansible-playbook --user=username data-platform.yml

You can to a test run (without applying any changes) using the parameter --check, optionally with a parameter --diff to see the actual differences:

$ ansible-playbook --user=username data-platform.yml --check --diff

To show the debug output, use -vvv parameter:

$ ansible-playbook --user=username data-platform.yml -vvv

About

Ansible roles and a sample playbook to deploy a simple Data Platform

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published