# What is Ansible?

Ansible is an open-source IT automation tool that allows for automated management of remote systems.

![](/posts/ansible_cluster/Ansible_logo.png){fig-align="center" width="488"}

A basic Ansible environment has the following three components:

-   **Control Node**: This is a system on which Ansible is installed, and the system from which Ansible commands such as `ansible-inventory` are issued. This is also where Ansible playbooks and configuration files are stored.

-   **Managed node**: This is a remote system that Ansible intends to manage and configure.

-   **Inventory**: This is a list of managed nodes that are organized locally on the **control node**. This lists the IP addresses or the hostnames of the remote systems being managed along with any connection information needed.

![A flowchart demonstrating the basic architecture of an Ansible environment.](ansible_architecture.png){fig-align="center"}

# What does Ansible do?

Ansible uses a declarative language in the YAML format to describe the *desired end state* of the managed nodes, and it can connect to these managed nodes via standard protocols like SSH and handle the execution of the tasks required to acheive this end state. These are done via YAML-based files called **Playbooks**.

**Playbooks** are automation blueprints, in the `YAML` format. Playbooks contain **plays**, which are a list of tasks that map to each managed node in the predefined inventory.

**Tasks** are a list of one or more modules that define the operations that Ansible can perform.

A **Module** is a unit of code that Ansible can run on managed nodes.

![](ansible_playbook_structure.png){fig-align="center"}

The key features of Ansible that make it deal to perform automated configuration management of remote systems are as follows:

-   **Agentless Architecture**: Ansible only needs to be installed on the Control Node, and the Managed nodes do not require Ansible to be installed.

-   **Idempotent Execution**: Ansible ensures that no matter how many time the same Playbook is run, the end state of the managed nodes will be the same, regardless of the initial state of the managed node.

# Potential Use Case Scenario

Now, let's get into the meat of the post, which is about using Ansible to connect to a cluster and configure each node of the cluster.

Let's assume that we've been given access to a sparkling new cluster for computing purposes.

The cluster has a head node, and several individual cluster nodes.

Now, the head node has both a public and a private IP address.

The cluster nodes, however, only have private IP addresses, and can only be connected to via the head node.

We intend to use Ansible to automate and configure the cluster and it's nodes.

Instead of downloading and installing Ansible on the head node of the cluster, we intend to use a dockerized version of Ansible to connect to the cluster.

Let's assume that the head node has the hypothetical IP address XXXX.XX.XXX.X.

Let's also assume that there are three cluster nodes, each with private IP addresses in the format of:

-   YY.Y.Y.2
-   YY.Y.Y.3
-   YY.Y.Y.4

![The example use case for Ansible.](cluster_config.png){fig-align="center"}

# The Ansible Dockerfile


```{bash}         
# Loading from a miniconda3 image
FROM continuumio/miniconda3

# Installing Mamba using Conda.
# I find Mamba to be much faster for package installation compared to Conda. 
RUN conda install -c conda-forge mamba -y

# Creating a Mamba ENV called `ansible_env`.
RUN mamba create -y -n ansible_env

# Setting the new environment as the default
RUN echo "conda activate ansible_env" >> ~/.bashrc
SHELL ["/bin/bash", "--login", "-c"]

#Installing Linux based dependencies
RUN apt-get update && \
    apt-get install -y git nano software-properties-common gcc wget build-essential ssh less screen && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

#Installing ansible using mamba
RUN mamba install -y -c conda-forge ansible

# Generating a configuration file for Ansible.
# This will need to placed in a particular directory. 
WORKDIR /etc
RUN ansible-config init --disabled -t all > ansible.cfg

# Creating an ansible folder in /etc/ to mount the hosts file
WORKDIR ../
RUN mkdir /etc/ansible

# Creating the entrypoint
ENTRYPOINT ["/bin/bash"]
```


The above Docker image will require you to create a file called `hosts` to be mounted to `/etc/ansible` inside the container. This `hosts` file will be a list of the managed nodes you intend to connect to, and this is what we will be using to assess if the SSH connection has been set up properly.

The `hosts` file can be in .ini or .yaml formats. I went with .ini, in the following format:

```         
[head_node]
XXXX.XX.XXX.X
[cluster_nodes]
YYY.Y.Y.2
YYY.Y.Y.3
YYY.Y.Y.4
```

Where the cluster nodes private IP addresses are used, and the head nodes public IP address is used.

We can also create an `inventory.ini` file to use as the inventory for mapping arguments to the cluster.

My `inventory.ini` file was in this format:

## Setting up the SSH connections

As stated previously, Ansible will need to be able to SSH into all the managed nodes in order to be able to perform tasks on each of them.

However, in our example set-up, it is not possible to directly SSH into the cluster nodes using their private IP addresses. (At least, not without some configuration.)

First, before initializing and running a docker container, create a directory which will contain all the files and folders we will need to mount to run Ansible.

Once this directory is created (I called it ansible_mount)