## Practice using Ansible

Now that we have provisioned some infrastructure, we can configure and install software on it using Ansible!

Ansible is a tool for configuring systems by accessing them over SSH and running commands on them. The commands to run will be defined in advance in a series of *playbooks*, so that instead of using SSH directly and then running commands ourselves interactively, we can just execute a playbook to set up our systems.

First, let’s just practice using Ansible.

### Preliminaries

As before, let’s make sure we’ll be able to use the Ansible executables. We need to put the install directory in the `PATH` inside each new Bash session.

In [None]:
# runs in Chameleon Jupyter environment

export PATH=$HOME/.local/bin:$PATH
export PYTHONUSERBASE=$HOME/.local

If you haven’t already, make sure to put your floating IP (which you can see in the output of the Terraform command!) in the `ansible.cfg` configuration file, and move it to the specified location.

The following cell will show the contents of this file, so you can double check - make sure your real floating IP is visible in this output!

In [None]:
# runs in Chameleon Jupyter environment

cat ~/Fine-Tuning-Taiwanese-Hokkien-LLM-for-Medical-Advising/iac/ansible/ansible.cfg

[defaults]
stdout_callback = yaml
#inventory = /home/yc7690_nyu_edu/Fine-Tuning-Taiwanese-Hokkien-LLM-for-Medical-Advising/iac/ansible/inventory.yaml

[ssh_connection]
ssh_args = -F /home/yc7690_nyu_edu/.ssh/config -o StrictHostKeyChecking=off -o UserKnownHostsFile=/dev/null -o ForwardAgent=yes


Finally, we’ll `cd` to that directory -

In [None]:
# runs in Chameleon Jupyter environment

cd ~/Fine-Tuning-Taiwanese-Hokkien-LLM-for-Medical-Advising/iac/ansible

- upload id_rsa_chameleon to the path ~/work/mlops/id_rsa_chameleon

In [None]:
mkdir -p ~/.ssh
mv ~/work/mlops/id_rsa_chameleon ~/.ssh/
chmod 600 ~/.ssh/id_rsa_chameleon
scp ~/.ssh/id_rsa_chameleon cc@192.5.87.178:/home/cc/.ssh/id_rsa_chameleon

- **Configure SSH Access for Ansible (`3_practice_ansible.ipynb`)**  
    - To allow Ansible to access all three nodes (node1, node2, node3), configure a local SSH config file with a **ProxyCommand**.
    - This allows Ansible to SSH into `node2` and `node3` **via `node1`**, since only `node1` has a floating IP.

    - **Edit SSH Config on node1**  
      ```bash
      nano ~/.ssh/config
      ```

    - **Paste the Following Configuration**
      Replace `192.5.87.50` with the actual floating IP of `node1`, and update the key path if different.
      ```sshconfig
      Host node1
          HostName 192.5.87.50
          User cc
          IdentityFile /home/yc7690_nyu_edu/.ssh/id_rsa_chameleon
          StrictHostKeyChecking no
          UserKnownHostsFile=/dev/null
          ControlMaster auto
          ControlPersist 120s
          ControlPath ~/.ssh/ansible-%r@%h:%p

      Host node2
          HostName 192.168.1.12
          User cc
          IdentityFile /home/yc7690_nyu_edu/.ssh/id_rsa_chameleon
          ProxyCommand ssh -i /home/yc7690_nyu_edu/.ssh/id_rsa_chameleon -W %h:%p cc@192.5.87.50
          StrictHostKeyChecking no
          UserKnownHostsFile=/dev/null
          ControlMaster auto
          ControlPersist 120s
          ControlPath ~/.ssh/ansible-%r@%h:%p

      Host node3
          HostName 192.168.1.13
          User cc
          IdentityFile /home/yc7690_nyu_edu/.ssh/id_rsa_chameleon
          ProxyCommand ssh -i /home/yc7690_nyu_edu/.ssh/id_rsa_chameleon -W %h:%p cc@192.5.87.50
          StrictHostKeyChecking no
          UserKnownHostsFile=/dev/null
          ControlMaster auto
          ControlPersist 120s
          ControlPath ~/.ssh/ansible-%r@%h:%p
      ```
    - **Generate and Upload SSH Public Key**
    To allow Ansible to connect from `node1` to `node2`/`node3`, we must ensure `node1` can SSH into them using the same key.

        1. **Generate a public key from your private key**  
            This creates a `.pub` file needed for SSH key copying.
            ```bash
            ssh-keygen -y -f ~/.ssh/id_rsa_chameleon > ~/.ssh/id_rsa_chameleon.pub
            ```

        2. **Edit SSH Daemon Configuration on `node1`**
            Enable `AllowAgentForwarding` so `node1` can act as a jump host, and increase `MaxStartups` to support more concurrent SSH sessions (important when using Ansible with multiple hosts).
            ```bash
            sudo nano /etc/ssh/sshd_config
            ```

            Uncomment or add the following lines:
            ```conf
            AllowAgentForwarding yes
            MaxStartups 30:50:200
            ```

            Then restart the SSH service:
            ```bash
            sudo systemctl restart ssh
            ```

            > 💡 Explanation:
            > - `AllowAgentForwarding` is required so that SSH agent and key forwarding works when `node1` connects to `node2` and `node3`.
            > - `MaxStartups` allows more parallel SSH sessions to prevent "Too many authentication failures" when Ansible connects to all nodes concurrently.

        3. **Distribute SSH Public Key to Other Nodes**
            Use `ssh-copy-id` to push the `.pub` key to `node2` and `node3` from `node1`.
            ```bash
            ssh-copy-id -i ~/.ssh/id_rsa_chameleon.pub cc@192.168.1.12
            ssh-copy-id -i ~/.ssh/id_rsa_chameleon.pub cc@192.168.1.13
            ```

        4. **(Optional) Test SSH Connection**
            You can verify connectivity directly:
            ```bash
            ssh -i ~/.ssh/id_rsa_chameleon cc@192.5.87.178
            ```
            > Replace the IP with your `node1` floating IP.


In [6]:
# scp -i ~/.ssh/id_rsa_chameleon \
#     -r ~/Fine-Tuning-Taiwanese-Hokkien-LLM-for-Medical-Advising/iac/ansible \
#     cc@192.5.87.148:/home/cc/

In [None]:
# scp ~/.ssh/id_rsa_chameleon cc@192.5.87.178:/home/cc/.ssh/id_rsa_chameleon

id_rsa_chameleon                              100% 2700    76.2KB/s   00:00    


In [8]:
# ssh-copy-id -i ~/.ssh/id_rsa_chameleon.pub cc@192.168.1.12


/usr/bin/ssh-copy-id: ERROR: failed to open ID file '/home/yc7690_nyu_edu/.ssh/id_rsa_chameleon.pub': No such file


: 1

In [8]:
# runs in Chameleon Jupyter environment
ansible -i inventory.yml all -m ping

node1 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}
node3 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}
node2 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}


In [9]:
# ansible -i inventory.yml node3 -m ping -vvvv

### Run a “Hello, World” playbook

Once we have verified connectivity to the nodes in our “inventory”, we can run a *playbook*, which is a sequence of tasks organized in plays, and defined in a YAML file. Here we will run the following playbook with one “Hello world” play:

    ---
    - name: Hello, world - use Ansible to run a command on each host
      hosts: all
      gather_facts: no

      tasks:
        - name: Run hostname command
          command: hostname
          register: hostname_output

        - name: Show hostname output
          debug:
            msg: "The hostname of {{ inventory_hostname }} is {{ hostname_output.stdout }}"

The playbook connects to `all` hosts listed in the inventory, and performs two tasks: first, it runs the `hostname` command on each host and saves the result in `hostname_output`, then it prints a message showing the value of `hostname_output` (using the *debug* module).

In [10]:
# runs in Chameleon Jupyter environment
ansible-playbook -i inventory.yml general/hello_host.yml


PLAY [Hello, world - use Ansible to run a command on each host] ****************

TASK [Run hostname command] ****************************************************
changed: [node1]
changed: [node2]
changed: [node3]

TASK [Show hostname output] ****************************************************
ok: [node1] => 
  msg: The hostname of node1 is node1-mlops
ok: [node2] => 
  msg: The hostname of node2 is node2-mlops
ok: [node3] => 
  msg: The hostname of node3 is node3-mlops

PLAY RECAP *********************************************************************
node1                      : ok=2    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
node2                      : ok=2    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
node3                      : ok=2    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   

