Skip to content

Commit

Permalink
Merge pull request #1088 from ajdecon/singularity-centos-8
Browse files Browse the repository at this point in the history
Add Molecule testing for Singularity, plus infra for more roles
  • Loading branch information
dholt committed Jan 19, 2022
2 parents afa0531 + e4991dd commit aa2dfab
Show file tree
Hide file tree
Showing 12 changed files with 264 additions and 20 deletions.
33 changes: 33 additions & 0 deletions .github/workflows/molecule.yml
@@ -0,0 +1,33 @@
---
name: test ansible roles with molecule
on:
- push
- pull_request
jobs:
build:
runs-on: ubuntu-20.04
strategy:
max-parallel: 4
matrix:
deepops-role:
- singularity_wrapper
steps:
- name: check out repo
uses: actions/checkout@v2
with:
path: "${{ github.repository }}"
- name: set up python
uses: actions/setup-python@v2
with:
python-version: "3.9"
- name: install dependencies
run: |
python3 -m pip install --upgrade pip
python3 -m pip install molecule[docker] docker ansible
- name: run molecule test
run: |
cd "${{ github.repository }}/roles"
ansible-galaxy role install --force -r ./requirements.yml
ansible-galaxy collection install --force -r ./requirements.yml
cd "${{ matrix.deepops-role }}"
molecule test
76 changes: 74 additions & 2 deletions docs/deepops/testing.md
@@ -1,12 +1,13 @@
# DeepOps Testing, CI/CD, and Validation

## DeepOps Continuous Integration Testing

## DeepOps end-to-end testing

The DeepOps project leverages a private Jenkins server to run continuous integration tests. Testing is done using the [virtual](../../virtual) deployment mechanism. Several Vagrant VMs are created, the cluster is deployed, tests are executed, and then the VMs are destroyed.

The goal of the DeepOps CI is to prevent bugs from being introduced into the code base and to identify when changes in 3rd party platforms have occurred or impacted the DeepOps deployment mechanisms. In general, K8s and Slurm deployment issues are detected and resolved with urgency. Many components of DeepOps are 3rd party open source tools that may silently fail or suddenly change without notice. The team will make a best-effort to resolve these issues and include regression tests, however there may be times where a fix is unavailable. Historically, this has been an issue with Rook-Ceph and Kubeflow, and those GitHub communities are best equipped to help with resolutions.

### Testing Methodi
### Testing Method

DeepOps CI contains two types of automated tests:

Expand Down Expand Up @@ -63,6 +64,77 @@ A short description of the nightly testing is outlined below. The full suit of t
| MIG configuration | | | | No testing support


## DeepOps Ansible role testing

A subset of the Ansible roles in DeepOps have tests defined using [Ansible Molecule](https://molecule.readthedocs.io/en/latest/).
This testing mechanism allows the roles to be tested individually, providing additional test signal to identify issues which do not appear in the end-to-end tests.
These tests are run automatically for each pull request using [Github Actions](https://github.com/NVIDIA/deepops/actions).

Molecule testing runs the Ansible role in quesiton inside a Docker container.
As such, not all roles will be easy to test witth this mechanism.
Roles which mostly involve installing software, configuring services, or executing scripts should generally be possible to test.
Roles which rely on the presence of specific hardware (such as GPUs), which reboot the nodes they act on, or which make changes to kernel configuration are going to be harder to test with Molecule.

### Defining Molecule tests for a new role

To add Molecule tests to a new role, the following procedure can be used.

1. Ensure you have Docker installed in your development environment

2. Install Ansible Molecule in your development environment

```
$ python3 -m pip install "molecule[docker,lint]"
```

3. Initialize Molecule in your new role

```
$ cd deepops/roles/<your-role>
$ molecule init scenario -r <your-role> --driver docker
```

4. In the file `molecule/default/molecule.yml`, define the list of platforms to be tested.
DeepOps currently supports operating systems based on Ubuntu 18.04, Ubuntu 20.04, EL7, and EL8.
To test these stacks, the following `platforms` stanza can be used.

```
platforms:
- name: ubuntu-1804
image: geerlingguy/docker-ubuntu1804-ansible
pre_build_image: true
- name: ubuntu-2004
image: geerlingguy/docker-ubuntu2004-ansible
pre_build_image: true
- name: centos-7
image: geerlingguy/docker-centos7-ansible
pre_build_image: true
- name: centos-8
image: geerlingguy/docker-centos8-ansible
pre_build_image: true
```

5. If you haven't already, define your role's metadata in the file `meta/main.yml`.
A sample `meta.yml` is shown here:

```
galaxy_info:
role_name: <your-role>
namespace: deepops
author: DeepOps Team
company: NVIDIA
description: <your-description>
license: 3-Clause BSD
min_ansible_version: 2.9
```

6. Once this is done, verify that your role executes successfully in the Molecule environment by running `molecule test`. If you run into any issues, consult the [Molecule documentation](https://molecule.readthedocs.io/en/latest/index.html) for help resolving them.

7. (optional) In addition to testing successful execution, you can add additional tests which will be run after your role completes in a file `molecule/default/verify.yml`. This is an Ansible playbook that will run in the same environment as your playbook ran. For a simple example of such a verify playbook, see the [Enroot role](https://github.com/NVIDIA/ansible-role-enroot/blob/master/molecule/default/verify.yml).

8. Once you're confident that your new tests are all passing, add your role to the `deepops-role` section in the `.github/workflows/molecule.yml` file.


## DeepOps Deployment Validation

The Slurm and Kubernetes deployment guides both document cluster verification steps. These should be run during the installation process to validate a GPU workload can be executed on the cluster.
Expand Down
7 changes: 1 addition & 6 deletions playbooks/container/singularity.yml
@@ -1,10 +1,5 @@
---
- hosts: all
become: yes
pre_tasks:
- name: create a folder for go
file:
path: "{{ golang_install_dir }}"
recurse: yes
roles:
- lecorguille.singularity
- singularity_wrapper
6 changes: 3 additions & 3 deletions roles/requirements.yml
Expand Up @@ -64,8 +64,8 @@ roles:
- src: https://github.com/OSC/ood-ansible.git
version: 'v2.0.3'

- src: abims_sbr.singularity
version: 3.7.1-1

- src: gantsign.golang
version: 2.4.0

- src: lecorguille.singularity
version: 1.2.0
33 changes: 33 additions & 0 deletions roles/singularity_wrapper/.yamllint
@@ -0,0 +1,33 @@
---
# Based on ansible-lint config
extends: default

rules:
braces:
max-spaces-inside: 1
level: error
brackets:
max-spaces-inside: 1
level: error
colons:
max-spaces-after: -1
level: error
commas:
max-spaces-after: -1
level: error
comments: disable
comments-indentation: disable
document-start: disable
empty-lines:
max: 3
level: error
hyphens:
level: error
indentation: disable
key-duplicates: enable
line-length: disable
new-line-at-end-of-file: disable
new-lines:
type: unix
trailing-spaces: disable
truthy: disable
10 changes: 10 additions & 0 deletions roles/singularity_wrapper/defaults/main.yml
@@ -0,0 +1,10 @@
---
# vars for lecorguille.singularity
singularity_version: "3.7.3"
singularity_conf_path: "/etc/singularity/singularity.conf"
bind_paths: []

# vars for gantsign.golang
golang_version: "1.14.4"
golang_install_dir: "/opt/go/{{ golang_version }}"
golang_gopath: "/opt/go/packages"
9 changes: 9 additions & 0 deletions roles/singularity_wrapper/meta/main.yml
@@ -0,0 +1,9 @@
---
galaxy_info:
role_name: singularity_wrapper
namespace: deepops
author: DeepOps Team
company: NVIDIA
description: Wrap lecourguille.singularity role
license: 3-Clause BSD
min_ansible_version: 2.9
7 changes: 7 additions & 0 deletions roles/singularity_wrapper/molecule/default/converge.yml
@@ -0,0 +1,7 @@
---
- name: Converge
hosts: all
tasks:
- name: "Include singularity_wrapper"
include_role:
name: "singularity_wrapper"
26 changes: 26 additions & 0 deletions roles/singularity_wrapper/molecule/default/molecule.yml
@@ -0,0 +1,26 @@
---
dependency:
name: galaxy
options:
requirements-file: requirements.yml
driver:
name: docker
platforms:
- name: ubuntu-1804
image: geerlingguy/docker-ubuntu1804-ansible
pre_build_image: true
- name: ubuntu-2004
image: geerlingguy/docker-ubuntu2004-ansible
pre_build_image: true
- name: centos-7
image: geerlingguy/docker-centos7-ansible
pre_build_image: true
- name: centos-8
image: geerlingguy/docker-centos8-ansible
pre_build_image: true
provisioner:
name: ansible
ansible_args:
- -vv
verifier:
name: ansible
13 changes: 13 additions & 0 deletions roles/singularity_wrapper/molecule/default/verify.yml
@@ -0,0 +1,13 @@
---
- name: verify
hosts: all
tasks:
- name: check for path to singularity
command: which singularity
register: which_singularity
changed_when: which_singularity.rc != 0

- name: verify path to singularity
assert:
that:
- "'/usr/local/bin/singularity' in which_singularity.stdout"
35 changes: 35 additions & 0 deletions roles/singularity_wrapper/tasks/main.yml
@@ -0,0 +1,35 @@
---
- name: centos 8 - ensure powertools installed
block:
- name: ensure prereq packages installed
yum:
name: "dnf-plugins-core"
state: "present"
- name: enable powertools
command: "yum config-manager --set-enabled powertools"
register: enable_powertools
changed_when: enable_powertools.rc != 0
when: (ansible_distribution == "CentOS") and (ansible_distribution_major_version == "8")

- name: rhel 8 - ensure CRB repository is enabled
rhsm_repository:
name: "codeready-builder-for-rhel-8-x86_64-rpms"
when: (ansible_distribution == "Red Hat Enterprise Linux") and (ansible_distribution_major_version == "8")

- name: debian - ensure apt cache is up to date
apt:
update_cache: yes
when: ansible_os_family == "Debian"

- name: create a folder for go
file:
path: "{{ golang_install_dir }}"
recurse: yes

- name: install golang explicitly
include_role:
name: gantsign.golang

- name: install singularity
include_role:
name: abims_sbr.singularity
29 changes: 20 additions & 9 deletions scripts/setup.sh
Expand Up @@ -7,10 +7,14 @@
# Can be run standalone with: curl -sL git.io/deepops | bash
# or: curl -sL git.io/deepops | bash -s -- 19.07

# Determine current directory and root directory
SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
ROOT_DIR="${SCRIPT_DIR}/.."

# Configuration
ANSIBLE_VERSION="${ANSIBLE_VERSION:-2.9.27}" # Ansible version to install
ANSIBLE_TOO_NEW="${ANSIBLE_TOO_NEW:-2.10.0}" # Ansible version too new
CONFIG_DIR="${CONFIG_DIR:-./config}" # Default configuration directory location
CONFIG_DIR="${CONFIG_DIR:-${ROOT_DIR}/config}" # Default configuration directory location
DEEPOPS_TAG="${1:-master}" # DeepOps branch to set up
JINJA2_VERSION="${JINJA2_VERSION:-2.11.1}" # Jinja2 required version
PIP="${PIP:-pip3}" # Pip binary to use
Expand All @@ -21,10 +25,6 @@ VENV_DIR="${VENV_DIR:-/opt/deepops/env}" # Path to python virtual environ

# Set distro-specific variables
. /etc/os-release

SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
ROOT_DIR="${SCRIPT_DIR}/.."

DEPS_DEB=(git virtualenv python3-virtualenv sshpass wget)
DEPS_EL7=(git libselinux-python3 python-virtualenv python3-virtualenv sshpass wget)
DEPS_EL8=(git python3-libselinux python3-virtualenv sshpass wget)
Expand Down Expand Up @@ -146,10 +146,21 @@ fi
# Install Ansible Galaxy roles
if command -v ansible-galaxy &> /dev/null ; then
echo "Updating Ansible Galaxy roles..."
as_user ansible-galaxy collection install --force -r "${ROOT_DIR}/roles/requirements.yml" >/dev/null
as_user ansible-galaxy role install --force -r "${ROOT_DIR}/roles/requirements.yml" >/dev/null
as_user ansible-galaxy collection install --force -i -r "${ROOT_DIR}/config/requirements.yml" >/dev/null
as_user ansible-galaxy role install --force -i -r "${ROOT_DIR}/config/requirements.yml" >/dev/null
initial_dir="$(pwd)"
roles_path="${ROOT_DIR}/roles/galaxy"
collections_path="${ROOT_DIR}/collections"

cd "${ROOT_DIR}"
as_user ansible-galaxy collection install -p "${collections_path}" --force -r "roles/requirements.yml" >/dev/null
as_user ansible-galaxy role install -p "${roles_path}" --force -r "roles/requirements.yml" >/dev/null

# Install any user-defined config requirements
if [ -d "${CONFIG_DIR}" ] && [ -f "${CONFIG_DIR}/requirement.yml" ] ; then
cd "${CONFIG_DIR}"
as_user ansible-galaxy collection install -p "${collections_path}" --force -i -r "requirements.yml" >/dev/null
as_user ansible-galaxy role install -p "${roles_path}" --force -i -r "requirements.yml" >/dev/null
fi
cd "${initial_dir}"
else
echo "ERROR: Unable to install Ansible Galaxy roles, 'ansible-galaxy' command not found"
fi
Expand Down

0 comments on commit aa2dfab

Please sign in to comment.