Ansible project with multiple roles for installing and configuring IBM Spectrum Scale (GPFS).
Table of Contents
- Features
- Supported Versions
- Prerequisites
- Installation Instructions
- Optional Role Variables
- Available Roles
- Cluster Membership
- Limitations
- Troubleshooting
- Reporting Issues and Feedback
- Contributing Code
- Disclaimer
- Copyright and License
- Pre-built infrastructure (using a static inventory file)
- Dynamic inventory file
- Support for RHEL 7 on x86_64, PPC64 and PPC64LE
- Support for RHEL 8 on x86_64 and PPC64LE
- Disable SELinux (
scale_prepare_disable_selinux: true
), by default false - Disable firewall (
scale_prepare_disable_firewall: true
), by default true. - Disable firewall ports
- Install and start NTP
- Create /etc/hosts mappings
- Open firewall ports
- Generate SSH key
- User must set up base OS repositories
- Install yum-utils package
- Install gcc-c++, kernel-devel, make
- Install elfutils,elfutils-devel (RHEL8 specific)
- Install core Spectrum Scale packages on Linux nodes
- Install Spectrum Scale license packages on Linux nodes
- Compile or install pre-compiled Linux kernel extension (mmbuildgpl)
- Configure client and server license
- Assign default quorum (maximum 7 quorum nodes) if user has not defined in the inventory
- Assign default manager nodes(all nodes will act as manager node) if user has not defined in the inventory
- Create new cluster (mmcrcluster -N /var/tmp/NodeFile -C {{ scale_cluster_clustername }})
- Create cluster with profiles
- Add new node into existing cluster
- Configure node classes
- Define configuration parameters based on node classes
- Configure NSDs and file system
- Configure NSDs without file system
- Extend NSDs and file system
- Add disks to existing file systems
- Install Spectrum Scale GUI packages on GUI designated nodes
- maximum 3 GUI nodes to be configured
- Install performance monitoring sensor packages on all Linux nodes
- Install performance monitoring packages on all GUI designated nodes
- Configure performance monitoring and collectors
- Configure HA federated mode collectors
- Install Spectrum Scale callhome packages on all cluster nodes
- Configure callhome
The following Ansible versions are supported:
- 2.7 and above
The following IBM Spectrum Scale versions are supported:
- 5.0.4.0
- 5.0.4.1
- 5.0.4.2
Users need to have a basic understanding of the Ansible concepts for being able to follow these instructions. Refer to the Ansible User Guide if this is new to you.
-
Install Ansible on any machine (control node)
$ curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py $ python get-pip.py --user $ pip install --user ansible
Refer to the Ansible Installation Guide for detailled installation instructions.
-
Download Spectrum Scale packages
-
A Developer Edition Free Trial is available at this site: https://www.ibm.com/account/reg/us-en/signup?formid=urx-41728
-
Customers who have previously purchased Spectrum Scale can obtain entitled versions from IBM Fix Central. Visit https://www.ibm.com/support/fixcentral and search for 'IBM Spectrum Scale (Software defined storage)'.
-
-
Create password-less SSH keys between all Spectrum Scale nodes in the cluster
A pre-requisite for installing Spectrum Scale is that password-less SSH must be configured among all nodes in the cluster. Password-less SSH must be configured and verified with FQDN, hostname, and IP of every node to every node.
Example:
$ ssh-keygen $ ssh-copy-id -oStrictHostKeyChecking=no node1.gpfs.net $ ssh-copy-id -oStrictHostKeyChecking=no node1 $ ssh-copy-id -oStrictHostKeyChecking=no
Repeat this process for all nodes to themselves and to all other nodes.
-
Clone
ibm-spectrum-scale-install-infra
repository to your Ansible control node$ git clone https://github.com/IBM/ibm-spectrum-scale-install-infra.git
-
Change working directory to
ibm-spectrum-scale-install-infra/
$ cd ibm-spectrum-scale-install-infra/
-
Create Ansible inventory
-
Define Spectrum Scale nodes in the Ansible inventory (e.g.
./hosts
) in the following format# hosts: [cluster01] scale01 scale_cluster_quorum=true scale_cluster_manager=true scale_cluster_gui=false scale02 scale_cluster_quorum=true scale_cluster_manager=true scale_cluster_gui=false scale03 scale_cluster_quorum=true scale_cluster_manager=false scale_cluster_gui=false scale04 scale_cluster_quorum=false scale_cluster_manager=false scale_cluster_gui=false scale05 scale_cluster_quorum=false scale_cluster_manager=false scale_cluster_gui=false
The following Ansible variables are defined in the above inventory:
-
[cluster01]
: User defined host groups for Spectrum Scale cluster nodes on which Spectrum Scale installation will take place. -
scale_cluster_quorum
: User defined node designation for Spectrum Scale quorum. It can be either true or false. -
scale_cluster_manager
: User defined node designation for Spectrum Scale manager. It can be either true or false. -
scale_cluster_gui
: User defined node designation for Spectrum Scale GUI. It can be either true or false.
Note: Defining node roles such as
scale_cluster_quorum
andscale_cluster_manager
is optional. If you do not specify any quorum nodes then the first seven hosts in your inventory are automatically assigned the quorum role. -
-
To create NSDs, file systems and node classes in the cluster you'll need to provide additional information. It is recommended to use Ansible group variables (e.g.
group_vars/*
) as follows:# group_vars/all: --- scale_storage: - filesystem: gpfs01 blockSize: 4M defaultMetadataReplicas: 2 defaultDataReplicas: 2 numNodes: 16 automaticMountOption: true defaultMountPoint: /mnt/gpfs01 disks: - device: /dev/sdb nsd: nsd_1 servers: scale01 failureGroup: 10 usage: metadataOnly pool: system - device: /dev/sdc nsd: nsd_2 servers: scale01 failureGroup: 10 usage: dataOnly pool: data
Refer to
man mmchfs
andman mmchnsd
man pages for a description of these storage parameters.The
filesystem
parameter is mandatory,servers
, and thedevice
parameter is mandatory for each of the file system'sdisks
. All other file system and disk parameters are optional. Hence, a minimal file system configuration would look like this:# group_vars/all: --- scale_storage: - filesystem: gpfs01 disks: - device: /dev/sdb servers: scale01 - device: /dev/sdc servers: scale01,scale02
Important:
scale_storage
must be define using group variables. Do not define disk parameters using host variables or inline variables in your playbook. Doing so would apply them to all hosts in the group/play, thus defining the same disk multiple times...Furthermore, Spectrum Scale node classes can be defined on a per-node basis by defining the
scale_nodeclass
variable:# host_vars/scale01: --- scale_nodeclass: - classA - classB
# host_vars/scale02: --- scale_nodeclass: - classA - classC
These node classes can optionally be used to define Spectrum Scale configuration parameters. It is suggested to use group variables for that purpose:
# group_vars/all: --- scale_config: - nodeclass: classA params: - pagepool: 16G - autoload: no - ignorePrefetchLUNCount: yes
Refer to the
man mmchconfig
man page for a list of available configuration parameters.Note that configuration parameters can be defined as variables for any host in the play — the host for which you define the configuration parameters is irrelevant.
-
To install and configure callhome in the cluster you'll need to provide additional information. It is recommended to use group variables as follows:
# group_vars/all.yml: --- callhome_params: is_enabled: true customer_name: abc customer_email: abc@abc.com customer_id: 12345 customer_country: IN proxy_ip: proxy_port: proxy_user: proxy_password: proxy_location: callhome_server: host-vm1 callhome_group1: [host-vm1,host-vm2,host-vm3,host-vm4] callhome_schedule: [daily,weekly]
-
-
Create Ansible playbook
The basic Ansible playbook (e.g.
./playbook.yml
) looks as follows:# playbook.yml: --- - hosts: cluster01 vars: - scale_version: 5.0.4.0 - scale_install_localpkg_path: /path/to/Spectrum_Scale_Standard-5.0.4.0-x86_64-Linux-install roles: - core/precheck - core/node - core/cluster - gui/precheck - gui/node - gui/cluster - zimon/precheck - zimon/node - zimon/cluster - callhome/precheck - callhome/node - callhome/cluster - callhome/postcheck
The following installation methods are available:
- Installation from (existing) YUM repository (
scale_install_repository_url
) - Installation from remote installation package (
scale_install_remotepkg_path
) - Installation from local installation package (
scale_install_localpkg_path
) - Installation from single directory package path (
scale_install_directory_pkg_path
)
Note: Defining the variable
scale_version
is optional forscale_install_localpkg_path
andscale_install_directory_pkg_path
installation methods. It is mandatory forscale_install_repository_url
andscale_install_remotepkg_path
installation methods. Furthermore, you'll need to configure an installation method by defining one of the following variables:scale_install_repository_url
(eg: http://infraserv/gpfs_rpms/)scale_install_remotepkg_path
(accessible on Ansible managed node)scale_install_localpkg_path
(accessible on Ansible control machine)scale_install_directory_pkg_path
(eg: /opt/IBM/spectrum_scale_packages)
Important: If you are using the single directory installation method (
scale_install_directory_pkg_path
), you need to keep all required Spectrum Scale RPMs in a single user-provided directory. - Installation from (existing) YUM repository (
-
Run the playbook to install and configure the Spectrum Scale cluster
-
Using the
ansible-playbook
command:$ ansible-playbook -i hosts playbook.yml
-
Using the automation script:
$ ./ansible.sh
Note: An advantage of using the automation script is that it will generate log files based on the date and the time in the
/tmp
directory.
-
-
Playbook execution screen
Playbook execution starts here:
$ ./ansible.sh Running #### ansible-playbook -i hosts playbook.yml PLAY #### [cluster01] ********************************************************************************************************** TASK #### [Gathering Facts] ********************************************************************************************************** ok: [GPFS-vm1] ok: [GPFS-vm2] ok: [GPFS-vm3] TASK [common : check | Check Spectrum Scale version] ********************************************************************************************************* ok: [GPFS-vm1] => { "changed": false, "msg": "All assertions passed" } ok: [GPFS-vm2] => { "changed": false, "msg": "All assertions passed" } ok: [GPFS-vm3] => { "changed": false, "msg": "All assertions passed" }
Playbook recap:
#### PLAY RECAP *************************************************************************************************************** GPFS-vm1 : ok=0 changed=65 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 GPFS-vm2 : ok=0 changed=59 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 GPFS-vm3 : ok=0 changed=59 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
User can also define some of the following variables to override default values and customize the behavior:
scale_cluster_clustername
: User defined Spectrum Scale cluster name.scale_prepare_disable_selinux
: SELinux can be disabled. It can be either true or false (default).scale_prepare_disable_firewall
: Firewall can be disabled. It can be either true or false (default).
If you are assembling your own playbook, the following roles are available for you to reuse:
All hosts in the play are configured as nodes in the same cluster. If you want to add hosts to an existing cluster then add at least one node from that existing cluster to the play.
You can create multiple clusters by running multiple plays.
The roles in this project can (currently) be used to create new clusters or extend existing clusters. Similarly, new file systems can be created or extended. But this role does not remove existing nodes, disks, file systems or node classes. This is done on purpose. This is also the reason why it can not be used, for example, to change the file system pool of a disk. Changing the pool requires you to remove and then re-add the disk from a file system, which is not currently in the scope of this role.
Furthermore, upgrades are not currently in scope of this role. Spectrum Scale supports rolling online upgrades (by taking down one node at a time), but this requires careful planning and monitoring and might require manual intervention in case of unforeseen problems.
The roles in this project store configuration files in /var/tmp
on the first host in the play. These configuration files are kept to determine if definitions have changed since the previous run, and to decide if it's necessary to run certain Spectrum Scale commands (again). When experiencing problems one can simply delete these configuration files from /var/tmp
in order to clear the cache — doing so forces re-application of all definitions upon the next run. As a downside, the next run may take longer than expected as it might re-run unnecessary Spectrum Scale commands. This will automatically re-generate the cache.
Please use the issue tracker to ask questions, report bugs and request features.
We welcome contributions to this project, see Contributing for more details.
Please note: all playbooks / modules / resources in this repo are released for use "AS IS" without any warranties of any kind, including, but not limited to their installation, use, or performance. We are not responsible for any damage or charges or data loss incurred with their use. You are responsible for reviewing and testing any scripts you run thoroughly before use in any production environment. This content is subject to change without notice.
Copyright IBM Corporation 2020, released under the terms of the Apache License 2.0.