Azure CLI bash script to automatically configure a Pacemaker/Corosync (PCS) cluster for an Oracle database.
This script uses Azure Shared Disk.
- Premium SSD with max_shares > 1 are limited by region as specified HERE.
- UltraDisk with max-shares > 1 are available for all regions where UltraDisk is available. More info...
This Azure CLI bash script fully automates the creation an Oracle database in an HA-LVM cluster on two Azure VMs, using Azure shared disk as the database storage. Linux HA-LVM on Oracle Linux and Red Hat uses open-source Pacemaker and Corosync to manage the HA-LVM cluster. The cluster is set up so that only one VM is active with full access to the Oracle database and listener. All Oracle services can be failed over to the second VM using the HA-LVM cluster.
The "cr_orapcs.sh" bash script automates the following steps...
- verify that subscription and resource group exist and are accessible
- set defaults for resource group and location
- create vnet, subnet, network security group with rules
- create proximity placement group (PPG) for the two VMs
- create the NIC, public IP, VM, and shared disk attached for the first database-server VM
- create the NIC, public IP, VM, and shared disk attached for the second database-server VM
- create the NIC, public IP, and VM for the third observer/tester VM
- display the public and private IP addresses for all VMs for later use
- create an Azure load-balancer to "front-end" the virtual IP (VIP) managed in the PCS cluster
- IP address of the PCS virtual IP is the "front-end" of the LB
- internal IP addresses of both database-server VMs as the "back-end" pool
- health probe of LB checks port 62503 held open by PCS resource azure-lb
- LB rules tie together health probe for HA of a "floating" virtual IP
- On both database-server VMs, do the following steps...
- copy Oracle "oraInst.loc" file from inventory location to "/etc" directory
- use "yum" to install the LVM2 package
- use "yum" to install the Linux "nc" (netcap) command for use by PCS azure-lb resource
- create directory mountpoint "/u02"
- On the first VM only, do the following tasks...
- partition the shared disk
- make a physical volume from the partition
- create a volume group from the physical partition
- create logical volume within the volume group
- create an EXT4 filesystem within the logical volume
- mount the filesystem on "/u02"
- create subdirectories within "/u02" for Oracle database/configuration files
- use the Oracle Database Creation Assistant to create a database and a TNS listener
- create a service account for PCS within the database
- shutdown the Oracle database and stop the TNS Listener
- copy the Oracle PWDFILE and SPFILE to the shared disk and create symlinks in their place
- edit the Oracle TNS sqlnet.ora, listener.ora, and tnsnames.ora configuration files to replace the IP hostname of the first VM with the virtual IP address
- copy the Oracle TNS configuration files to the shared disk, and create symlinks in their place
- unmount the filesystem on "/u02"
- On the second VM only, do the following tasks...
- create an entry in the "/etc/oratab" configuration file
- create adump and dpdump subsdirectories
- On both VMs, do the following steps...
- use "yum" to install the PCS package and start/enable the PCSD daemon
- set a password for the PCS account used for remote access
- On the first VM only, do the following steps...
- create and start the PCS cluster, then enable it for automatic restart on node reboot
- set cluster properties to disable QUORUM for 2-node HA operation
- On both VMs, disable LVMETAD daemon and reboot the VM
- On the first VM only, create the following PCS resources within a resource group...
- STONITH device for fencing on SCSI shared disk using persistent reservations
- virtual IP
- azure-lb for the virtual IP
- volume group
- filesystem
- listener
- database
The "orapcs_output.txt" file in this github package contains an example of the output generated by the bash script. Additionally, the script also saves stdout and stderr output from all commands to a ".log" file in the present working directory, for diagnostic purposes. If anything fails, it is wise to look in the ".log" file for more information.
First of all, the "cr_orapcs.sh" script expects to do all of its work within a single Azure subscription, within a single Azure resource group, and within a single Azure virtual network (vnet).
The name of the Azure subscription has no default value, so the "-S" command-line switch followed by the name of the Azure subscription (possibly enclosed within double-quotes if the subscription name includes spaces) is always required...
$ ./cr_orapcs.sh -S MySubscriptionName
The name of the Azure resource group defaults to "{owner}-{project}-rg", where "{owner}" is the name of the OS account in which the script is being executed (i.e. output from "whoami" command) and "{project}" defaults to the string "orapcs". The "{owner}-{project}" string combination is used a lot within the script for naming objects like resource groups, VMs, storage, PPGs, etc. So with this minimal call syntax, where only the name of the Azure subscription is specified, will result in the script expecting a resource group to already exist with the name of "{owner}-orapcs-rg", where "{owner}" is the OS account name of the Azure CLI shell running the script. For example, when the author uses "https://shell.azure.com", the resulting OS account name is "tim", so using this minimal call syntax for the "cr_orapcs.sh" script means that it expects an Azure resource group named "tim-orapcs-rg" to exist already, and it will create about 14 Azure objects with a prefix of "tim-orapcs-". If you don't want the resource group to be required to have this name, then both these basic values can be changed from the defaults using the "-O" and "-P" command-line switches, respectively...
$ ./cr_orapcs.sh -S MySubscriptionName -O test -P foobar
As a result, the name of the resource group will be expected to be "test-foobar-rg", and all of the Azure objects created within the resource group will also be named with the prefix string of "test-foobar-". If the name of the resource group is something else (i.e. "MyResourceGroupName") but you'd like all of the objects created by the script to start with the prefix string "test-foobar-", then you can use the following call syntax...
$ ./cr_orapcs.sh -S MySubscriptionName -R MyResourceGroupName -O test -P foobar
As a result, the precreated resource group named "MyResourceGroupName" within the existing "MySubscriptionName" subscription will be populated with objects with names like "test-foobar-vm01", "test-foobar-vnet", "test-foobar-vnet", etc.
The names of all other objects in Azure follow the convention of "{owner}-{project}-{something}". Using the example where "{owner}" is "tim" and "{project}" is "orapcs", the resource group named "tim-orapcs-rg" would contain the following objects with the following names...
storage account: timorapcssa
virtual network: tim-orapcs-vnet
virtual network subnet: tim-orapcs-subnet
network security group: tim-orapcs-nsg
availability set: tim-orapcs-avset
proximity placement group: tim-orapcs-ppg
load balancer: tim-orapcs-lb
first VM: tim-orapcs-vm01
second VM: tim-orapcs-vm02
third VM: tim-orapcs-vm03
(...and so on...)
Please see the next section for a complete list of all of the command-line switches, what they control, and default values...
cr_orapcs.sh -N -M -O val -P val -S val -V val -g val -h val -i val -r val -u val -v -w val
-N skip steps to create vnet/subnet, public-IP, NSG, rules, and PPG
-M skip steps to create VMs and storage
-O owner-tag name of the owner to use in Azure tags (no default)
-P project-tag name of the project to use in Azure tags (no default)
-S subscription name of the Azure subscription (no default)
-V vip-IPaddr IP address for the virtual IP (VIP) (default: 10.0.0.10)
-g resource-group name of the Azure resource group (default: ${_azureOwner}-${_azureProject}-rg)
-h ORACLE_HOME full path of the ORACLE_HOME software (default: /u01/app/oracle/product/19.0.0/dbhome_1)
-i instance-type name of the Azure VM instance type for database nodes (default: Standard_D2s_v4)
-r region name of Azure region (default: westus2)
-u urn Azure URN for the VM from the marketplace
(default: Oracle:oracle-database-19-3:oracle-database-19-0904:19.3.1)
-v set verbose output is true (default: false)
-w password Oracle SYS/SYSTEM account password (default: oracleA1)
The "-N" and "-M" switches were mainly used for initial debugging, and might well be removed in more mature versions of the script. They intended to skip over some steps if something failed later on.
Additional note: If you change the URN of the marketplace VM image with the "-u" switch, then you will probably need to change the path of "$ORACLE_HOME" directory using the "-H" switch as well?
Use SSH to access each of the two VMs. From the Azure administrative account on the VM, you can use the "sudo" utility to execute all of the PCS commands as root, or just enter "sudo su -" to open a shell as root.
The command "pcs status" (or "sudo pcs status") displays the overall status of the PCS cluster, the nodes, and each of the cluster resources.
Using the fully-qualified IP hostname of the VM in place of the label "{vm}"...
$ sudo pcs node standby {vm}
...will put the specified "{vm}" into PCS "standby" mode, which means that the VM cannot host services. This will force all services to failover to the remaining node. Issue the command above, and then monitor the progress of failover using the "sudo pcs status" command.
To take the "{vm}" out of PCS standby mode and allow it to host services again, issue the following command...
$ sudo pcs node unstandby {vm}
...and then follow up with the "sudo pcs status" command to view any changes to status, which should not happen.
To failback the Oracle services to the original VM, make the other VM (on which the services currently reside) in standby mode, and be sure to "unstandby" that node after the Oracle services have been successfully forced off it.
Of course, the purpose of the HA-LVM cluster is high-availability in the event of failure, so killing some of the services directly is another way to test, bearing in mind that the PCS cluster polls periodically for failover. In other words, failover will not occur instantly after failure, but 30 seconds or 60 seconds later when the polling discovers the failure and then retries to verify that the failure has indeed happened.
Please note the official PCS documentation HERE as well as the Clusterlabs wiki HERE.
To locate Oracle images in the Azure marketplace, you can use the Azure CLI command as follows...
$ az vm image list --offer Oracle --all --publisher Oracle --output table
Offer Publisher Sku Urn Version
-------------------- ----------- ----------------------- ---------------------------------------------------------- -------------
oracle-database-19-3 Oracle oracle-database-19-0904 Oracle:oracle-database-19-3:oracle-database-19-0904:19.3.1 19.3.1
Oracle-Database-Ee Oracle 12.1.0.2 Oracle:Oracle-Database-Ee:12.1.0.2:12.1.20170220 12.1.20170220
Oracle-Database-Ee Oracle 12.2.0.1 Oracle:Oracle-Database-Ee:12.2.0.1:12.2.20180725 12.2.20180725
Oracle-Database-Ee Oracle 18.3.0.0 Oracle:Oracle-Database-Ee:18.3.0.0:18.3.20181213 18.3.20181213
Oracle-Database-Se Oracle 12.1.0.2 Oracle:Oracle-Database-Se:12.1.0.2:12.1.20170220 12.1.20170220
Oracle-Database-Se Oracle 12.2.0.1 Oracle:Oracle-Database-Se:12.2.0.1:12.2.20180725 12.2.20180725
Oracle-Database-Se Oracle 18.3.0.0 Oracle:Oracle-Database-Se:18.3.0.0:18.3.20181213 18.3.20181213
Oracle-Linux Oracle 6.10 Oracle:Oracle-Linux:6.10:6.10.00 6.10.00
Oracle-Linux Oracle 6.8 Oracle:Oracle-Linux:6.8:6.8.0 6.8.0
Oracle-Linux Oracle 6.9 Oracle:Oracle-Linux:6.9:6.9.0 6.9.0
Oracle-Linux Oracle 7.3 Oracle:Oracle-Linux:7.3:7.3.0 7.3.0
Oracle-Linux Oracle 7.3 Oracle:Oracle-Linux:7.3:7.3.20190529 7.3.20190529
Oracle-Linux Oracle 7.4 Oracle:Oracle-Linux:7.4:7.4.1 7.4.1
Oracle-Linux Oracle 7.4 Oracle:Oracle-Linux:7.4:7.4.20190529 7.4.20190529
Oracle-Linux Oracle 7.5 Oracle:Oracle-Linux:7.5:7.5.1 7.5.1
Oracle-Linux Oracle 7.5 Oracle:Oracle-Linux:7.5:7.5.2 7.5.2
Oracle-Linux Oracle 7.5 Oracle:Oracle-Linux:7.5:7.5.20181207 7.5.20181207
Oracle-Linux Oracle 7.5 Oracle:Oracle-Linux:7.5:7.5.20190529 7.5.20190529
Oracle-Linux Oracle 7.5 Oracle:Oracle-Linux:7.5:7.5.3 7.5.3
Oracle-Linux Oracle 7.6 Oracle:Oracle-Linux:7.6:7.6.2 7.6.2
Oracle-Linux Oracle 7.6 Oracle:Oracle-Linux:7.6:7.6.3 7.6.3
Oracle-Linux Oracle 7.6 Oracle:Oracle-Linux:7.6:7.6.4 7.6.4
Oracle-Linux Oracle 7.6 Oracle:Oracle-Linux:7.6:7.6.5 7.6.5
Oracle-Linux Oracle 77 Oracle:Oracle-Linux:77:7.7.1 7.7.1
Oracle-Linux Oracle 77 Oracle:Oracle-Linux:77:7.7.2 7.7.2
Oracle-Linux Oracle 77 Oracle:Oracle-Linux:77:7.7.3 7.7.3
Oracle-Linux Oracle 77 Oracle:Oracle-Linux:77:7.7.4 7.7.4
Oracle-Linux Oracle 77 Oracle:Oracle-Linux:77:7.7.5 7.7.5
Oracle-Linux Oracle 77 Oracle:Oracle-Linux:77:7.7.6 7.7.6
Oracle-Linux Oracle 77-ci Oracle:Oracle-Linux:77-ci:7.7.01 7.7.01
Oracle-Linux Oracle 77-ci Oracle:Oracle-Linux:77-ci:7.7.02 7.7.02
Oracle-Linux Oracle 77-ci Oracle:Oracle-Linux:77-ci:7.7.03 7.7.03
Oracle-Linux Oracle 78 Oracle:Oracle-Linux:78:7.8.3 7.8.3
Oracle-Linux Oracle 78 Oracle:Oracle-Linux:78:7.8.5 7.8.5
Oracle-Linux Oracle 79-gen2 Oracle:Oracle-Linux:79-gen2:7.9.11 7.9.11
Oracle-Linux Oracle 79-gen2 Oracle:Oracle-Linux:79-gen2:7.9.12 7.9.12
Oracle-Linux Oracle 79-gen2 Oracle:Oracle-Linux:79-gen2:7.9.13 7.9.13
Oracle-Linux Oracle 8 Oracle:Oracle-Linux:8:8.0.2 8.0.2
Oracle-Linux Oracle 8-ci Oracle:Oracle-Linux:8-ci:8.0.11 8.0.11
Oracle-Linux Oracle 81 Oracle:Oracle-Linux:81:8.1.0 8.1.0
Oracle-Linux Oracle 81 Oracle:Oracle-Linux:81:8.1.2 8.1.2
Oracle-Linux Oracle 81-ci Oracle:Oracle-Linux:81-ci:8.1.0 8.1.0
Oracle-Linux Oracle 81-gen2 Oracle:Oracle-Linux:81-gen2:8.1.11 8.1.11
Oracle-Linux Oracle ol77-ci-gen2 Oracle:Oracle-Linux:ol77-ci-gen2:7.7.1 7.7.1
Oracle-Linux Oracle ol77-gen2 Oracle:Oracle-Linux:ol77-gen2:7.7.01 7.7.01
Oracle-Linux Oracle ol77-gen2 Oracle:Oracle-Linux:ol77-gen2:7.7.02 7.7.02
Oracle-Linux Oracle ol77-gen2 Oracle:Oracle-Linux:ol77-gen2:7.7.03 7.7.03
Oracle-Linux Oracle ol78-gen2 Oracle:Oracle-Linux:ol78-gen2:7.8.03 7.8.03
Oracle-Linux Oracle ol78-gen2 Oracle:Oracle-Linux:ol78-gen2:7.8.05 7.8.05
Oracle-Linux Oracle ol79 Oracle:Oracle-Linux:ol79:7.9.1 7.9.1
Oracle-Linux Oracle ol79 Oracle:Oracle-Linux:ol79:7.9.2 7.9.2
Oracle-Linux Oracle ol79 Oracle:Oracle-Linux:ol79:7.9.3 7.9.3
Oracle-Linux Oracle ol79-gen2 Oracle:Oracle-Linux:ol79-gen2:7.9.11 7.9.11
Oracle-Linux Oracle ol79-lvm Oracle:Oracle-Linux:ol79-lvm:7.9.01 7.9.01
Oracle-Linux Oracle ol79-lvm-gen2 Oracle:Oracle-Linux:ol79-lvm-gen2:7.9.11 7.9.11
Oracle-Linux Oracle ol82 Oracle:Oracle-Linux:ol82:8.2.1 8.2.1
Oracle-Linux Oracle ol82 Oracle:Oracle-Linux:ol82:8.2.3 8.2.3
Oracle-Linux Oracle ol82-gen2 Oracle:Oracle-Linux:ol82-gen2:8.2.01 8.2.01
Oracle-Linux Oracle ol83-lvm Oracle:Oracle-Linux:ol83-lvm:8.3.1 8.3.1
Oracle-Linux Oracle ol83-lvm Oracle:Oracle-Linux:ol83-lvm:8.3.2 8.3.2
Oracle-Linux Oracle ol83-lvm Oracle:Oracle-Linux:ol83-lvm:8.3.3 8.3.3
Oracle-Linux Oracle ol83-lvm-gen2 Oracle:Oracle-Linux:ol83-lvm-gen2:8.3.11 8.3.11
Oracle-Linux Oracle ol83-lvm-gen2 Oracle:Oracle-Linux:ol83-lvm-gen2:8.3.12 8.3.12
Oracle-Linux Oracle ol83-lvm-gen2 Oracle:Oracle-Linux:ol83-lvm-gen2:8.3.13 8.3.13
Oracle-Linux Oracle ol8_2-gen2 Oracle:Oracle-Linux:ol8_2-gen2:8.2.13 8.2.13
oracle_virtual_esbc Oracle oracle_evsbc_8301 Oracle:oracle_virtual_esbc:oracle_evsbc_8301:8.3.1 8.3.1
If you remove all the entries for Oracle Linux standalone and Oracle WebLogic, leaving only Oracle database images, you might see something like this...
$ az vm image list --offer Oracle-Database --all --publisher Oracle --output table
Offer Publisher Sku Urn Version
-------------------- ----------- ----------------------- ---------------------------------------------------------- -------------
oracle-database-19-3 Oracle oracle-database-19-0904 Oracle:oracle-database-19-3:oracle-database-19-0904:19.3.1 19.3.1
Oracle-Database-Ee Oracle 12.1.0.2 Oracle:Oracle-Database-Ee:12.1.0.2:12.1.20170220 12.1.20170220
Oracle-Database-Ee Oracle 12.2.0.1 Oracle:Oracle-Database-Ee:12.2.0.1:12.2.20180725 12.2.20180725
Oracle-Database-Ee Oracle 18.3.0.0 Oracle:Oracle-Database-Ee:18.3.0.0:18.3.20181213 18.3.20181213
Oracle-Database-Se Oracle 12.1.0.2 Oracle:Oracle-Database-Se:12.1.0.2:12.1.20170220 12.1.20170220
Oracle-Database-Se Oracle 12.2.0.1 Oracle:Oracle-Database-Se:12.2.0.1:12.2.20180725 12.2.20180725
Oracle-Database-Se Oracle 18.3.0.0 Oracle:Oracle-Database-Se:18.3.0.0:18.3.20181213 18.3.20181213
Reminder: the URN value is what the "cr_oravm.sh" script expects as a value for the "-u" switch, just FYI?
Also stored within the header comments of the "cr_orapcs.sh" script...
TGorman 13may20 v0.1 written
TGorman 27jul20 v0.2 added new cmd-line parameters
TGorman 14aug20 v0.3 added SCSI fencing
TGorman 17aug20 v0.4 added pause before CRM_VERIFY on both nodes
TGorman 11sep20 v0.5 added availability set for both nodes
TGorman 10nov20 v0.6 added Azure load balancer for VIP, create 3rd VM
from which to test, and change default URN value
from Oracle12c v12.2 to Oracle19c v19.3 as well
as the default ORACLE_HOME path value
TGorman 12nov20 v0.7 contains code to disable Linux firewalld; also
has code that is commented-out to enable Linux
firewall and open necessary ports
TGorman 29jan21 v0.8 set accelerated networking TRUE at NIC creation
and change default VM instance type to
"Standard_D2s_v4"
TGorman 05apr21 v0.9 correct handling of ephemeral SSD by instance type
TGorman 26apr21 v1.0 set waagent.conf to rebuild swapfile after reboot,
set default image to 19c, and perform yum updates