Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow brute-cleanup to work with RAC #67

Merged
merged 7 commits into from
Apr 27, 2021
Merged

Allow brute-cleanup to work with RAC #67

merged 7 commits into from
Apr 27, 2021

Conversation

mfielding
Copy link
Member

brute-cleanup was originally designed for single-node, Oracle Restart installs; on RAC clusters I ran into a few issues and making a few fixes:

  • Making existing service shutdown and CRS deconfigure run only on non-RAC clusters (detected by the presence or absence of a cluster_name in the inventory file)
  • Adding a somewhat hacky RAC hard shutdown using the OHASD shutdown handler, plus rootcrs.sh. Why not use Oracle's deinstall? I'd prefer not to depend on a response file taht may or may not reflect current reality, but rather do as generic a deconfigure/shutdown as possible
  • Adding tfa (trace file anaalyzer) to the kill list, as well as removing the initscript itself
  • I ran into a case where umount failed on /u01 but lsof showed nothing in use. a lazy (umount -l) seems to resolve by simply detaching the mount, and seems to fit the context of a "brute cleanup", especially since we're going to zero out the device later anyway. Ansible's mount handler doens't support the option, so we're reverting to shell here, with the unfortunate side effect that nonexistant mounts will generated ignored error messages.

Remount /dev/shm if necessary
brute-cleanup was originally designed for single-node, Oracle Restart installs;  on RAC clusters I ran into a few issues and making a few fixes:

- Making existing service shutdown and CRS deconfigure run only on non-RAC clusters (detected by the presence or ansence of a cluster_name in the inventory file)
- Adding a somewhat hacky RAC hard shutdown using the OHASD shutdown handler, plus rootcrs.sh.  Why not use Oracle's deinstall?  I'd prefer not to depend on a response file taht may or may not reflect current reality, but rather do as generic a deconfigure/shutdown as possible
- Adding tfa (trace file anaalyzer) to the kill list, as well as removing the initscript itself
- I ran into a case where umount failed on /u01 but lsof showed nothing in use.  a lazy (umount -l) seems to resolve by simply detaching the mount, and seems to fit the context of a "brute cleanup", especially since we're going to zero out the device later anyway.  Ansible's mount handler doens't support the option, so we're reverting to shell here, with the unfortuantely side effect that nonexistant mounts will generated ignored error messages.
@mfielding mfielding requested a review from jcnars April 5, 2021 20:58
A few more objects that can be left behind if CRS deconfig scripts fail for some reason: shared memory segments, semaphors, and kernel modules (ACFS, ASMFD, etc).  In the spirit of a brute cleanup, let's yank stragglers the hard way.
Forcibly unloading oracleasm will impact the ability to identify ASM devices, so delay kernel module removal until disks and asmlib are already removed.
Using pkill instead of a pipeline to avoid error messages when these processes don't exist.  And thus avoid the need for complicated return code processing.
@jcnars
Copy link
Collaborator

jcnars commented Apr 20, 2021

Tested the cleanup branch's roles/brute-ora-cleanup/tasks/main.yml file on a non-RAC host and confirm that the newly added commits successfully execute the intended functionality.

The run is documented in this gpaste (internal):
https://paste.googleplex.com/6116459345346560

This LGTM from a non-rac standpoint.
Aiming to test this on a RAC cluster in the coming days.

@jcnars
Copy link
Collaborator

jcnars commented Apr 27, 2021

Able to test run this multiple times successfully for a single-node RAC cleanup.
https://paste.googleplex.com/6236106396794880 (internal)

@mfielding mfielding merged commit 21f1984 into master Apr 27, 2021
@mfielding mfielding deleted the cleanup branch April 27, 2021 17:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants