Skip to content

mattwillsher/rhcs_fence

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 

Repository files navigation

Program: rhcs_fence
Author:  Madison Kelly (digimer@alteeve.com)
         Alteeve's Niche! - https://alteeve.com/w/
Date:    2012-10-26
Version: 0.2.6
License: GPL v2.0+


-=] Description:

This script is designed to be used as DRBD's 'fence-peer' handler. It ties
DRBD's fence call, using 'disk { fencing resource-and-stonith }' into Red Hat
Cluster Service's fenced daemon. This allows you to configure fencing once in
your cluster and use it for both the cluster and DRBD.

This program was based heavily on Lon Hohberger <lhh[a]redhat.com> 
"obliterate-peer.sh" script. This was created as a replacement fence handler
designed to add some features and intelligence to his script, but became a new
program in order to switch to perl.


-=] Supported Environments

This script supports two-node DRBD setups only, but the nodes themselves may be
part of a larger cluster. This script should be used when 
'fencing resource-and-stonith' is set only. The 'on <host> { }' name *must*
match the '<clusternode name="..."> as well.


-=] Limitations

As this handler insists on seeing the local disk as 'UpToDate' before it will
proceed with a fence. Thus, if you have a simultaneous and complete cluster
crash followed by the recovery of only one node, the recovered node will be in
a 'Consistent' state, which will abort the fence call. As such, this scenario
will recover human intervention to recover from.


-=] Notes:

This program takes certain steps to avoid dual-fencing;

- First, Timing:

This program will use the cluster's 'Node ID' as a base value for a delay prior
to fencing. If a node has 'Node ID: 1' (as seen with 'cman_tool status'), there
will be no delay and the fence will occur immediately. All other nodes will 
sleep for ((node_id x 2) + 5) seconds, up to a maximum of 30 seconds.

It is possible to override this behaviour by setting 'local_delay' in the head
configuration section of the script. If this is a non-0 value, the script will
pause for the defined number of seconds, ignoring the behaviour described 
above.

- Second, 'UpToDate' check;

When a fence call is made, this program checks the referenced resource minor
number's disk state to ensure it is 'UpToDate' as resported by '/proc/drbd'.
The fence call will abort if the disk state is 'Consistent' (or anything else).
This helps prevent accidentally fencing the original survivng node when the
cluster communication is up, but the storage network is not, avoiding a 
fence-loop.


-=] Failure Modes:

This program follows the "Fail Early, Fail Often" ethos. It will *only* fence
the peer if several conditions are met. Please test the functionality of this
script before going into production!

Exit codes;
 1   - Fence failed, see syslog
 7   - Fence succeeded
 255 - End of script hit, likely a program error

If you run into any trouble, please enable 'debug' mode by setting the internal
'debug' value to '1'. If you need help, please send the output of the syslog
of both nodes with debug enabled to the Linux Cluster mailing list or DRBD 
Users mailing list.


-=] Getting Help:

By email:     digimer@alteeve.com / https://alteeve.com
IRC:          #linux-cluster and #drbd on freenode.net
Mailing list: https://www.redhat.com/mailman/listinfo/linux-cluster
              http://lists.linbit.com/mailman/listinfo/drbd-user

About

Allows DRBD to fence it's peer using red hat cluster service.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Perl 100.0%