Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Puppet module for managing Heartbeat2 & Pacemaker

branch: master
README.markdown

Overview

This module will allows us to automagically configure HA environments using Heartbeat2 & Pacemaker. I would strongly recommend reading http://clusterlabs.org/mediawiki/images/f/fb/Configuration_Explained.pdf before continuing.

You really shouldn't use this module unless you know what you're doing. Please ensure that you run puppetd with --noop after committing any changes to live setups.

If you attempt to use this module without having grokked the Configuration Explained PDF linked above and then complain about not understanding it (or worse, break a live system), I will print out a copy of said manual and give you a paper cut for each of it's 136 pages. You have been warned!

Cluster Setup

ha::node { "<something>":
    authkey          => "<ha::authkey index>",
    autojoin         => "<autojoin setting (default: any)>", 
    use_logd         => "<on|off (default: on)>",
    compression      => "<compression method (default: bz2)>",
    keepalive        => "<seconds (default: 1)>", 
    warntime         => "<seconds (default: 6)>", 
    deadtime         => "<seconds (default: 10)>", 
    initdead         => "<seconds (default: 60)>", 

ha::authkey { "<index>":
    method  => "<md5|sha1|crc>",
    key     => "<key> (not required for crc)",
    require => Ha::Node["<something>"],
}

ha::mcast { "<interface name>":
    group   => "<mcast group>",
    port    => "<mcast port> (default: 694)",
    ttl     => "<mcast ttl> (default: 1)",
    require => Ha::Node["<something>"],
}

Resource Management

There are two ways to manage resources in the ha module -- either the easy way, with special types customised to common situations, or by directly manipulating the primitives and properties (and such) of pacemaker. The former is much easier, but requires that someone has already written an appropriate type, whilst the latter requires a lot more knowledge of how pacemaker works, but lets you do anything you want.

ha_crm_property

Set a cluster-wide (crm_config) property.

ha_crm_property { "<property name>":
    value          => "<value>",
    ensure         => "(present|absent)",
    only_run_on_dc => "(true|false)",
}

Example:

ha_crm_property { "stonith-enabled":
    value  => "true",
    ensure => present,
}

Required Parameters:

  • namevar: The name of the property
  • value: The value of the property
  • ensure: Whether this property should exist in the CIB

Optional Parameters:

  • only_run_on_dc: Should Puppet only attempt to manage this resource if the node is the cluster DC (default: true)

ha_crm_primitive

Create a primitive (resource). In almost all cases, this resource will require additional parameters (ha_crm_parameter) in order to function correctly.

ha_crm_primitive { "<primitive name>":
    type                => "<class>:<provider>:<type>"
    ensure              => "(present|absent)",
    only_run_on_dc      => "(true|false)",
    priority            => "<integer>",
    target_role         => "(stopped|started|master)",
    is_managed          => "(true|false)",
    resource_stickiness => "<integer>",
    migration_threshold => "<integer>",
    failure_timeout     => "<integer>",
    multiple_active     => "(block|stop_only|stop_start)",
}

Example:

ha_crm_primitive { "fs_mysql":
    type   => "ocf:heartbeat:Filesystem",
    ensure => present,
}

Required Parameters:

  • namevar: The name of the primitive (used as a reference for most other ha:: types)
  • type: The primitive class (almost always will start with ocf: or lsb:)
  • ensure: Whether this primitive should exist in the CIB

Optional Parameters:

  • only_run_on_dc: Should Puppet only attempt to manage this resource if the node is the cluster DC (default: true)
  • priority: The priority of the resource
  • target_role: What state should the cluster attempt to keep this resource in?
  • is_managed: Is the cluster allowed to start and stop the resource?
  • resource_stickiness: How much does the resource prefer to stay where it is?
  • migration_threshold: How many failures should occur for this resource on a node before making the node ineligible to host this resource.
  • failure_timeout: How many seconds to wait before acting as if the failure had not occurred
  • multiple_active: What should the cluster do if it ever finds the resource active on more than one node
Something went wrong with that request. Please try again.